The future of generative AI in the enterprise could be smaller, more focused language models

The amazing abilities of OpenAI’s ChatGPT wouldn’t be possible without great language models. These models are trained on billions, if not trillions, of sample text. The idea behind ChatGPT is to understand the language so well that it can plausibly anticipate which word is coming in a fraction of a second. It takes a ton of training, compute resources, and developer knowledge to get there.
But perhaps the future of these models is more focused than the boil-the-ocean approach we’ve seen from OpenAI and others, who want to be able to answer every question under the sun. What if every industry or even every business had its own model formed to understand the jargon, language, and approach of the individual entity? Perhaps we would then get fewer completely made-up answers because the answers will come from a more limited universe of words and phrases.
In an AI-driven future, each company’s own data could be its most valuable asset. If you’re an insurance company, you have a completely different lexicon than a hospital, auto company, or law firm, and when you combine that with your customer data and the whole organizational content, you have a language model. Although it might not be big, as in the sense of the really big language model, it would be just the model you need, a model created for one and not for the masses.
It will also require a set of tools to continuously collect, aggregate and update the enterprise data set in a way that makes it unmanageable for these small big language models (sLLMs).
Building these models could pose a challenge. They’ll likely leverage something like open source or a private company’s existing LLMs and then refine them on industry or company data to fine-tune them further, all in a more secure environment than the generic LLM variety.
This represents a huge opportunity for the startup community, and we see many companies getting a head start on this idea.