Large Language Model ecosystem
Post the initial release of GPT-3.5 on March 15, 2023 and the enthusiasm it generated, the world of large language model started to change irreversibly. We saw huge momentum in companies, startups and research community announcing their variant of large language models.
This momentum shows no signs of slowing either, as of February 2024, as the mainstreaming of AI continues to advance with an unabated fury. This was coupled with excitement and demand to experience and experiment by businesses to leverage the power, as businesses strive to stay ahead in digital transformation, innovation and efficiency are paramount for survival and growth.
However, as the ecosystem of models emerge, one saw emergence of both proprietary model announcements, open-source models as well as quasi-open-source model announcements. In a world which is moving at a frantic pace, the headlines are mired with atleast one or two significant model announcements every week over the past 12 months.
With the enumerous options to experiment and the breath-taking speed, confusion among creators and businesses wanting to experiment is natural.
Open-source vs Propritary
The debate over open versus closed source models is an unsettled one. At Cynepia, we beleive there is no single answer yet and like everything else, there's a tradeoff. However, specifically in the context of enterprise innovation, we are strong opinion in favor of Open-source language models (LLMs), specifically when the use-case value is significant. The democratization brings in flexibility to choose, fine-tune and contextualize if needed and more importantly take advantage of the race for model superiority underway, where the gap between winner model and the other good options is shrinking. We also beleive that in enterprise context, Task/Domain specific models have a significant role to play in shaping up how the shape of the desired outcome would look.
However, a balanced and a carefully chosen approach can be the way to go for most enterprises.
Advantages of Open-source in Enterprise Context
To begin with, Open-source LLMs offer greater flexibility and freedom compared to proprietary models. It fosters an ecosystem of inclusivity and adaptability, granting AI researchers, enterprises unrestricted access to the underlying architecture, data, and/or code. Such accessibility is important in personalizing these models to unique needs that businesses have.
Often many of the non-functional requirements of every use-case are unique and differentiated. Applying a single model to all requirements most likely will fail to meet the use-case/system goal. These non-functional requirements may include criteria such as reliability, consistency, performance, inference speed, running cost and expected ROI, etc.
Economic attractiveness, transparency and freedom to personalize are a few important reasons, to consider Open source LLMs.
Challenges in using Open-source LLMs
While open-source LLMs have significant advantages as articulated above. They are many challenges too.
-
Ever evolving and continuously changing landscape that requires careful consideration.
-
At the time of writing, Open-source models still have catch up to do viz-a-viz top of the line proprietary models in terms of accuracy and various task specific benchmarks. Picking an OSS is often determined by the need to contextualize with other sources of data not seen by proprietary top-of-the-line models. And such data is crucial for the reliability of the end use-case.
Popular Open Source Foundational Language Models
There are many popular open source LLM options available today in public repositories such as Huggingface. Some of the popular options in > 30B parameter categories include the following:
- Meta LLAMA 2 33B/65B, LLAMA 2 chat - 70B
- Mistral 8x7B
- Vicuna 33B
- Falcon 40b
- Yi-34b
In the < 30B parameter category, there are enumerous options, some of the popular options include the following:
- Stanford Alpaca series 7B/13B
- Salesforce codegen 16B mono/multi
- Falcon series 7B/13B
- Meta LLAMA series 7B
- Meta Opt 7B/13B
- Vicuna 13B
- Google Palm 2 Series
- Mistral 7B/7b-instruct
- Microsoft Phi-2