Top 10 Open Source LLMs To USE In Your Next LLM Application
Build LLM Based Applications Without Getting Run Out of Money
Large language model-based applications have been on the hype for the last ten months after ChatGPT was released by OpenAI. Since then a lot of companies and startups launched applications and products that are based on LLMs.
However, using the available LLM for commercial use such as GPT4 and others can be very costly in the long run especially if you would like to work on a side project.
Luckily there are currently very good open-source LLMs that can be used and tuned to your problem. In this article, we will discuss ten of the most powerful open-source LLM that you can use for your next LLM-based application.
Table of Contents:
LLaMA
Falcon
Dolly
Guanaco
BloomZ
Alpaca
OpenChatKit
GPT4ALL
Vicuna
Flan-T5
Looking to start a career in data science and AI and need to learn how. I offer data science mentoring sessions and long-term career mentoring:
Mentoring sessions: https://lnkd.in/dXeg3KPW
Long-term mentoring: https://lnkd.in/dtdUYBrM
All the resources and tools you need to teach yourself Data Science for free!
The best interactive roadmaps for Data Science roles. With links to free learning resources. Start here: https://aigents.co/learn/roadmaps/intro
The search engine for Data Science learning recourses. 100K handpicked articles and tutorials. With GPT-powered summaries and explanations. https://aigents.co/learn
Teach yourself Data Science with the help of an AI tutor (powered by GPT-4). https://community.aigents.co/spaces/10362739/
1. LLaMA
The LLaMA project comprises a range of language models with different sizes, from 7 billion to 65 billion parameters. These models have been trained on millions of tokens, using only publicly available datasets. As a result, the LLaMA-13B model performs better than GPT-3 (175B), while LLaMA-65B is on par with top-performing models such as Chinchilla-70B and PaLM-540B.3,7B.
License: Apache 2.0
Release Date: May 5, 2023
Github: Source Code
Paper: Meet OpenLLaMA — An Open-Source Reproduction of Meta AI’s LLaMA Large Language Model
2. Falcon
Falcon AI, mainly Falcon LLM 40B, is a Large Language Model released by the UAE’s Technology Innovation Institute (TII). The 40B indicates the 40 Billion parameters used by this Large Language Model uses. The TII has even developed a 7B, i.e., 7 billion parameters model that’s trained on 1500 billion tokens. In comparison, the Falcon LLM 40B model is trained on 1 trillion tokens of RefinedWeb. What makes this LLM different from others is that this model is transparent and Open Source.
The Falcon is an autoregressive decoder-only model. The training of Falcon AI was on AWS Cloud continuously for two months with 384 GPUs attached. The pretraining data largely consisted of public data, with few data sources taken from research papers and social media conversations.
TII has open-sourced Falcon LLM for research and commercial utilization.
3. Dolly
Dolly 2.0 is a LLM from Databricks. It is based on the EleutherAI Pythia model family and has been fine-tuned with a high-quality human-generated instruction dataset. Dolly 2.0 is available for commercial use and can be used for a variety of tasks, including summarization, content generation, and question-answering.
Description: Pythia 12B LLM trained on Databricks ML platform
Params: 12B
License: Apache 2.0
Release Date: Apr 12,2023
Github: Source Code
Paper: Free Dolly — Introducing the World’s First Truly Open Instruction-Tuned LLM
4. Guanaco
Guanaco is an LLM that uses a finetuning method called LoRA which was developed by Tim Dettmers et. al. in the UW NLP group. With QLoRA, it becomes possible to finetune up to a 65B parameter model on a 48GB GPU without loss of performance relative to a 16-bit model. The Guanaco model family outperforms all previously released models on the Vicuna benchmark. However, given the models are based on the LLaMA model family, commercial use is permitted.
Description: LLM model released with efficient finetuning approach QLoRA
Params: 65B
License: MIT
Release Date: May 24, 2023
Github: Source Code
5. Bloom
BLOOM as a Large Language Model (LLM), is trained to continue and complete text from a prompt. Thus in essence the process of generating text. Seemingly the words completion, generation, and continue are being used interchangeably. BLOOM is able to generate text in 46 languages and 13 programming languages
Description: Cross-lingual Generalization through Multitask Finetuning
Params: 176B
License: Apache 2.0
Release Date: Apr 19,2023
Github: Source Code
Paper: Cross-lingual Generalization through Multitask Finetuning
6. Alpaca
Alpaca is fine-tuned from Meta’s LLaMA 7B model. It was trained on 52K instruction-following demonstrations generated in the style of self-instruct using text-davinci-003. On the self-instruct evaluation set, Alpaca shows many behaviors similar to OpenAI’s text-davinci-003, but is also surprisingly small and easy/cheap to reproduce.
Description: Stanford’s Instruction-following LLaMA Model
Params: 7B
License: Apache 2.0
Release Date: Mar 13, 2023
Github: Source Code
Paper: Alpaca — A Strong, Replicable Instruction-Following Model
7. OpenChatKit
OpenChatKit provides an open-source, powerful set of tools to create generalized and specialized chatbot applications. It is the first version of the model, and developers have released a set of tools and processes to improve the model with the help of community contribution.
Together Computer has released OpenChatKit 0.15 under an Apache-2.0 license that comes with source code, model weights, and training datasets.
You can try the based model demo on Hugging Face: OpenChatKit. It is similar to ChatGPT, where you write a prompt, and the model responds to you with the answer, code block, tables, or text.
8. GPT4ALL
GPT4All is a chatbot website that you can use for free. It runs locally and respects your privacy, so you don’t need a GPU or internet connection to use it. This ecosystem allows you to create and use language models that are powerful and customized to your needs. Best of all, these models run smoothly on consumer-grade CPUs.
You can download the 4GB GPT4All model and connect it with the open-source GPT4All ecosystem software. Nomic AI ensures safe and top-notch software ecosystems. It strives to empower individuals and groups to easily train and use their own extensive language models locally.
Description: Ecosystem to train and deploy powerful and customized LLMs
Params: 7–13B
License: MIT
Release Date: Apr 24,2023
Github: Source Code
Paper: GPT4All: An ecosystem of open-source on-edge large language models.
9. Vicuna
Vicuna-13B, an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. Preliminary evaluation using GPT-4 as a judge shows Vicuna-13B achieves more than 90%* quality of OpenAI ChatGPT and Google Bard while outperforming other models like LLaMA and Stanford Alpaca in more than 90%* of cases. The cost of training Vicuna-13B is around $300. The code and weights, along with an online demo, however, they are not publicly available for non-commercial use.
10. Flan-T5
Flan-T5-XXL fine-tuned T5 models on a collection of datasets phrased as instructions. The instruction fine-tuning dramatically improves performance on a variety of model classes such as PaLM, T5, and U-PaLM. The Flan-T5-XXL model is fine-tuned on more than 1000 additional tasks covering also more languages.
Description: Chatbot trained by fine-tuning Flan-t5-xl on user-shared conversations collected from ShareGPT
Params: 3B
License: Apache 2.0
Release Date: Apr 28,2023
Github: Source Code
Paper: FastChat-T5 — our compact and commercial-friendly chatbot!
References:
Looking to start a career in data science and AI and do not know how. I offer data science mentoring sessions and long-term career mentoring:
Mentoring sessions: https://lnkd.in/dXeg3KPW
Long-term mentoring: https://lnkd.in/dtdUYBrM
Great work 👏👏👏👏