LLM Worksheet

	A	B	C	D	E	F	G	H	J	K
1	Organization	Model Name	Parameters	Context	Type	License	Release	Notes	See Also	Properties

2	Skywork	Skywork-MoE	22Ba146B	8K	EN/CN	Lawful	2024-6-3	https://github.com/SkyworkAI/Skywork-MoE
3	LLM360	K2	65B	2K	General	Apache 2.0	2024-5-29	https://www.llm360.ai/blog/several-new-releases-to-further-our-mission.html
4	IEIT-Yuan	Yuan2.0-M32	3.7Ba40B	8K		Unknown	2024-5-28	https://huggingface.co/IEITYuan/Yuan2-M32-hf
5	DeepSeek	DeepSeek-V2-Lite	2.4Ba16B	32K	EN/CN	Lawful	2024-5-16	https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite
6	IBM	Granite	3B, 8B, 20B, 34B	2K-8K	Code	Apache 2.0	2024-5-7	https://research.ibm.com/blog/granite-code-models-open-source
7	DeepSeek	DeepSeek-V2	21Ba236B	128K	EN/CN	Lawful	2024-5-6	https://github.com/deepseek-ai/DeepSeek-V2
8	Snowflake	Arctic	17Ba408B	4K		Apache 2.0	2024-4-24	https://www.snowflake.com/blog/arctic-open-efficient-foundation-language-models-snowflake/
9	Microsoft	Phi-3	3.8B, 7B, 14B	4-128K		MIT	2024-4-23	https://news.microsoft.com/source/features/ai/the-phi-3-small-language-models-with-big-potential/
10	Meta	Llama 3	8B, 70B, 405B	8K		<700M MAU	2024-4-18	https://ai.meta.com/blog/meta-llama-3/
11	Mistral	Mixtral 8x22B	35Ba141B	64K		Apache 2.0	2024-4-17	https://mistral.ai/news/mixtral-8x22b/
12	Hugging Face	Idefics2	8B	32K	Vision	Apache 2.0	2024-4-15	https://huggingface.co/blog/idefics2
13	Cohere	Command-R+	104B	128K	Multilingual	CC-BY-NC	2024-4-4	https://cohere.com/blog/command-r-plus-microsoft-azure
14	AI21	Jamba	12Ba52B	256K		Apache 2.0	2024-3-28	https://www.ai21.com/blog/announcing-jamba
15	Databricks	DBRX	36Ba132B	32K		Apache 2.0	2024-3-27	https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm
16	xAI	Grok-1	79Ba314B	8K		Apache 2.0	2024-3-17	https://x.ai/blog/grok-os
17	Cohere	Command-R	104B	128K	Multilingual	CC-BY-NC	2024-3-11	https://cohere.com/blog/command-r
18	BigCode	StarCoder2	3B, 7B, 15B	16K	Code	Apache 2.0	2024-2-28	https://huggingface.co/blog/starcoder2
19	Alibaba	Qwen 1.5	0.5B, 1.8B, 4B, 7B, 14B, 72B	32K		<100M MAU	2024-2-4	https://qwenlm.github.io/blog/qwen1.5/
20	OpenBMB	MiniCPM	2.7B			China	2024-2-1	https://openbmb.vercel.app/minicpm-en
21	Allen Institute	OLMo	7B	2K		Apache 2.0	2024-2-1	https://allenai.org/olmo
22	LLaVA	LLaVA-NeXT	7B, 13B, 34B	4K-32K	Vision	Various	2024-1-30	https://llava-vl.github.io/blog/2024-01-30-llava-next/
23	BlinkDL	RWKV-5	7B	4K+		Apache 2.0	2024-1-28	https://twitter.com/BlinkDL_AI/status/1751542433039651304
24	ORION STAR	Orion	14B	4K, 200K	Multilingual	Lawful/+Comm	2024-1-22	https://github.com/OrionStarAI/Orion
25	InternLM	InternLM2	7B, 20B	200K	EN/CN	Lawful/+Comm	2024-1-17	https://github.com/InternLM/InternLM
26	Mistral	Mixtral 8x7B	13Ba47B			Apache 2.0	2023-12-11	https://mistral.ai/news/mixtral-of-experts/
27	XVERSE	XVERSE	7B, 13B, 65B, 65B-2	16K	EN/CN	Lawful/+Comm	2023-12-8	https://github.com/xverse-ai
28	DeepSeek	LLM	7B, 67B	4K	EN/CN	Lawful	2023-11-29	https://github.com/deepseek-ai/DeepSeek-LLM
29	DeepSeek	Coder	1.3B, 6.7B, 33B	16K	Code	MIT	2023-11-3	https://deepseekcoder.github.io/
30	01.ai	Yi	6B, 34B	4K	EN/CN	Apache 2.0	2023-11-02	https://github.com/01-ai/Yi
31	Mistral	Mistral	7B	4K	General	Apache 2.0	2023-9-27	https://mistral.ai/news/announcing-mistral-7b/
32	CofeAI	FLM	101B	2K+	EN/CN	Apache 2.0	2023-09-08	Terrible perf https://www.reddit.com/r/LocalLLaMA/comments/16danhb/flm101b_an_open_llm_and_how_to_train_it_with_100k/
33	Meta	CodeLlama	7B, 13B, 34B	16K	Code	<700M MAU	2023-08-24	https://about.fb.com/news/2023/08/code-llama-ai-for-coding/
34	matsuo-lab	weblab	10B	2K	JP	CC-BY-NC 4.0	2023-08-18	https://weblab.t.u-tokyo.ac.jp/100%E5%84%84%E3%83%91%E3%83%A9%E3%83%A1%E3%83%BC%E3%82%BF%E3%82%B5%E3%82%A4%E3%82%BA%E3%83%BB%E6%97%A5%E8%8B%B12%E3%83%B6%E5%9B%BD%E8%AA%9E%E5%AF%BE%E5%BF%9C%E3%81%AE%E5%A4%A7%E8%A6%8F%E6%A8%A1/
35	LINE	japanese-large-lm	3.6B	2K	JP	Apache 2.0	2023-08-14	https://engineering.linecorp.com/ja/blog/3.6-billion-parameter-japanese-language-model
36	Stability AI	Japanese StableLM	7B	2K	JP	Apache 2.0	2023-08-10	https://stability.ai/blog/stability-ai-new-jplm-japanese-language-model-stablelm
37	Alibaba	Qwen	7B, 14B	8K	EN/CN	<100M MAU	2023-08-03	https://github.com/QwenLM/Qwen-7B
38	Meta	Llama 2	7B, 13B, 34B, 70B	4K	General	<700M MAU	2023-07-18	https://ai.meta.com/llama/
39	Salesforce	CodeGen2.5	7B		Code	Apache 2.0	2023-07-06	https://blog.salesforceairesearch.com/codegen25/
40	Salesforce	XGen	7B		General/Code	Apache 2.0	2023-06-28	https://blog.salesforceairesearch.com/xgen/
41	BAAI	Aquila	7B, 33B		EN/CN	Lawful SA	2023-06-09
42	TII	Falcon	7B, 40B	2K	General	Apache 2.0	2023-05-25	license changed 5/31
43	s-JoL	Open-Llama	7B		General	MIT	2023-05-11	https://github.com/s-JoL/Open-Llama
44	conceptofmind	PaLM	1B		General	MIT	2023-05-08	Open Source Reimplementation of Google's PaLM, only C4 trained https://www.reddit.com/r/MachineLearning/comments/13bxu2g/p_opensource_palm_models_trained_at_8k_context/
45	RedPajama	INCITE	3B, 7B		General	Apache 2.0	2023-05-05	https://www.together.xyz/blog/redpajama-models-v1
46	BigCode	StarCoder	15.5B		Code	OpenRAIL	2023-05-04
47	openlm-research	OpenLLaMA	3B, 7B, 13B, 20B		General	Apache 2.0	2023-05-02	All models done training to 1T
48	MosaicML	MPT	1B, 7B, 30B		General	Apache 2.0	2023-04-20	More being trained https://twitter.com/jefrankle/status/1649060478910357504
49	Stability AI	StableLM	3B, 7B, 15B, 30B		General	CC-BY-SA 4.0	2023-04-19	Still training (alpha checkpoint not good)
50	NVIDIA	GPT-2B-001	2B		General	CC-BY 4.0	2023-04-17
51	GeoV	GeoV	9B		General	OpenRAIL	2023-04-02	Still Training (checkpoints available)
52	Cerebras	Cerebras-GPT	1.3B, 2.7B, 6.7B, 13B		General	Apache 2.0	2023-03-28
53	Anthropic	Claude	claude-instant-1, claude-2	100K	General	Commercial API	2023-03-14	Wait List
54	OpenAI	GPT-4	1.8T?	32K	General	Commercial API	2023-03-14	Wait List
55	Together	GPT-JT-Moderation	6B		Moderation	Apache 2.0	2023-03-10
56	Together	GPT-NeoXT-Chat-Base	20B		Instruction	Apache 2.0	2023-03-10
57	AI21	J2	7.5B, 17B, 178B		General	Commercial API	2023-03-09
58	Meta	LLaMA	7B, 13B, 33B, 65B	2K	General	NC Research	2023-02-24
59	BlinkDL	RWKV	1B, 3B, 7B, 14B	4K+	General	Apache 2.0	2023-02-15
60	EleutherAI	Pythia	1B, 1.4B, 2.8B, 6.9B, 12B		General	Apache 2.0	2023-02-13
61	BigCode	Santacoder	1.1B		Code	OpenRAIL	2022-12-22
62	Stanford	BioMedLM	2.7B		Academic (Bio)	RAIL	2022-12-16
63	EleutherAI	Polyglot	1.3B, 3.8B, 5.8B		KO	Apache 2.0	2022-12-15
64	OpenAI	GPT-3.5	175B?		General	Commercial API	2022-11-30
65	Meta	Galactica	120B		Academic	CC-BY-NC 4.0	2022-11-16
66	Cohere	cohere	6B, 13B, 52B		General	Commercial API	2022-11-08
67	Google	Flan-T5	3B, 11B		General	Apache 2.0	2022-10-22
68	OpenBMB	CPM-Ant	1B, 3B, 7B, 10B		EN/CN	GML Open	2022-10-12
69	NVIDIA	NeMo	1.3B, 5B, 20B		General	CC-BY 4.0	2022-09-15
70	THUDM	GLM-130B	130B		EN/CN	NC China	2022-08-04
71	BigScience	BLOOM	1B, 3B, 7B, 176B		Multilingual	OpenRAIL	2022-07-12
72	Yandex	YaLM	100B		EN/RU	Apache 2.0	2022-06-23
73	Meta	OPT	1.3B, 2.7B, 13B, 30B, 66B, 175B		General	NC Research	2022-05-03
74	Salesforce	CodeGen	2B, 6B, 16B		Code	BSD	2022-04-06
75	EleutherAI	GPT-Neo	1.3B, 2.7B		General	Apache 2.0	2022-03-21
76	Google	Flan-UL2	20B		General	Apache 2.0	2022-03-03
77	EleutherAI	GPT-NeoX	20B		General	Apache 2.0	2022-02-02
78	Huawei	PanGu-α	2.6B, 13B, (200B)		EN/CN	Apache 2.0	2021-12-30
79	Meta	Fairseq	1.3B, 2.7B, 6.7B, 13B		General	MIT	2021-12-21
80	OpenAI	Codex	cushman, davinci		Code	Commercial API	2021-08-10
81	EleutherAI	GPT-J	6B		General	Apache 2.0	2021-06-04
82	Google	mT5	1.2B, 3.7B, 13B		Multilingual	Apache 2.0	2020-12-02
83	OpenAI	GPT-3	ada, babbage, curie, davinci		General	Commercial API	2020-06-11
84	Meta	Megatron	11B		General	MIT	2020-04-04
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100