ABCDEFGHIJKLMNOPQRSTUVWX
1
Model
pink = recently added
creatorMMLULMArena scoreSWE-Bench ScoreAL scoreParameters
(Bn)
Tokens trained (B)Ratio Tokens AnnouncedyearmonthdateLabPlaygroundMMLU
-Pro
GPQALinkArchiectureNoteopen accessforce labelshow only
2
source: LifeArchitect
https://docs.google.com/spreadsheets/d/1kc262HZSMAWI6FVsh0zJwbB-ooYvzhCHaHcNUiA0_hY/edit?gid=1158069878#gid=1158069878
https://huggingface.co/spaces/lmarena-ai/chatbot-arena-leaderboard
where available
https://www.vals.ai/benchmarks/legal_bench-09-08-2025
ALScore
"ALScore" is a quick and dirty rating of the model's power. The formula is:
Sqr Root of (Parameters x Tokens) ÷ 300.
Any ALScore ≥ 1.0 is a powerful model in mid-2023.
Ratio Tokens:Params
(Chinchilla scaling≥20:1)
as numeric
3
AMD-Llama-135mother23.00.00.1356704,963:1Sep/2024202495.75AMDhttps://huggingface.co/amd/AMD-Llama-135mhttps://www.amd.com/en/developer/resources/technical-articles/introducing-amd-first-slm-135m-model-fuels-ai-advancements.htmlDenseSmall language model (SLM) trained on 70,000 open access books
4
Apple On-Device Jun 24other26.80.23.041,500494:1Jun/2024202465.50Applehttps://github.com/apple/corenet/tree/main/projects/openelmhttps://arxiv.org/abs/2404.14619DenseYESsignificant models
5
Arcticother67.34.34803,5008:1Apr/2024202445.33
Snowflake AI Research
https://arctic.streamlit.app/https://www.snowflake.com/blog/arctic-open-efficient-foundation-language-models-snowflake/Hybrid
6
Atlasmeta47.90.111404:1Aug/2022202283.67Meta AIhttps://arxiv.org/abs/2208.03299Dense
7
Baichuan 2chinese54.20.6132,600200:1Sep/2023202394.75Baichuan Intelligencehttps://cdn.baichuan-ai.com/paper/Baichuan2-technical-report.pdfDenseChinese open-access equivalent to Meta's Llama modelYES
8
BLOOMother39.10.81763663:1Jul/2022202273.58BigSciencehttps://huggingface.co/spaces/huggingface/bloom_demohttps://github.com/bigscience-workshop/bigscience/tree/master/train/tr11-176B-mlDense(tr11-176B-ml)YESsignificant models
9
BloombergGPTother39.20.65056912:1Mar/2023202334.25Bloomberghttps://arxiv.org/abs/2303.17564DenseFinance-focussed (of course), based on BLOOM, underperforms against GPT 3significant models
10
Bytedance 175Bchinese440.81753002:1Feb/2024202425.17ByteDancehttps://arxiv.org/abs/2402.15627DenseGPT 3-esque
11
Bytedance 530Bchinese441.35303001:1Feb/2024202425.17ByteDancehttps://arxiv.org/abs/2402.15627DenseGPT 3-esqueYES
12
Chameleonmeta65.81.9349,200271:1May/2024202455.42Meta AIhttps://ai.meta.com/resources/models-and-libraries/chameleon-downloads/?gk_enable=chameleon_web_flow_is_livehttps://arxiv.org/abs/2405.09818Dense
13
ChatGPT (gpt-3.5-turbo)openAI70.020Nov/20222022113.92OpenAIhttps://chat.openai.com/28.1https://openai.com/blog/chatgptDensesignificant models
14
Chinchillagoogle67.51.0701,40020:1Mar/2022202233.25DeepMindhttps://arxiv.org/abs/2203.15556DenseFirst to double tokens per size increase
15
ChuXinother0.21.62,3001,438:1May/2024202455.42Independenthttps://huggingface.co/chuxin-llm/Chuxin-1.6B-Basehttps://arxiv.org/abs/2405.04828DenseChuXin-1M performs well across all context window lengths up to 1M (!)
16
Claude 2anthropic78.51.91302,50020:1Jul/2023202384.67Anthropichttps://claude.ai/https://www-files.anthropic.com/production/images/Model-Card-Claude-2.pdfDense
Expanded input and output length (up to 100,00 tokens) allowing the AI model to analyze long documents such as technical guides or entire books
significant models
17
Claude 2.1anthropic78.51.91302,50020:1Nov/20232023114.92Anthropichttps://claude.ai/https://www.anthropic.com/index/claude-2-1DenseFewer hallucinations, 200k context length, tool useYESsignificant models
18
Claude 3 Opusanthropic86.81,24829.8200040,00020:1Mar/2024202435.25Anthropichttps://claude.ai/68.559.5https://www.anthropic.com/claude-3-model-cardDense200K context window! 1M for researchersYESsignificant models
19
Claude 3.5 Sonnet*anthropic90.51,283Jun/2024202465.50Anthropichttps://poe.com/Claude-3.5-Sonnet7865https://www.anthropic.com/news/claude-3-5-sonnetDenseYESsignificant models
20
Command-Rother37.90.53570020:1Mar/2024202435.25CohereCohere37.9https://txt.cohere.com/command-r/Densehas knowledge outside training set (RAG), tool use
21
Command-R+other75.72.11044,00039:1Apr/2024202445.33Coherehttps://huggingface.co/spaces/CohereForAI/c4ai-command-r-plushttps://huggingface.co/CohereForAI/c4ai-command-r-plusDensebusiness orientated - "purpose-built to excel at real-world enterprise use cases."
22
DBRXother73.74.213212,00091:1Mar/2024202435.25MosaicMLhttps://huggingface.co/spaces/databricks/dbrx-instructhttps://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llmMoE
23
DCLM-Baseline 7B 2.6Tother63.70.472,600372:1Jun/2024202465.50Internationalhttps://huggingface.co/apple/DCLM-Baseline-7Bhttps://arxiv.org/abs/2406.11794Dense
24
DeepSeek-V2chinese54.81,2584.62368,10035:1May/2024202455.42DeepSeek-AIhttps://chat.deepseek.com/54.8https://arxiv.org/abs/2405.04434MoEHuge dataset, 12% Chinese "Therefore, we acknowledge that DeepSeek-V2 still has a slight gap in basic English capabilities with LLaMA3 70B".
25
DeepSeek-V3chinese87.11,31767114,80022:1Dec/20242025126.00DeepSeek-AIhttps://chat.deepseek.com/64.459.1https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdfMoEPrelude to DeepSeek-R1YESYESsignificant models
26
DeepSeek-R1chinese90.81,35767114,80022:1Jan/2025202516.08DeepSeek-AIhttps://chat.deepseek.com/8471.5https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdfMoEMajor reasoning open-source LLM with performance comparable to ChatGPT-o1YESYESsignificant models
27
Ernie 3.5chinese65.1Jul/2023202374.58Baiduhttps://yiyan.baidu.com/https://www.reuters.com/technology/chinas-baidu-unveils-latest-version-its-ernie-ai-model-2023-10-17/Dense"Enhanced Representation through kNowledge IntEgration"
28
Ernie 4.0chinese8614.9100020,00020:1Oct/20232023104.83Baiduhttps://yiyan.baidu.com/https://www.reuters.com/technology/chinas-baidu-unveils-latest-version-its-ernie-ai-model-2023-10-17/DenseEnhanced Representation through kNowledge IntEgration
29
Ernie 4.0 Turbo*chinese8614.9100020,00020:1Jun/2024202465.50Baiduhttps://www.reuters.com/technology/artificial-intelligence/baidu-launches-upgraded-ai-model-says-user-base-hits-300-mln-2024-06-28/https://www.reuters.com/technology/chinas-baidu-unveils-latest-version-its-ernie-ai-model-2023-10-17/Dense"rivals GPT-4 in capabilities"
30
EXAONE 3.0other27.40.87.88,0001,026:1Aug/2024202485.67LGhttps://huggingface.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct27.410.1https://arxiv.org/abs/2408.03541Dense“EXAONE”=“EXpert AI for EveryONE”significant models
31
EXAONE 3.5other78.31.5326,500204:1Dec/20242024126.00LGhttps://huggingface.co/collections/LGAI-EXAONE/exaone-35-674d0e1bb3dcd2ab6f39dbb439.7https://huggingface.co/collections/LGAI-EXAONE/exaone-35-674d0e1bb3dcd2ab6f39dbb4DenseEXAONE”=“EXpert AI for EveryONE”
32
Falcon 180Bother70.62.61803,50020:1Sep/2023202394.75TIIhttps://huggingface.co/spaces/tiiuae/falcon-180b-demohttps://arxiv.org/abs/2311.16867DenseLargest open-access modelYES
33
Falcon 2 11Bother58.40.8115,500500:1May/202420245TIIhttps://huggingface.co/tiiuae/falcon-11Bhttps://www.tii.ae/news/falcon-2-uaes-technology-innovation-institute-releases-new-ai-model-series-outperforming-metasDensefoundational large language model (LLM) with 40 billion parameters trained on one trillion tokenssignificant models
34
Falcon Mamba 7Bother62.10.776,000858:1Aug/2024202485.67TIIhttps://falconllm.tii.ae/falcon-models.html14.478.05https://falconllm.tii.ae/tii-releases-first-sslm-with-falcon-mamba-7b.htmlDense
35
Flan-PaLMgoogle73.52.25407802:1Oct/20222022103.83Googlehttps://arxiv.org/abs/2210.11416Dense
36
Galacticameta52.60.81204504:1Nov/20222022113.92Meta AIhttps://galactica.org/https://galactica.org/static/paper.pdfDensescientific onlyYESsignificant models
37
Gemini 1.5 Flashgoogle78.91,271May/2024202455.42Google DeepMindhttps://aistudio.google.com/app/prompts/new_chat59.139.5https://goo.gle/GeminiV1-5MoE1M context length.significant models
38
Gemini 1.5 Progoogle85.91,26022.4150030,00020:1Feb/2024202425.17Google DeepMindhttps://aistudio.google.com/app/prompts/new_chat6946.2https://goo.gle/GeminiV1-5MoE100 languagesYESsignificant models
39
Gemini 2.0google76.41,369Dec/20242024126.00Googlehttps://console.cloud.google.com/vertex-ai/generative/multimodal/create/text?model=gemini-2.0-flash-exp&pli=1&inv=1&invt=Abj6Sg76.462.1https://console.cloud.google.com/vertex-ai/generative/multimodal/create/text?model=gemini-2.0-flash-exp&pli=1&inv=1&invt=Abj6SgMoEintroduces native image generation and controllable text-to-speech capabilitiesYES
40
Gemini Ultra 1.0google83.722.4150030,00020:1Dec/20232023125.00Google DeepMindhttps://deepmind.google/technologies/gemini/35.7https://storage.googleapis.com/deepmind-media/gemini/gemini_1_report.pdfDenseBot, based on Chincillasignificant models
41
Gemini-1.5google75.81,30122.4150030,00020:1Sep/2024202495.75Google DeepMindhttps://aistudio.google.com/app/prompts/new_chat75.859.1https://developers.googleblog.com/en/updated-production-ready-gemini-models-reduced-15-pro-pricing-increased-rate-limits-and-more/MoEYESsignificant models
42
Gemmagoogle64.30.776,000858:1Feb/2024202425.17Google DeepMindhttps://labs.pplx.ai/33.7https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdfDenseYES
43
Gemma 2google75.21,22022713,000482:1Jun/2024202465.50Google DeepMindhttps://huggingface.co/google/gemma-2-27b-ithttps://storage.googleapis.com/deepmind-media/gemma/gemma-2-report.pdfDenseYES
44
GLM-4chinese81.51,27432004,00020:1Jan/2024202415.08Tsinghuahttps://open.bigmodel.cn/https://pandaily.com/zhipu-ai-unveils-glm-4-model-with-advanced-performance-paralleling-gpt-4/DenseBest Chinese model to date based on analysis. Follows OpenAI roadmapsignificant models
45
Gophergoogle60.01.02803002:1Dec/20212021123.00DeepMindhttps://arxiv.org/abs/2112.11446Dense
46
GPT-2openAI32.40.01.5107:1Feb/2019201920.17OpenAIHugging Facehttps://openai.com/blog/better-language-models/Densetrained on Reddit onlyYESsignificant models
47
GPT-3openAI43.90.81753002:1May/2020202052.65OpenAIhttps://arxiv.org/abs/2005.14165Densesignificant models
48
GPT-4 ClassicopenAI86.41,18615.9176013,0008:1Mar/2023202334.25OpenAIhttps://chat.openai.com/35.7https://cdn.openai.com/papers/gpt-4.pdfMoEsignificant models
49
GPT-4 Turbo*openAI86.41,25613,000Nov/20232023114.92OpenAIhttps://chat.openai.com/46.5https://cdn.openai.com/papers/gpt-4.pdfMoEsignificantly better model than other earlier GPTssignificant models
50
GPT-4o mini*openAI82.01,2731.1813,0001,625:1Jul/2024202475.58OpenAIhttps://chatgpt.com/40.2https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/MoE"OpenAI would not disclose exactly how large GPT-4o mini is, but said it’s roughly in the same tier as other small AI models, such as Llama 3 8b, Claude Haiku and Gemini 1.5 Flash."significant models
51
GPT-4o*openAI88.71,2656.720020,000100:1May/2024202455.42OpenAIhttps://chatgpt.com/72.653.6https://openai.com/index/gpt-4o-system-card/MoElikely early "beta" of GPT-5YESsignificant models
52
GPT-NeoXother33.620Mar/2023202334.25Togetherhttps://huggingface.co/spaces/togethercomputer/OpenChatKithttps://github.com/togethercomputer/OpenChatKitDenseYES
53
Graniteother57.00.6132,500193:1Sep/2023202394.75IBMhttps://www.ibm.com/granitehttps://www.ibm.com/downloads/cas/X9W4O6BMtrained on 2.5T token
54
Griffingoogle49.50.21430022:1Feb/2024202425.17Google DeepMindhttps://arxiv.org/abs/2402.19427Dense
55
GRIN MoEmicrosoft79.41.6604,02568:1Sep/2024202495.75Microsofthttps://huggingface.co/microsoft/GRIN-MoEhttps://huggingface.co/microsoft/GRIN-MoE/blob/main/GRIN_MoE.pdfMoE
56
Grok-1.5other81.34.63146,00020:1Mar/2024202435.25xAIhttps://grok.x.ai/https://x.ai/blog/grok-1.5MoETwitter chatbot trained on Twitter data, Context=128k.
57
Grok-2other87.51,28810.060015,00025:1Aug/2024202485.67xAIhttps://x.com/i/grok75.556https://x.ai/blog/grok-2MoETwitter chatbot trained on Twitter dataYESsignificant models
58
H20-Danube3-4Bother55.20.546,0001,500:1Jul/2024202475.58H20.aihttps://h2o.ai/platform/danube/personal-gpt/https://arxiv.org/abs/2407.09276Dense"Runs natively and fully offline on mobile phone."
59
Hawkgoogle35.00.2730043:1Feb/2024202425.17Google DeepMindhttps://arxiv.org/abs/2402.19427Dense
60
HLATother41.30.471,800258:1Apr/2024202445.33Amazonhttps://arxiv.org/abs/2404.10630Dense
61
iFlytekSpark-13Bother63.00.7133,000231:1Jan/2024202415.08iFlyTekhttps://gitee.com/iflytekopensource/iFlytekSpark-13Bhttps://www.ithome.com/0/748/030.htmDense
62
Inflection-2.5other85.516.3120020,00017:1Mar/2024202435.25Inflection AIhttps://inflection.ai/inflection-238.4https://inflection.ai/inflection-2-5Dense
63
InternLM2chinese67.70.8202,600130:1Jan/2024202415.08
Shanghai AI Laboratory/SenseTime
https://github.com/InternLM/InternLMhttps://arxiv.org/abs/2403.17297DenseImage-aware Decoder Enhanced à la Flamingo with Interleaved Cross-attentionS
64
InternLM2.5chinese73.50.8202,600130:1Jul/2024202475.58
Shanghai AI Laboratory/SenseTime
https://huggingface.co/internlm/internlm2_5-20b-chat38.4https://github.com/InternLM/InternLM/blob/main/model_cards/internlm2.5_7b.mdDenseImage-aware Decoder Enhanced à la Flamingo with Interleaved Cross-attentionSYES
65
Jamba 1other67.41.7525,00097:1Mar/2024202435.25AI21https://huggingface.co/ai21labs/Jamba-v0.1https://arxiv.org/abs/2403.19887MoE
66
Jamba 1.5other81.21,2215.93988,00021:1Aug/2024202485.67AI21https://huggingface.co/collections/ai21labs/jamba-15-66c44befa474a917fcf5525153.536.9https://arxiv.org/abs/2408.12570MoE
optimized for business use cases and capabilities such as function calling, structured output (JSON), and grounded generation.
67
JetMoE-8Bother49.20.381,250157:1Apr/2024202445.33MIThttps://www.lepton.ai/playground/chat?model=jetmoe-8b-chathttps://huggingface.co/jetmoe/jetmoe-8bMoE
68
K2other64.81.0651,40022:1May/2024202455.42LLM360https://huggingface.co/LLM360/K2https://www.llm360.ai/blog/several-new-releases-to-further-our-mission.htmlDenseoutperforming Llama 2 70B using 35% less compute.
69
LFM-40Bother78.80.9402,00050:1Sep/2024202495.75Liquid AIhttps://labs.perplexity.ai/55.6https://www.liquid.ai/liquid-foundation-modelsMoE
70
Llama 2meta68.91.2702,00029:1Jul/2023202384.67Meta AIhttps://www.llama2.ai/37.526.26https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/DenseOpen source LLM comes in 3 parameter sizes - 7, 30, and 70 bn, Context window=4096YESsignificant models
71
Llama 3 70Bmeta82.01,2063.47015,000215:1Apr/2024202445.33Meta AIhttps://meta.ai/52.8https://ai.meta.com/blog/meta-llama-3/DenseYESsignificant models
72
Llama 3.1 405Bmeta88.61,2698.240515,00038:1Jul/2024202475.58Meta AIhttps://www.meta.ai/73.351.1https://ai.meta.com/research/publications/the-llama-3-herd-of-models/DenseYESYESsignificant models
73
Llama 3.2 3Bmeta63.40.63.219,0002,804:1Sep/2024202495.75Meta AIhttps://www.llama.com/32.8https://www.llama.com/DenseYESsignificant models
74
LLama 3.3meta86.01,2563.47015,000215:1Dec/20242024126.00Meta AIhttps://huggingface.co/meta-llama/Llama-3.3-70B-Instruct68.950.5https://huggingface.co/meta-llama/Llama-3.3-70B-InstructDenseEnhanced performanceYESYES
75
LLaMA-65Bmeta68.91.0651,40022:1Feb/2023202324.17Meta AIWeights leaked: https://github.com/facebookresearch/llama/pull/73/files https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/DenseResearchers only, noncommercial only.YES
76
Mambaother26.20.12.8300108:1Dec/20232023125.00CMUhttps://huggingface.co/havenhq/mamba-chathttps://arxiv.org/abs/2312.00752Dense
77
MAP-Neoother58.10.674,500643:1May/2024202455.42Internationalhttps://map-neo.github.io/https://arxiv.org/abs/2405.19327DenseYES
78
Minitron-4Bother58.60.149424:1Aug/2024202485.67NVIDIAhttps://huggingface.co/nvidia/Minitron-4B-Basehttps://arxiv.org/abs/2407.14679Dense
79
Minitron-8Bother63.80.149424:1Jul/2024202475.58NVIDIAhttps://huggingface.co/nvidia/Mistral-NeMo-Minitron-8B-Basehttps://blogs.nvidia.com/blog/mistral-nemo-minitron-8b-small-language-model/Densesignificant models
80
Mistral 7Bmistral30.90.37.3800110:1Sep/2023202394.75Mistralhttps://huggingface.co/mistralaihttps://mistral.ai/news/announcing-mistral-7b/Open source, outperforms Llama2YESsignificant models
81
Mistral Largemistral81.21,2515.23008,00027:1Feb/2024202425.17Mistralhttps://poe.com/Mistral-Largehttps://mistral.ai/news/mistral-large/Densenatively fluent in English, French, Spanish, German, and Italiansignificant models
82
Mistral Large 2mistral84.01,2493.31238,00066:1Jul/2024202475.58Mistralhttps://huggingface.co/mistralai/Mistral-Large-Instruct-2407https://mistral.ai/news/mistral-large-2407/Densesignificant models
83
Mistral Smallmistral72.20.573,000429:1Feb/2024202425.17Mistralhttps://chat.mistral.ai/chathttps://mistral.ai/news/mistral-large/DenseOptimised for latency and cost.significant models
84
Mistral-mediummistral75.32.61803,50020:1Dec/20232023125.00Mistralhttps://poe.com/https://mistral.ai/news/la-plateforme/Dense
85
mixtral-8x22bmistral77.81.81412,00015:1Apr/2024202445.33Mistralhttps://huggingface.co/mistral-community/Mixtral-8x22B-v0.1https://mistral.ai/news/mixtral-8x22b/MoE
86
mixtral-8x7b-32kseqlenmistral70.62.046.78,000172:1Dec/20232023125.00Mistralhttps://www.together.ai/blog/mixtral43.3https://arxiv.org/abs/2401.04088MoEprocesses input and generates output at the same speed and for the same cost as a 12B model.'
87
NeMomistral68.00.5122,000167:1Jul/2024202475.58Mistralhttps://huggingface.co/mistralai/Mistral-Nemo-Base-2407https://mistral.ai/news/mistral-nemo/Dense
88
Nemotron-3 22Bother54.41.0223,800173:1Nov/20232023114.92NVIDIAhttps://huggingface.co/nvidia/nemotron-3-8b-base-4khttps://developer.nvidia.com/blog/nvidia-ai-foundation-models-build-custom-enterprise-chatbots-and-co-pilots-with-production-ready-llms/DenseNvidia's LLM
89
Nemotron-4 15Bother64.21.2158,000534:1Feb/2024202425.17NVIDIAhttps://arxiv.org/abs/2402.16819DenseNvidia's LLM
90
Nemotron-4-340Bother81.15.83409,00027:1Jun/2024202465.50NVIDIAhttps://build.nvidia.com/nvidia/nemotron-4-340b-instructhttps://d1qx31qr3h6wln.cloudfront.net/publications/Nemotron_4_340B_8T.pdfDenseNvidia's LLM. Open-source equiv of Mar/2023 GPT-4YESYESsignificant models
91
NLVM 1.0other82.03.87218,000250:1Sep/2024202495.75NVIDIAhttps://huggingface.co/nvidia/NVLM-D-72Bhttps://arxiv.org/abs/2409.11402DenseFlamingo clone.
92
Nova Pro*other85.91,2443.29010,000112:1Dec/20242024126.00Amazonhttps://aws.amazon.com/bedrock/46.9https://www.aboutamazon.com/news/aws/amazon-nova-artificial-intelligence-bedrock-awsDenseFirst major LLM from Amazon, same performance as LLama 3.2
93
Nova*other88.8May/2024202455.42Rubiks AIhttps://rubiks.ai/https://rubiks.ai/nova/release/Dense
94
o1 pro*openAI92.31,3646.720020,000100:1Dec/20242024126.00OpenAIhttps://chatgpt.com/9179https://chatgpt.com/MoEa version of our most intelligent model that thinks longer for the most reliable responsesYES
95
o1*openAI92.31,3356.720020,000100:1Sep/2024202495.75OpenAIhttps://chatgpt.com/9178.3https://openai.com/index/introducing-openai-o1-preview/MoEYESsignificant models
96
OLMoE-1B-7Bother54.10.76.95,900856:1Sep/2024202495.75Allen AIhttps://huggingface.co/collections/allenai/olmoe-66cf678c047657a30c8cd3da23https://arxiv.org/abs/2409.02060v1MoE
97
OpenELMother26.80.23.041,500494:1Apr/2024202445.33Applehttps://huggingface.co/apple/OpenELM-3B-Instructhttps://arxiv.org/abs/2404.14619Dense
98
Orion-14Bother69.60.6142,500179:1Jan/2024202415.08OrionStarhttps://github.com/OrionStarAI/Orionhttps://arxiv.org/abs/2401.12246DenseEnglish, Chinese, Japanese, Korean, and other languages.
99
Palmyra Xother70.21.0721,20017:1Jan/2024202415.08Writerhttps://writer.com/blog/palmyra-helm-benchmark/Dense
100
phi-3-mediummicrosoft78.20.9144,800343:1Apr/2024202445.33Microsofthttps://huggingface.co/microsoft/Phi-3-medium-128k-instruct55.7https://arxiv.org/abs/2404.14219Dense