评测榜单
致力于探索最先进的大模型,为产研界提供全面、客观、中立的评测参考
| Rank | Model | Score |
|---|---|---|
|
|
Claude4.5-sonnet-thinking | 37.65 |
|
|
Gemini3-pro | 37.02 |
|
|
Doubao-Seed-1.6-Thinking | 36.79 |
| 4 | o4-mini | 34.51 |
| 5 | Grok4 | 34.34 |
| 6 | Hunyuan-2.0-Thinking | 33.38 |
| 7 | DeepSeek-V3.2-thinking | 32.6 |
| 8 | DeepSeek-V3.2-Speciale | 31.42 |
| 9 | GPT-5.2-Thinking | 29.98 |
| 10 | DeepSeek-R1 | 29.43 |
| 11 | Llama4-Maverick | 28.87 |
| 12 | Qwen3-Max | 28.66 |
| 13 | Gemini2.5-pro | 28.48 |
| 14 | Qwen3-32B | 26.12 |
| 15 | ERNIE4.0 | 22.69 |
| 16 | Doubao-pro | 19.13 |
| 17 | GLM4-32B | 14.82 |
| 18 | Baichuan4 | 13.88 |
| Rank | Model | Score |
|---|---|---|
|
|
Claude4.5-sonnet-thinking | 30.61 |
|
|
DeepSeek-V3.2-Speciale | 29 |
|
|
DeepSeek-V3.2-thinking | 25.87 |
| 4 | Doubao-Seed-1.6-Thinking | 24.72 |
| 5 | Gemini3-pro | 23.9 |
| 6 | Grok4 | 23.61 |
| 7 | Hunyuan-2.0-Thinking | 23 |
| 8 | o4-mini | 21.36 |
| 9 | ERNIE4.0 | 20.6 |
| 10 | Qwen3-Max | 20.44 |
| 11 | Llama4-Maverick | 18.76 |
| 12 | GPT-5.2-Thinking | 18.65 |
| 13 | Gemini2.5-pro | 18.42 |
| 14 | DeepSeek-R1 | 17.14 |
| 15 | Baichuan4 | 14.95 |
| 16 | GLM4-32B | 12.36 |
| 17 | Qwen3-32B | 11.32 |
| 18 | Doubao-pro | 9.68 |
| Rank | Model | Score |
|---|---|---|
|
|
o4-mini | 55.07 |
|
|
Doubao-Seed-1.6-Thinking | 53.5 |
|
|
DeepSeek-V3.2-thinking | 51.6 |
| 4 | Hunyuan-2.0-Thinking | 47.61 |
| 5 | Qwen3-32B | 47.33 |
| 6 | DeepSeek-V3.2-Speciale | 46.89 |
| 7 | Claude4.5-sonnet-thinking | 46.61 |
| 8 | DeepSeek-R1 | 45.53 |
| 9 | Grok4 | 45.11 |
| 10 | Gemini3-pro | 41.83 |
| 11 | GPT-5.2-Thinking | 37.56 |
| 12 | Qwen3-Max | 35.98 |
| 13 | Llama4-Maverick | 35.43 |
| 14 | Gemini2.5-pro | 34 |
| 15 | GLM4-32B | 27.33 |
| 16 | ERNIE4.0 | 27.33 |
| 17 | Doubao-pro | 27.07 |
| 18 | Baichuan4 | 20.67 |
| Rank | Model | Score |
|---|---|---|
|
|
Claude4.5-sonnet-thinking | 53 |
|
|
GPT-5.2-Thinking | 53 |
|
|
Gemini3-pro | 48.33 |
| 4 | Hunyuan-2.0-Thinking | 48.33 |
| 5 | o4-mini | 45.4 |
| 6 | Grok4 | 45 |
| 7 | Qwen3-Max | 43 |
| 8 | Doubao-Seed-1.6-Thinking | 41.44 |
| 9 | DeepSeek-V3.2-thinking | 41 |
| 10 | DeepSeek-V3.2-Speciale | 39 |
| 11 | DeepSeek-R1 | 37 |
| 12 | Gemini2.5-pro | 37 |
| 13 | Llama4-Maverick | 35 |
| 14 | Qwen3-32B | 33 |
| 15 | GLM4-32B | 26.5 |
| 16 | Baichuan4 | 15.8 |
| 17 | Doubao-pro | 5.5 |
| 18 | ERNIE4.0 | 2 |
| Rank | Model | Score |
|---|---|---|
|
|
Gemini2.5-pro | 29.92 |
|
|
DeepSeek-V3.2-thinking | 27.92 |
|
|
Gemini3-pro | 26 |
| 4 | Grok4 | 24 |
| 5 | Qwen3-32B | 22 |
| 6 | DeepSeek-R1 | 20.4 |
| 7 | Llama4-Maverick | 20 |
| 8 | Doubao-Seed-1.6-Thinking | 20 |
| 9 | DeepSeek-V3.2-Speciale | 19.83 |
| 10 | o4-mini | 18.09 |
| 11 | GPT-5.2-Thinking | 15.38 |
| 12 | Baichuan4 | 14 |
| 13 | Qwen3-Max | 12 |
| 14 | Hunyuan-2.0-Thinking | 11.54 |
| 15 | Doubao-pro | 11.1 |
| 16 | Claude4.5-sonnet-thinking | 8 |
| 17 | ERNIE4.0 | 7.18 |
| 18 | GLM4-32B | 3.38 |
| Rank | Model | Score |
|---|---|---|
|
|
Gemini3-pro | 62.44 |
|
|
Claude4.5-sonnet-thinking | 48.96 |
|
|
Doubao-Seed-1.6-Thinking | 47.96 |
| 4 | Llama4-Maverick | 45.7 |
| 5 | Grok4 | 44.85 |
| 6 | GPT-5.2-Thinking | 42.74 |
| 7 | Doubao-pro | 41.68 |
| 8 | Hunyuan-2.0-Thinking | 41.4 |
| 9 | Gemini2.5-pro | 40.38 |
| 10 | Qwen3-Max | 39.4 |
| 11 | o4-mini | 39.32 |
| 12 | ERNIE4.0 | 39.04 |
| 13 | DeepSeek-R1 | 36.72 |
| 14 | Qwen3-32B | 29.92 |
| 15 | DeepSeek-V3.2-thinking | 19.1 |
| 16 | DeepSeek-V3.2-Speciale | 16.28 |
| 17 | GLM4-32B | 2.08 |
| 18 | Baichuan4 | 0 |
领域类榜单
| Rank | Model | Score |
|---|---|---|
|
|
Doubao-Seed-1.6-Thinking | 45.42 |
|
|
Grok4 | 44.79 |
|
|
Gemini3-pro | 43.98 |
| 4 | Llama4-Maverick | 43.12 |
| 5 | o4-mini | 42.93 |
| 6 | Claude4.5-sonnet-thinking | 41.42 |
| 7 | Gemini2.5-pro | 37.02 |
| 8 | Hunyuan-2.0-Thinking | 36.02 |
| 9 | Qwen3-32B | 35.47 |
| 10 | GPT-5.2-Thinking | 34.08 |
| 11 | DeepSeek-R1 | 33.39 |
| 12 | Qwen3-Max | 33.14 |
| 13 | Doubao-pro | 26.75 |
| 14 | DeepSeek-V3.2-thinking | 26.59 |
| 15 | ERNIE4.0 | 24.01 |
| 16 | DeepSeek-V3.2-Speciale | 23.87 |
| 17 | GLM4-32B | 14.44 |
| 18 | Baichuan4 | 13.62 |
| Rank | Model | Score |
|---|---|---|
|
|
Doubao-Seed-1.6-Thinking | 48.33 |
|
|
DeepSeek-V3.2-Speciale | 48.33 |
|
|
Gemini3-pro | 43.33 |
| 4 | Hunyuan-2.0-Thinking | 41.67 |
| 5 | Claude4.5-sonnet-thinking | 35 |
| 6 | DeepSeek-V3.2-thinking | 31.67 |
| 7 | ERNIE4.0 | 30 |
| 8 | Grok4 | 30 |
| 9 | Gemini2.5-pro | 26.67 |
| 10 | Qwen3-Max | 25 |
| 11 | GPT-5.2-Thinking | 23.33 |
| 12 | GLM4-32B | 22 |
| 13 | Qwen3-32B | 22 |
| 14 | o4-mini | 21.67 |
| 15 | Baichuan4 | 20 |
| 16 | DeepSeek-R1 | 20 |
| 17 | Llama4-Maverick | 16.67 |
| 18 | Doubao-pro | 16.67 |
| Rank | Model | Score |
|---|---|---|
|
|
Gemini3-pro | 32 |
|
|
DeepSeek-V3.2-Speciale | 30 |
|
|
DeepSeek-V3.2-thinking | 26 |
| 4 | Claude4.5-sonnet-thinking | 22 |
| 5 | Hunyuan-2.0-Thinking | 20 |
| 6 | Qwen3-Max | 18 |
| 7 | GPT-5.2-Thinking | 18 |
| 8 | Gemini2.5-pro | 8 |
| 9 | DeepSeek-R1 | 6 |
| 10 | Qwen3-32B | 2 |
| 11 | Doubao-Seed-1.6-Thinking | 2 |
| 12 | GLM4-32B | 0 |
| 13 | Baichuan4 | 0 |
| 14 | Llama4-Maverick | 0 |
| 15 | Doubao-pro | 0 |
| 16 | o4-mini | 0 |
| 17 | ERNIE4.0 | 0 |
| 18 | Grok4 | 0 |
| Rank | Model | Score |
|---|---|---|
|
|
Claude4.5-sonnet-thinking | 33.75 |
|
|
DeepSeek-V3.2-Speciale | 30 |
|
|
Qwen3-Max | 29.6 |
| 4 | ERNIE4.0 | 29.5 |
| 5 | DeepSeek-V3.2-thinking | 29.17 |
| 6 | Hunyuan-2.0-Thinking | 26.66 |
| 7 | GPT-5.2-Thinking | 25.95 |
| 8 | DeepSeek-R1 | 25.4 |
| 9 | Grok4 | 24.01 |
| 10 | o4-mini | 21.28 |
| 11 | Llama4-Maverick | 18.9 |
| 12 | Gemini3-pro | 18.75 |
| 13 | Baichuan4 | 17.9 |
| 14 | Gemini2.5-pro | 17.56 |
| 15 | GLM4-32B | 15.9 |
| 16 | Doubao-Seed-1.6-Thinking | 15.55 |
| 17 | Doubao-pro | 14.55 |
| 18 | Qwen3-32B | 10.3 |
| Rank | Model | Score |
|---|---|---|
|
|
DeepSeek-V3.2-thinking | 87.8 |
|
|
DeepSeek-V3.2-Speciale | 75.67 |
|
|
o4-mini | 74.2 |
| 4 | Claude4.5-sonnet-thinking | 70.83 |
| 5 | Doubao-Seed-1.6-Thinking | 67.5 |
| 6 | Hunyuan-2.0-Thinking | 57.83 |
| 7 | DeepSeek-R1 | 55.6 |
| 8 | GPT-5.2-Thinking | 46.67 |
| 9 | Gemini3-pro | 41.5 |
| 10 | Grok4 | 41.32 |
| 11 | GLM4-32B | 34 |
| 12 | Qwen3-32B | 34 |
| 13 | ERNIE4.0 | 32 |
| 14 | Qwen3-Max | 30.93 |
| 15 | Gemini2.5-pro | 20 |
| 16 | Baichuan4 | 15 |
| 17 | Llama4-Maverick | 13.3 |
| 18 | Doubao-pro | 12.2 |
