SuperCLUE中文数学推理排行榜

SuperCLUE中文数学推理能力排行榜,测试模型中文数学问题求解能力

排名模型机构评分
1Gemini 3.1 Pro Preview (High)Google
2GPT 5.5 HighOpenAI
3Claude Opus 4.7 (high)Anthropic
4Gemini 3.5 Flash (high)Google
5Doubao Seed 2.0 Pro 260215 (High)ByteDance
6Qwen3.7 MaxQwen
7DeepSeek V4 Pro (Max)DeepSeek
8Kimi K2.6moonshot
9DeepSeek V4 Flash (Max)DeepSeek
10Qwen 3.6 Max Previewalibaba

SuperCLUE 中文大模型排行榜

中文大模型评测基准,综合评估AI模型中文理解与生成能力

数学推理能力排行

排名模型机构总分代码数学指令科学幻觉智能体变动
🥇
DeepSeek V4 Flash (Max)
DeepSeek V4 Flash (Max)
DeepSeek676783377271763
🥈
Gemini 3.1 Pro Preview (High)
Gemini 3.1 Pro Preview (High)
Google76818256728775
🥉
GPT 5.5 High
GPT 5.5 High
OpenAI74738253638787
#4
Gemini 3.5 Flash (high)
Gemini 3.5 Flash (high)
Google72718245758670
#5
Qwen3.7 Max
Qwen3.7 Max
Qwen70808231748371
#6
Claude Opus 4.7 (high)
Claude Opus 4.7 (high)
Anthropic74798156688176
#7
Doubao Seed 2.0 Pro 260215 (High)
Doubao Seed 2.0 Pro 260215 (High)
ByteDance706877447580762
#8
Kimi K2.6
Kimi K2.6
moonshot69767630707981
#9
Doubao-Seed-2.0-lite-260428(high)
Doubao-Seed-2.0-lite-260428(high)
ByteDance66587540727973
#10
Gemma 4 31B
Gemma 4 31B
Google5866751678357
#11
DeepSeek V4 Pro (Max)
DeepSeek V4 Pro (Max)
DeepSeek70757249707978
#12
GLM 5.1
GLM 5.1
Zhipu63717029687567
#13
MiMo V2.5 Pro
MiMo V2.5 Pro
Xiaomi57687013676562
#14
Ernie 5.1
Ernie 5.1
Baidu63586848587770
#15
Qwen3.6-27B(Thinking)
Qwen3.6-27B(Thinking)
Qwen62636821687773
#16
Spark X2
Spark X2
iFlytek55516837063723
#17
Qwen 3.6 Max Preview
Qwen 3.6 Max Preview
alibaba67666732688583
#18
Step 3.5 Flash
Step 3.5 Flash
StepFun54636512606165
#19
Minimax M2.7
Minimax M2.7
MiniMax526265234657602
#20
Grok 4.3
Grok 4.3
xAI56675823617154
#21Hy3 preview(high)Unknown5056519586856