SuperCLUE中文指令遵循排行榜

SuperCLUE中文指令遵循能力排行榜,测试模型按照中文指令执行任务的能力

排名模型机构评分
1Gemini 3.1 Pro Preview (High)Google
2Claude Opus 4.7 (high)Anthropic
3GPT 5.5 HighOpenAI
4Gemini 3.5 Flash (high)Google
5Doubao Seed 2.0 Pro 260215 (High)ByteDance
6DeepSeek V4 Pro (Max)DeepSeek
7Qwen3.7 MaxQwen
8Kimi K2.6moonshot
9Qwen 3.6 Max Previewalibaba
10DeepSeek V4 Flash (Max)DeepSeek

SuperCLUE 中文大模型排行榜

中文大模型评测基准,综合评估AI模型中文理解与生成能力

指令遵循能力排行

排名模型机构总分代码数学指令科学幻觉智能体变动
🥇
Gemini 3.1 Pro Preview (High)
Gemini 3.1 Pro Preview (High)
Google76818256728775
🥈
Claude Opus 4.7 (high)
Claude Opus 4.7 (high)
Anthropic74798156688176
🥉
GPT 5.5 High
GPT 5.5 High
OpenAI74738253638787
#4
DeepSeek V4 Pro (Max)
DeepSeek V4 Pro (Max)
DeepSeek70757249707978
#5
Ernie 5.1
Ernie 5.1
Baidu63586848587770
#6
Gemini 3.5 Flash (high)
Gemini 3.5 Flash (high)
Google72718245758670
#7
Doubao Seed 2.0 Pro 260215 (High)
Doubao Seed 2.0 Pro 260215 (High)
ByteDance706877447580762
#8
Doubao-Seed-2.0-lite-260428(high)
Doubao-Seed-2.0-lite-260428(high)
ByteDance66587540727973
#9
DeepSeek V4 Flash (Max)
DeepSeek V4 Flash (Max)
DeepSeek676783377271763
#10
Qwen 3.6 Max Preview
Qwen 3.6 Max Preview
alibaba67666732688583
#11
Qwen3.7 Max
Qwen3.7 Max
Qwen70808231748371
#12
Kimi K2.6
Kimi K2.6
moonshot69767630707981
#13
GLM 5.1
GLM 5.1
Zhipu63717029687567
#14
Grok 4.3
Grok 4.3
xAI56675823617154
#15
Minimax M2.7
Minimax M2.7
MiniMax526265234657602
#16
Qwen3.6-27B(Thinking)
Qwen3.6-27B(Thinking)
Qwen62636821687773
#17
MiMo V2.5 Pro
MiMo V2.5 Pro
Xiaomi57687013676562
#18
Step 3.5 Flash
Step 3.5 Flash
StepFun54636512606165
#19Hy3 preview(high)Unknown5056519586856
#20
Spark X2
Spark X2
iFlytek55516837063723
#21
Gemma 4 31B
Gemma 4 31B
Google5866751678357