用例库
浏览所有评测维度和用例,查看各模型的生成结果对比
L-AgentMCP
L-ChinesePinyin
L-Code
L-Comprehension
L-Consistency
L-Context
L-Creative
L-Instruction
L-Knowledge
L-Logic
L-Math
L-Multilingual
L-QA
L-ReasoningChain
L-Roleplay
L-Safety
L-Summary
L-Translation
L-Writing
L-Hallucination
L-CriticalThinking
L-Polish
L-Hallucination
xsct-l
实时数据查询边界
L-Hallucination
xsct-l
虚构法律条文识别
L-Hallucination
xsct-l
虚构医学概念识别
L-Hallucination
xsct-l
错误文化常识纠正
L-Hallucination
xsct-l
未来事件预测边界
L-Hallucination
xsct-l
错误地理常识纠正
L-Hallucination
xsct-l
虚构企业与商业案例识别
L-Hallucination
xsct-l
虚构科学定律与物理常数的识别与纠正
L-Hallucination
xsct-l
虚构历史事件与人物的识别与纠正
L-Hallucination
xsct-l