CLUE1.0总排行榜
CLUE1.1提交规则
| 项目地址
CLUE1.1与CLUE1.0区别:区别与原有的CLUE1.0,CLUE1.1在部分任务启用了新的测试集,训练集和验证集保持不变;CLUE1.0保留CMNLI自然语言推理任务
排行 | 模型 | 研究机构 | 测评时间 | Score1.0 | 认证 | AFQMC | TNEWS1.0 | IFLYTEK | OCNLI_50K | CMNLI | WSC1.0 | CSL | CMRC2018 | CHID1.0 | C3 1.0 |
---|
1 | ShenZhou | QQ浏览器实验室(QQ Browser Lab) | 21-09-19 | 85.881 | 待认证 | 80.55 | 74.15 | 67.65 | 86.37 | 86.49 | 96.55 | 90.97 | 87.85 | 95.58 | 92.65 |
2 | HUMAN | CLUE | 19-12-01 | 85.610 | 已认证 | 81 | 71 | 80.3 | 90.3 | 76 | 98 | 84 | 92.4 | 87.1 | 96 |
3 | Mengzi | 澜舟科技-创新工场 | 21-09-14 | 84.939 | 待认证 | 81.79 | 75.06 | 65.08 | 82.57 | 86.13 | 96.55 | 89.87 | 83.95 | 96.0 | 92.39 |
4 | Motian | QQ浏览器搜索 | 21-06-25 | 84.056 | 待认证 | 78.3 | 73.18 | 65.46 | 84.97 | 85.44 | 94.83 | 90.17 | 85.3 | 94.42 | 88.49 |
5 | BERTSG | Sogou Search | 21-06-25 | 83.824 | 待认证 | 79.85 | 74.15 | 64.54 | 85.93 | 85.3 | 95.17 | 89 | 83.8 | 93.06 | 87.44 |
6 | Pangu | 华为云-循环智能 | 21-04-23 | 83.045 | 待认证 | 78.11 | 72.07 | 65.19 | 83.3 | 85.19 | 95.52 | 87.73 | 84.45 | 93.25 | 85.64 |
7 | MT-BERTs | Meituan NLP | 21-03-10 | 81.065 | 待认证 | 77.36 | 70.03 | 64.31 | 83.47 | 85.14 | 89.66 | 87.4 | 83.2 | 89.79 | 80.29 |
8 | LICHEE | 腾讯看点 | 21-01-08 | 80.507 | 待认证 | 76.97 | 70.5 | 64.15 | 81.3 | 84.54 | 90.69 | 87.4 | 79.8 | 87.5 | 82.22 |
9 | roberta_selfrun | OPPO小布助手 | 21-09-29 | 80.238 | 待认证 | 77.88 | 69.37 | 63.92 | 80.4 | 82.94 | 93.1 | 87.27 | 80.1 | 90.11 | 77.29 |
10 | BERTs | BERTs | 20-12-24 | 80.220 | 待认证 | 76.77 | 69.94 | 63.92 | 82.9 | 84.48 | 88.97 | 86.77 | 80.5 | 89.51 | 78.44 |
11 | UER-ensemble | TencentPretrain & TI-ONE | 20-11-28 | 80.086 | 待认证 | 76.82 | 72.2 | 64 | 80.8 | 84.09 | 90.34 | 85.83 | 79.15 | 86.03 | 81.6 |
12 | Archer-24E-SINGLE | search-nlp | 20-12-24 | 79.795 | 待认证 | 77.26 | 69.54 | 62.27 | 83.57 | 85.23 | 90 | 85.73 | 75.65 | 85.66 | 83.04 |
13 | BI-ALBERT | It's me. | 21-03-04 | 79.570 | 待认证 | 76.04 | 68.27 | 63.81 | 82.17 | 83.25 | 87.93 | 86.77 | 81.25 | 88 | 78.21 |
14 | selfrun-ensemble | OPPO小布助手 | 20-12-22 | 79.531 | 待认证 | 76.09 | 69.1 | 63.92 | 80.4 | 82.56 | 91.38 | 87.27 | 78.5 | 88.8 | 77.29 |
15 | Archer-24l | search-nlp | 20-11-30 | 79.338 | 待认证 | 77.44 | 69.96 | 62.69 | 82.57 | 84.78 | 87.24 | 85.17 | 74.05 | 85.41 | 84.07 |
16 | NEZHA-large | Huawei Noah's Ark lab | 20-11-14 | 79.289 | 待认证 | 76.59 | 69.37 | 63.62 | 80.93 | 84.21 | 89.31 | 85.27 | 77.9 | 86.53 | 79.16 |
17 | NvWa | Convolutional AI | 21-05-27 | 79.278 | 待认证 | 76.12 | 66.61 | 61.31 | 81.5 | 84.59 | 88.97 | 86.77 | 70.65 | 89.72 | 86.54 |
18 | BI-ALBERT | It's me! | 21-01-25 | 79.206 | 待认证 | 76.04 | 67.89 | 63.81 | 81.77 | 83.06 | 87.93 | 86.7 | 79.7 | 88 | 77.16 |
19 | RoFormerV2 large | 追一科技 | 22-03-19 | 78.027 | 待认证 | 76.95 | 58.87 | 62.65 | 75.83 | 81.2 | 86.21 | 84.97 | 80.5 | 87.68 | 85.41 |
20 | aoteman | aoteman | 20-12-03 | 77.534 | 待认证 | 75.83 | 68.75 | 62.65 | 77.13 | 82.54 | 85.52 | 83.43 | 78.15 | 86.57 | 74.77 |
https://mp.weixin.qq.com/s/PODShmOo0tg9cmchNhzvtw
自研神舟预训练大模型;百亿级别参数量,2TB 高价值数据(ensemble)
10B
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/CLUE
人类测评的得分
-
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://langboat.com/
孟子预训练模型;1B级别参数量;数百G高质量语料(互联网网页、社区、新闻、电子商务、金融等)。基于transformer的denoising 预训练模型。
1B数量级
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://mp.weixin.qq.com/s/HQL0Hk49UR6kVNtrvcXEGA
摩天预训练大模型;十亿级别参数量,1TB 高价值数据,优化Masked language model遮蔽方案,研发一种相对位置编码方案,大规模/大batch预训练模型训练能力优化
1B
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
None
BERTSG
BERTSG
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://mp.weixin.qq.com/s/gHoeUiZ2b4IvAb-S-wMdtw
盘古大模型
1100亿参数,40TB 的行业文本数据和超过 400 万小时的行业语音数据
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
MT-BERTs
MT-BERTs
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://mp.weixin.qq.com/s/em_mzM71edVA7XzrkeLDnw
大规模预训练,结构创新。https://github.com/BitVoyage/lichee
大规模预训练,结构创新
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/ZeroCLUE.git
na
na
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
BERTs
BERTs
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/dbiir/UER-py
半监督学习;Dynamic masking;Span masking;MLM目标任务;对抗训练;DUMA;WWM;(跨语言)多任务学习;模型集成
36层Transformer
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/zhouxincheng/super-bert
24层模型ALBERT
24层 ALBERT
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
X.W. S
X.W. S
X.W. S
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/CLUE
na
na
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/zhouxincheng/super-bert
24层 Bert
-
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/NEZHA
预训练模型为NEZHA模型。下游fine-tuning过程中,tnews任务使用了数据集中的keywords;WSC任务使用了数据增强。 iflytek, MRC任务使用了XNLI数据进行 coarse tuning。
-
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
aaa
-
-
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
Not Now!
Not Now!
Not Now!
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/ZhuiyiTechnology/roformer-v2
新版RoFormer,简化模型结构以获得更快的速度,并通过无监督MLM+有监督多任务预训练来提升模型性能。
314m
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/
na
na
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
ALBERT(Ensemble)
GitHub/模型网址:
提交日期:9月17日
分数:9月17日
更多详情:
参数说明
单任务微调。我们从MNLI为RTE、STS和MRPC优化的模型开始
诊断信息
诊断主混淆矩阵
|
C |
N |
E |
C |
182 |
36 |
40 |
N |
81 |
189 |
116 |
E |
17 |
69 |
374 |
C = 对立
N = 不包含
E = 包含