pCLUE:1200000+多任务提示学习数据集
项目地址 | 在线demo | 提交样例
模型描述需包含关键词"pCLUE"; 2022-10-01: pCLUE榜启用;p指的是prompt learning即提示学习。由于包含了多种任务,也是多任务学习的数据集。任务类型分为四种类型:分类、阅读理解、推理、生成。
提交需实名,即:队伍名称、模型名称、Url/Github、模型描述,需有真实有效。无意义的提交将被移除;有问题发邮件:CLUEbenchmark@163.com;有效的提交,需在一个模型上进行多任务学习并预测。
排行 | 模型 | 研究机构 | 测评时间 | Score
| 阅读理解 F1 | EM
| 分类(acc)
| 推理(acc)
| 生成(rouge-l)*
|
---|
1 | Human | CLUE | 22-10-01 | 0.812 | 0.973 | 0.918 | 0.800 | 0.903 | 0.600 |
2 | nlutest2-large-gen | nucky | 23-09-07 | 0.589 | 0.790 | 0.680 | 0.579 | 0.686 | 0.357 |
3 | nlutest2-gen | nucky | 23-08-29 | 0.569 | 0.761 | 0.651 | 0.575 | 0.638 | 0.357 |
4 | pvpvpvpv | threecolor | 22-11-27 | 0.561 | 0.775 | 0.649 | 0.593 | 0.580 | 0.359 |
5 | hemu_pclue_merge | hemu | 23-02-21 | 0.552 | 0.713 | 0.585 | 0.585 | 0.627 | 0.347 |
6 | p_10_2_12 | threecolor | 22-11-21 | 0.550 | 0.748 | 0.621 | 0.584 | 0.575 | 0.356 |
7 | multi_rm_20_2_12 | threecolor | 22-11-23 | 0.549 | 0.746 | 0.615 | 0.579 | 0.577 | 0.358 |
8 | base_test_transformer_3_24_m | threecolor | 22-11-08 | 0.548 | 0.747 | 0.617 | 0.583 | 0.570 | 0.357 |
9 | multi_fv_10_2_12 | threecolor | 22-11-23 | 0.548 | 0.750 | 0.622 | 0.578 | 0.569 | 0.359 |
10 | pclue_mengzi_param7-true | cxy | 22-11-23 | 0.548 | 0.747 | 0.618 | 0.583 | 0.570 | 0.357 |
11 | hemu_pclue_new | hemu | 23-02-21 | 0.548 | 0.713 | 0.585 | 0.571 | 0.627 | 0.347 |
12 | p_20?_2_12 | threecolor | 22-11-21 | 0.547 | 0.747 | 0.619 | 0.583 | 0.568 | 0.354 |
13 | pclue_mengzi_param5 | cxy | 22-11-23 | 0.547 | 0.748 | 0.619 | 0.583 | 0.570 | 0.351 |
14 | pclue_mengzi_param6 | cxy | 22-11-23 | 0.547 | 0.748 | 0.619 | 0.583 | 0.570 | 0.353 |
15 | pclue_mengzi_param7 | cxy | 22-11-23 | 0.547 | 0.748 | 0.619 | 0.583 | 0.570 | 0.353 |
16 | single_rm_10_2_12 | threecolor | 22-11-23 | 0.547 | 0.751 | 0.621 | 0.579 | 0.565 | 0.359 |
17 | multi_rm_10_2_12 | threecolor | 22-11-23 | 0.547 | 0.751 | 0.622 | 0.576 | 0.566 | 0.359 |
18 | pclue_mengzi_param4 | cxy | 22-11-22 | 0.546 | 0.748 | 0.618 | 0.583 | 0.570 | 0.350 |
19 | mmodel | mzy | 23-04-25 | 0.546 | 0.736 | 0.594 | 0.596 | 0.575 | 0.349 |
20 | ixm | ixm | 23-04-25 | 0.546 | 0.736 | 0.594 | 0.596 | 0.575 | 0.349 |
https://github.com/CLUEbenchmark/pCLUE
人类的水平-非模型, pclue
人类大脑的参数量
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://code.alibaba-inc.com/fubang.zfb/uie-siamese/blob/master/code/main.py
pCLUE通用自然语言理解
350M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://code.alibaba-inc.com/fubang.zfb/uie-siamese/blob/master/code/main.py
pCLUE通用自然语言理解
200M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE,prompt
220M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE
13B
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE,prompt
220M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE,prompt
220M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE,PLM change
220M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE,prompt
220M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE
220M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE
13B
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE,prompt
220M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE, mengzi, 测试采样0.3
220M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE, mengzi, 测试采样0.2
220M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE, mengzi, 测试采样0.5
220M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE,prompt
220M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE,prompt
220M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE, mengzi, 测试采样0.4
220M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE,混合多次预训练
1.3b
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/pCLUE
pCLUE
1.3b
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
ALBERT(Ensemble)
GitHub/模型网址:
提交日期:9月17日
分数:9月17日
更多详情:
参数说明
单任务微调。我们从MNLI为RTE、STS和MRPC优化的模型开始
诊断信息
诊断主混淆矩阵
|
C |
N |
E |
C |
182 |
36 |
40 |
N |
81 |
189 |
116 |
E |
17 |
69 |
374 |
C = 对立
N = 不包含
E = 包含