CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese
Abstract
A Chinese NER dataset named CLUENER2020 is introduced, containing diverse categories and challenging tasks, along with released baselines and leader-board.
In this paper, we introduce the NER dataset from CLUE organization (CLUENER2020), a well-defined fine-grained dataset for named entity recognition in Chinese. CLUENER2020 contains 10 categories. Apart from common labels like person, organization, and ___location, it contains more diverse categories. It is more challenging than current other Chinese NER datasets and could better reflect real-world applications. For comparison, we implement several state-of-the-art baselines as sequence labeling tasks and report human performance, as well as its analysis. To facilitate future work on fine-grained NER for Chinese, we release our dataset, baselines, and leader-board.
Get this paper in your agent:
hf papers read 2001.04351 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 4
daman043/roberta-base-finetuned-cluener2020-chinese
Datasets citing this paper 1
zjunlp/iepile
Spaces citing this paper 3
Collections including this paper 0
No Collection including this paper