CrossNER: Evaluating Cross-Domain Named Entity Recognition
Abstract
A cross-___domain NER dataset (CrossNER) and experiments demonstrate ___domain-specialized pre-training strategies improve ___domain adaptation for NER tasks.
Cross-___domain named entity recognition (NER) models are able to cope with the scarcity issue of NER samples in target domains. However, most of the existing NER benchmarks lack ___domain-specialized entity types or do not focus on a certain ___domain, leading to a less effective cross-___domain evaluation. To address these obstacles, we introduce a cross-___domain NER dataset (CrossNER), a fully-labeled collection of NER data spanning over five diverse domains with specialized entity categories for different domains. Additionally, we also provide a ___domain-related corpus since using it to continue pre-training language models (___domain-adaptive pre-training) is effective for the ___domain adaptation. We then conduct comprehensive experiments to explore the effectiveness of leveraging different levels of the ___domain corpus and pre-training strategies to do ___domain-adaptive pre-training for the cross-___domain task. Results show that focusing on the fractional corpus containing ___domain-specialized entities and utilizing a more challenging pre-training strategy in ___domain-adaptive pre-training are beneficial for the NER ___domain adaptation, and our proposed method can consistently outperform existing cross-___domain NER baselines. Nevertheless, experiments also illustrate the challenge of this cross-___domain NER task. We hope that our dataset and baselines will catalyze research in the NER ___domain adaptation area. The code and data are available at https://github.com/zliucr/CrossNER.
Get this paper in your agent:
hf papers read 2012.04373 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 4
eesuhn/crossner-ai
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper