LEGAL-BERT: The Muppets straight out of Law School
Abstract
A systematic investigation of BERT adaptation and fine-tuning strategies for legal ___domain applications reveals that ___domain-specific pre-training is essential for optimal performance.
BERT has achieved impressive performance in several NLP tasks. However, there has been limited investigation on its adaptation guidelines in specialised domains. Here we focus on the legal ___domain, where we explore several approaches for applying BERT models to downstream legal tasks, evaluating on multiple datasets. Our findings indicate that the previous guidelines for pre-training and fine-tuning, often blindly followed, do not always generalize well in the legal ___domain. Thus we propose a systematic investigation of the available strategies when applying BERT in specialised domains. These are: (a) use the original BERT out of the box, (b) adapt BERT by additional pre-training on ___domain-specific corpora, and (c) pre-train BERT from scratch on ___domain-specific corpora. We also propose a broader hyper-parameter search space when fine-tuning for downstream tasks and we release LEGAL-BERT, a family of BERT models intended to assist legal NLP research, computational law, and legal technology applications.
Get this paper in your agent:
hf papers read 2010.02559 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 3
avichr/Legal-heBERT_ft
Datasets citing this paper 0
No dataset linking this paper