The E2E Dataset: New Challenges For End-to-End Generation
Abstract
A new, large-scale natural language generation dataset for the restaurant ___domain presents challenges in lexical richness, syntactic variation, and content selection, offering potential for more varied and natural system outputs.
This paper describes the E2E data, a new dataset for training end-to-end, data-driven natural language generation systems in the restaurant ___domain, which is ten times bigger than existing, frequently used datasets in this area. The E2E dataset poses new challenges: (1) its human reference texts show more lexical richness and syntactic variation, including discourse phenomena; (2) generating from this set requires content selection. As such, learning from this dataset promises more natural, varied and less template-like system utterances. We also establish a baseline on this dataset, which illustrates some of the difficulties associated with this data.
Get this paper in your agent:
hf papers read 1706.09254 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 1
Datasets citing this paper 3
tuetschek/e2e_nlg
tuetschek/e2e_nlg_cleaned
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper