Papers
arxiv:2311.00430

Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling

Published on Nov 1, 2023
· Submitted by
AK
on Nov 2, 2023
#1 Paper of the day

Abstract

Distil-Whisper, a smaller and faster variant of the Whisper model, achieves nearly the same performance with fewer resources and is optimized for low-latency environments.

As the size of pre-trained speech recognition models increases, running these large models in low-latency or resource-constrained environments becomes challenging. In this work, we leverage pseudo-labelling to assemble a large-scale open-source dataset which we use to distill the Whisper model into a smaller variant, called Distil-Whisper. Using a simple word error rate (WER) heuristic, we select only the highest quality pseudo-labels for training. The distilled model is 5.8 times faster with 51% fewer parameters, while performing to within 1% WER on out-of-distribution test data in a zero-shot transfer setting. Distil-Whisper maintains the robustness of the Whisper model to difficult acoustic conditions, while being less prone to hallucination errors on long-form audio. Distil-Whisper is designed to be paired with Whisper for speculative decoding, yielding a 2 times speed-up while mathematically ensuring the same outputs as the original model. To facilitate further research in this ___domain, we make our training code, inference code and models publicly accessible.

Community

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

Distil-Whisper: Faster, Smaller, Yet Powerful Speech Recognition!

Links 🔗:

👉 Subscribe: https://www.youtube.com/@Arxflix
👉 Twitter: https://x.com/arxflix
👉 LMNT (Partner): https://lmnt.com/

By Arxflix
9t4iCUHx_400x400-1.jpg

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2311.00430
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 66

Browse 66 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2311.00430 in a dataset README.md to link it from this page.

Spaces citing this paper 345

Browse 345 spaces citing this paper

Collections including this paper 27