WhiStress: Enriching Transcriptions with Sentence Stress Detection

WhiStress allows you to detect emphasized words in your speech.

Check out our paper: 📚 WhiStress

Architecture

The model is built on Whisper model, using whisper-small.en model as the backbone. WhiStress includes an additional decoder based classifier that predicts the stress label of each transcription token.

Training Data

WhiStress was trained using TinyStress-15K, that is derived from the tinyStories dataset.

Inference Demo

Upload an audio file or record your own voice to transcribe the speech and emphasize the important words.

For maximal performance, please speak clearly.