Tools/Audio & Speech/Whisper

Featured Tool

Whisper

OpenAI's powerful speech recognition model

Open SourceSelf HostedOffline Capable

0.0 (0)

Visit Website View on GitHub Documentation

About

Whisper by OpenAI is a general-purpose speech recognition model trained on a large and diverse audio dataset. It is a Transformer sequence-to-sequence model that handles multilingual transcription, speech translation, and spoken language identification as a single multitask system, and it holds up well across accents, background noise, and technical speech. Pretrained model checkpoints and inference code are released under the MIT license.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Category: Audio & Speech
Price: Free
Platform: Local/Desktop
Difficulty: Easy (2/5)
License: MIT
Added: Jan 29, 2026

Tags

speech-to-text transcription openai multilingual

Related Tools

Featured

TextSpeakPro

Free text-to-speech generator with multiple voices, accents, and languages. No signup required.

Beginner

5.0 (1)

Featured

faster-whisper

CTranslate2-based Whisper with 4x faster transcription

Open SourceSelf HostedOffline

Easy

0.0 (0)

BigVGAN

Universal neural vocoder from NVIDIA that converts mel spectrograms into waveforms up to 44 kHz.

Open SourceSelf HostedOfflineGPU

Intermediate

0.0 (0)

GLM-4-Voice

End-to-end Chinese and English spoken dialogue model from Zhipu AI with streaming speech output.

Open SourceSelf HostedOfflineGPU

Intermediate

0.0 (0)

Kimi-Audio

Audio foundation model unifying speech recognition, understanding, and conversation in one 7B model.

Open SourceSelf HostedOfflineGPU

Intermediate

0.0 (0)

Coqui TTS

Deep learning toolkit for text-to-speech synthesis

Open SourceSelf HostedOffline

Intermediate

0.0 (0)

Browse all Audio & Speech tools