Tools/Speech-to-Text / Speech Recognition/Whisper JAX

Whisper JAX

JAX-based Whisper implementation optimized for TPU/GPU with 70x+ speedup.

Open SourceSelf HostedOffline CapableGPU Required (8GB+ VRAM)

0.0 (0)

About

Whisper JAX reimplements OpenAI's Whisper in JAX and Flax on top of the Hugging Face Transformers codebase, targeting raw transcription throughput on accelerator hardware. Its speed comes from JIT compilation, data parallelism across devices via JAX's pmap, optional float16 or bfloat16 computation, and a batching strategy that splits long audio into 30 second chunks transcribed in parallel rather than sequentially. Benchmarks in the repository show a single A100 GPU transcribing an hour of audio in about 75 seconds against roughly 1,000 seconds for the original PyTorch version, and a TPU v4-8 doing the same in under 14 seconds, the source of its headline claim of over 70 times faster inference. All Whisper checkpoints from tiny through large-v2 are supported for multilingual transcription, translation, and timestamps, and the code runs on CPU, GPU, or TPU, standalone or as a Gradio inference endpoint. Open source under the Apache 2.0 license, it suits researchers and services that batch-process large audio archives on TPU or GPU infrastructure.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Category: Speech-to-Text / Speech Recognition
Price: Free
Platform: Local/Desktop
Difficulty: Advanced (4/5)
License: Apache-2.0
Minimum VRAM: 8 GB
Added: Apr 3, 2026

Website GitHub

Browse all Speech-to-Text / Speech Recognition tools

Whisper JAX

About

Reviews (0)

Leave a Review

Details

Tags

Related Tools

Conformer (ESPnet)

ESPnet

Insanely Fast Whisper

Kaldi

Wav2Vec 2.0

Canary (NVIDIA NeMo)