Tools/Speech-to-Text / Speech Recognition/FunASR (Paraformer)

FunASR (Paraformer)

Industrial-grade ASR toolkit by Alibaba with Paraformer non-autoregressive models.

Open SourceSelf HostedOffline CapableGPU Required (4GB+ VRAM)

0.0 (0)

About

FunASR is an industrial-grade speech recognition toolkit from Alibaba, published through ModelScope, that covers offline, streaming, and edge deployment rather than betting on a single model. Its catalog includes Paraformer, a 220M-parameter non-autoregressive recognizer for Chinese and English with timestamp output, SenseVoiceSmall, which adds emotion recognition and audio event detection across five languages, and larger Fun-ASR and multilingual variants reaching dozens of languages. Around the recognizers sit voice activity detection, punctuation restoration, and speaker diarization modules that combine into complete transcription pipelines. The models are efficient enough that CPU inference can outpace Whisper on GPU, vLLM acceleration is reported at up to 340x realtime, and GGUF builds run through llama.cpp on edge devices without Python. Deployment options include Docker, WebSocket streaming servers, and OpenAI-compatible APIs. The code is MIT licensed with model weights licensed separately, and it is widely used in production Chinese ASR systems.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Category: Speech-to-Text / Speech Recognition
Price: Free
Platform: Local/Desktop
Difficulty: Intermediate (3/5)
License: MIT
Minimum VRAM: 4 GB
Added: Apr 3, 2026

Website GitHub

Browse all Speech-to-Text / Speech Recognition tools

Mentioned in

Beyond Whisper: Parakeet, SenseVoice and ASR in 2026

Whisper is no longer the default: how Parakeet, SenseVoice, Kimi-Audio, Ultravox and Moshi compare on...

Max P

FunASR (Paraformer)

About

Reviews (0)

Leave a Review

Details

Tags

Related Tools

Conformer (ESPnet)

ESPnet

Insanely Fast Whisper

Kaldi

Wav2Vec 2.0

Canary (NVIDIA NeMo)

Mentioned in