AWQ (Activation-aware Weight Quantization)
Efficient LLM quantization preserving important weight channels.
About
AWQ, Activation-aware Weight Quantization from MIT HAN Lab, compresses large language and multimodal models to 3 or 4-bit weights by protecting the small fraction of salient weight channels identified from activation statistics, preserving more quality than naive quantization at the same bit-width. It pairs with the TinyChat runtime for efficient 4-bit inference, including vision-language models. Released under the MIT license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- Model Training & Fine-Tuning
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Intermediate (3/5)
- License
- MIT
- Minimum VRAM
- 8 GB
- Added
- Apr 3, 2026
Related Tools
No-code tool by Hugging Face for training ML models automatically.
Video model fine-tuning toolkit by Hugging Face Diffusers team.
Low-code framework for building custom AI models by Predibase.
Library for training LLMs with reinforcement learning (RLHF, DPO, PPO).
All-in-one framework for fine-tuning 100+ LLMs with web UI.
Efficient fine-tuning method using 4-bit quantized base model with LoRA adapters.