GPTQ (Quantization)

Post-training quantization method for compressing large language models.

Open SourceSelf HostedOffline CapableGPU Required (8GB+ VRAM)
0.0 (0)

About

GPTQ is a one-shot post-training quantization method for large language models from the ICLR 2023 paper on accurate post-training compression of generative transformers. It compresses model weights to 4-bit or 3-bit precision with little quality loss, and an activation-order heuristic further improves accuracy on outlier-heavy models. This makes large models runnable on consumer GPUs. The reference implementation is openly available.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Price
Free
Platform
Local/Desktop
Difficulty
Intermediate (3/5)
License
Apache-2.0
Minimum VRAM
8 GB
Added
Apr 3, 2026

Related Tools

No-code tool by Hugging Face for training ML models automatically.

Open SourceSelf HostedOfflineGPU 8GB+
Beginner
0.0 (0)

Efficient LLM quantization preserving important weight channels.

Open SourceSelf HostedOfflineGPU 8GB+
Intermediate
0.0 (0)

Video model fine-tuning toolkit by Hugging Face Diffusers team.

Open SourceSelf HostedOfflineGPU 16GB+
Advanced
0.0 (0)

Low-code framework for building custom AI models by Predibase.

Open SourceSelf HostedOfflineGPU 8GB+
Easy
0.0 (0)
Featured

Library for training LLMs with reinforcement learning (RLHF, DPO, PPO).

Open SourceSelf HostedOfflineGPU 8GB+
Intermediate
0.0 (0)
Featured

Efficient fine-tuning method using 4-bit quantized base model with LoRA adapters.

Open SourceSelf HostedOfflineGPU 8GB+
Intermediate
0.0 (0)
Browse all Model Training & Fine-Tuning tools