Featured Tool
Triton Inference Server
NVIDIA inference serving platform for deploying AI models at scale.
Open SourceSelf HostedOffline CapableGPU Required (8GB+ VRAM)
0.0 (0)
About
Triton Inference Server by NVIDIA supports deploying models from all major frameworks (TensorRT, PyTorch, TensorFlow, ONNX, vLLM). Features dynamic batching, model ensemble, GPU/CPU inference, and metrics. BSD-3-Clause license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- AI Deployment & MLOps
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Advanced (4/5)
- License
- BSD-3-Clause
- Minimum VRAM
- 8 GB
- Added
- Apr 3, 2026
Similar Tools
Open-source ML deployment platform for Kubernetes.
Open SourceSelf HostedOffline
Intermediate
0.0 (0)
Featured
Framework for building production-ready AI application services.
Open SourceSelf HostedOffline
Easy
0.0 (0)
Kubernetes-native platform for deploying ML models to production.
Open SourceSelf HostedOffline
Advanced
0.0 (0)