Tools/AI Deployment & MLOps/Triton Inference Server
Featured Tool

Triton Inference Server

NVIDIA inference serving platform for deploying AI models at scale.

Open SourceSelf HostedOffline CapableGPU Required (8GB+ VRAM)
0.0 (0)

About

Triton Inference Server by NVIDIA is an inference-serving platform for deploying models from many frameworks, including TensorRT, PyTorch, TensorFlow, ONNX, OpenVINO, Python, and vLLM. It offers dynamic batching, model ensembles, concurrent execution, and metrics, and runs across cloud, data center, edge, and embedded devices on NVIDIA GPUs or x86 and ARM CPUs. Released under the BSD-3-Clause license.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Price
Free
Platform
Local/Desktop
Difficulty
Advanced (4/5)
License
BSD-3-Clause
Minimum VRAM
8 GB
Added
Apr 3, 2026

Related Tools

Featured

Framework for building production-ready AI application services.

Open SourceSelf HostedOffline
Easy
0.0 (0)

Container tool by Replicate for packaging ML models as standard Docker images.

Open SourceSelf HostedOffline
Easy
0.0 (0)

Local AI API platform that runs LLMs on your hardware with OpenAI-compatible API.

Open SourceSelf HostedOffline
Easy
0.0 (0)

Production model serving system for TensorFlow models.

Open SourceSelf HostedOffline
Intermediate
0.0 (0)

PyTorch model serving framework for production deployment.

Open SourceSelf HostedOffline
Intermediate
0.0 (0)

Open-source ML deployment platform for Kubernetes.

Open SourceSelf HostedOffline
Intermediate
0.0 (0)
Browse all AI Deployment & MLOps tools