BentoML
Framework for building production-ready AI application services.
About
BentoML is a Python framework for building model inference APIs and multi-model serving systems from models in any framework. It packages models with their dependencies into a standard unit, exposes them as online services with request batching and GPU support, and containerizes them for deployment anywhere. It targets reliable, cost-efficient AI services. Released under the Apache 2.0 license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- AI Deployment & MLOps
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Easy (2/5)
- License
- Apache-2.0
- Added
- Apr 3, 2026
Related Tools
Container tool by Replicate for packaging ML models as standard Docker images.
Local AI API platform that runs LLMs on your hardware with OpenAI-compatible API.
Production model serving system for TensorFlow models.
PyTorch model serving framework for production deployment.
NVIDIA inference serving platform for deploying AI models at scale.
Open-source ML deployment platform for Kubernetes.