AI Observability & Evaluation AI Tools
Open-source tools for monitoring, logging, tracing, evaluating, and debugging AI model outputs, prompts, and pipelines.
Open-source tools for monitoring, logging, tracing, evaluating, and debugging AI model outputs, prompts, and pipelines.
CLI and library for testing, evaluating, and red-teaming LLM outputs.
Open-source AI metadata tracker for logging and comparing ML experiments.
Open-source ML/AI development platform with experiment tracking and orchestration.
Open-source library for evaluating and tracking LLM applications.
Python framework for unit testing and evaluating LLM applications with metrics like G-Eval.
LangChain's platform for debugging and monitoring LLM apps
Open-source LLM observability platform for logging, caching, and monitoring.
Open-source AI observability platform for tracing, evaluation, and experimentation.
Open-source ML monitoring framework for data drift and model quality.
Open-source testing framework for ML models and data validation.
Open-source testing framework for AI models focusing on quality and safety.
UK AI Security Institute framework for large language model evaluations and benchmarks.