AI Observability &amp; Evaluation AI Tools

ML experiment tracking, visualization, and collaboration

Open Source

Easy

0.0 (0)

Featured

Langfuse

Open source LLM engineering platform for tracing and analytics

Open SourceSelf Hosted

Easy

0.0 (0)

Featured

Promptfoo

CLI and library for testing, evaluating, and red-teaming LLM outputs.

Open SourceSelf HostedOffline

Easy

0.0 (0)

Aim

Open-source AI metadata tracker for logging and comparing ML experiments.

Open SourceSelf HostedOffline

Easy

0.0 (0)

ClearML

Open-source ML/AI development platform with experiment tracking and orchestration.

Open SourceSelf HostedOffline

Easy

0.0 (0)

TruLens

Open-source library for evaluating and tracking LLM applications.

Open SourceSelf Hosted

Easy

0.0 (0)

DeepEval

Python framework for unit testing and evaluating LLM applications with metrics like G-Eval.

Open SourceSelf HostedOffline

Easy

0.0 (0)

LangSmith

LangChain's platform for debugging and monitoring LLM apps

Easy

0.0 (0)

Website

Helicone

Open-source LLM observability platform for logging, caching, and monitoring.

Open SourceSelf Hosted

Beginner

0.0 (0)

Phoenix (Arize)

Open-source AI observability platform for tracing, evaluation, and experimentation.

Open SourceSelf Hosted

Easy

0.0 (0)

Evidently AI

Open-source ML monitoring framework for data drift and model quality.

Open SourceSelf HostedOffline

Easy

0.0 (0)

DeepChecks

Open-source testing framework for ML models and data validation.

Open SourceSelf HostedOffline

Easy

0.0 (0)

Giskard

Open-source testing framework for AI models focusing on quality and safety.

Open SourceSelf HostedOffline

Easy

0.0 (0)

Opik

Open-source LLM evaluation and tracing platform by Comet.

Open SourceSelf Hosted

Easy

0.0 (0)

Inspect AI

UK AI Security Institute framework for large language model evaluations and benchmarks.

Open SourceSelf HostedOffline

Easy

0.0 (0)

MLflow

Open source platform for the ML lifecycle

Open SourceSelf HostedOffline

Intermediate

0.0 (0)

Phoenix

AI observability and evaluation from Arize

Open SourceSelf Hosted

Easy

0.0 (0)