Documentation Index
Fetch the complete documentation index at: https://evalgate.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Evalgate: AI Quality Infrastructure for LLM Apps
Evalgate traces real AI failures, turns them into eval cases, and blocks regressions in CI — no separate observability stack required.Evalgate is an evaluation control plane for AI teams shipping LLM applications to production. It closes the loop between what breaks in the real world and what gets tested before the next release: trace real agent behavior, promote failures into reusable evaluation coverage, and enforce quality gates in CI so the same issue never ships twice.
Quick Start
Set up your first eval gate in under 5 minutes — no account required for local gating.
Authentication
Get your API key and start making authenticated requests to the Evalgate platform.
SDK Reference
TypeScript and Python SDKs with full type safety, built-in assertions, and CLI tools.
API Reference
Complete REST API reference with request/response examples and interactive endpoints.
How Evalgate works
Evalgate is built around one operating loop: trace → eval → gate. Every feature feeds into this path.Collect traces from real AI behavior
Instrument your LLM application with the SDK to capture production and staging behavior — inputs, outputs, tool calls, latency, and token usage — with full structured context.
Turn failures into reusable eval coverage
Promote failing patterns into test cases and suites. Label traces interactively, cluster failures by behavior, and synthesize golden datasets from real production gaps.
Explore by topic
Core Concepts
Understand the trace → eval → gate loop and Evalgate’s data model.
CI/CD Integration
Wire Evalgate into GitHub Actions or GitLab CI to gate every PR.
LLM Judge
Orchestrate multi-judge evaluation with disagreement handling and provider flexibility.
MCP Integration
Use Evalgate tools directly from Cursor, Claude Desktop, or ChatGPT.