Documentation Index
Fetch the complete documentation index at: https://evalgate.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Annotations API — human labeling and review
Create annotation tasks, assign traces for human review, and submit labels to build the golden dataset used for measuring LLM judge credibility.Human annotations are the foundation of judge credibility in Evalgate. When you label a set of traces as pass or fail, those labels become the ground truth that the LLM Judge alignment endpoint compares against automated judge scores. A judge with high alignment against a well-labeled dataset is one you can trust to gate your CI pipeline.
GET /api/annotations/tasks — list annotation tasks
Returns annotation tasks for the authenticated organization.Response
POST /api/annotations/tasks — create an annotation task
Creates a new task and assigns a set of traces for labeling.Request body
Display name for this annotation task.
Array of numeric trace IDs to include in this task. Each trace will become one annotation item.
Optional guidance text shown to annotators when they open the task.
Optional array of label strings annotators can choose from. Defaults to
["pass", "fail"] when not specified.Response (201)
GET /api/annotations/tasks/ — get task details
Returns a single annotation task with its items.Path parameters
Numeric ID of the annotation task.
Response
POST /api/annotations/tasks//items — submit an annotation
Submits a label for a single annotation item within a task.Path parameters
Numeric ID of the annotation task.
Request body
Numeric ID of the annotation item to label.
The label to assign. Must be one of the task’s configured
labelOptions, or pass / fail by default.Optional free-text notes explaining the label decision. These are stored alongside the label for audit and inter-rater review.
Response
Once a task has enough labels, run the LLM Judge alignment check to measure how well your automated judge agrees with your team’s ground truth. A high-alignment judge is safe to use as an automated CI gate.