What is Agenta?

An open-source (MIT) LLMOps platform that unifies prompt management, LLM evaluation, observability (tracing), and a prompt playground in a single tool. Supports Git-style versioning, branching, and environment switching for prompts, enabling developers and non-engineers to collaborate on LLM applications in production. Free for self-hosting; a cloud-hosted version is also available.

Business problems it solves

【Engineer】

About "Agenta"

How to Use

Set up your environment (cloud or self-hosted)

Sign up for the cloud version, or self-host the open-source edition on your own servers with Docker Compose. Self-hosting is free.
Build prompts in the playground

Write prompts in the playground and compare results while switching between multiple models and parameters. Agenta supports more than 50 LLM models as well as custom providers.
Version your prompts

Manage your prompts in a Git-like way. You can branch and switch between environments (such as production and staging), and non-engineers can edit and operate prompts without touching code.
Run evaluations

Create test sets from production data or CSV files, and measure prompt quality by combining built-in evaluators (20+), custom evaluations, and human review.
Monitor with observability

Use OpenTelemetry-native tracing to visualize cost, latency, and execution details, and continuously improve your production LLM apps.

Features

Prompt playground

Test prompts against test cases while switching between multiple models and parameters. Supports 50+ LLM models and custom providers.
Prompt management and Git-like versioning

Version your prompts, with branching and switching between environments (production/staging). Non-engineers (SMEs) can collaborate with developers and edit and operate prompts without touching code.
LLM evaluation

Create test sets from production data or CSV files, and quantitatively evaluate quality using 20+ built-in evaluators, custom evaluations, and human feedback. You can run evaluations from the UI or programmatically.
Observability and tracing

OpenTelemetry-native, with monitoring of cost and performance and detailed tracing of complex workflows for debugging. Integrations for various models and frameworks are also provided.
Open source (MIT) and self-hosting

Open source under the MIT license, with self-hosting via Docker Compose. You can keep your data in your own environment.

Pricing

Pricing is current as of June 2026. Prices are in US dollars, and amounts in Japanese yen vary with exchange rates. The open-source edition is free when self-hosted, and the cloud version is as follows. Check the official pricing page for the latest details.

Plan	Monthly Price	Key Features
Self-hosted (OSS)	Free	MIT license, self-operated via Docker Compose, all features available on your own infrastructure
Hobby	$0	2 users, 5,000 traces/month, 20 evaluations/month, 30-day retention, community support
Pro	$49/month	3 users (additional seats $20 each, up to 10), 10,000 traces/month, unlimited evaluations, 90-day retention, in-app support
Business	$399/month	Unlimited users, 1 million traces/month, SSO/SOC2/RBAC, 365-day retention, dedicated Slack
Enterprise	Contact us	Self-hosting onboarding support, custom SLA, audit logs, dedicated support, BYOC support

※ Trace and evaluation limits may be billed on a usage basis for any overage (for example, $5 per 10,000 traces).

Pros & Cons

Pros

Prompt management, evaluation, and observability are all handled in a single tool
MIT-licensed OSS, so if you self-host you can keep your data in-house and run it for free
Git-like version control and environment switching make it easier to manage prompts in production
Non-engineers (SMEs) can edit prompts without touching code, making it well suited to team collaboration

Cons

The interface and documentation are primarily in English
Self-hosting requires knowledge of infrastructure operations such as Docker
The cloud version has limits on the number of traces and so on, requiring plan design that matches your scale

Reviews & Reputation

Developers praise it for bringing prompt management, evaluation, and tracing together in one place, making it easy to adopt as an operational foundation for LLM apps.
Some note that because it can be self-hosted as OSS, it can be used even in environments where data cannot leave the organization, and that its Git-like prompt management fits real-world operations.
On the other hand, some point out that the UI is in English and that self-hosting assumes infrastructure management.

FAQ

Q. Can I use Agenta for free?

Yes. If you self-host the MIT-licensed open-source edition, you can use it for free. The cloud version also has a free Hobby plan, and there are paid plans from Pro and up for full-scale operations.

Q. Is it open source? Can it be self-hosted?

Yes. It is MIT-licensed OSS, and you can self-host it on your own servers using Docker Compose.

Q. Which LLMs does it support?

It supports more than 50 models, including OpenAI, Anthropic, and Google Gemini, as well as custom providers.

Q. Does it support Japanese?

The interface and documentation are primarily in English. The prompts and data you handle can be in any language, but a Japanese-localized UI is not provided.

How Agenta differs from other LLMOps and AI app development tools

Aspect	Agenta	Typical AI app development platform	Use Case
Focus	LLMOps for prompt management, evaluation, and observability	More geared toward app building and orchestration	Choose Agenta if you prioritize prompt operations and evaluation
Delivery	OSS (MIT) plus cloud	Varies by service	Convenient when you want to keep data in-house
Versioning	Git-like branching and environment switching	May be limited depending on the tool	Suited to strictly managing production prompts