Skip to main content
0
P

Phoenix

AI Observability & Evaluation

Rating

0.0

Votes

0

score

Downloads

0

total

Price

Free

No login needed

Works With

Claude CodeCursorWindsurfVS CodeDeveloper tool

About

Phoenix is an open-source AI observability platform designed for experimentation, evaluation, and troubleshooting. It provides:

  • **_Tracing_** - Trace your LLM application's runtime using OpenTelemetry-based instrumentation.
  • **_Evaluation_** - Leverage LLMs to benchmark your application's performance using response and retrieval evals.
  • **_Datasets_** - Create versioned datasets of examples for experimentation, evaluation, and fine-tuning.
  • **_Experiments_** - Track and evaluate changes to prompts, LLMs, and retrieval.
  • **_Playground_**- Optimize prompts, compare models, adjust parameters, and replay traced LLM calls.
  • **_Prompt Management_**- Manage and test prompt changes systematically using version control, tagging, and experimentation.

Phoenix is vendor and language agnostic with out-of-the-box support for popular frameworks (OpenAI Agents SDK, Claude Agent SDK, LangGraph, Vercel AI SDK, Mastra, CrewAI, LlamaIndex, DSPy) and LLM providers (OpenAI, Anthropic, Google GenAI, Google ADK, AWS Bedrock, OpenRouter, LiteLLM, and more). For details on auto-instrumentation, check out the OpenInference project.

Phoenix runs practically anywhere, including your local machine, a Jupyter notebook, a containerized deployment, or in the cloud.

Installation

Install Phoenix via pip or conda

shell
pip install arize-phoenix

Don't lose this

Three weeks from now, you'll want Phoenix again. Will you remember where to find it?

Save it to your library and the next time you need Phoenix, it’s one tap away — from any AI app you use. Group it into a bench with the rest of the team for that kind of task and you can pull the whole stack at once.

⚡ Pro tip for geeks: add a-gnt 🤵🏻‍♂️ as a custom connector in Claude or a custom GPT in ChatGPT — one click and your library is right there in the chat. Or, if you’re in an editor, install the a-gnt MCP server and say “use my [bench name]” in Claude Code, Cursor, VS Code, or Windsurf.

🤵🏻‍♂️

a-gnt's Take

Our honest review

AI Observability & Evaluation. Best for anyone looking to make their AI assistant more capable in devops & monitoring. It's completely free and works across most major AI apps. This one just landed in the catalog — worth trying while it's fresh.

Tips for getting started

1

Tap "Get" above, pick your AI app, and follow the steps. Most installs take under 30 seconds.

What's New

Version 1.0.06 days ago

Imported from GitHub

Ratings & Reviews

0.0

out of 5

0 ratings

No reviews yet. Be the first to share your experience.