Plasticity
Environment Design
Dynamic Worlds

Patronus Products

Central to our product evolution is dynamism. Our industry landscape has evolved rapidly over the past few years, and so have we.

We started by evaluating models at static checkpoints

Then we moved toward understanding agents over time — across traces, tools, and decisions.

Today, we build dynamic environments where behavior emerges through interaction.

01
Platform
02
Percival
03
Generative
Simulator

Our core eval platform provides teams with a centralized solution for experiments, logging, comparisons, and traces, among more

LLM-as-a-Judge

Enables developers to score multimodal AI systems for image to text

Explore
Glider

Powerful 3B evaluator LLM that can score any text input on user-defined criteria

Explore
Lynx

A SOTA hallucination detection LLM that is capable of advanced reasoning

Explore

Percival is our eval copilot for agentic systems built to detect 20+ failure modes in agentic traces, suggesting optimizations, and evaluating a suite of reasoning and planning errors

Percival

Eval copilot that analyzes traces, identifies issues, and suggests optimizations

Explore
Agent

Interactive AI assistant that lets you unlock the power of Percival

Explore

We are a team of AI researchers and engineers formerly from companies such as Meta AI, Amazon AGI, and Google. Our work has led to product contributions serving top Fortune 500 clients

Generative Simulators

Adaptive environments that co-generate tasks, world dynamics, and reward functions

Explore
RL Environments

Dynamic, feedback-driven environments for domain-specific agent training and evaluation

Explore
MemTrack

Benchmark to evaluate long-term memory and state tracking in multi-platform agent environments

Explore
01
Platform

Our core eval platform provides teams with a centralized solution for experiments, logging, comparisons, and traces, among more

LLM-as-a-Judge

Enables developers to score multimodal AI systems for image to text

Explore
Glider

Powerful 3B evaluator LLM that can score any text input on user-defined criteria

Explore
Lynx

A SOTA hallucination detection LLM that is capable of advanced reasoning

Explore
02
Percival

Percival is our eval copilot for agentic systems built to detect 20+ failure modes in agentic traces, suggesting optimizations, and evaluating a suite of reasoning and planning errors

Percival

Eval copilot that analyzes traces, identifies issues, and suggests optimizations

Explore
Agent

Interactive AI assistant that lets you unlock the power of Percival

Explore
03
Generative Simulator

We are a team of AI researchers and engineers formerly from companies such as Meta AI, Amazon AGI, and Google. Our work has led to product contributions serving top Fortune 500 clients

Generative Simulators

Adaptive environments that co-generate tasks, world dynamics, and reward functions

Explore
RL Environments

Dynamic, feedback-driven environments for domain-specific agent training and evaluation

Explore
MemTrack

Benchmark to evaluate long-term memory and state tracking in multi-platform agent environments

Explore

The “compressed 21st century”: the idea that after powerful AI is developed, we will in a few years make all the progress in biology and medicine that we would have made in the whole 21st century

– Dario Amodei, Machines of Loving Grace