Experimentation

Latest

How Khan Academy Optimizes AI Tutoring with Experimentation

Khan Academy went from vibes-based prompt testing to running A/B experiments on their AI tutor, Khanmigo, in production. Kelli Hill shares the journey, including a fascinating iterative case study on latency vs. math accuracy.

· 5 min read

AI Evals vs. A/B Testing: Why You Need Both to Ship GenAI

Stop relying on "vibe checks" to ship GenAI. While AI Evals answer "can the model do the job?", only A/B testing answers "do users care?" Discover how to combine offline evaluation with online experimentation to build a reliable pipeline for shipping LLM features.

· 9 min read

7 Steps to Better Experiment Design

Only 10–30% of experiments produce a clear winner — and that’s not a problem, it’s reality. This article shows how high-performing teams design experiments to learn faster and make better decisions, even when results are neutral.

· 4 min read

Feedback Loops Are the Next Breakthrough in Agentic Coding

Most AI coding tools today help teams build faster, but they don’t provide the feedback needed to know what’s worth building. Feedback loops—drawn from millions of experiments—will be the next breakthrough in agentic coding, making AI smarter, faster, and more valuable to every software team.

· 2 min read

How GrowthBook Holdouts Work Under the Hood

Most holdouts measure only shipped features. Ours measure everything—including failed experiments. This technical deep dive reveals why we chose reality over clean rooms, and how we built it.

· 4 min read

Holdouts in GrowthBook: The Gold Standard for Measuring Cumulative Impact

Many teams ship features weekly but struggle to measure their true cumulative impact. Holdouts in GrowthBook provide a simple way to maintain a control group across multiple features, answering the critical question: What did all this shipping actually do to our key metrics?

· 4 min read

GrowthBook Version 4.0

GrowthBook 4.0 includes a huge number of new features and updates. Continue reading for a full list of changes.

· 3 min read

Flavors of Experimentation in GrowthBook

GrowthBook provides you with 3 different types of experimentation for different purposes: Bandits for picking a winner among many, Safe Rollouts for releasing safely, and Experiments for learning.

· 4 min read

Want to give GrowthBook a try?

In under two minutes, GrowthBook can be set up and ready for feature flagging and A/B testing, whether you use our cloud or self-host.