Blog

Conceptual explainers and essays on causal evaluation, LLM judges, and alignment.

Start Here

Your AI Metrics Are Lying to You

Why "You're absolutely right!" scored 9/10 but tanked user satisfaction by 18%. Zero math, just the core insight.

Looking for theory?

Research papers with formal proofs and identification results

Looking for benchmark results?

Read the CJE empirical paper on arXiv

Loading posts...