CJE Documentation
Everything you need to get started with Causal Judge Evaluation
Getting Started
Installation & Setup
→Get CJE running in 5 minutes with pip install and quick verification
Data Format Guide
→Required fields for each mode, log probability setup, and validation rules
API Reference
→Complete reference for analyze_dataset() function, parameters, and results
Guides & Workflows
Quick-Start Recipes
→7-step workflow, mode selection (DM/IPS/DR), and copy-and-use recipe cards
Diagnostics & Fixes
→The 5 highest-leverage diagnostics with alert thresholds and fix strategies
Method Details
Direct Method (DM)
→AutoCal-R calibration, OUA for honest uncertainty, and what to report
Off-Policy Re-use (IPS & DR)
→Calibrated IPS with SIMCal stabilization and doubly robust estimation for log re-use
Assumptions (Plain English)
→When your estimates are causally interpretable: shared assumptions and mode-specific requirements
Additional Resources
GitHub Repository
Source code, examples, and developer documentation
Working Example
Check out examples/arena_sample/
in the GitHub repo for a complete working example with 100 samples from real Arena 10K evaluation