How We Developed Zeta2 — Zed's Blog

What stands out here is that product quality improved through better data and evaluation loops, not just a stronger base model. The discussion of the reversal problem is especially sharp, because it shows how easily a model can optimize for plausible code while fighting the user’s immediate intent. More broadly, it feels like a case study in applied ML maturity: better teacher prompts, better traces, and more realistic evals matter a lot.