Kevin Schaul
Visual journalist/hacker covering AI
About
LLM evals
Follow
Jan 12, 2026
Opus 4.5 result on METR task duration is wild