

1·
1 month agoInterestingly, LLMs are horrible at Zork: https://arxiv.org/abs/2602.15867
Our results reveal that all tested models achieve less than 10% completion on average, with even the best-performing model (Claude Opus 4.5) reaching only approximately 75 out of 350 possible points


I got my first victory at a mere 80 hours, so you could say I’m a God gamer. And the only thing Gods fear is polymorph.