false, chess ELO is pretty good https://maxim-saplin.github.io/llm_chess/ ets no...

bigfishrunning · 2026-03-09T20:36:08 1773088568

If you look at the "workflow" section of that page, they had to add a bunch of scaffolding around telling the model what moves are legal -- an llm can't keep enough context to know how to play chess; only to choose an advantageous move from a given list. But feel free to "cherry pick".

simianwords · 2026-03-09T20:38:09 1773088689

why do you think this falsifies that it can't reason?

simianwords · 2026-03-09T22:44:50 1773096290

i ran the benchmark without the valid moves tool as well as the three mistakes grace and gpt-5.4 holds well. it can achieve 1000 ELO which is much higher than my own.

this clearly tells me that GPT is good at chess, at least better than a normal person who has played ~30-40 games in their life.