Comment by zniturah

8 months ago

"Aready won position" or "99% win rate" is statistics given by Stockfish (or professional chess player). It is weird to assume that the same statement is true for the trained LLM since we are assessing the LLM itself. If it is using during the game then it is searching, thus the title doesn't reflect the actual work.

2 comments

zniturah

dmurray 8 months ago

It's quite clear from the article that the 99% is the model's predicted win rate for a position, not its evaluation by Stockfish (which doesn't return evaluations in those terms).

It's true that this is a relatively large deficiency in practice: how strong would a player be if he played the middlegame at grandmaster strength but couldn't reliably mate with king and rook?

The authors overcame the practical problem by just punting to Stockfish in these few cases. However, I think it's clearly solvable with LLM methods too. Their model performs poorly because of an artifact in the training process where mate-in-one is valued as highly as mate-in- fifteen. Train another instance of the model purely on checkmate patterns - it can probably be done with many fewer parameters - and punt to that instead.

im3w1l 8 months ago

Human players have this concept of progress. I couldn't give a good succinct description of exactly what that entails, but basically if you are trading off pieces that's progress, if your king is breaking through the defensive formation of the pawn endgame that's progress. If you are pushing your passed pawn up the board that's progress. If you are slowly constricting the other king that's progress.
When we have a won position we want to progress and convert it to an actual win.
I think the operational definition I would use for progress is a prediction of how many more moves the game will last. A neural network can be used for that.