Comment by mtlmtlmtlmtl

8 months ago

Yeah, NNUE is a separate invention that unfortunately, Deepmind often get undeserved credit for inspiring. It didn't even originate in chess engines but a shogi version of Stockfish. Architecture is completely different from the nets in Leela or Alpha Zero.

Wait, so progress on Stockfish would happen regardless of Alpha Chess? I always thought they were inspired by it in the newer versions, and got much improved rating from incorporating it.

  • Well, NNUE is surprisingly similar to what Stockfish was doing before NNUE. Before it was doing what's called piece-square tables. The basic idea(the stockfish evaluator had a lot more going on in addition, using multiple tables and interpolating between them based on game phase) is to assign some heuristic value to every square, for every piece. So it's just a 6x8x8 array that maps piece positions to values.

    To get the evaluation of the whole position, you add up all of these mappings for the pieces on the board with opposite signs for the opposing players.

    If you blur your eyes a little, this already looks a lot like a neural net. It's just a big summation of terms, and if you leave in a 0*(whatever value) for every piece that's not present, you've effectively embedded your lookup table into a giant mathematical expression that can be optimised by gradient descent.

    The reason computer shogi programmers stumbled on this is that they were experimenting with adding more dimensions to the piece-square table, specifically via indexing by king position as well. So now you have 4 or 5 dimensions, making for a pretty massive array. Hand-tuning all the values becomes less and less feasible, and so I think discovering this idea of rearchitecting it as a neural-net was more or less inevitable.

    So NNUE is actually just a pretty natural evolution of what they were doing before.