Comment by salamo
8 months ago
I think this is an interesting finding from a practical perspective. A function which can reliably approximate stockfish at a certain depth could replace it, basically "compressing" search to a set depth. And unlike NNUE which is optimized for CPU, a neural network is highly parallelizable on GPU meaning you could send all possible future positions (at depth N) through the network and use the results for a primitive tree search.
The Stockfish installer is ~45 MB. At 16 bits per parameter, the 270B model would be over 500 MB. The 9B model would be smaller than Stockfish, but you could probably find a smaller chess engine that achieves 2000 ELO.
Dedicated chess computers were hitting 2000 ELO with an 8-bit 6502 running at <10MHz in the late 1980s. The Novag Super expert had 96KB of ROM, which also included the opening book, so yeah, quite a bit smaller.
https://schach-computer.info/wiki/index.php?title=Novag_Supe...
https://www.schach-computer.info/wiki/index.php?title=Mephis...
The advantage of this approach that we can run many simultaneous computations on the GPU/TPU. Instead of using maybe a dozen CPU threads, we can approximate the value of a few thousand positions at the same time.