Comment by janalsncm

8 months ago

A sufficiently large nn can learn an arbitrary function, yes. But stockfish is also theoretically perfect given infinite computational resources.

What is interesting is performing well under reasonable computational constraints i.e. doing it faster/with fewer flops than stockfish.

5 comments

janalsncm

joe_the_user 8 months ago

Is the model more efficient than Stockfish? I think Stockfish runs on regular CPU computer and I'd guess this " 270M parameter transformer model" requires a GPU but I can't find any reference to efficiency in the paper.

Also found in the paper: "While our largest model achieves very good performance, it does not completely close the gap to Stockfish 16". It's actually inferior but they still think it's an interesting exercise. But that's the thing, it's primarily an exercise like calculating pi to a billion decimal points or overclocking a gaming laptop.

janalsncm 8 months ago

Well I think it’s interesting to the extent that it optimizes the solution for a different piece of hardware, the TPU. Their results are also applicable to GPUs. Since the problem is highly parallelizable, we might expect a viable model to quickly approximate a more accurate evaluation, and perhaps even make up for it in volume.
rolisz 8 months ago
BERT has around that many parameters and it runs on CPU in 200ms
- spookie 8 months ago
  
  In that time Stockfish 16 would evaluate about 2 million positions on a mildly powerful consumer CPU
samus 8 months ago

270M are very tractable even on CPUs.