← Back to context

Comment by janalsncm

8 months ago

A sufficiently large nn can learn an arbitrary function, yes. But stockfish is also theoretically perfect given infinite computational resources.

What is interesting is performing well under reasonable computational constraints i.e. doing it faster/with fewer flops than stockfish.

Is the model more efficient than Stockfish? I think Stockfish runs on regular CPU computer and I'd guess this " 270M parameter transformer model" requires a GPU but I can't find any reference to efficiency in the paper.

Also found in the paper: "While our largest model achieves very good performance, it does not completely close the gap to Stockfish 16". It's actually inferior but they still think it's an interesting exercise. But that's the thing, it's primarily an exercise like calculating pi to a billion decimal points or overclocking a gaming laptop.

  • Well I think it’s interesting to the extent that it optimizes the solution for a different piece of hardware, the TPU. Their results are also applicable to GPUs. Since the problem is highly parallelizable, we might expect a viable model to quickly approximate a more accurate evaluation, and perhaps even make up for it in volume.