← Back to context

Comment by joe_the_user

8 months ago

Is the model more efficient than Stockfish? I think Stockfish runs on regular CPU computer and I'd guess this " 270M parameter transformer model" requires a GPU but I can't find any reference to efficiency in the paper.

Also found in the paper: "While our largest model achieves very good performance, it does not completely close the gap to Stockfish 16". It's actually inferior but they still think it's an interesting exercise. But that's the thing, it's primarily an exercise like calculating pi to a billion decimal points or overclocking a gaming laptop.

Well I think it’s interesting to the extent that it optimizes the solution for a different piece of hardware, the TPU. Their results are also applicable to GPUs. Since the problem is highly parallelizable, we might expect a viable model to quickly approximate a more accurate evaluation, and perhaps even make up for it in volume.