← Back to context

Comment by actionfromafar

8 months ago

I wonder if one could make a neural net play human-like, at various levels, by for instance training smaller or larger nets. And by human-like, I don't mean ELO level, but more like the Turing tests - "does this feel like playing against a human?"

I wonder how many time-annotated chess play logs are out there. (Between humans, I mean.)

I suppose varying the neural net size wouldn't be the best way of doing that; very small nets can have very "unhuman-like" behaviour. I'm not an expert on reinforcement learning, but for other fields in deep learning that's typically the case.

I think that, to simulate worse human-like players, it would be better to just increase the temperature: don't always select the best move, at every step just select one of the top 10, randomly proportional to some function of the model-predicted probability of it being "the best" move (e.g. a power of the probability; very large powers give always the best move, i.e. the strongest player, and powers close to 0 tend to choose uniformly at random, i.e. the weakest player). The only thing I'm not certain about is, if you train the original network well enough, stupid blunders (that a very bad human player like me would make) are still scored so low that there's no way this algorithm will pick them up - the only way to know would be to try.

  • > don't always select the best move, at every step just select one of the top 10

    Engines already do this when you turn down their skill level. It does not lead to human-like play.

    The problem is that bad (or just non-expert) human players don't make completely random mistakes. They tend to make very specific types of mistakes. For example, they may miss certain types of tactics, or underestimate king safety, or forget about hanging pieces.

    In order to make a bot that feels like a human, you need to somehow capture the specific weaknesses that human players have.

Maia chess did this. It works ok for low to mid elo levels (around 50% accuracy). But their project also didn’t use any search, it just directly predicted the move. Humans actually do perform search, so a more accurate model at higher elos will probably need to do something like that. However, humans don’t do a complete game search like stockfish, and we don’t do full game rollouts like lc0 either.