← Back to context

Comment by tzs

7 hours ago

OT: what's the state of the art in non-GM level computer chess?

Say I want to play chess with an opponent that is at about the same skill level as me, or perhaps I want to play with an opponent about 100 rating points above me for training.

Most engines let you dumb them down by cutting search depth, but that usually doesn't work well. Sure, you end up beating them about half the time if you cut the search down enough but it generally feels like they were still outplaying you for much of the game and you won because they made one or two blunders.

What I want is a computer opponent that plays at a level of my choosing but plays a game that feels like that of a typical human player of that level.

Are there such engines?

Maia does this reasonably well! You can play against it on Lichess. I have gotten a few "feels like a human" moments when playing against it - for example, getting it to fall into a trap that could trick a human but would easily be seen by a traditional search algorithm. It's not adjustable but there are a few different versions with different ratings (although it's not a very wide range).

https://www.maiachess.com/

https://lichess.org/@/maia1

  • Piggy-backing off this - does anyone know of a quick way to evaluate the maia weights from python or js for a single board state? I'm trying to hack something together with my own search func intended for human play and I can't quite figure it out from the cpp in Lc0.

I built something like this. It works as long as you're not too high-rated: chessmate.ai. Once players get higher rated it is more difficult to predict their moves because you need to model their search process, not just their intuitive move choice. It's also possible to train on one player's games only so that it is more personalized.

It uses a similar approach to Maia but with a different neural network, so it had a bit better move matching performance. And on top of that it has an expectation maximization algorithm so that the bot will try to exploit your mistakes.

  • Really nice work! The tabs other than "play" don't seem to be working, but I was able to try some novelty openings and it certainly felt like it was responding with human moves. It would be great to have the ability to go back/forth moves to try out different variations.

    I'm curious how you combined Stockfish with your own model - but no worries if you're keeping the secret sauce a secret. All the best to you in building out this app!

A long time ago I had the Fritz engine from chessbase. It had a sparring feature where if you maintained good play it would give up a tactical puzzle in the middle of the game. It could either warn you or not. If you didn't play solidly enough, you would just lose.

As far as I can tell, they got rid of this feature. It was the only computer opponent that felt real. Like it made a human mistake when put under pressure, rather than just playing like a computer and randomly deciding to play stupid.

What’s your rating, have you tried gpt4o?

It’s supposedly good up to about 1300, but aside from that the ability to prompt can make the style of play somewhat tunable for ex aggressive, defensive, etc.

  • Do you know if there are any interfaces to play against got4o? Or is it just typing in algebraic moves back and forth?

    • It prints an ASCII board, and then yes, you type algebraic moves back and forth and it updates the board for you each turn.

No, not with adjustable rating. The best human-like engine is fairymax, but its Elo is estimated between 1700-2000.

I'm currently trying to build one, fwiw.

  • Cool! I've been wondering for s while if it wouldn't be possible to use lichess games for various ratings to make typical mistakes.

    I'm also curious about if it would be possible to mimic certain playing styles. Two beginners can have the same rating but one might lose because they have a weak opening, and the other one because they mess upo the end game, for example.

    Random mistakes doesn't mimic human play very well.

GPT-3.5-turbo-instruct has a peak Elo of around 1800 (but of course can be prompted to play with less skill) and is as human like as you'll get at that level.

It doesn't seem that difficult to pull off - take one of the existing engines, get the top y moves, choose randomly. For each level down increase y by 1.

  • No, it doesn't work at all. Human mistakes are not at all like computer mistakes. Like -- blundering a piece in a 1-2 move combination will straight up never show up in the stockfish move tree, no matter what you set `y` to.

  • It doesn't work that way. There are many positions with lots of moves that are reasonable, but many others with only 1-2 sensible moves. It would make lots of obvious blunders that an amateur human would never make.

    • Also attention. Lower level human players are more likely to make a move close to their own/their opponent's recent move. They're focused on one area of the board.

      Basic computer opponents on the other hand can make moves all over the place. They look at the board state holistically. This can be very frustrating to play against as a human who has enough problems just thinking their way through some subset of the board, but is thrown off by the computer again and again.

      It's not that bad in chess at least (compared to Go), but still something worth to keep in mind if you're trying to make an AI that is fun to play against as an amateur.

  • Seems this might still have the problem of moves being either extremely good or extremely bad depending on how many good moves are found, rather than playing at a consistent level. Or for example in a degenerate case where there are only two moves and one leads to mate, the computer will be picking randomly.