I ported my particle swarm tuning code from MadChess 2.x to MadChess 3.0 Beta, then simplified and improved it. My code uses the Texel tuning technique described by Peter Österlund in a TalkChess forum post. I improved my particle swarm algorithm in the following manner.
- Simplified update of evaluation parameters via EvaluationConfig class.
- Run the Iterate method of each ParticleSwarm on the .NET Core threadpool instead of using dedicated threads.
- Locate global best particle (among all particle swarms) and update particle velocities after all Iterate methods have completed. This eliminates need to synchronize reads and writes of global state via a lock (to make code thread-safe). Algorithm performs best if number of particle swarms <= number of physical CPU cores.
- Randomize location of all particles, except the global best, if 10 iterations fail to improve evaluation error. Retaining the global best causes other particles to be drawn back towards it, though their motions are randomized- they’re also drawn towards their individual best and swarm best with random magnitudes, so they jitter in particle-space. They may encounter a new global best on their path towards the last known global best.
In his post, Peter describes how to calibrate evaluation parameters by examining a large collection of games played by strong Grandmasters or engines. By scoring every position in every game, mapping the score (which represents fractional pawn units) to a winning percentage using a sigmoid function, then summing the square of the difference between the winning percentage and the actual result of the game (0 = player-to-move loses, 1/2 = draw, 1 = win), the chess engine’s strength can be improved by minimizing the evaluation error. Peter does not describe how to minimize the error- that effort is left for the chess engine programmer. The minimization effort is challenging because the parameter space is huge. MadChess 3.0 Beta’s evaluation function considers only material and piece square location, and yet, given reasonable minimum and maximum integer values for all evaluation parameters, 1.75 x 1068 discrete parameter combinations are possible.
Particle.cs
How large is 1.75 x 1068? I need add only a few more evaluation parameters for the number of discrete parameter combinations to surpass the number of atoms in the universe.
I majored in physics in college, but it’s been a while since I’ve read math-intensive scientific papers, so rather than implement ADAM or other multivariate, derivative-free gradient descent algorithms (or determine how to plug the MadChess 3.0 Beta evaluation error cost function into a third-party optimization library), I decided to trust the particles. They succeeded in finding better evaluation parameters for MadChess 2.x, and they’ve succeeded again for MadChess 3.0 Beta.
While not a complete listing, here’s code that illustrates my implementation of a multi-threaded particle swarm optimizer.
UciStream.cs
ParticleSwarms.cs
ParticleSwarm.cs
Particle.cs
Evaluation.cs
Tuning Results
I used Chessbase to export all games played between two Grandmasters rated >= 2700 Elo since the year 2000. I realize I could tune MadChess’ evaluation function using games played by stronger engines, such as Stockfish or Komodo. However, I wish to avoid biasing my engine’s playing style towards that of other engines. I’d rather have it emulate a more human playing style. I fed my optimization algorithm the games played by 2700+ Elo Grandmasters since the year 2000 and the particles found new evaluation parameters worth 47 Elo in playing strength.
Feature | Category | Date | Rev1 | WAC2 | Elo Rating3 | Improvement |
---|---|---|---|---|---|---|
Eval Param Tuning | Evaluation | 2018 Nov 24 | 75 | 272 | 2143 | +47 |
Sophisticated Search Material and Piece Location |
Baseline | 2018 Nov 08 | 58 | 269 | 2096 | 0 |
- Subversion source code revision
- Win At Chess position test, 3 seconds per position
- Bullet chess, 2 min / game + 1 sec / move
Perhaps the best way to calibrate the parameters is to use a high quality database of long time-control over-the-board games. Excluding Rapid, Blitz, Blindfold, Consultation, Internet, Radio, Telegraph, Exhibition and Simultaneous games. Similar to Norman Pollock’s databases.
I’m not sure but it seems that you used some blitz chess games.
That’s an excellent point, Pablo. I focused on writing the particle swarm algorithm and treated the games database almost as an afterthought. I mean, I made sure to find games played between evenly matched super GMs, but I didn’t make an effort to filter out games played at fast time controls or non-tournament conditions.
One of the reasons I wrote the tuning code so early in the process of developing MadChess 3.0 is I want to tune the evaluation parameters regularly. Each new evaluation feature unbalances the engine so I intend to tune its evaluation every two or three features.
Thanks for pointing out the inconsistent quality of the games. I’ll be sure to use a higher quality database next time.