I ported my particle swarm tuning code from MadChess 2.x to MadChess 3.0 Beta, then simplified and improved it. My code uses the Texel tuning technique described by Peter Österlund in a TalkChess forum post. I improved my particle swarm algorithm in the following manner.
- Simplified update of evaluation parameters via EvaluationConfig class.
- Run the Iterate method of each ParticleSwarm on the .NET Core threadpool instead of using dedicated threads.
- Locate global best particle (among all particle swarms) and update particle velocities after all Iterate methods have completed. This eliminates need to synchronize reads and writes of global state via a lock (to make code thread-safe). Algorithm performs best if number of particle swarms <= number of physical CPU cores.
- Randomize location of all particles, except the global best, if 10 iterations fail to improve evaluation error. Retaining the global best causes other particles to be drawn back towards it, though their motions are randomized- they’re also drawn towards their individual best and swarm best with random magnitudes, so they jitter in particle-space. They may encounter a new global best on their path towards the last known global best.
In his post, Peter describes how to calibrate evaluation parameters by examining a large collection of games played by strong Grandmasters or engines. By scoring every position in every game, mapping the score (which represents fractional pawn units) to a winning percentage using a sigmoid function, then summing the square of the difference between the winning percentage and the actual result of the game (0 = player-to-move loses, 1/2 = draw, 1 = win), the chess engine’s strength can be improved by minimizing the evaluation error. Peter does not describe how to minimize the error- that effort is left for the chess engine programmer. The minimization effort is challenging because the parameter space is huge. MadChess 3.0 Beta’s evaluation function considers only material and piece square location, and yet, given reasonable minimum and maximum integer values for all evaluation parameters, 1.75 x 1068 discrete parameter combinations are possible.
How large is 1.75 x 1068? I need add only a few more evaluation parameters for the number of discrete parameter combinations to surpass the number of atoms in the universe.
I majored in physics in college, but it’s been a while since I’ve read math-intensive scientific papers, so rather than implement ADAM or other multivariate, derivative-free gradient descent algorithms (or determine how to plug the MadChess 3.0 Beta evaluation error cost function into a third-party optimization library), I decided to trust the particles. They succeeded in finding better evaluation parameters for MadChess 2.x, and they’ve succeeded again for MadChess 3.0 Beta.
While not a complete listing, here’s code that illustrates my implementation of a multi-threaded particle swarm optimizer.
I used Chessbase to export all games played between two Grandmasters rated >= 2700 ELO since the year 2000. I realize I could tune MadChess’ evaluation function using games played by stronger engines, such as Stockfish or Komodo. However, I wish to avoid biasing my engine’s playing style towards that of other engines. I’d rather have it emulate a more human playing style. I fed my optimization algorithm the games played by 2700+ ELO Grandmasters since the year 2000 and the particles found new evaluation parameters worth 47 ELO in playing strength.
|Eval Param Tuning||Evaluation||2018 Nov 24||75||272||2143||+47|
Material and Piece Location
|Baseline||2018 Nov 08||58||269||2096||0|
- Subversion source code revision
- Win At Chess position test, 3 seconds per position
- Bullet chess, 2 min / game + 1 sec / move