MadChess 3.0 Beta Build 075 (Eval Param Tuning)

I ported my particle swarm tuning code from MadChess 2.x to MadChess 3.0 Beta, then simplified and improved it. My code uses the Texel tuning technique described by Peter Österlund in a TalkChess forum post. I improved my particle swarm algorithm in the following manner.

  • Simplified update of evaluation parameters via EvaluationConfig class.
  • Run the Iterate method of each ParticleSwarm on the .NET Core threadpool instead of using dedicated threads.
  • Locate global best particle (among all particle swarms) and update particle velocities after all Iterate methods have completed. This eliminates need to synchronize reads and writes of global state via a lock (to make code thread-safe). Algorithm performs best if number of particle swarms <= number of physical CPU cores.
  • Randomize location of all particles, except the global best, if 10 iterations fail to improve evaluation error. Retaining the global best causes other particles to be drawn back towards it, though their motions are randomized- they’re also drawn towards their individual best and swarm best with random magnitudes, so they jitter in particle-space. They may encounter a new global best on their path towards the last known global best.

In his post, Peter describes how to calibrate evaluation parameters by examining a large collection of games played by strong Grandmasters or engines. By scoring every position in every game, mapping the score (which represents fractional pawn units) to a winning percentage using a sigmoid function, then summing the square of the difference between the winning percentage and the actual result of the game (0 = player-to-move loses, 1/2 = draw, 1 = win), the chess engine’s strength can be improved by minimizing the evaluation error. Peter does not describe how to minimize the error- that effort is left for the chess engine programmer. The minimization effort is challenging because the parameter space is huge. MadChess 3.0 Beta’s evaluation function considers only material and piece square location, and yet, given reasonable minimum and maximum integer values for all evaluation parameters, 1.75 x 1068 discrete parameter combinations are possible.

Skip Code, Go To Results

Particle.cs

How large is 1.75 x 1068? I need add only a few more evaluation parameters for the number of discrete parameter combinations to surpass the number of atoms in the universe.

I majored in physics in college, but it’s been a while since I’ve read math-intensive scientific papers, so rather than implement ADAM or other multivariate, derivative-free gradient descent algorithms (or determine how to plug the MadChess 3.0 Beta evaluation error cost function into a third-party optimization library), I decided to trust the particles. They succeeded in finding better evaluation parameters for MadChess 2.x, and they’ve succeeded again for MadChess 3.0 Beta.

While not a complete listing, here’s code that illustrates my implementation of a multi-threaded particle swarm optimizer.

UciStream.cs

ParticleSwarms.cs

ParticleSwarm.cs

Particle.cs

Evaluation.cs

Tuning Results

I used Chessbase to export all games played between two Grandmasters rated >= 2700 Elo since the year 2000. I realize I could tune MadChess’ evaluation function using games played by stronger engines, such as Stockfish or Komodo. However, I wish to avoid biasing my engine’s playing style towards that of other engines. I’d rather have it emulate a more human playing style. I fed my optimization algorithm the games played by 2700+ Elo Grandmasters since the year 2000 and the particles found new evaluation parameters worth 47 Elo in playing strength.

 

Feature Category Date Rev1 WAC2 Elo Rating3 Improvement
Eval Param Tuning Evaluation 2018 Nov 24 75 272 2143 +47
Sophisticated Search
Material and Piece Location
Baseline 2018 Nov 08 58 269 2096 0
  1. Subversion source code revision
  2. Win At Chess position test, 3 seconds per position
  3. Bullet chess, 2 min / game + 1 sec / move

MadChess 3.0 Beta Build 058 (Baseline)

I’ve reached an important milestone in the development of my new chess engine. MadChess 3.0 Beta can play a timed game of chess.

I copied the search function from MadChess 2.0 but implemented an evaluation function from scratch. The search function is rather sophisticated.

  • Alpha / beta negamax
  • PVS with aspiration windows
  • MVV / LVA move order
  • MultiPV with tracking of all principal variations (in search, not in hash table)
  • Hashtable with score, bounds, and best move
  • Delayed move generation (play best move from hashtable before generating moves)
  • Null move pruning
  • Killer moves
  • Move history with aging (used to sort quiet moves)
  • Late move reductions
  • Futility pruning
  • Time management based on total material and material advantage, with “panic” time loss prevention

The evaluation function is very simple.

  • Material
  • Piece location
  • Draw by repetition, 50 moves without a capture or pawn move, or insufficient material
  • Checkmate (obviously)

Move generation is limited to a single function: all moves are generated. I have not implemented separate move generators for captures-only or check-evasion.

I figure there’s no point for me to repeat the exact process I used to develop MadChess 2.0, where I slowly built up search and evaluation functions in tandem. In MadChess 3.0 Beta, I’ve changed the board representation from mailbox to bitboards. This greatly affects the evaluation logic (no more traversing a piece square table) but not the search logic, so it makes sense to retain the search function as-is. After all, the search code is a product of hundreds of hours of engineering and testing.

It will be interesting to measure the Elo value of evaluation features as I add them to MadChess 3.0 Beta. I am aware through personal experience of the highly non-linear relationship between individual search and evaluation features (with regards to contribution to Elo playing strength), so it’s unlikely I’ll find the same Elo values for evaluation features as I found in MadChess 2.0. Now I’m adding them to an engine with a mature search function. Who knows if they’ll be worth more or less than they were in MadChess 2.0? Stay tuned.

OK, what was the result of all this effort just to make a rather dumb chess engine? To answer this question, I ran a gauntlet tournament, pitting MadChess 3.0 Beta against familiar engines.

45) MadChess 3.0        2096.3 :   2000 (+783,=348,-869),  47.9 %

    vs.                        :  games (   +,   =,   -),   (%) :    Diff,    SD, CFS (%)
    Sungorus 1.4               :    200 (  40,  27, 133),  26.8 :  -213.7,  12.8,    0.0
    Waxman 2017                :    200 (  39,  38, 123),  29.0 :  -158.7,  11.1,    0.0
    Zevra 1.8.4                :    200 (  39,  36, 125),  28.5 :  -138.3,  12.9,    0.0
    Napoleon 1.8               :    200 (  43,  43, 114),  32.3 :  -117.7,  16.9,    0.0
    Galjoen 0.37.2             :    200 (  66,  33, 101),  41.3 :   -73.9,  11.9,    0.0
    BikJump 2.01               :    200 (  69,  44,  87),  45.5 :   -18.0,  14.3,   10.4
    Monarch 1.7                :    200 (  90,  37,  73),  54.3 :   +42.9,  15.1,   99.8
    Gerbil 02                  :    200 ( 122,  27,  51),  67.8 :  +114.0,   6.6,  100.0
    Faile 1.4                  :    200 ( 111,  47,  42),  67.3 :  +115.7,  13.8,  100.0
    TSCP 1.81                  :    200 ( 164,  16,  20),  86.0 :  +334.8,  14.6,  100.0

This establishes a baseline rating.

Feature Category Date Rev1 WAC2 Elo Rating3 Improvement
Sophisticated Search
Material and Piece Location
Baseline 2018 Nov 08 58 269 2096 0
  1. Subversion source code revision
  2. Win At Chess position test, 3 seconds per position
  3. Bullet chess, 2 min / game + 1 sec / move

I will add one evaluation (occasionally search) feature at a time and update this table, so I may track the progress of MadChess 3.0 Beta.

Banks 71st Amateur Series Division 7

MadChess 2.2 participated in Graham Banks’ 71st amateur tournament in division 7.

                                Orio Bets Topp Lozz Jumb Rook MadC Talt Dros Jone Romi Ares  
 1. Orion 0.5 64-bit            #### 11=0 =1== 000= 1011 0=1= 1=11 0==1 ==11 111= =1=1 =11=  65%  28.5 ( 942.0, 594.0)
 2. Betsabe II 1.84             00=1 #### 10=1 1==0 =110 ===1 011= 1=1= 0101 11=1 0111 10==  61%  27.0 ( 948.0, 573.0)
 3. Topple 0.2.1                =0== 01=0 #### 11=1 0=10 =000 ==01 0111 0011 011= 1=11 1111  58%  25.5 ( 954.0, 520.0)
 4. Lozza 1.18 64-bit           111= 0==1 00=0 #### 0=0= ==11 0=1= 0111 1011 11=1 ===0 ==01  57%  25.0 ( 956.0, 542.5)
 5. Jumbo 0.6.51 64-bit         0100 =001 1=01 1=1= #### 101= =0=0 0111 ===1 =00= 0111 =11=  55%  24.0 ( 960.0, 505.5)
 6. RookieMonster 1.6.3 64-bit  1=0= ===0 =111 ==00 010= #### =1=1 10=1 0111 =1=0 =00= 011=  53%  23.5 ( 962.0, 511.3)
 7. MadChess 2.2 64-bit         0=00 100= ==10 1=0= =1=1 =0=0 #### =000 0111 1==1 11=1 =011  51%  22.5 ( 966.0, 464.3)
 8. Taltos rev118 64-bit        1==0 0=0= 1000 1000 1000 01=0 =111 #### 1001 0==1 1001 1111  48%  21.0 ( 972.0, 432.5)
 9. Drosophila 1.5.1 64-bit     ==00 1010 1100 0100 ===0 1000 1000 0110 #### =100 1=01 11=1  43%  19.0 ( 980.0, 396.5)
10. Jonesy 1.0 64-bit           000= 00=0 100= 00=0 =11= =0=1 0==0 1==0 =011 #### 1=0= =111  43%  19.0 ( 980.0, 387.0)
11. RomiChess P3n 64-bit        =0=0 1000 0=00 ===1 1000 =11= 00=0 0110 0=10 0=1= #### 0010  36%  16.0 ( 992.0, 358.0)
12. Ares GB 64-bit              =00= 01== 0000 ==10 =00= 100= =100 0000 00=0 =000 1101 ####  30%  13.0 (1004.0, 292.5)

Games

MadChess 3.0 Beta Build 039 (Bitboards)

For the last month or so, in the evenings and on the weekends, I’ve been writing a new version of MadChess. For this 3.0 version, I’m writing code using bitboards instead of the mailbox board representation I used in MadChess 1.x and 2.x. I considered using C++ and even went as far as purchasing Bjarne Stroustrup’s The C++ Programming Language book and reading the first four chapters. But in the end I decided to stick with C#, the programming language with which I’m most familiar, for a few reasons.

  • Microsoft has been adding high-performance features to C# in recent editions, such as ref locals and ref returns.
  • Microsoft has embraced the open source movement with its .NET Core development platform.
  • C# is fast enough.

Perhaps I’ll consider using C++ for version 4 of MadChess.

OK, back to version 3: At program startup, I pre-calculate moves for sliding pieces using magic hashing. I found paulwal222’s answer to the Sliding Move Generation Using Magic Bitboard topic on Stack Overflow a very clear explanation of the technique. See my code below.

I’ve reached the first important milestone: legal move generation. I’m happy with the performance of my C# bitboard code. On my PC, MadChess 3.0 Beta generates legal moves at a rate of 41 million per second from the starting position (4.7x faster than MadChess 2.2). This includes generating pseudo-legal moves with the minimal requirements of From Square, To Square, and Pawn Promotion Piece, plus other metadata that eventually will be used by the search function; finding pinned pieces; and testing move legality (does move expose own king to check) for pinned pieces.

I’ve successfully passed my suite of test positions– the same positions I used to verify correct legal move generation in MadChess 1.x and 2.x. MadChess 3.0 Beta completes the test suite 4.2x faster than MadChess 2.2.

PS C:\Users\Erik\Documents\Visual Studio 2019\Projects\MadChess\Engine\bin\Publish> .\MadChess.Engine.exe
testpositions "C:\Users\Erik\Documents\Chess\Tests\TestPositions.txt"
Number                                                                     Position  Depth     Expected        Moves  Correct    Pct
======  ===========================================================================  =====  ===========  ===========  =======  =====
     1                     rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1      1           20           20     True  100.0
     2                     rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1      2          400          400     True  100.0
     3                     rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1      3        8,902        8,902     True  100.0
     4                     rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1      4      197,281      197,281     True  100.0
     5                     rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1      5    4,865,609    4,865,609     True  100.0
     6                     rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1      6  119,060,324  119,060,324     True  100.0
     7                                           8/8/1k6/8/2pP4/8/5BK1/8 b - d3 0 1      6      824,064      824,064     True  100.0
     8                                          8/8/1k6/2b5/2pP4/8/5K2/8 b - d3 0 1      6    1,440,467    1,440,467     True  100.0
     9                                          8/5k2/8/2Pp4/2B5/1K6/8/8 w - d6 0 1      6    1,440,467    1,440,467     True  100.0
    10                                               5k2/8/8/8/8/8/8/4K2R w K - 0 1      6      661,072      661,072     True  100.0
    11                                               4k2r/8/8/8/8/8/8/5K2 b k - 0 1      6      661,072      661,072     True  100.0
    12                                               3k4/8/8/8/8/8/8/R3K3 w Q - 0 1      6      803,711      803,711     True  100.0
    13                                               r3k3/8/8/8/8/8/8/3K4 b q - 0 1      6      803,711      803,711     True  100.0
    14                                    r3k2r/1b4bq/8/8/8/8/7B/R3K2R w KQkq - 0 1      4    1,274,206    1,274,206     True  100.0
    15                                    r3k2r/7b/8/8/8/8/1B4BQ/R3K2R b KQkq - 0 1      4    1,274,206    1,274,206     True  100.0
    16                                     r3k2r/8/3Q4/8/8/5q2/8/R3K2R b KQkq - 0 1      4    1,720,476    1,720,476     True  100.0
    17                                     r3k2r/8/5Q2/8/8/3q4/8/R3K2R w KQkq - 0 1      4    1,720,476    1,720,476     True  100.0
    18                                            2K2r2/4P3/8/8/8/8/8/3k4 w - - 0 1      6    3,821,001    3,821,001     True  100.0
    19                                            3K4/8/8/8/8/8/4p3/2k2R2 b - - 0 1      6    3,821,001    3,821,001     True  100.0
    20                                          8/8/1P2K3/8/2n5/1q6/8/5k2 b - - 0 1      5    1,004,658    1,004,658     True  100.0
    21                                          5K2/8/1Q6/2N5/8/1p2k3/8/8 w - - 0 1      5    1,004,658    1,004,658     True  100.0
    22                                               4k3/1P6/8/8/8/8/K7/8 w - - 0 1      6      217,342      217,342     True  100.0
    23                                               8/k7/8/8/8/8/1p6/4K3 b - - 0 1      6      217,342      217,342     True  100.0
    24                                                8/P1k5/K7/8/8/8/8/8 w - - 0 1      6       92,683       92,683     True  100.0
    25                                                8/8/8/8/8/k7/p1K5/8 b - - 0 1      6       92,683       92,683     True  100.0
    26                                                K1k5/8/P7/8/8/8/8/8 w - - 0 1      6        2,217        2,217     True  100.0
    27                                                8/8/8/8/8/p7/8/k1K5 b - - 0 1      6        2,217        2,217     True  100.0
    28                                               8/k1P5/8/1K6/8/8/8/8 w - - 0 1      7      567,584      567,584     True  100.0
    29                                               8/8/8/8/1k6/8/K1p5/8 b - - 0 1      7      567,584      567,584     True  100.0
    30                                            8/8/2k5/5q2/5n2/8/5K2/8 b - - 0 1      4       23,527       23,527     True  100.0
    31                                            8/5k2/8/5N2/5Q2/2K5/8/8 w - - 0 1      4       23,527       23,527     True  100.0
    32         r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq - 0 1      5  193,690,690  193,690,690     True  100.0
    33                                    8/2p5/3p4/KP5r/1R3p1k/8/4P1P1/8 w - - 0 1      6   11,030,083   11,030,083     True  100.0
    34             r3k2r/Pppp1ppp/1b3nbN/nP6/BBP1P3/q4N2/Pp1P2PP/R2Q1RK1 w kq - 0 1      5   15,833,292   15,833,292     True  100.0
    35                rnbqkb1r/pp1p1ppp/2p5/4P3/2B5/8/PPP1NnPP/RNBQK2R w KQkq - 0 1      3       53,392       53,392     True  100.0
    36      r4rk1/1pp1qppp/p1np1n2/2b1p1B1/2B1P1b1/P1NP1N2/1PP1QPPP/R4RK1 w - - 0 1      5  164,075,551  164,075,551     True  100.0
    37                                    8/2p5/3p4/KP5r/1R3p1k/8/4P1P1/8 w - - 0 1      7  178,633,661  178,633,661     True  100.0
    38             r3k2r/Pppp1ppp/1b3nbN/nP6/BBP1P3/q4N2/Pp1P2PP/R2Q1RK1 w kq - 0 1      6  706,045,033  706,045,033     True  100.0
    39                    rnbq1k1r/pp1Pbppp/2p5/8/2B5/8/PPP1NnPP/RNBQK2R w KQ - 1 8      5   89,941,194   89,941,194     True  100.0
    40                                            1k6/1b6/8/8/7R/8/8/4K2R b K - 0 1      5    1,063,513    1,063,513     True  100.0
    41                                            3k4/3p4/8/K1P4r/8/8/8/8 b - - 0 1      6    1,134,888    1,134,888     True  100.0
    42                                           8/8/4k3/8/2p5/8/B2P2K1/8 w - - 0 1      6    1,015,133    1,015,133     True  100.0

Counted 1,846,360,249 nodes (31,154,555 nodes per second).

PrecalculatedMoves.cs:

Bitwise.cs:

Banks 70th Amateur Series Division 7

MadChess 2.2 participated in Graham Banks’ 70th amateur tournament in division 7.

                                 1    2    3    4    5    6    7    8    9    0    1    2    
1   Bagatur 1.5e 64-bit          **** 11½½ 0½½½ ½½11 0½½1 1011 ½1½½ 01½½ ½001 01½1 ½110 1½1½  26.5/44  574.00
2   MadChess 2.2 64-bit          00½½ **** ½½00 1½1½ ½0½1 1011 1½0½ 1½½1 111½ 1½1½ ½½½1 110½  26.5/44  554.25
3   Drosophila 1.5.1 64-bit      1½½½ ½½11 **** 1000 ½100 ½½0½ ½½10 01½½ 111½ 0½11 ½110 ½½11  25.0/44
4   Lozza 1.18 64-bit            ½½00 0½0½ 0111 **** 1011 101½ 10½½ ½0½1 1½½1 00½1 11½½ ½0½1  24.0/44  515.75
5   Shallow r694 64-bit          1½½0 ½1½0 ½011 0100 **** ½½1½ 100½ 01½1 0101 ½½11 11½½ 0011  24.0/44  514.25
6   Jumbo 0.6.45 64-bit          0100 0100 ½½1½ 010½ ½½0½ **** ½100 ½101 1½1½ 11½1 11½½ 101½  23.5/44
7   RomiChess P3n 64-bit         ½0½½ 0½1½ ½½01 01½½ 011½ ½011 **** ½½½1 0½1½ 1010 010½ ½101  23.0/44
8   RookieMonster 1.5.13 64-bit  10½½ 0½½0 10½½ ½1½0 10½0 ½010 ½½½0 **** 1000 1000 1½1½ ½101  19.0/44  418.00
9   Waxman 2017                  ½110 000½ 000½ 0½½0 1010 0½0½ 1½0½ 0111 **** ½100 01½1 1½01  19.0/44  406.25
10  Topple 0.2.1 64-bit          10½0 0½0½ 1½00 11½0 ½½00 00½0 0101 0111 ½011 **** ½1½0 ½000  18.0/44  394.75
11  Nemeton 1.7                  ½001 ½½½0 ½001 00½½ 00½½ 00½½ 101½ 0½0½ 10½0 ½0½1 **** 11½1  18.0/44  390.75
12  Galjoen 0.39.2 64-bit        0½0½ 001½ ½½00 ½1½0 1100 010½ ½010 ½010 0½10 ½111 00½0 ****  17.5/44

Games