Tournaments

I run chess engine tournaments using the Cute Chess GUI.

Structure

  • Tournament
    • Regular
      • 12 Engines per Division
      • Type = Round Robin
      • Rounds = 100
      • Games Per Encounter = 2
      • Play Each Opening = 2 Times
      • Swap Sides = Checked
      • 2,200 Games per Engine = 13,200 Games per Tournament
    • Cross-Division
      • 6 Engines
        • 3 from Upper Division
        • 3 from Lower Division
      • Type = Round Robin
      • Rounds = 100
      • Games Per Encounter = 2
      • Play Each Opening = 2 Times
      • Swap Sides = Checked
      • 1,000 Games per Engine = 3,000 Games per Tournament
  • Time Control
    • Moves = Whole Game
    • Bullet = 2 min/game + 1 sec/move
    • Blitz =  5 min/game + 3 sec/move
    • Rapid = 14 min/game + 7 sec/move
  • Opening Suite
    • PGN / EPD = Arasan.pgn
    • Depth = 99 Plies
    • Order = Random
  • Opening Book = None
  • Draw Adjudication (Move Number) = Off
  • Resign Adjudication (Move Number) = Off
  • Thinking on Opponent’s Time = Unchecked

Multithreading

I configure all engines to run using a single CPU core. My Chess PC has 64 cores (128 logical processors with hyper-threading). I configure Cute Chess to use 60 cores via Tools > Settings > Tournaments > Maximum Number of Concurrent Games. This leaves 4 cores for the operating system and other applications.

Quality Control

Ideally, shortly after a tournament begins, I search its PGN file for the following phrases to find games that terminated prematurely due to buggy engines.

  • abandon
  • stall
  • disconnect
  • forfeit

If any engine repeatedly causes such terminations, I remove the engine from my PC (to prevent accidentally including it in future tournaments), download a replacement engine of similar strength, and restart the tournament. If I don’t notice premature terminations until the tournament has progressed significantly, I’ll wait for the tournament to complete, remove the buggy engine’s games (using Hiarcs Chess Explorer Pro), then run a gauntlet tournament for a replacement engine matched against the same opponents of the buggy engine.

This eliminates invalid games that artificially raise the rating of other engines by crediting them a cheap win. I am willing to tolerate a few prematurely terminated games per engine. However, any engine with a failure rate higher than one half of one percent reduces the reliability of my rating lists, so I replace the buggy engine with a reliable engine of similar playing strength.

Comments are closed.