Tournaments

I run chess engine tournaments using the Cute Chess GUI.

  • Tournament
    • Regular
      • 12 Engines per Division
      • Variant = Standard Chess
      • Type = Round Robin
      • Games Per Encounter = 2
      • Rounds = 50
      • 1100 Games per Engine, 6600 Games per Tournament
    • Cross-Divisional
      • 6 Engines
        • 3 from Upper Division
        • 3 from Lower Division
      • Variant = Standard Chess
      • Type = Round Robin
      • Games Per Encounter = 2
      • Rounds = 25
      • 250 Games per Engine, 750 Games per Tournament
  • Time Control
    • Moves = Whole Game
    • Bullet = 2 min/game +  1 sec/move
    • Blitz =  5 min/game +  3 sec/move
    • Rapid = 15 min/game + 10 sec/move
  • Opening Suite
    • PGN / EPD = Arasan.pgn
    • Depth = 99 Plies
    • Order = Random
  • Opening Book = None
  • Draw Adjudication (Move Number) = Off
  • Resign Adjudication (Move Number) = Off
  • Thinking on Opponent’s Time = Unchecked

Multi-Threading

I configure all engines to run using a single CPU core. My Chess PC has 16 cores (32 with hyper-threading). I configure Cute Chess to use 12 – 15 cores (depending on what work I intend to do on the PC while the tournament is running) via Tools > Settings > Tournaments > Maximum Number of Concurrent Games. This leaves 1 – 4 cores for the operating system and other applications.

Quality Control

Shortly after a tournament begins (perhaps after one hour), I search the PGN file for the following phrases to find games that terminated prematurely due to buggy engines.

  • abandon
  • stall
  • disconnect
  • forfeit

If any engine repeatedly causes these terminations, I remove the engine from my PC (to prevent accidentally including it in future tournaments), download a replacement engine of similar strength, and restart the tournament. After the tournament concludes, I import the games into a ChessBase database. Then I search for invalid games using Home > Filter List with the following (“anded”) criteria:

  • Annotation Text 1 (Whole Word unchecked) = abandon
  • Game Data > Result = 1/2 – 1/2

I update the game result as a loss for the offending engine. Then I search again with the following criteria and update the game result:

  • Annotation Text 1 (Whole Word unchecked) = stall
  • Game Data > Result = 1/2 – 1/2

Once more:

  • Annotation Text 1 (Whole Word unchecked) = disconnect
  • Game Data > Result = 1/2 – 1/2

Finally:

  • Annotation Text 1 (Whole Word unchecked) = forfeit
  • Game Data > Result = 1/2 – 1/2

This eliminates invalid games that artificially raise an engine’s rating by crediting a draw when the engine deserved a loss for the tournament instability it caused due to buggy code. I am willing to tolerate a few invalid games per engine (less than two percent). More frequent problems should be caught by my initial check of the PGN file one hour into the tournament, and the engine replaced. If not, I delete the offending engine’s games from the database, then run a gauntlet tournament with a replacement engine.

Comments are closed.