MadChess user Budana Prijadi reached out via the Contact Me link on this blog. They expressed interest in playing against MadChess at reduced strength but had questions about how to weaken the engine. I’m sharing questions and answers from our email conversation with the hope other users find it helpful.
For a technical explanation of how the strength of MadChess is reduced, see The MadChess UCI_LimitStrength Algorithm.
Q: The Elo setting is based on FIDE, CCRL, or other?
A: Who knows? Calibrating a chess engine to accurately resemble weak human play is black magic. I’ve attempted to address the qualitative aspect by excluding unreasonable moves (See the Unreasonable Inferior Moves section of The MadChess UCI_LimitStrength Algorithm). Calibrating the quantitative aspect can only be done by having MadChess play thousands of games against human players in order to establish an Elo rating with a narrow error bar known with high confidence. I lack resources to organize such a test. (Though such testing is routine for engine-engine matches that establish a chess engine’s rating at full strength.)
Q: How does MadChess’ rating convert to FIDE?
A: I have no idea.
Q: How do I get, say, 1100 Elo on my hardware? What time controls do I use? Or depth?
A: Set the Elo rating of the engine in your GUI to 1100. Any time control. (MadChess reduces search speed instead of outright limiting search depth.) If you feel the engine plays too weak or too strong for the 1100 Elo rating, then customize the formulas used to calculate the internal limit-strength engine parameters as described in the Advanced Configuration section of The MadChess UCI_LimitStrength Algorithm.
Q: How do I limit NPS in my hardware? Any tips to get 1100 Elo on my PC?
A: NPS and other internal limit-strength engine parameters are calculated from the Elo rating. See the page linked above for details. If you feel the engine does not play convincingly as an 1100 or 1500 (or whatever rating) player typically plays, then experiment by tweaking values in the Advanced Configuration JSON file.
Q: Stockfish and Crafty have a “bench” command to adjust time controls to get the strength of the engine on any PC used. Why doesn’t MadChess have a “bench” command?
A: That’s not relevant here. That estimates engine strength at full strength. It’s used by testers to adjust time controls in matches to simulate the computing power of an old PC used to run games with the computing power of a new PC used to run games… when all those games (from old and new PCs) are combined into a single rating list. In other words, run games on the new PC at a faster time control than the old PC because chess engines on the new PC will search just as deeply due to a faster CPU.
Personally, I think it’s useless. I think engine testers are fooling themselves into believing they’re controlling testing conditions to preserve the integrity of a rating list. In my opinion, they’re not. The only way to ensure rating list integrity is to run all games on the same PC. Of course, that technique does not scale.
Q: The good thing with Lucas Chess (a GUI) is that I don’t have to specify time controls.
A: OK, that wasn’t a question but I’ll comment nonetheless.
I don’t understand how that’s a good thing, but to each their own. To me, playing chess without a clock is like playing poker without money. It’s not the same game.
One of the reasons I programmed MadChess to reduce its search speed in limit-strength mode, rather than outright limiting depth, is I want it to take time to “think” about its move just like a human player does. I find it jarring when a reduced strength engine moves instantly. In addition to the negative psychological impact, it reduces my time to think. Normally, if I’m focused, while waiting for my opponent to move I think about their potential replies and how I should respond. When a chess engine moves instantly, it robs me of that time to contemplate continuations. That artificially lowers my playing strength.
I find chess engines that manage their time similar to how a human player manages theirs are more enjoyable to spar against than instant-movers.
Continual Tinkering
I include this last Q&A (below) not to embarrass the user who asked the question. No. I include it to dispel the urge to continually tinker with MadChess. Believe me, I get it. I’ve been a software developer most of my life- professionally since 1998. Computers encourage continual tinkering. Chess engine development exacerbates the problem.
Computers encourage constant tinkering. Chess engine development exacerbates the problem.
Q: I played hundreds of games of MadChess (limit-strength) versus Houdini 1.5 (limited to search depth 6) to “quick calibrate” MadChess. So I could play against it with Elo under 1500 benchmarked against the attached research paper. The reason for me to do this is to have MadChess as my go-to sparring partner when I play with limit-strength progressively from 1000 to 1500 Elo with MadChess playing at 2 seconds per move and me playing with free time controls. It appears MadChess does not play as reliably at 2 seconds per move time control compared to when it’s given time controls with bonus time per move. Are you aware of this inconsistency in MadChess’ search?
A: You’re overthinking it.
MadChess, in limit-strength mode, is meant to be played against humans with human-like time controls. No human would play a game requiring them to move exactly every two seconds. Give it the same time control as you have, just as you would when playing against a human chess player. Don’t attempt to weaken MadChess by handicapping its time control. Use its existing handicap features:
- Elo Rating
- Customizable formulas that adjust internal engine parameters based on Elo rating.
You’re creating a science experiment where none is necessary. Simply play MadChess in a GUI with the engine set to match your rating. After a few games, and some rating oscillation, you’ll arrive at a rating where you and MadChess are evenly matched. If that rating is too high or too low compared to your actual rating- perhaps known from your over-the-board (OTB) tournament or online chess experience- then adjust MadChess’ limit-strength formulas and play more games. Eventually you’ll settle at a rating where…
- MadChess gives you a competitive but even-chance game. And…
- That rating aligns with your known OTB or online chess rating.
MadChess, in limit-strength mode, is meant to be played against humans with human-like time controls.
The paper you attached is interesting but irrelevant. It discusses chess engine performance when playing at full strength except for search depth limitations. It does not take into account an engine performing a Multi-PV (open alpha / beta window) search to get exact scores for multiple root moves and purposefully selecting an inferior move. Nor does it take into account an engine disabling evaluation features to simulate a human player who does not grasp- or cannot correctly evaluate- subtle positional aspects that occur in the course of a game.
I appreciate your enthusiasm but if you contort MadChess, in limit-strength mode, into playing in a manner for which it was not designed, you’ll find it’s not much of a sparring partner. Simply challenge it to a game of 10m or 3m+2s or 2m+1s or whatever is your favorite time control and MadChess will give you a good game.