Close    



Endgame Tablebases


The Chess-DB.com endgame tablebases are proprietary-implemented WDL (win/draw/lose) bitbases and DTM tablebases. Most of the work we are doing so far is on the WDL generation, indexing and compression. A simple overview where we stand with the generation and in comparison to other popular bitbase (WDL tablebase) implementations can be found below. In short, the good news is that endgame tablebases can be calculated and stored (while still be useful for fast O(1) polling) in much less space than it is widely established today.


Number of
pieces
Chess-DB Scorpio    Syzygy     Shredderbases RobboTripple Lomonosov
4 man 0,36
MB
0,96
MB
1,2
MB
1,36
MB
4
MB
N/A
5 man 46
MB
215
MB
378
MB
156
MB
570
MB
384+
MB
6 man
[Pawnless]
1.43
GB
8.74
GB
11.2
GB
N/A
N/A
N/A
6 man
[All]
8.7
GB
48
GB
67.8
GB
N/A
100+
GB
64.2
GB
7 man
[Pawnless]
 7% 
ready
N/A
N/A
N/A
N/A
N/A
7 man
[All]
 2% 
ready
N/A
N/A
N/A
N/A
8 - 15
TB
8 man
 1% 
ready
N/A
N/A
N/A
N/A
N/A
9 man
N/A
N/A
N/A
N/A
N/A
N/A

References with citations (links) for all those numbers will follow soon.



About Chess-DB tablebases


Our work is still in alpha stages. More information about our main ideas and implementation of the tablebases will be made available continuously. Eventually, we would release everything around our tablebase generation and probing under permissive licenses. For the time being you can use our online 5, 6 and 7-man endgame tablebase viewer (support for all endgame configurations is not completed as of today, but published continuously when new become available), or read more about the endgame statistics we are producing based on tablebases.


Indexing and Compression Techniques


In this section we outline some of the ideas that we implemented and are working on as part of our backlog. Most of the ideas are supporting optimal indexing and compression (to minimize the tablebase space, while improving, or not affecting performance):
  • Eliminate board symmetries: For pawnless endgames we take only the 486 options for the two kings, thus not indexing and calculating redundantly all horizontal, vertical and diagonal board symmetries, as well as eliminating illegal king positions from the very beginning. For endgames with pawns only vertical position symmetry exists (thus 1806 two-king positions are calculated).
  • Eliminate same-piece symmetries: For endgame like KQQQQvK we don't need to generate all 64 positions for each queen: The unique configurations are 486*64*63*62*61/(1*2*3*4), a much smaller number. Such symmetries e.g. exist in endgames like KxxxxvK, KxxxvKy, KxxvKyy and others.
  • Store "half-tables": Calculate results only for white to move or only for black to move, whichever results in a smaller tablebase.
  • 3-bit encoding: To store win/draw/lose information we only need 1.5 bits. We designed special implementation that shrinks the space per position to "1.5 bits" while still maintaining efficient performance. In cases where we also need a 4th "undefined" denominator (e.g. to implement heuristic correction tables) we use 2 bits per position.
  • Don't calculate illegal positions: All illegal positions can be excluded. This is hard to do on indexing level (i.e. to reduce the overall maximum index value) but the fact that illegal positions can be ignored can be exploited to reduce space size.
  • Pick the best piece order for the index: Using different piece order when calculating the index of a position leads to different compression sizes.
  • Exploit endgame configuration patterns: If you gzip a raw endgame tablebase file (i.e. one that stores the results for each position before custom compression) the end result is sometimes much smaller in comparison to the size when per block generic compression is done. This is a good indication that usually there are patterns in certain endgame configuration that are not detected by the generic compressors.
  • Ignore winning captures: One technique that we also do use, but with caution: Capture moves that lead to a win for the side to move can be ignored (Further similar ideas are to filter out pawn promotions, positions where one side could capture and win if the side-to-move were switched, forced sequences leading to capture or mate/draw, etc.). Those can be calculated during polling using tablebase of size N-1 (same N for promotions). However, this in combination with keeping half-tables can lead to a much slower probing, thus it should be used carefully. Interestingly, applying such method and general compression gives in some cases worse results for some endgame configurations as interesting patterns that are otherwise present get lost.
  • Heuristic correction tables: Let's take an example to explain the idea in short. Endgame like KBBBBvK with white to move is almost 100% won for white if at least two of the white bishops are opposite colored-bishops. In case all white bishops are on white squares or black squares only the result is a draw. A heuristic function will make a "guess" based on either chess knowledge or generic chess patterns. Any time the guess is right the position will be stored as "undefined", and any time the guess is wrong we store the actual value. The heuristic function need not to be complete for all positions. The tablebase polling always queries the heuristic function first, and if the heuristic gives a result it will do a "double-check" against the stored tablebase value. If the stored value is "undefined" the heuristic result is the true result, otherwise the stored value is the correct one.
  • Custom compression for small blocks: We do custom compression that fits best to the data we see. A simple example illustrating this is the following: A block of positions holding the same result value can be compressed to just 3 bits: The first one to express that we use a custom method, the second and third one storing the single value for the whole block. Any generic compression method would need tens of bytes to encode the same (a block that holds the same values).
  • Double-indexing: If you use 4K, or even 64K blocks of positions for each such block an index to the disk/file position is needed either in memory, or on the disk by a second reading operation. The index alone could easily grow to tens of gigabytes for 7 and 8-man tablebases. With insignificant (very low) overhead we implement double-indexing (with in-memory compression) of the memory indices to the tablebases resulting in memory need of just few megabytes for 6-man tablebases. We have more work to do on that, as the generic indices compression is not good: the list of disk positions being seemingly random with high entropy.

References


As usual, Wikipedia contains one of the best introductions on endgame tablebases, you can find it here. The chess programming wiki is another excellent technical resource on generation of tablebases to which we hope would be contributing in the future as well. Good discussion forums on the topics of endgame tablebases are Kirill Kryukov's endgame tablebase forums and the talkchess programming discussion forums.