When I computed the suicide checkers 6-piece endgame database, I was in for an unpleasent surprise: My compression program which worked fine for the regular checkers endgame database didn't do a good job at all compressing the suicide checkers database. While the regular 6-piece database is only about 40MB, the suicide 6-piece database is a full 150MB, nearly 4 times larger. Why is this? To understand why it doesn't compress well, you first need to know how the compressor works: it uses a scheme known as "run length encoding" or RLE for short. RLE replaces "runs" of the same value in the database with a pair of values: the length of the run, and the value of the run. If this sounds gibberish, an example might help. Let's say you have a way of enumerating all positions in your database from 1...N. You write down the value for every position and get something like
with W a win, D a draw and L a loss - none of them present in the above database. The compressor now works by storing (10,W), (10,D), (10,W), (10,D) instead. In many databases the runs reach amazing lengths, especially in lopsided databases where most positions are just a win for the stronger side. There are of course the positions where one side can capture something, and they usually have more or less random results. For this reason, positions with captures are just removed from the database; your program will have to look one ply deeper and look up the position after the capture. This results in nice compression ratios for regular checkers, the compressed database is approximately 10x smaller than the uncompressed database. Now you can probably guess why the suicide checkers database doesn't compress well: The results in those databases are much less monotonous, and therefore this compression scheme is no longer particularly efficient.
Again, an example might make things clearer: In regular checkers, all positions with 2 vs 1 king are wins for the 2 kings if you remove the positions with a capture, with only one rather pathological exception; red kings on 3 and 12 and a white king on 11, with red to move. Therefore, this database compresses exceptionally well. In suicide checkers, 2 vs 1 king is normally a win for the side with 2 kings, however, if you played through the game I posted yesterday, you can see that in the endgame red could get 2 kings vs 1, but he would lose all the same. Therefore, the 2 vs 1 king database in suicide checkers has wins as well as losses in it, and doesn't compress well. On another note, this also means that the database is more important in suicide checkers than in regular checkers: Obviously, it would be quite difficult to write a heuristic that can guess the likely outcome of the game with 2 vs 1 kings on the board, something which is very simple for regular checkers. And of course for more complicated endgames, it gets even more difficult.