views:

171

answers:

7

I'm currently debugging my transposition table for a chess variant engine where pieces can be placed (ie. originally not on the board). I need to know how often I'm hitting key collisions. I'm saving the piece list in each table index, along with the usual hash data. My simple solution for determining if two positions are equal is failing on transpositions because I'm linearly comparing the two piece lists.

Please do not suggest that I should be storing by board-centric instead of piece-centric. I have to store the piece list because of the unique nature of placable and captured pieces. Pieces in those states are like they are occupying an overlapping and position-less location. Please look at the description of how pieces are stored.

// [Piece List]
// 
// Contents: The location of the pieces.
//           Values 0-63 are board indexes; -2 is dead; -1 is placeable
// Structure: Black pieces are at indexes 0-15
//            White pieces are at indexes 16-31
//            Within each set of colors the pieces are arranged as following:
//            8 Pawns, 2 Knights, 2 Bishops, 2 Rooks, 1 Queen, 1 King
// Example: piece[15] = 6 means the black king is on board index 6
//          piece[29] = -2 means the white rook is dead
char piece[32];

A transposition happens when pieces are moved in a different order, but the end result is the same board position. For example the following positions are equal:

1) first rook on A1; second rook on D7
2) first rook on D7; second rook on A1

The following is a non-optimised general algorithm; and the inner loop is similar to another general problem, but with the added restraint that values in 0-63 will only happen once (ie. only one piece per square).

for each color:
    for each piece type:
        are all pieces in the same position, disregarding transpositions?

The following comparison does NOT work because of transpositions. What I need is a way to detect transpositions as equal and only report actually different positions.

bool operator==(const Position &b)
{
    for (int i = 0; i < 32; i++)
        if (piece[i] != b.piece[i])
            return false;
    return true;
}

Performance/memory is a consideration because the table gets over 100K hits (where keys are equal) per turn and a typical table has 1 million items. Henceforth, I'm looking for something faster than copying and sorting the lists.

+4  A: 

Why not keep an 64 byte string in your database that corresponds to the chessboard layout? Every type of piece, including 'no piece' represents a letter (different caps for both colors, ie ABC for black, abc for white). Board comparison boils down to simple string comparison.

In general, comparing from the chessboard perspective, instead of the piece perspective, will get rid of your transpositions problem!

Jochem
That would work for traditional chess. But in the chess variant that I'm writing, **pieces can also be placable**. Which means your board perspective solution wouldn't work as pieces to be placed or that were dead wouldn't be listed in the 64 byte string. The **board doesn't contain all the data, and my problem is that the piece list is harder to detect transpositions**.
Justin
Hm, too bad. There would still be a way around that by appending the 64 chars for the board by an ordered list of all pieces that are dead (ie aabAC). But perhaps that's starting to become impractical? It's a classic problem of storage-time vs retrieval-time tradeoff.
Jochem
Justin. Just expand the stored board to allow for an area of "not-placed" pieces outside the board.
Esben Skov Pedersen
@Esben: this would not work unless sorted as I mentioned, because this not-placed area is more like a position-less pool.
Jochem
Jochem. Then define some standard ordering. Or just sort them
Esben Skov Pedersen
A: 

From the piece perspective you could do this:

for each color:
    for each piece type:
        start new list for board A
        for each piece of this piece type on board A
            add piece position to the list
        start new list for board B
        for each piece of this piece type on board B
            add piece position to the list
        order both lists and compare them

Optimizations can come in different ways. Your advantage is: as soon as you notice a difference: your done!

You could for instance start with a quick and dirty check by summing up all the indexes for all pieces, for both boards. The sums should be equal. If not, there's a difference.

If the sums are equal, you could quickly compare the positions of the unique pieces (King and Queen). Then you could write out (in somewhat complicated if statements) the comparisons for the pieces that are in pairs. All you then have to do is compare the pawns using the above stated method.

Jochem
Again this wouldn't work for my variant. Pieces on the board are unique, but placable/captured pieces are not. For example the beginning of the game starts out with all values "-2" (placeable). You are losing the piece type information of the index groupings. Without this losing 2 pawns would be the same as losing 2 rooks. Also the sum idea won't work see [this](http://stackoverflow.com/questions/3886986).
Justin
The sum idea is only to quickly EXCLUDE certain boards (as I expected would be the most common case). Because in order to be equal, the sum should at least be equal. I do think this would work, placeable is just another index. If you have one pawn on B2, the ordered list will always yield -2,-2,-2,-2,-2,-2,-2,X (X being the B2 index). In your question you said -1 was placeable but okay.
Jochem
@Jochem Yes, equal positions have equal sums, but the reverse is not true. Furthermore, my simple solution can already detect equal positions where the pieces are not transposed. But a transposition can't be detected, and that's what I need.
Justin
@Jochem So basically a piecewise copy, sort and compare. That would work, but I'm looking for a solution without copy/sort. The one time cost for this is nothing, but 100,000+ of them adds up.
Justin
A: 

And a third option (I really do hope posting 3 answers to one question is ok, stackoverflow-wise ;)):

Always keep your pieces of the same type in index-order, ie the first pawn in the list should always have the lowest index. If a move takes place that breaks this, just flip the pawns positions in the list. The use won't see the difference, a pawn is a pawn.

Now when comparing positions, you can be sure there's no transpositions problem and you can just use your proposed for-loop.

Jochem
I don't think that would work because my search function is recursive and probably wouldn't be able to undo the move if I keep swapping the pieces around. The move object use the index to make/unmake moves. You're just adding more overhead.
Justin
This is a very good approach to the question as asked: more generally, when comparing objects (such as board positions) where there's more than one valid representation for the same object (due to transpositions etc), a powerful strategy is to try to find a "canonical" representaton - i.e. for each different board position that may have many representations, pick one as the "right" one, and then when comparing two positions (or beforehand), transform them into this "right" ("canonical") representation - that way, if they don't compare equal, you can be certain they're different positions.
psmears
BTW the recursion shouldn't be an issue (just have a function that moves a piece and updates the representation, then call it once on the way in to move a piece, and again on the way out to move it back). And in the vast majority of cases the updating is trivial - at most swap two entries - so the "hard" cases should be rare enough to make it a win overall. But if you're prepared to change your board representation, then Amoss's solution is better, and Mathieu Pagé explains how to use hashing to make it even more efficient...
psmears
@justin: So, try making it work. For instance, whenever you construct a new position (whether moving forwards or backwards), you can make sure it's in canonical order. And, seriously, stop whining -- you're devoting more text to complaining how everybody's ideas couldn't possibly work for you than you are to asking the question.
comingstorm
A: 

Given your choice of game state representation, you have to sort the black pawns' indices, the white pawns' indices, etc., one way or the other. If you don't do it in the course of creating a new game state, you will have to do it upon comparison. Because you only need to sort a maximum of 8 elements, this can be done quite fast.

There are a few alternatives to represent your game states:

  • Represent each type of piece as a bit field. The first 64 bits mean that there is a piece of this type on that board coordinate; then there are n bits of "placeable" and n bits of "dead" slots, which have to be filled from one side (n is the number of pieces of this type).

or

  • Give each type of piece a unique ID, e.g. white pawns could be 0x01. A game state consists of an array of 64 pieces (the board) and two ordered lists of "placeable" and "dead" pieces. Maintaining the order of these lists can be done quite efficiently upon inserting and deleting.

These two alternatives would not have a transposition problem.

Anyway, I have the impression that you are fiddling around with micro-optimizations when you should first get it to work.

Svante
+1  A: 

"do not suggest that I should be storing by board-centric instead of piece-centric".

You're so focused on not doing that, that you miss the obvious solution. Compare board-specific. To compare two position lists L1 and L2, place all elements of L1 on a (temporary) board. Then, for each element of L2, check if it's present on the temporary board. If an element of L2 is not present on the board (and thus in L1), return unequal.

If after removing all elements of L2, there are still pieces left on the board, then L1 must have had elements not present in L2 and the lists are equal. L1 and L2 are only equal when the temporary board is empty afterwards.

An optimization is to check the lengths of L1 and L2 first. Not only will this catch many discrepancies quickly, it also eliminates the need to remove the elemetns of L2 from the baord and the "empty board" check at the end. That is only needed to catch the case where L1 is a true superset of L2. If L1 and L2 have the same size, and L2 is a subset of L1, then L1 and L2 must be equal.

MSalters
This answer doesn't fully solve the problem. (I think it can be extended to do so without much trouble, but at present it doesn't account for the difference between pieces which are "placeable" versus pieces which are "dead"...)
psmears
+1  A: 

There is lot of research done on computer chess and the way to create unique hash for a position is a well know problem with an universal solution used by virtually every chess engine.

What you need to do is use Zobrist Hashing to create a unique (not really unique, but we'll see later why this is not a problem in practice) key for each different positions. The Algorithm applied to chess is explained here.

When you start your program you create what we call zobrist keys. These are 64 bits random integers for each piece/square pairs. In C you would have an 2 dimension array like this :

unsigned long long zobKeys[NUMBER_OF_PIECES][NUMBER_OF_SQUARES];

Each of this key are initialized with a good random number generator (Warning : the random number generator provided with gcc or VC++ are not good enough, use an implementation of the Mersenne Twister).

When the board is empty you arbitrarily set it's hash key to 0, then when you add a piece on the board, say a Rook on A1, you also update the hash key by XORing the zobrist key for a rook on A1 with the hash key of the board. Like this (in C) :

boardHash = boardHash ^ zobKeys[ROOK][A1];

If you later remove the rook from this square you need to reverse what you just did, since a XOR can be reversed by applaying it again, you can simply use the same command again when you remove the piece :

boardHash = boardHash ^ zobKeys[ROOK][A1];

If you move a piece, say the rook on A1 goest to B1, you need to do two XOR, one to remove the rook on A1 and one to add a rook on B2.

boardHash = boardHash ^ zobKeys[ROOK][A1] ^ boardHash ^ zobKeys[ROOK][B1];

This way everytime you modify the board you also modify the hash. It is very efficient. You could also compute the hash from scatch each time by xoring the zobKeys corresponding to all pieces on the board. You will also need to XOR the position of the pawn that can be taken en passant and the status of the rooking capabilities of both side. You do it the same way, by creating zobris keys for each possible values.

This algotitm does not guaranty that each position has a unique hash, however, if you use a good pseudo random number generator, the odds of a collision occuring are so low that even if you let your engine play you whole life there is virtually no chances of a collision ever occuring.

edit: I just red that you are trying to implement this for a variant of chess that has off-board pieces. Zobrist hashing is still the right solution for you. You will have to find a way to incorporate theses information in the hash. You could for example have some keys for the off-the-board pieces :

unsigned long long offTheBoardZobKeys[NUMBER_OF_PIECE][MAXIMUM_NUMBER_OF_ON_PIECE_TYPE];

If you have 2 paws off the board and put one of this pawn on a2, you will have to do 2 operations :

// remove one pawn from the off-the-board
boardHash = boardHash ^ offTheBoardZobKeys[WHITE_PAWN][numberOfWhitePawsOffTheBoard];

// Put a pawn on a2
boardHash = boardHash ^ zobKeys[WHITE_PAWN][A2];
Mathieu Pagé
I'm already using a variant of Zobrist Hashing. But I'm using add/sub because of the placable pieces.
Justin
I'm not sure how you use add/sub... in place of XOR? But you simply need to find a way to hash in the "off the board" pieces status. It can be done with the regular algorithm, I explained one way to do it in my edit.
Mathieu Pagé
+2  A: 

Your main objection to storing the states board-wise is that you have a bag of position-less pieces. Why not maintain a board + a vector of pieces? This would meet your requirements and it has the advantage that it is a canonical representation for your states. Hence you don't need the sorting, and you either use this representation internally or convert to it when you need to compare :

Piece-type in A1
... 63 more squares
Number of white pawns off-board
Number of black pawns off-board
... other piece types
Amoss
This is another good canonicalisation approach!
psmears