views:

253

answers:

6

Hi All

I am working on representation of the chess board, and I am planning to store it in 32 bytes array, where each byte will be used to store two pieces. (That way only 4 bits are needed per piece)

Doing it in that way, results in a overhead for accessing particular index of the board. Do you think that, this code can be optimised or completely different method of accessing indexes can be used?

c++

char getPosition(unsigned char* c, int index){
    //moving pointer
    c+=(index>>1);

    //odd number
    if (index & 1){
        //taking right part
        return *c & 0xF;
    }else
    {
        //taking left part
        return *c>>4;
    }
}


void setValue(unsigned char* board, char value, int index){
    //moving pointer
    board+=(index>>1);

    //odd number
    if (index & 1){
        //replace right part
                 //save left       value only 4 bits
        *board = (*board & 0xF0) + value;
    }else
    {
        //replacing left part
        *board  = (*board & 0xF) + (value<<4);
    }
}


int main() {

    char* c = (char*)malloc(32);

    for (int i = 0; i < 64 ; i++){
        setValue((unsigned char*)c, i % 8,i);
    }

    for (int i = 0; i < 64 ; i++){
        cout<<(int)getPosition((unsigned char*)c, i)<<" ";

        if (((i+1) % 8 == 0) && (i > 0)){
            cout<<endl;
        }


    }


    return 0;
}

I am equally interested in your opinions regarding chess representations, and optimisation of the method above, as a stand alone problem.

Thanks a lot

EDIT

Thanks for your replies. A while ago I created checkers game, where I was using 64 bytes board representation. This time I am trying some different methods, just to see what I like. Memory is not such a big problem. Bit-boards is definitely on my list to try. Thanks

+2  A: 

Well, 64 bytes is a very small amount of RAM. You're better off just using a char[8][8]. That is, unless you plan on storing a ton of chess boards. Doing char[8][8] makes it easier (and faster) to access the board and do more complex operations on it.

If you're still interested in storing the board in packed representation (either for practice or to store a lot of boards), I say you're "doing it right" regarding the bit operations. You may want to consider inlining your accessors if you're going for speed using the inline keyword.

Joey Adams
If you're planning to make any kind of chess AI, you will need to store so many boards that space will become an issue.
Joe Gauterin
+1 I wrote a chess game on a PIC16c66 with 176 bytes of RAM that could look 5 move ahead. The board was stored in 64bytes. Moves were made *on the board* during tree search and moves were "undone" to go back up the tree. Only 1 board instance in memory. Different story if you want to store visited positions in a hash table.
phkahler
@joe - I very much doubt that. As I showed in my answer the memory issue is not even that great a consideration with some very extreme assumptions. In reality most chess engines are going to have at most depth copies of the board during the search and unless you let them run for hours on one move (for analysis) most engines never even reach depth > 20.
Noah Roberts
Joe - this is a TON of extra cycles for only a factor of 2 in space savings. Rarely should you go through that much work (or make the computer go though that much work everytime to access the board) just to win by a factor of 2.
miked
A: 

Is space enough of a consideration where you can't just use a full byte to represent a square? That would make accesses easier to follow on the program and additionally most likely faster as the bit manipulations are not required.

Otherwise to make sure everything goes smoothly I would make sure all your types are unsigned: getPosition return unsigned char, and qualify all your numeric literals with "U" (0xF0U for example) to make sure they're always interpreted as unsigned. Most likely you won't have any problems with signed-ness but why take chances on some architecture that behaves unexpectedly?

Mark B
A: 

Nice code, but if you are really that deep into performance optimization, you should probably learn more about your particular CPU architecture.

AFAIK, you may found that storing a chess piece in as much 8 bytes will be more efficient. Even if you recurse 15 moves deep, L2 cache size would hardly be a constraint, but RAM misalignment may be. I would guess that proper handling of a chess board would include Expand() and Reduce() functions to translate between board representations during different parts of the algorithm: some may be faster on compact representation, and some vice versa. For example, caching, and algorithms involving hashing by composition of two adjacent cells might be good for the compact structure, all else no.

I would also consider developing some helper hardware, like some FPGA board, or some GPU code, if performance is so important..

Pavel Radzivilovsky
+9  A: 

That's the problem with premature optimization. Where your chess board would have taken 64 bytes to store, now it takes 32. What has this really boughten you? Did you actually analyze the situation to see if you needed to save that memory?

Assuming that you used one of the least optimal search method, straight AB search to depth D with no heuristics, and you generate all possible moves in a position before searching, then absolute maximum memory required for your board is going to be sizeof(board) * W * D. If we assume a rather large W = 100 and large D = 30 then you're going to have 3000 boards in memory at depth D. 64k vs 32k...is it really worth it?

On the other hand, you've increased the amount of operations necessary to access board[location] and this will be called many millions of times per search.

When building chess AI's the main thing you'll end up looking for is cpu cycles, not memory. This may vary a little bit if you're targeting a cell phone or something, but even at that you're going to worry more about speed before you'll ever reach enough depth to cause any memory issues.

As to which representation I prefer...I like bitboards. Haven't done a lot of serious measurements but I did compare two engines I made, one bitboard and one array, and the bitboard one was faster and could reach much greater depths than the other.

Noah Roberts
BTW, if you're interested in bitboard implementations I'd suggest gnuchess, crafty, and my own junfa ( http://sourceforge.net/projects/xiangqi-engine/files/ ). Note that JunFa was written when I was in college and didn't know as much about C++ as now. Not that it isn't fairly decent code, it just isn't up to the standards I have now.
Noah Roberts
Depth of 30 only requires 3000 boards!? I think your calculation is a *little* off... (it should be `(X^d-1)/(X-1)`, where `X=sizeof(board)*W` - this gives you about `1x10^58` boards for depth of 30-ply ie. 15 moves...)
BlueRaja - Danny Pflughoeft
@raja - you only visit one branch at a time and it would be a beyond horrible implementation if you were keeping all copies after you were finished evaluating them. You'll visit that many positions but will not have that many copies of the board in memory. My math is correct.
Noah Roberts
"boughten" ? :-)
Mike Dunlavey
@mike - A past participle of buy. http://www.thefreedictionary.com/boughten
Noah Roberts
When I was a kid on a farm (in northern Illinois) we used to say "My mom's bread is almost as good as store-boughten" but we meant it as a joke. I guess if enough people say it, it's official. I bet nu-cu-lar is next :-)
Mike Dunlavey
+3  A: 

Let me be the first to point out a potential bug (depending on compilers and compiler settings). And bugs being why premature optimization is evil:

   //taking left part
    return *c>>4;

if *c is negative, then >> may repeat the negative high bit. ie in binary:

0b10100000 >> 4 == 0b11111010

for some compilers (ie the C++ standard leaves it to the compiler to decide - both whether to carry the high bit, and whether a char is signed or unsigned).

If you do want to go forward with your packed bits (and let me say that you probably shouldn't bother, but it is up to you), I would suggest wrapping the packed bits into a class, and overriding [] such that

board[x][y] 

gives you the unpacked bits. Then you can turn the packing on and off easily, and having the same syntax in either case. If you inline the operator overloads, it should be as efficient as the code you have now.

tony
Good call. You should mention this only applies to signed types. Unless I'm mistaken, right-shifting an unsigned value should never sign extend. Also note that the signedness of `char` is implementation-dependent (for instance, GCC defaults to unsigned char on PowerPC/Linux).
Joey Adams
yeah, the signed part was slightly skipped over, and only hinted at when I wrote "and whether a char is signed or unsigned". I could be more explicit.
tony
A: 

As a chess player, I can tell you: There's more to a position than the mere placement of each piece. You have to take in to consideration some other things:

  1. Which side has to move next?
  2. Can white and/or black castle king and/or queenside?
  3. Can a pawn be taken en passant?
  4. How many moves have passed since the last pawn move and/or capturing move?

If the data structure you use to represent a position doesn't reflect this information, then you're in big trouble.

Eduardo León