tags:

views:

946

answers:

2

If you have geographic data stored in ESRI shapes, you have at least three files: one ending with .shp containing the vector-data, one ending with .dbf containing attributes and a file ending with .shx containing an index.

I'm interested in the shx-file. How does it work? Does it contain a complete mapping, like 'first geometry maps to third row in the dbf and second geometry maps to the first row' for every geometry? Or does it work different?

+4  A: 

According to the spec the shx contains a 100 byte header followed by a sequence of 8 byte records. Each record stores a 4 byte offset and a 4 byte content length for a record in the main .shp data file.

+-----------------------------------------------+
| header (100 bytes)                            |
+-----------------+------------------+----------+
| offset(4 bytes) | length (4 bytes) | 
+-----------------+------------------+
| offset(4 bytes) | length (4 bytes) | 
+-----------------+------------------+
| offset(4 bytes) | length (4 bytes) | 
+-----------------+------------------+
| offset(4 bytes) | length (4 bytes) | 
+-----------------+------------------+
| ....                               | 
+-----------------+------------------+

Note that the offset is specified in 16 bit words, so the offset for the first record is 50 (as the .shp header is 100 bytes, or 50 words, long). The content length is also specified in 16 bit words.

So, you can figure out the number of records from (index_file_length-100)/8, and use the index to access a particular shape record in the .shp file at random or in sequence.

Paul Dixon
So the order of items in the dbf-file has nothing todo with it, is only for fast access to the correct geometry in the shape-file? And if your explanation is rigth, the formula should be (index_file_length-100)/8 (that would also exactly match my example-data).
Mnementh
The order of the dbf records is equal to the order of the shapes.
Gamecat
Yes sorry, that should be 8. I've corrected it.
Paul Dixon
A: 

Fine answer by Paul Dixon.

Though I was wondering what you are going to do with it! If you're going to write code to read or write SHP files I would strongly suggest using a library instead - there are some good free open source ones like GDAL, also some good commercial ones.

MarkJ