views:

633

answers:

3

Hi!

I'm using gfortran, I need to write a function that reads records from a .dbf file associated with an ESRI Shapefile. The file I should be able to read is available from internet http://diss.rm.ingv.it/diss/DISS_3.0.4.shp.zip

The opinion of the file command about the format of the file is:

$ file GGSources_polyline.dbf
GGSources_polyline.dbf: \012- DBase 3 data file\012-  (119 records)

Thanks for your suggestions

+1  A: 

I found a rough description of the file format here. It looks like there is quite a mix of variable types and sizes throughout, which is going to complicate things somewhat. I don't know if using Fortran to try and read this data is the best option, but if you must here are some hints:

  • Open the file for direct access unformatted I/O. Unformatted means that you can just read the bytes straight out of the file, and direct access won't add any padding to records.
  • Set the record length as the lowest common length between fields
  • Use the transfer() function to interpret a location in memory as a particular type. This will allow you to read the binary data from the file into a variable of type integer but then assign to a real without doing a type cast.

I'm in a similar situation now trying to read a file with a structure very similar to the dBase file (i.e. varying sizes of headers pointing to regions of the file with different types) and ended up using Python and Numpy to read the file. Reading consists of seeking to a location in the file, reading a bunch of bytes, then using the numpy.fromstring option to convert that into real*4, real*8, integer*8, etc. You can make this work, but you may want to keep your options open.

Tim Whitcomb
+1  A: 

Your best bet is to conver the dbf file into something else, using e.g. the OGR tools, available in most linux distributions. You can just convert the contents of the dbf file into a CSV file using ogr2ogr:

ogr2ogr -f "CSV" output.csv FaultScarps_polyline.shp FaultScarps_polyline

(note that you need to include the layername, which for Shapefiles, is identical to the shapefile's name). The first 3 lines of the CSV look like this:

IDSOURCE,IDSCARP,SOURCENAME,FAULTSCARP,LENGHT,HEIGHT,AVGVOFFSET,MAXVOFFSET,VOFFSETTYP,AVGHOFFSET,MAXHOFFSET,HOFFSETTYP,AGE,NOEVENTS,LENGHTQ,HEIGHTQ,VOFFSETQ,HOFFSETQ,AGEQ,NOEVENTSQ,LENGHTN,HEIGHTN,VOFFSETN,HOFFSETN,AGEN,NOEVENTSN,REFERENCE
ITGG001,          1,Ovindoli-Pezza,Ovindoli-Pezza Fault Piano Pezza,  4.40, 18.00,   9.750,  16.000,          1,   0.000,   0.000,3,             10.000000000000000,3,1,0,1,1,1,1,Based on topographic observations.,Max height in late Pleistocene-Holocene fluvioglacial deposits.,Based on geological survey and refers to late Pleistocene-Holocene deposits.,Based on geological survey.,Based on geological observations.,Refers to Holocene and based on paleoseismology.,Pantosti et al. [1996].
ITGG001,          2,Ovindoli-Pezza,Ovindoli-Pezza  Fault Campo Porcaro,  8.60,  0.00,   8.700,  12.000,          1,   3.045,   4.025,1,             18.000000000000000,3,1,0,1,1,1,1,Based on topographic observations.,,Max offset observed  in the late Pleistocene-Holocene fluvioglacial and moraine deposits.,"Calculated as 35 % of the vertical component, on the basis of literature data.",Based on geological observations.,Refers to Holocene and based on paleoseismology.,Pantosti et al. [1996]

An alternative would be to access the Shapefile using OGR (or Shapelib) and doing the processing in C, returning it to the main Fortran program.

Jose
A: 

You may struggle reading binary unformatted files in Fortran which were not written from a Fortran write statement unless your compiler has some extensions. Fortran binary unformatted files have beginning of record and end of record marks. These marks are usually the length of the record in bytes. So the runtime system will try to interpret characters in the file as record marks and get confused.

Converting to csv ascii and reading that from Fortran will work. If you were going to try reading other file types then writing some C functions to interface to the C I/O library should allow you to read the files directly.

hugok
Fortran binary unformatted files have the begin/end records (that specify the length of the record) only when the file is opened for sequential access, not direct.
Tim Whitcomb