tags:

views:

164

answers:

2

I have a several DBF files generated by a third party that I need to be able to query. I am having trouble because all of the column types have been defined as characters, but the data within some of these fields actually contain binary data. If I try to read these fields using an OleDbDataReader as anything other than a string or character array, I get an InvalidCastException thrown, but I need to be able to read them as a binary value or at least cast/convert them after they are read. The columns that actually DO contain text are being returned as expected.

For example, the very first column is defined as a character field with a length of 2 bytes, but the field contains a 16-bit integer.

I have written the following test code to read the first column and convert it to the appropriate data type, but the value is not coming out right.

The first row of the database has a value of 17365 (0x43D5) in the first column. Running the following code, what I end up getting is 17215 (0x433F). I'm pretty sure it has to do with using the ASCII encoding to get the bytes from the string returned by the data reader, but I'm not sure of another way to get the value into the format that I need, other that to write my own DBF reader and bypass ADO.NET altogether which I don't want to do unless I absolutely have to. Any help would be greatly appreciated.

        byte[] c0;
        int i0; 

        string con = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\ASTM;Extended Properties=dBASE III;User ID=Admin;Password=;";

        using (OleDbConnection c = new OleDbConnection(con))
        {
            c.Open();
            OleDbCommand cmd = c.CreateCommand();
            cmd.CommandText = "SELECT * FROM astm2007";
            OleDbDataReader dr = cmd.ExecuteReader();
            while (dr.Read())
            {
                c0 = Encoding.ASCII.GetBytes(dr.GetValue(0).ToString());

                i0 = BitConverter.ToInt16(c0, 0);
            }
            dr.Dispose();
        }
A: 

What you may be running into is actually a memo based field... These are columns that actually have the raw text in ANOTHER file (typically a .DBT (dBASE) or .FPT (FoxPro). It is a pointer offset in the text content file which is freeform length and written in blocks, but the pointer is stored in 4 bytes.

If you have access to a .dbf viewer and can see it somewhat natively, that would probably help you some.

DRapp
Thanks, but these fields are not memo fields. They contain byte offset and length information to an image file and I have verifed that the values stored in the DBF file are correct.
figabytes
A: 

I am pretty sure that you are correct about the ASCII character conversions. I looked a bit for the supported scalar functions for the Jet engine but was not able to find them ... or rather I found scalar functions listed but no syntax. The CONVERT function is probably what you want. Something like:

SELECT CONVERT(twobytefield, SQL_BINARY) from astm2007

Then you could call dr.GetBytes() to read the raw data. However, I was not able to construct a statement using that function that the Jet engine liked.

If you are not able to get the conversion working, another possibility is to use the Advantage .NET Data Provider. Or the OLE DB provider (but the .NET data provider might be a better fit since you are using C#). That provider reads DBF files and supports the CONVERT scalar function. It has a free local engine.

Since you mention you are going to try it and since I tested it to make sure I wasn't lying, here is the code snippet I used:

AdsConnection conn = new AdsConnection( 
   @"data source=c:\path;chartype=ansi;ServerType=local;TableType=cdx;" );
conn.Open();
AdsCommand cmd = conn.CreateCommand();
cmd.CommandText = "select cast(somefield as sql_binary) from sometable";
cmd.CommandType = CommandType.Text;
AdsExtendedReader rdr = cmd.ExecuteExtendedReader();
rdr.Read();
byte[] c0 = rdr.GetBytes( 0 );
int i0 = BitConverter.ToInt16( c0, 0 );
Console.WriteLine( "val = {0}", i0 );
Mark Wilkins
Thanks, I will give Advantage .NET Data Provider a shot
figabytes
@Jason: In the spirit of full disclosure, I should mention I am one of the developers on the Advantage product line. But it won't cost anything to use the local engine (so hopefully it's not bad manners on my part). I will update the answer with the connection string info that I used to test it.
Mark Wilkins
That did the trick. Thank you very much! BTW, I'm glad you posted the connection string, because I was struggling with that part.
figabytes
@Mark, I seem to have found an issue with the provider periodically returning NULL in fields that contain non-NULL values. I can reproduce it in the Advantage ODBC driver as well. It happens with or without the cast, but have not been able to tell what in the data, if anything, is causing it.
figabytes
@figabytes: I just looked at the code. I see that if the DBF character field contained zeros, the resulting cursor from that select statement will have zeros which is then interpreted as DbNULL at the client. I suspect that this is what you are seeing. You could check for this case with a call to rdr.IsDBNull(col).
Mark Wilkins
@Mark, That makes sense. IsDbNull() does return true for those columns. The first byte is 0 so I guess its being interpreted a NULL terminated string so subsequent bytes aren't even being evaluated. Thanks
figabytes
@figabytes: Yes you are correct. I didn't realize it only required the first byte to be zero. So, ultimately, it is still be treated as a string. Rats. I don't see a simple way to get around that. A convoluted way would be to use "tabletype=vfp" in the connection string and then use alter table to change the field type to varbinary, which is really a "byte" type field and better suited to this operation.
Mark Wilkins

related questions