views:

163

answers:

3

Hello, today I got a copy of an old system from which I need to import data. The system is written in C and runs in DOS. It uses some kind of database. The file format seems to be rather simple(1 file = 1 table, header contains some description and then records, fields are delimited by 0 ASCII character, but it's not that simple as it seems).

The question is: how to recognize what database is used?

Is there any kind of software that maybe opens many formats?

Or is there any software that could help me?

Or any links to sites describing dos databases?

Or just anything that can help will be appreciated:)

PS> I can post some small files from the db if anyone wants to try guessing.

One small db file:

http://www.2shared.com/file/9137583/f840f261/WCENNIK.html

+1  A: 

Most of those older flat-file apps used proprietary(ie, non-standard) formats. If the db is a standard format, you should see some kind of identifier close to the header that tells you what it is.

If you can't determine the format by visually inspecting the file in a hex editor, your best bet is to trace through the C code that reads each record and reverse-engineer the format.

David Lively
I was looking at a dBase file dumper that I wrote in the 1980s. I expected to see "DBF" reserved in the header somewhere. There's no such thing. It just begins with version number and then goes into the last update timestamp, number of records, record length, etc.
wallyk
Ah, dBase. And sweet memories of Paradox... (shiver)
David Lively
+2  A: 

Almost every version of Unix including linux and Mac OS has a command called "file" that recognizes a huge range of file types by their content. Try copying one of the data files to a Mac OS or Linux computer and running

file [filename]

from the command line.

rjmunro
That's well worth the trouble to try, but now that I reflect on it, there were few patterns present in files of that era that the file command could know about, let alone inspect. Recognizing an old MSDOS .COM file in this way is nearly impossible. It just begins with instructions—no header, no containers—nothing. I believe the file extension had great importance for declaring the file type in those days.
wallyk
I'll give it a try, but that might not help much, because those extensions are probably in polish "baz" which might mean "baza" == "database" and ind - "indeks" == "index"
kubal5003
Tried it on Debian, "file" doesn't recognize it.
NXT
I was going to suggest that if you try the Unix `file` command, you might also try `strings`, but if the file contents are Polish, the strings might resemble random sequences of consonants. (Well, that's how it looks to me ;-)
pavium
+1  A: 

Sounds like a dBase file to me. They were very common. It's not necessary for DBF to appear in the header. See the format description here:

http://www.dbase.com/knowledgebase/int/db7%5Ffile%5Ffmt.htm

edit Better link:

http://www.clicketyclick.dk/databases/xbase/format/

What's the value of the first byte?

I just double checked some DBF files that I have on hand and they do not have DBF in the header.

NXT
Not dBase unfortunately, I would recognize it immediately and that was my first thought.
kubal5003
I've investigated the link that you posted and that might really be it. I was thinking about the dBase that I know from windows, but I see that there was much more than that.
kubal5003
I don't think it's dBase. I posted another link above that is better.Fortunately it looks very easy to write a decoder in C. Hints: Byte 2 (0x52) looks like the record length-2 and I'm guessing that 0x90 0xAD is the record separator.
NXT