views:

690

answers:

7

I have to convert .DBF and .FPT files from Visual FoxPro to MySQL. Right now my script works for .DBF files, it opens and reads them with dbase_open() and dbase_get_record_with_names() and then executes the MySQL INSERT commands.

However, some fields of these .DBF files are of type MEMO and therefore stored in a separate files ending in .FPT. How do I read this file?

I have found the specifications of this filetype in MSDN, but I don't know how I can read this file byte-wise with PHP (also, I would really prefer a simplier solution).

Any ideas?

A: 

I think it's unlikely there are FoxPro libraries in PHP.

You may have to code it from scratch. For byte-wise reading, meet fopen() fread() and colleagues.

Edit: There seems to be a Visual FoxPro ODBC driver. You may be able to connect to a FoxPro database through PDO and that connector. How the chances of success are, and how much work it would be, I don't know.

Pekka
Thanks for the quick answer. Right now I am trying to read the file with fopen() and fread(). The only problem I am facing here is that I don't know how to convert a Byte into an Integer (how do I store a BYTE in a PHP variable anyway?).
sHiRoKKo
OK I did some research and found the unpack() function which can convert raw data to several kinds of integers or other formats. If I am able to solve this problem then I'll let you know. Otherwise - further suggestions are always welcome :)
sHiRoKKo
A: 

Although not PHP, VFP is 1-based references and I think PHP is zero-based references so you'll have to decypher and adjust accordingly, but this works and hopefully you'll be able to post your version of this portion when finished.

FILETOSTR() in VFP will open a file, and read the entire content into a single memory variable as a character string -- all escape keys, high byte characters, etc, intact. You'll probably need to rely on an FOPEN(), FSEEK(), FCLOSE(), etc.

MemoTest.FPT was my sample memo table/file fpt1 = FILETOSTR( "MEMOTEST.FPT" )

First, you'll have to detect the MEMO BLOCK SIZE used when the file was created. Typically this would be 64 BYTES, but per the link you had in your post.

The header positions 6-7 identify the size (VFP, positions 7 and 8). The first byte is the high-order

nBlockSize = ASC( SUBSTR( fpt1, 7, 1 )) * 256 + ASC( SUBSTR( fpt1, 8, 1 ))

Now, at your individual records. Wherever in your DBF structure has the memo FIELD (and you can have many per single record structure) there will be 4 bytes. In THE RECORD field, it identifies the "block" in the memo file where the content is stored.

MemoBytes = 4 bytes at your identified field location. These will be stored as ASCII from 0-255. This field is stored with the FIRST byte as low-order and the 4th byte as 256^3 = 16777216. The first "Block" ever used will be starting in position offset of 512 per the memo .fpt file spec that the header takes up positions 0-511.

So, if your first memo field has a content of "8000" where the 8 is the actual 0x08, not number "8" which is 0x38, and the zeros are 0x00.

YourMemoField = "8000" (actually use ascii, but for readability showing hex expected value)

First Byte is ASCII value  * 1   ( 256 ^ 0 )
Second Byte is ASCII value * 256   (256 ^ 1)
Third Byte is ASCII value * 65536   (256 ^ 2)
Fourth Byte is ASCII value * 16777216 (256 ^ 3)

nMemoBlock =  byte1 + ( byte2 * 256 ) + ( byte3 * 65536 ) + ( byte4 * 16777216 )

Now, you'll need to FSEEK() to the

FSEEK( handle, nMemoBlock * nBlockSize +1 )

for the first byte of the block you are looking for. This will point to the BLOCK header. In this case, per the spec, the first 4 bytes identify the Block SIGNATURE, the second 4 bytes is the length of the content. For these two, the bytes are stored with HIGH-BYTE first.

From your FSEEK(), its REVERSE of the nMemoBlock above with the high-byte. The "Byte1-4" here are from your FSEEK() position

nSignature = ( byte1 * 16777216 ) + ( byte2 * 65536 ) + ( byte3 * 256 ) + byte4

nMemoLength = ( byte5 * 16777216 ) + ( byte6 * 65536 ) + ( byte7 * 256 ) + byte8

Now, FSEEK() to the 9th byte (1st actual character of the data AFTER the 8 bytes of the header you just read for signature and memo length). This is the beginning of your data.

Now, read the rest of the content...

FSEEK() +9 characters to new position

cFinalMemoData = FREAD( handle, nMemoLength )

I know this isn't perfect, nor PHP script, but its enough of pseudo-code on hOW things are stored and hopefully gets you WELL on your way.

Again, PLEASE take into consideration as you are stepping through your debug process to ensure 0 or 1 offset basis. To help simplify and test this, I created a simple .DBF with 2 fields... a character field and a memo field, added a few records and some basic content to confirm all content, positions, etc.

DRapp
Thanks for this detailed explanation. The Code that I am developing roughly works as you have described. However, there seems to be a problem with the data as some records have less fields than others, meaning I cannot tell when a record starts and when a record ends.You said that the DBF would identify the block in the FPT, do you have more information on this? I'm using PHP's dbase-functions, I think they don't return binary data... The memo fields in the DBF are just NULL
sHiRoKKo
if null, then there would be no memo content associated with that entry. However, the field bytes allocated FOR the memo field will always be 4 bytes. If it HAS a value, it will be the parsing as I've described pointing to the block within the .FPT file.
DRapp
+1  A: 

Alright, I have carefully studied the MSDN specifications of DBF and FPT file structures and the outcome is a beautiful PHP class which can open a DBF and (optional) an FPT memo file at the same time. This class will give you record after record and thereby fetch any memos from the memo file - if opened.

class Prodigy_DBF {
    private $Filename, $DB_Type, $DB_Update, $DB_Records, $DB_FirstData, $DB_RecordLength, $DB_Flags, $DB_CodePageMark, $DB_Fields, $FileHandle, $FileOpened;
    private $Memo_Handle, $Memo_Opened, $Memo_BlockSize;

    private function Initialize() {

        if($this->FileOpened) {
            fclose($this->FileHandle);
        }

        if($this->Memo_Opened) {
            fclose($this->Memo_Handle);
        }

        $this->FileOpened = false;
        $this->FileHandle = NULL;
        $this->Filename = NULL;
        $this->DB_Type = NULL;
        $this->DB_Update = NULL;
        $this->DB_Records = NULL;
        $this->DB_FirstData = NULL;
        $this->DB_RecordLength = NULL;
        $this->DB_CodePageMark = NULL;
        $this->DB_Flags = NULL;
        $this->DB_Fields = array();

        $this->Memo_Handle = NULL;
        $this->Memo_Opened = false;
        $this->Memo_BlockSize = NULL;
    }

    public function __construct($Filename, $MemoFilename = NULL) {
        $this->Prodigy_DBF($Filename, $MemoFilename);
    }

    public function Prodigy_DBF($Filename, $MemoFilename = NULL) {
        $this->Initialize();
        $this->OpenDatabase($Filename, $MemoFilename);
    }

    public function OpenDatabase($Filename, $MemoFilename = NULL) {
        $Return = false;
        $this->Initialize();

        $this->FileHandle = fopen($Filename, "r");
        if($this->FileHandle) {
            // DB Open, reading headers
            $this->DB_Type = dechex(ord(fread($this->FileHandle, 1)));
            $LUPD = fread($this->FileHandle, 3);
            $this->DB_Update = ord($LUPD[0])."/".ord($LUPD[1])."/".ord($LUPD[2]);
            $Rec = unpack("V", fread($this->FileHandle, 4));
            $this->DB_Records = $Rec[1];
            $Pos = fread($this->FileHandle, 2);
            $this->DB_FirstData = (ord($Pos[0]) + ord($Pos[1]) * 256);
            $Len = fread($this->FileHandle, 2);
            $this->DB_RecordLength = (ord($Len[0]) + ord($Len[1]) * 256);
            fseek($this->FileHandle, 28); // Ignoring "reserved" bytes, jumping to table flags
            $this->DB_Flags = dechex(ord(fread($this->FileHandle, 1)));
            $this->DB_CodePageMark = ord(fread($this->FileHandle, 1));
            fseek($this->FileHandle, 2, SEEK_CUR);    // Ignoring next 2 "reserved" bytes

            // Now reading field captions and attributes
            while(!feof($this->FileHandle)) {

                // Checking for end of header
                if(ord(fread($this->FileHandle, 1)) == 13) {
                    break;  // End of header!
                } else {
                    // Go back
                    fseek($this->FileHandle, -1, SEEK_CUR);
                }

                $Field["Name"] = trim(fread($this->FileHandle, 11));
                $Field["Type"] = fread($this->FileHandle, 1);
                fseek($this->FileHandle, 4, SEEK_CUR);  // Skipping attribute "displacement"
                $Field["Size"] = ord(fread($this->FileHandle, 1));
                fseek($this->FileHandle, 15, SEEK_CUR); // Skipping any remaining attributes
                $this->DB_Fields[] = $Field;
            }

            // Setting file pointer to the first record
            fseek($this->FileHandle, $this->DB_FirstData);

            $this->FileOpened = true;

            // Open memo file, if exists
            if(!empty($MemoFilename) and file_exists($MemoFilename) and preg_match("%^(.+).fpt$%i", $MemoFilename)) {
                $this->Memo_Handle = fopen($MemoFilename, "r");
                if($this->Memo_Handle) {
                    $this->Memo_Opened = true;

                    // Getting block size
                    fseek($this->Memo_Handle, 6);
                    $Data = unpack("n", fread($this->Memo_Handle, 2));
                    $this->Memo_BlockSize = $Data[1];
                }
            }
        }

        return $Return;
    }

    public function GetNextRecord($FieldCaptions = false) {
        $Return = NULL;
        $Record = array();

        if(!$this->FileOpened) {
            $Return = false;
        } elseif(feof($this->FileHandle)) {
            $Return = NULL;
        } else {
            // File open and not EOF
            fseek($this->FileHandle, 1, SEEK_CUR);  // Ignoring DELETE flag
            foreach($this->DB_Fields as $Field) {
                $RawData = fread($this->FileHandle, $Field["Size"]);
                // Checking for memo reference
                if($Field["Type"] == "M" and $Field["Size"] == 4 and !empty($RawData)) {
                    // Binary Memo reference
                    $Memo_BO = unpack("V", $RawData);
                    if($this->Memo_Opened and $Memo_BO != 0) {
                        fseek($this->Memo_Handle, $Memo_BO[1] * $this->Memo_BlockSize);
                        $Type = unpack("N", fread($this->Memo_Handle, 4));
                        if($Type[1] == "1") {
                            $Len = unpack("N", fread($this->Memo_Handle, 4));
                            $Value = trim(fread($this->Memo_Handle, $Len[1]));
                        } else {
                            // Pictures will not be shown
                            $Value = "{BINARY_PICTURE}";
                        }
                    } else {
                        $Value = "{NO_MEMO_FILE_OPEN}";
                    }
                } else {
                    $Value = trim($RawData);
                }

                if($FieldCaptions) {
                    $Record[$Field["Name"]] = $Value;
                } else {
                    $Record[] = $Value;
                }
            }

            $Return = $Record;
        }

        return $Return;
    }

    function __destruct() {
        // Cleanly close any open files before destruction
        $this->Initialize();
    }
}

The class can be used like this:

    $Test = new Prodigy_DBF("customer.DBF", "customer.FPT");
    while(($Record = $Test->GetNextRecord(true)) and !empty($Record)) {
        print_r($Record);
    }

It might not be an almighty perfect class, but it works for me. Feel free to use this code, but note that the class is VERY tolerant - it doesn't care if fread() and fseek() return true or anything else - so you might want to improve it a bit before using.

Also note that there are many private variables like number of records, recordsize etc. which are not used at the moment.

sHiRoKKo
A: 

The FPT file contains memo data. In the DBF you have columns of type memo, and the information in this column is a pointer to the entry in the FPT file.

If you are querying the data from the table you only have to reference the memo column to get the data. You do not need to parse the data out of the FPT file separately. The OLE DB driver (or the ODBC driver if your files are VFP 6 or earlier) should just give you this information.

There is a tool that will automatically migrate your Visual FoxPro data to MySQL. You might want to check it out to see if you can save some time:

Go to http://leafe.com/dls/vfp

and search for "Stru2MySQL_2" for the tool to migrate the structures of the data and "VFP2MySQL Data Upload program" for tools to help with the migration.

Rick Schummer VFP MVP

Rick Schummer
A: 

You also might want to check the PHP dbase libraries.They work quite well with DBF files.

mtanish
Yes you can read them easily with dbase functions, but I found that these functions do not show data from MEMO fields (which are most likely stored in a separate file). This is why I wrote my own class which can automatically read the MEMO files accordingly.
sHiRoKKo
A: 

sHiRoKKo;

ur script work well. thanks....

but if the size = 10 and type =m. it return the int value instead of the value.

any ideas to solve this?

Chu SOon
A: 
<?
class Prodigy_DBF {
    private $Filename, $DB_Type, $DB_Update, $DB_Records, $DB_FirstData, $DB_RecordLength, $DB_Flags, $DB_CodePageMark, $DB_Fields, $FileHandle, $FileOpened;
    private $Memo_Handle, $Memo_Opened, $Memo_BlockSize;

    private function Initialize() {

        if($this->FileOpened) {
            fclose($this->FileHandle);
        }

        if($this->Memo_Opened) {
            fclose($this->Memo_Handle);
        }

        $this->FileOpened = false;
        $this->FileHandle = NULL;
        $this->Filename = NULL;
        $this->DB_Type = NULL;
        $this->DB_Update = NULL;
        $this->DB_Records = NULL;
        $this->DB_FirstData = NULL;
        $this->DB_RecordLength = NULL;
        $this->DB_CodePageMark = NULL;
        $this->DB_Flags = NULL;
        $this->DB_Fields = array();

        $this->Memo_Handle = NULL;
        $this->Memo_Opened = false;
        $this->Memo_BlockSize = NULL;
    }

    public function __construct($Filename, $MemoFilename = NULL) {
        $this->Prodigy_DBF($Filename, $MemoFilename);
    }

    public function Prodigy_DBF($Filename, $MemoFilename = NULL) {
        $this->Initialize();
        $this->OpenDatabase($Filename, $MemoFilename);
    }

    public function OpenDatabase($Filename, $MemoFilename = NULL) {
        $Return = false;
        $this->Initialize();

        $this->FileHandle = fopen($Filename, "r");
        if($this->FileHandle) {
            // DB Open, reading headers
            $this->DB_Type = dechex(ord(fread($this->FileHandle, 1)));
            $LUPD = fread($this->FileHandle, 3);
            $this->DB_Update = ord($LUPD[0])."/".ord($LUPD[1])."/".ord($LUPD[2]);
            $Rec = unpack("V", fread($this->FileHandle, 4));
            $this->DB_Records = $Rec[1];
            $Pos = fread($this->FileHandle, 2);
            $this->DB_FirstData = (ord($Pos[0]) + ord($Pos[1]) * 256);
            $Len = fread($this->FileHandle, 2);
            $this->DB_RecordLength = (ord($Len[0]) + ord($Len[1]) * 256);
            fseek($this->FileHandle, 28); // Ignoring "reserved" bytes, jumping to table flags
            $this->DB_Flags = dechex(ord(fread($this->FileHandle, 1)));
            $this->DB_CodePageMark = ord(fread($this->FileHandle, 1));
            fseek($this->FileHandle, 2, SEEK_CUR);    // Ignoring next 2 "reserved" bytes

            // Now reading field captions and attributes
            while(!feof($this->FileHandle)) {

                // Checking for end of header
                if(ord(fread($this->FileHandle, 1)) == 13) {
                    break;  // End of header!
                } else {
                    // Go back
                    fseek($this->FileHandle, -1, SEEK_CUR);
                }

                $Field["Name"] = trim(fread($this->FileHandle, 11));
                $Field["Type"] = fread($this->FileHandle, 1);
                fseek($this->FileHandle, 4, SEEK_CUR);  // Skipping attribute "displacement"
                $Field["Size"] = ord(fread($this->FileHandle, 1));
                fseek($this->FileHandle, 15, SEEK_CUR); // Skipping any remaining attributes
                $this->DB_Fields[] = $Field;
            }

            // Setting file pointer to the first record
            fseek($this->FileHandle, $this->DB_FirstData);

            $this->FileOpened = true;

            // Open memo file, if exists
            if(!empty($MemoFilename) and file_exists($MemoFilename) and preg_match("%^(.+).fpt$%i", $MemoFilename)) {
                $this->Memo_Handle = fopen($MemoFilename, "r");
                if($this->Memo_Handle) {
                    $this->Memo_Opened = true;

                    // Getting block size
                    fseek($this->Memo_Handle, 6);
                    $Data = unpack("n", fread($this->Memo_Handle, 2));
                    $this->Memo_BlockSize = $Data[1];
                }
            }
        }

        return $Return;
    }

    public function GetNextRecord($FieldCaptions = false) {
        $Return = NULL;
        $Record = array();

        if(!$this->FileOpened) {
            $Return = false;
        } elseif(feof($this->FileHandle)) {
            $Return = NULL;
        } else {
            // File open and not EOF
            fseek($this->FileHandle, 1, SEEK_CUR);  // Ignoring DELETE flag
            foreach($this->DB_Fields as $Field) {
                $RawData = fread($this->FileHandle, $Field["Size"]);
                // Checking for memo reference
                if($Field["Type"] == "M" and $Field["Size"] == 4 and !empty($RawData)) {
                    // Binary Memo reference
                    $Memo_BO = unpack("V", $RawData);
                    if($this->Memo_Opened and $Memo_BO != 0) {
                        fseek($this->Memo_Handle, $Memo_BO[1] * $this->Memo_BlockSize);
                        $Type = unpack("N", fread($this->Memo_Handle, 4));
                        if($Type[1] == "1") {
                            $Len = unpack("N", fread($this->Memo_Handle, 4));
                            $Value = trim(fread($this->Memo_Handle, $Len[1]));
                        } else {
                            // Pictures will not be shown
                            $Value = "{BINARY_PICTURE}";
                    }
                } else {
                    $Value = "{NO_MEMO_FILE_OPEN}";
                }
            } else {
                if($Field["Type"] == "M"){
                    if(trim($RawData) > 0)   {
                        fseek($this->Memo_Handle, (trim($RawData) * $this->Memo_BlockSize)+8);
                        $Value = trim(fread($this->Memo_Handle, $this->Memo_BlockSize));
                    }
                }else{
                    $Value = trim($RawData);
                }
            }

            if($FieldCaptions) {
                $Record[$Field["Name"]] = $Value;
            } else {
                $Record[] = $Value;
            }
        }

        $Return = $Record;
    }

    return $Return;
}

function __destruct() {
    // Cleanly close any open files before destruction
    $this->Initialize();
}

} ?>

Chu SOon
Did you solve your problem? If so, which lines did you modify?
sHiRoKKo
yes..solved my problems.......changes this "$Value = trim($RawData);" to below...here i added:if($Field["Type"] == "M"){ if(trim($RawData) > 0) { fseek($this->Memo_Handle, (trim($RawData) * $this->Memo_BlockSize)+8); $Value = trim(fread($this->Memo_Handle, $this->Memo_BlockSize)); } }else{ $Value = trim($RawData); }am rite do the right things? but i solved my problems..
Chu SOon