views:

3026

answers:

8

I am trying to extract the schema of an .mdb database, so that I can recreate the database elsewhere.

How can I pull off something like this?

+1  A: 

It's hard to do DDL scripts / queries in Access. It can be done but you'd be better off just creating a copy of the database - deleting all the data and compacting it. Then use a copy of this for recreating the database elsewhere.

DJ
I feared this might be the answer. So build integration is out of question.
AngryHacker
But that wasn't your question. You didn't ask about build integration, you asked about extracting the schema. You can certainly script the schema build process by writing the code, just like the rest of your application.
le dorfier
+1  A: 

Check out the docmd.TransferDatabase command. It is probably your best bet for build integration that needs to replicate the data structure

JohnFx
+7  A: 

It is possible to do a little with VBA. For example, here is a start on creating script for a database with local tables.

Dim db As Database
Dim tdf As TableDef
Dim fld As DAO.Field
Dim ndx As DAO.Index
Dim strSQL As String
Dim strFlds As String
Dim strCn As String

Dim fs, f

    Set db = CurrentDb

    Set fs = CreateObject("Scripting.FileSystemObject")
    Set f = fs.CreateTextFile("C:\Docs\Schema.txt")

    For Each tdf In db.TableDefs
        If Left(tdf.Name, 4) <> "Msys" Then
            strSQL = "strSQL=""CREATE TABLE [" & tdf.Name & "] ("

            strFlds = ""

            For Each fld In tdf.Fields

                strFlds = strFlds & ",[" & fld.Name & "] "

                Select Case fld.Type

                    Case dbText
                        'No look-up fields
                        strFlds = strFlds & "Text (" & fld.Size & ")"

                    Case dbLong
                        If (fld.Attributes And dbAutoIncrField) = 0& Then
                            strFlds = strFlds & "Long"
                        Else
                            strFlds = strFlds & "Counter"
                        End If

                    Case dbBoolean
                        strFlds = strFlds & "YesNo"

                    Case dbByte
                        strFlds = strFlds & "Byte"

                    Case dbInteger
                        strFlds = strFlds & "Integer"

                    Case dbCurrency
                        strFlds = strFlds & "Currency"

                    Case dbSingle
                        strFlds = strFlds & "Single"

                    Case dbDouble
                        strFlds = strFlds & "Double"

                    Case dbDate
                        strFlds = strFlds & "DateTime"

                    Case dbBinary
                        strFlds = strFlds & "Binary"

                    Case dbLongBinary
                        strFlds = strFlds & "OLE Object"

                    Case dbMemo
                        If (fld.Attributes And dbHyperlinkField) = 0& Then
                            strFlds = strFlds & "Memo"
                        Else
                            strFlds = strFlds & "Hyperlink"
                        End If

                    Case dbGUID
                        strFlds = strFlds & "GUID"

                End Select

            Next

            strSQL = strSQL & Mid(strFlds, 2) & " )""" & vbCrLf & "Currentdb.Execute strSQL"

            f.WriteLine vbCrLf & strSQL

            'Indexes
            For Each ndx In tdf.Indexes

                If ndx.Unique Then
                    strSQL = "strSQL=""CREATE UNIQUE INDEX "
                Else
                    strSQL = "strSQL=""CREATE INDEX "
                End If

                strSQL = strSQL & "[" & ndx.Name & "] ON [" & tdf.Name & "] ("

                strFlds = ""

                For Each fld In tdf.Fields
                    strFlds = ",[" & fld.Name & "]"
                Next

                strSQL = strSQL & Mid(strFlds, 2) & ") "

                strCn = ""

                If ndx.Primary Then
                    strCn = " PRIMARY"
                End If

                If ndx.Required Then
                    strCn = strCn & " DISALLOW NULL"
                End If

                If ndx.IgnoreNulls Then
                    strCn = strCn & " IGNORE NULL"
                End If

                If Trim(strCn) <> vbNullString Then
                    strSQL = strSQL & " WITH" & strCn & " "
                End If

                f.WriteLine vbCrLf & strSQL & """" & vbCrLf & "Currentdb.Execute strSQL"
            Next
        End If
    Next

    f.Close
Remou
This is really good. How do you get the default values out, foreign keys, etc?
AngryHacker
It is necessary to use ADO to add default values. Foreign keys etc can be added with CONSTRAINT ReferForeignField FOREIGN KEY(<f1>, <f2>,..,<fn>) REFERENCES <table> (<f1>, <f2>,..,<fn>)I will see what i can do to add to the example.
Remou
If you're using DAO, you should be using the relationships collection to apply foreign key restraints, no?
David-W-Fenton
Yes, but if it is to be done out in sql, like most schemas, I think, it looks like it ought to be ADO. As has already been said, for DAO the best bet is probably to simply copy the mdb. YesNo?
Remou
@David W. Fenton: the DAO model isn't as rich as the SQL DLL syntax e.g. DAO can't still can't create 'fast foreign keys'. I know, I know: you've never used a fast foreign key and therefore you don't miss it but it's still a limitation :)
onedaywhen
+2  A: 

You can use the ACE/Jet OLE DB Provider and an ADO Connection object's OpenSchema method to get schema information as a Recordset (which is arguable better than a Collection because it can be filtered, sorted, etc).

The basic methodology is to use adSchemaTables to get the base tables (not VIEWs), then use each TABLE_NAME to fetch adSchemaColumns for ORDINAL_POSITION, !DATA_TYPE, !IS_NULLABLE, !COLUMN_HASDEFAULT, !COLUMN_DEFAULT, !CHARACTER_MAXIMUM_LENGTH, !NUMERIC_PRECISION,!NUMERIC_SCALE.

adSchemaPrimaryKeys is straightforward. adSchemaIndexes is where you will find UNIQUE constraints, not sure wether these can be distinguished from unique indexes, also the names of FOREIGN KEYs to plug into the adSchemaForeignKeys rowset e.g. (pseudo code):

rsFK.Filter = "FK_NAME = '" & !INDEX_NAME & "'")

-- watch for the gotcha that Jet 3.51 allows a FK based on a nameless PK (!!)

Names of Validation Rules and CHECK constraints can be found in the adSchemaTableConstraints rowset, using the table name in the OpenSchema call, then use the name in the call to the adSchemaCheckConstraints rowset, filter for CONSTRAINT_TYPE = 'CHECK' (a gotcha is a constraint named 'ValidationRule' + Chr$(0), so best to escape the null characters form the name). Remember that ACE/Jet Validation rules can be either row-level or table-level (CHECK constraints are always table-level), so you may need to use the table name in the filter: for adSchemaTableConstraints is [].[].ValidationRule will be [].ValidationRule in adSchemaCheckConstraints. Anotehr gotcha (suspected bug) is that the Field is 255 characters wide, so any Validation Rule/CHECK constraint definition of more than 255 characters will have a NULL value.

adSchemaViews, for Access Query objects based on non-paramaterized SELECT SQL DML, is straightforward; you can use the VIEW name in adSchemaColumns to get the column details.

PROCEDURES are in adSchemaProcedures, being all other flavours of Access Query objects including parameterized SELECT DML; for the latter I prefer to replace the PARAMETERS syntax with CREATE PROCEDURE PROCEDURE_NAME in the PROCEDURE_DEFINITION. Don't boterh looking in the adSchemaProcedureParameters, you won't find anything: the parameters can be enumerated by using an ADOX Catalog object to return an ADO Command e.g. (pseudo code):

Set Command = Catalog.Procedures(PROCEDURE_NAME).Command

then enumerate the Comm.Parameters collection for the .Name, .Type for DATA_TYPE, (.Attributes And adParamNullable) for IS_NULLABLE, .Value for COLUMN_HASDEFAULT and COLUMN_DEFAULT, .Size, .Precision, .NumericScale.

For ACE/Jet-specific properties such as Unicode compression you need to use another kind of object. For example, a Long Integer Autonumber in Access-speak can be found using an ADO Catalog object e.g. (pseudo code):

bIsAutoincrement = Catalog.Tables(TABLE_NAME).Columns(COLUMN_NAME).Properties("Autoincrement").Value

Good luck :)

onedaywhen
+4  A: 

If you're happy to use something other than pure Access SQL, you could persist a collection of ADOX objects and use those to recreate the table structure.

Example (in Python, doesn't currently recreate relationships and indexes as it wasn't needed for the project I was working on):

import os
import sys
import datetime
import comtypes.client as client

class Db:
    def __init__(self, original_con_string = None, file_path = None,
                 new_con_string = None, localise_links = False):
        self.original_con_string = original_con_string
        self.file_path = file_path
        self.new_con_string = new_con_string
        self.localise_links = localise_links

    def output_table_structures(self, verbosity = 0):
        if os.path.exists(self.file_path):
            if not os.path.isdir(self.file_path):
                raise Exception("file_path must be a directory!")
        else:
            os.mkdir(self.file_path)
        cat = client.CreateObject("ADOX.Catalog")
        cat.ActiveConnection = self.original_con_string
        linked_tables = ()
        for table in cat.Tables:
            if table.Type == u"TABLE":
                f = open(self.file_path + os.path.sep +
                         "Tablestruct_" + table.Name + ".txt", "w")
                conn = client.CreateObject("ADODB.Connection")
                conn.ConnectionString = self.original_con_string
                rs = client.CreateObject("ADODB.Recordset")
                conn.Open()
                rs.Open("SELECT TOP 1 * FROM [%s];" % table.Name, conn)
                for field in rs.Fields:
                    col = table.Columns[field.Name]
                    col_details = (col.Name, col.Type, col.DefinedSize,
                                   col.Attributes)
                    property_dict = {}
                    property_dict["Autoincrement"] = (
                        col.Properties["Autoincrement"].Value)
                    col_details += property_dict,
                    f.write(repr(col_details) + "\n")
                rs.Close()
                conn.Close()
                f.close()
            if table.Type == u"LINK":
                table_details = table.Name,
                table_details += table.Properties(
                    "Jet OLEDB:Link DataSource").Value,
                table_details += table.Properties(
                    "Jet OLEDB:Link Provider String").Value,
                table_details += table.Properties(
                    "Jet OLEDB:Remote Table Name").Value,
                linked_tables += table_details,
        if linked_tables != ():
            f = open(self.file_path + os.path.sep +
                     "linked_list.txt", "w")
            for t in linked_tables:
                f.write(repr(t) + "\n")
        cat.ActiveConnection.Close()

A similar reverse function reconstructs the database using the second connection string.

mavnn
There is no such thing as "Access SQL." Access uses Jet SQL by default, but that's wholly independent of Access.
David-W-Fenton
True, true. I dock myself a pedant point :)
mavnn
@mavnn: yes, shame on you. The fact that *everyone* knew exactly what you meant has no bearing on the matter whatsoever :)
onedaywhen
onedaywhen
Correction: forget informal, looks like it's now official for the Access2007 era. See: http://msdn.microsoft.com/en-us/library/bb245488.aspx -- it's a whole section of MSDB entitiled 'Microsoft Access SQL Reference'. I can't see a single mention of 'Jet' :)
onedaywhen
+3  A: 

The following C# outlines how to obtain the schema from a .mdb file.

Obtain a connection to the database:

String f = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + "database.mdb";
OleDbConnection databaseConnection = new OleDbConnection(f);
databaseConnection.Open();

Get the name of each table:

DataTable dataTable = databaseConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });
int numTables = dataTable.Rows.Count;
for (int tableIndex = 0; tableIndex < numTables; ++tableIndex)
{
    String tableName = dataTable.Rows[tableIndex]["TABLE_NAME"].ToString();

Get the fields for each table:

    DataTable schemaTable = databaseConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Columns, new object[] { null, null, tableName, null });
    foreach (DataRow row in schemaTable.Rows)
    {
        String fieldName = row[3];
        String fieldType = row[11];
        String fieldDescription = row[27];
    }
}

Where do the 3, 11 and 27 come from? I found them by inspecting DataRow.ItemArray with a debugger, does anyone know the 'correct' way?

Dave Tapley
To get the field name for each column in the database:OleDbDataReader.GetName(i), where i is used to iterate through each of the field index numbers.OleDbDataReader.GetFieldType(i) returns the data type of the column.
Stewbob
Aha, I knew there must be a way! Cheers.
Dave Tapley
+2  A: 

Compare'Em http://home.gci.net/~mike-noel/CompareEM-LITE/CompareEM.htm will happily generate the VBA code need to recreate an MDB. Or the code to create the differences between two MDBs so you can do a version upgrade of the already existing BE MDB. It's a bit quirky but works. Note it does not support the new ACE (Access2007) ACCDB etc formats.

I use it all the time.

(OneDayWhen's edit was one third right and two thirds wrong.)

Tony Toews
A: 

You can try DBWScript.

exp