I have recently started work on a project that has already been running for several years. The application is written in c# using Windows Forms for the UI and SQLite for the database. The database is currently accessed using ADO.NET via the System.Data.SQLite
namespace.
From time-to-time clients have received application and database updates, but not all at once so there are numerous, structurally different versions of the database in existence. To make matters worse, clients are able to add their own fields to the application's tables to enable customized reports, etc. So, the number of different databases is out of control. When feature enhancements have been made there has been more and more "if-then-else" code added to keep the application running for all of these database variations.
Further to this, SQLite has the annoying feature that the values in fields can be stored as any type and not just the type defined in the table create statement. It is common for data to be imported from CSV files out of Excel that have incorrect date/time values that SQLite happily imports, but under code we get all sorts of invalid type exceptions.
I want to put a stop to all of this.
In order to clean up the database I am instigating a standard database design that will not change unless we release an official update along with code to automatically migrate the data.
I have taken the latest "ad-hoc" design and, without changing the table and field names, have made it at least Entity Framework friendly by ensuring that primary keys are defined for the existing fields and that consistent types are used.
I'm now attempting to migrate the various legacy databases one-by-one. I'm using a technique of loading all the records in each table using "SELECT * FROM [LEGACY_TABLE];" and creating new Entity Framework objects using new EF_Table() { ID = GetValue<long>(dr, "ID"), Description = GetValue<string>(dr, "Description"), };
. I've defined GetValue
to handle converting badly formatted data and DBNull
references etc.
My major issue now is that many of the tables have primary keys defined as "INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL" which won't allow the Entity Framework to assign the current key values from the legacy database.
In the migration process I need to ensure that primary keys remain the same as there are no foreign key relationships defined and existing primary key values are stored outside of the database in application configuration data.
I have looked around for a solution, but can't see one. I would have thought that this kind of a problem - the migration and/or cleansing of an existing SQLite database - would have been a solved problem.
Can anyone point me in a productive direction to solve this?