I have a growing set of Excel spreadsheets none with the same data structure. I need a mechanism to query each of these spreadsheets (DataTables) using a single interface. So essentially, you choose the DataTable from a dropdownlist and then perform your search.
My initial thought was to handle it like this.
- Create a generic data structure in SQL. This would be comprised of four tables. 1) Study 2) StudyProperty 3) StudyPropertyType (a static list of datatypes) and 4) StudyData.
- The first step would be to define a study. Meaning that you create an entry in the Study table, to give it a name and description. Joined to that you would create a StudyProperty record for each corresponding field in the Excel spreadsheet, and select the appropriate StudyPropertyType.
- So if you had a spreadsheet with two columns (Name and Date of Birth), you'd get one Study record, and two StudyProperty records (Name with the StudyPropertyType = string) and (Date of Birth with the StudyPropertyType = DateTime)
- Once the Study has been defined, I can import the data from the Excel spreadsheet into the StudyData table.
- The StudyData table has an Id and a column for each possible datatype. Depending on the Study definition, the imported data is stored in the appropriate field in the Study table based on dataType. So Date Of Birth would be stored in the ValueDateTime field of the Study table.
So with all that said and done, I can have all the spreadsheets stored in SQL, but now I am stumped on how to build a generic query mechanism.
I feel like I have made this much more complicated than need be, and I'm hoping there's a better way to handle unknown data structures.