I am currently messing around with some stuff for an idea for a site - where I pretty much want to enable my users to create "Tables" which holds data and then allow them to query over this data (in a less geeky way than writing up SQL Queries and hopefully easier than using excel).
My idea, so far, is to represent this in my database using a couple of tables - have one table representing a table, one table representing columns for the table, having one table that represents each row in a table and finally one that represents values. Something akin to (PSEUDO SQL):
CREATE TABLE 'Tables' (
Id INT NOT NULL PRIMARY KEY,
NAME VARCHAR(255)
)
CREATE TABLE 'TableColumns' (
Id INT NOT NULL PRIMARY KEY,
TableId INT NOT NULL FOREIGN KEY ON 'Tables',
NAME VARCHAR(255)
)
CREATE TABLE 'TableRows' (
Id INT NOT NULL PRIMARY KEY,
TableId INT NOT NULL FOREIGN KEY ON 'Tables',
RowNumber INT NOT NULL
)
CREATE TABLE 'TableValues' (
RowId INT NOT NULL PRIMARY KEY,
ColumnId INT NOT NULL PRIMARY KEY,
Value VARCHAR(255)
)
(note that the TableValues table has 2 primary key fields here, it's supposed to represent a "composite" primary key, don't bother too much with the fact that my syntax is not legal SQL, it's just supposed to show the idea).
I did a bit of testing with this and was able to successfully do simple querying (simple filtering, ordering and so forth). My way of doing this was to first query the TableRows table - for filtering I then filtered out the rows who's columns did not match the criteria, and for sorting i sorted the RowIds based on their column's content (as specified by the sorting specified). Resulting in a list of Row Ids in the desired order, from here on it was merely just to select what was needed.
All this works fine, but I am a bit stuck from here on. I'd like to somehow be able to represent different data types (which is my main issue really) and also later on work out how to do joins.
While thinking all this through I start to wonder if there's a better way of doing this. Note that performance here, of course, is a factor, but I'm not planning on supporting virtual tables with hundreds of thousands of rows, maybe about 1000 rows per virtual table - of course the entire system needs to be capable of handling many of these.
I know I could always just actually create tables in my database with queries created on the fly in C# to accomplish this, and likewise query using just SQL Queries - however I have never been a huge fan of letting users "construct" queries against my database like this - and it seems to me as if that would lead down a path where many bugs would appear - and in worst case scenario end up allowing the user to kill the database in one way or another.
Also, then my issue becomes how I can deal with this in a way that would make sense from a C# perspective. So far I think I am leaning on using LINQ and then create my own extension methods which would apply the needed functionality - that is ExtensionMethods extending IQueryable.
So what I'd really like would be some ideas of how this could be done, ideas of how to tune performance, ideas of how to deal with separate data types in the table (of course store the type in the tablecolumn, but how to actually store the value so I can filter, sort and so forth by it? - without just adding a "TextValue", "MoneyValue" and so forth column on my tablevalues table). And last but not least, hopefully some good discussions here - I at the very least consider this to be somewhat an interesting topic.