views:

539

answers:

5

I'm developping a web application using google appengine and django, but I think my problem is more general.

The users have the possibility to create tables, look: tables are not represented as TABLES in the database. I give you an example:

First form:
 Name of the the table: __________
 First column name: __________
 Second column name: _________
 ...

The number of columns is not fixed, but there is a maximum (100 for example). The type in every columns is the same.

Second form (after choosing a particular table the user can fill the table):
 column_name1: _____________
 column_name2: _____________
 ....

I'm using this solution, but it's wrong:


class Table(db.Model):
    name = db.StringProperty(required = True)

class Column(db.Model):
    name = db.StringProperty(required = True)
    number = db.IntegerProperty()
    table = db.ReferenceProperty(table, collection_name="columns")

class Value(db.Model):
    time = db.TimeProperty()
    column = db.ReferenceProperty(Column, collection_name="values")

when I want to list a table I take its columns and from every columns I take their values:


    data = []
    for column in data.columns:
        column_data = []
        for value in column.values:
            column_data.append(value.time)
        data.append(column_data)
    data = zip(*data)

I think that the problem is the order of the values, because it is not true that the order for one column is the same for the others. I'm waiting for this bug (but until now I never seen it):

Table as I want:   as I will got:
a z c                 a e c
d e f                 d h f
g h i                 g z i

Better solutions? Maybe using ListProperty?

+1  A: 

Why don't you add an IntegerProperty to Value for rowNumber and increment it every time you add a new row of values and then you can reconstruct the table by sorting by rowNumber.

mckeed
I thought about it, it solve my problem, but I'm looking for a completely different solution. I think mine is not a new problem and a lot of people has already solved it in a better way.
wiso
I think the way you describe it is the way most people would do it with a relational database, but since the App Engine's database is schemaless, you could give Table a list of column names as a ListProperty and then have a Row class that has a list of times.
mckeed
Yes, but the problem in this case is to handle empty values.
wiso
A: 

You're going to make life very hard for yourself unless your user's 'tables' are actually stored as real tables in a relational database. Find some way of actually creating tables and use the power of an RDBMS, or you're reinventing a very complex and sophisticated wheel.

vincebowdren
I have to give the CREATE permission to the user? I think it's bad. Nevertheless I'm working with google app engine so I'm working with models and not directly with database. Keep in mind that every user can create hundreds of "tables"
wiso
No, not necessarily. Let the user submit the spec of their table to a stored procedure which creates the table for them. That stored procedure can have plenty of logic to do with what kind of tables are allowed (and how many) and you don't need to give the end users any more freedom than you have to.
vincebowdren
A: 

This is the conceptual idea I would use: I would create two classes for the data-store:

  1. table this would serve as a dictionary, storing the structure of the pseudo-tables your app would create. it would have two fields : table_name, column_name, column_order . where column_order would give the position of the column within the table

  2. data this would store the actual data in the pseudo-tables. it would have four fields : row_id, table_name, column_name , column_data. row_id would be the same for data pertaining to the same row and would be unique for data across the various pseudo-tables.

Aadith
it's not very different from my solution. I only need a row_id, as mckeed said.
wiso
A: 

Put the data in a LongBlob.

The power of a database is to be able to search and organise data so that you are able to get only the part you want for performances and simplicity issues : you don't want the whole database, you just want a part of it and want it fast. But from what I understand, when you retrieve a user's data, you retrieve it all and display it. So you don't need to sotre the data in a normal "database" way.

What I would suggest is to simply format and store the whole data from a single user in a single column with a suitable type (LongBlob for example). The format would be an object with a list of columns and rows of type. And you define the object in whatever language you use to communicate with the database.

The columns in your (real) database would be : User int, TableNo int, Table Longblob. If user8 has 3 tables, you will have the following rows :

8, 1, objectcontaintingtable1;
8, 2, objectcontaintingtable2;
8, 3, objectcontaintingtable3;
Silence
+1  A: 

Here's a data model that might do the trick for you:

class Table(db.Model):
 name = db.StringProperty(required=True)
 owner = db.UserProperty()
 column_names = db.StringListProperty()

class Row(db.Model):
 values = db.ListProperty(yourtype)
 table = db.ReferenceProperty(Table, collection_name='rows')

My reasoning: You don't really need a separate entity to store column names. Since all columns are of the same data type, you only need to store the name, and the fact that they are stored in a list gives you an implicit order number.

By storing the values in a list in the Row entity, you can use an index into the *column_names* property to find the matching value in the values property.

By storing all of the values for a row together in a single entity, there is no possibility of values appearing out of their correct order.

Caveat emptor: This model will not work well if the table can have columns added to it after it has been populated with data. To make that possible, every time that a column is added, every existing row belonging to that table would have to have a value appended to its values list. If it were possible to efficiently store dictionaries in the datastore, this would not be a problem, but list can really only be appended to.

Alternatively, you could use Expando...

Another possibility is that you could define the Row model as an Expando, which allows you to dynamically create properties on an entity. You could set column values only for the columns that have values in them, and that you could also add columns to the table after it has data in it and not break anything:

class Row(db.Expando):
    table = db.ReferenceProperty(Table, collection_name='rows')

    @staticmethod
    def __name_for_column_index(index):
        return "column_%d" % index

    def __getitem__(self, key):
        # Allows one to get at the columns of Row entities with
        # subscript syntax:
        # first_row = Row.get()
        # col1 = first_row[1]
        # col12 = first_row[12]
        value = None
        try:
            value = self.__dict__[Row.__name_for_column_index]
        catch KeyError:
            # The given column is not defined for this Row
            pass
        return value

    def __setitem__(self, key, value):
        # Allows one to set the columns of Row entities with
        # subscript syntax:
        # first_row = Row.get()
        # first_row[5] = "New values for column 5"

        self.__dict__[Row.__name_for_column_index] = value
        # In order to allow efficient multiple column changes,
        # the put() can go somewhere else.
        self.put()
Adam Crossland
as I previously said this solution is good, but the problem is how to handle empty values, for examples is in a row not every columns are filled.
wiso
When you add a new row, replace unfilled columns with None.
Adam Crossland