ansaurus

Question

How to build a fully-customizable application (aka database), without lose performance/good-design?

Answer 1

A:

if you do it this way all of your queries will have to join to and use table_refer column, which will kill performance, and make simple queries hard, and hard queries very difficult.

if your want multiple e-mails, split the email out to another table so you can have many rows.

KM 2009-06-04 15:51:52

And how does 'split the email out to another table' be different from my solution? If you meaning to have 1 table for emails, 1 for phones, etc.. i must stop you: i dont want to have 1000 sub-tables for every item ;)

DaNieL 2009-06-04 16:28:13

there has to be a reasonable limit on what you will allow multiple values for. Will you need multiple rows and new tables for name, and/or surname, and or nickname? I don't think so. Possibly make a Contact table, with a type column: "e"mail or "p"hone and a multipurpose Value field

KM 2009-06-04 17:23:39

Surely, for not all the fields there ill be the 'duplicate' option.. man dont focus on the email-phone example, it is just an example. For the users table, the field 'multiple' would be not more than 5 i guess... but what happen if an user need a field that i havent put?

DaNieL 2009-06-05 06:17:19

Just edited the question.

DaNieL 2009-06-05 06:17:33

the app has to know what is permitted to have multiples, if that is the case just split out those tables from the git-go, optimize your queries for those tables and forget about making an all powerfull stores anything table that will zap performance and make writing your queries hard

KM 2009-06-05 12:59:19

Answer 2

+4 A:

As I said in my Answer to a similar question, "Database Design is Hard." You are going to have to make the decision about which is better for you, normalizing the tables and bringing phone numbers and e-mail addresses into their own tables, with the associated JOIN-ing to retrieve the data, and the extra effort of referential integrity, or having some number n e-mail and phone fields in your table, and the "data-messiness" that that entails.

Database design is always a series of tradeoffs. You need to look at all angles, maybe bodge up some prototypes and do some profiling, etc. There is no "One True Answer™".

Adrien 2009-06-04 15:53:15

+1: Database design is hard. If you want to give them unlimited customization, you have to give them the code.

S.Lott 2009-06-04 15:58:35

Seems to be true. Only good advises can be here... but not answers.

Jet 2009-06-04 16:00:21

I agree with you that there no 'one true answer', but maybe there will be a best answer... and that is why i am here: looking for other ideas about that kind of design

DaNieL 2009-06-05 06:09:50

@S.Lott: i cant give them the code, the application will be used by users that are able just to surf the web, not to costum theyre database/code.

DaNieL 2009-06-05 06:11:30

Answer 3

A:

You could design your application to request additional data (like emails list for the user) on demand, using AJAX etc. In those highly customizable and rich applications usually you have no need to display all the data - only a single category.

To store custom records you can create table field_types(id, name, datatype) and a table custom_fields(user_id, field_type_id, value), and then select smth like this:

SELECT * FROM custom_fields WHERE user_id=XXX AND field_type_id IN (X,Y,Z).

so now you can retrieve data in 1 fast query, split fields to categories and parse their values by their respective datatypes with your code without performance issues.

Jet 2009-06-04 15:55:15

So, 1 table with the custom fields for evrery object? This could be.. would be easyer even the partitioning, i guess.

DaNieL 2009-06-04 16:29:45

It depends on data. But in most cases table with custom fields should be better.

Jet 2009-06-09 14:39:00

Answer 4

A:

I'm not sure about the specifics of postgresql, but if you want highly customisable data structures in a DB that you don't really want to search on, the serializing the data to a LOB is an option.

In fact this is the way ASP.NET works by default with Personalization, which is per user settings.

I don't recommend this approach if you wish to search the fields for any reason.

RichardOD 2009-06-04 16:08:26

Answer 5

A:

Your proposed model is composed of two database patterns: an entity-attribute-value table and a polymorphic association.

Entity-attribute-value has some pretty big issues both in the performance and data integrity department. If you don't need to access the additional attributes in queries, then you can serialize the attribute value mapping to a text field in some standard serialization (JSON, XML). Not "pure" from the database design standpoint, but possibly a good pragmatic choice, given that you are aware of the tradeoffs. On postgres you can also use the hstore contrib module to store key-value pairs to make it usable in queries, if the limitation of string only values is acceptable.

For polymorphic association, you can get referential integrity by introducing an association table:

users                attrib_assocs       custom_attribs
-----                -------------       --------------
attrib_assoc_id -->  id             <--  assoc_id
...                  entity_type         field
                                         value

To get slightly more integrity, also add the entity_type to the primary key and corresponding foreign keys and a check constraint on users table that the entity_type equals 'user'.

Ants Aasma 2009-06-04 16:13:02

WTF, a text field with the values in a json array? No mate, thanks for the answer but as i specified before i'll need to query the additional fields as well as the normal fields...

DaNieL 2009-06-05 06:14:04

If the ability to query truly schemafree data is a requirement, then EAV is the pattern to go for. Depending on the performance requirements you still may need to denormalize the data, e.g. use a trigger to keep an aggregated form of the data on the owner row. My experience with a few hundred million row EAV table is that there's almost two orders of magnitude performance difference between joining the data and having it available on the owner row. With Postgres you can create GIN indexes over the aggregated form to get similar speedups for querying.

Ants Aasma 2009-06-05 21:42:43

ansaurus

tags:

views:

answers:

How to build a fully-customizable application (aka database), without lose performance/good-design?

related questions