views:

46

answers:

6

Here is the full details.

The web application creates report from CSV files There are different formats of CSV files User will be able to add new formats or will be able to add new fields.

My client want me to create table for each type of the CSV file ... So for each new formats there will be a new table ... I believe this is a bad design decision. But not sure how to explain this to my client.

Here are the reasons I can think of right now:

  1. It is a bad design decision
  2. After one year there will be like >10000 tables and the System will become unstable
  3. Implementation is going to be time consuming
  4. Maintenance is going to be difficult
  5. Data will get unmanageable eventually
  6. Not possible to migrate to other servers easily

Are my reasons valid? What do you think? Can you give me some light on this?

Thank you very much for your help

A: 

If you're just creating report from CSV files, why create tables for it. You could just store the CSV and create the reports on the fly...

BTW It's a bad design decision because... "it's a bad design decision", is a good example of why business users don't trust programmers.

pascal
thats what I am doing right now ... Yeah I am pretty bad at explaining sometimes
tmahmood
A: 

You provide negatives. I think if you additionally provide good solution with positives you have much more chances.

How often will new format come? May be dynamic changes of one table (as you said it could be only columns adding) is decision?

Donz
Well I already have a decent way (as I believe) implemented. That does what they want. They new formats comes quiet frequently.
tmahmood
+1  A: 

Taking your points:

  1. Although true, you will need to back this up with why.
  2. Unstable? Depends on the DBMS you are using.
  3. Not necessarily. Creating and dropping tables is a relatively simple task.
  4. Again, not necessarily. The tables should be managed entirely by the application.. which means little to no direct maintenance
  5. Not sure how you figure this.
  6. Moving to servers should be as simple as backing up and restoring the database to another machine. Assuming you can't do that, then there are a plethora of tools out there to do this for you.

That said, the reasons why I wouldn't do this are:

  1. Security. In order to allow the app to manage tables you will have to give the user rights to create/drop them. This opens up the possibility for a hacker to completely nuke the db. (POET anyone?)
  2. Security. You are pretty much guaranteed that you will be using dynamic sql to accomplish this with regular DBMSs which means you have to be aware of SQL Injection possibilities.
  3. Security. Did I mention that the app user would need basically full rights over your DB?

There are at least two alternatives. One is to use a NoSql database like MongoDB. It can create and alter tables on the fly with very little guidance by you. This type of relatively unstructured data is exactly what those database systems are for. You might consider just having the uploads go to MongoDB or similar while the rest of your app uses a traditional RDBMs.

A second is to use a table design which supports virtual tables. Meaning you have a main table holding the table names, another which holds the fields, and a final one which holds the data in a name/value pair association. This path is not for the faint of heart as reporting can be a b*tch, but skips past all of the security issues.

Chris Lively
Thank you very much! I am using Php+MySql, and no options to use other DBMS. I think Mysql is already unstable enough. I can do export data but not sure my client would be able to handle that by himself. I love your second idea. I have something similar to this already but its not good enough yet cause I still suck at designing :(
tmahmood
+1  A: 

Don't confuse what a customer wants the program to do, with how you actually achieve it.

They want your program to read in CSV files and report on them, that is it. It is up to you and your technical skills to achieve that, this customer should have ZERO say in how you implement this. Internal table and application design is not for the end users to dictate.

Would this guy tell a doctor how to cut into his aorta?

KM
I agree .. and I am trying to explain that to them ... but they are being a little obsessed with this.
tmahmood
A: 

What is the actual requirement being served here? Where is the data coming from that populates the CSV files? Why do you need another copy of that data?

It seems likely that the data would be unusable in the form you have described it. But since you haven't said what you actually want to achieve it's hard to be more specific.

dportas
there are lots of problem with their suggested way ... the more I think and research on this the more problem I see .. but some reason they are getting crazy about the multi-table solutuin
tmahmood
A: 

You are missing a critical piece of information in the presentation of the problem: What information is represented by the CSV files you are collecting?

As a general principle, if the CSV files are simply different ways of presenting the same kind of information (that is, some people send you more or fewer detail fields about the same "things") then, if you're collecting this information in relational database, I would say that you definitely want to have one common table (or a small group of common tables to correctly represent the data). You would have a separate import program for each format of CSV file that you might need to convert into your "thing", but the storage would be the same no matter what kind of CSV file the "thing" came in.

On the other hand, if each CSV file represents different "things" and you're really building a general-purpose CSV reporting tool of some kind, then I question whether a relational database is even the correct storage mechanism for the project although I would want to know more details about what you're actually doing before making any kind of categorical statement.

Larry Lustig
There are different kinds of information. some can be person's information some can be location. And seems there will be a lot of formats will be added ... I need to generate reports in other different formats ...
tmahmood