views:

54

answers:

1

I'm trying to develop a web based digital asset management application. I'm planning to do it in Codeigniter with mysql as db. This is for a popular regional newspaper. They will have 1000 of entries and TB's of data as daily tons of information will be entered.

There will be different content types like Media, Personality, Event, Issue, Misc etc... All this will be categorized. The thing is all will be interconnected. For example the event "olympics" will be connected to all the participants in personlity table and all the media related to this. I'm planning to implement this complex inter-connection using a table 'connections'

id  -   subject   -  connection   - type 
-------------------------------------------
 1         98           190         media
 2         283          992         issue 
 3         498          130         info

So when a person takes the event olympics... all the connections will be populated from this table. The 'subject' column will have id of 'olympics' and connection will have id of the connected entry.

Is there a better way to do this? The content will have to searched based on 100's of different criteria. But the end-users will be very less. Only the reporters of the newspaper(Max 100) will have access to this app so the traffic or load will be very less but the amount of information stored will be very high. I would like to hear from experienced developers as i don't have much experience doing something big like this.

+3  A: 

This is a complex question in that you need to know a lot about tuning and configuring your MySQL database in order to handle both the load and data. With such a low amount of users you will be okay in terms of connections so the time to execute is the real bottleneck.

If you are on a 32bit server the max rows for a table is 4.2billion and 4GB without any configuration changes. You can up the 4GB table limit but as far as I know the 4.2billion row limit is the max on a 32bit server.

Your table seems like it would be okay but I would change "type" to an ENUM so the data is not text (reduces overall table size).

You will have to index this table properly and from what it looks like it would be on subject,type. Without hard numbers/examples query with joins it would be hard to guestimate how fast this query would run but if it's indexed properly and has a high cardinality you should be okay.

You can always throw a Memcache layer in between PHP and MySQL to cache some results so you can get better performance if they are executing similar searches. With the "100's of different criteria" though you will most likely be hitting the database quite a bit.

Contrarily you could also take a look at some NoSQL options such as MongoDB which depending on your data might be a better fit.

methodin
"Contrarily you could also take a look at some NoSQL options such as MongoDB which depending on your data might be a better fit." http://highscalability.com/blog/2010/9/5/hilarious-video-relational-database-vs-nosql-fanbois.html
iddqd