views:

78

answers:

1

If we have table such as this

message: Id (INT) | message(TEXT) | date(TIMSTAMP)

And lets imagine that there is constantly data comming into this table. For example

1, 'Hello World', 12345678;

2, 'Yoooo Whats Up', 12345679;

3, 'My name is Hal!', 12345680;

Now this table is huge and is never deleted

SELECT * FROM articles WHERE MATCH (message) AGAINST ('Hal');

Now would this be feasable and what are the pitfalls

I was also thinking to have a group value that group each day. Because most of the queries are represented what came in on what day so if we change the DB structure to

message: Id (INT) | group(INT) | message(TEXT) | date(TIMESTAMP)

Where the group is an ID value that will be incremented once we are into a new day

Also what would be the best way to increment the group, Can it done be using MySQL or do we have to do it programaticly

Then we could do this to cut down time by using the below

SELECT * FROM articles WHERE MATCH (message) AGAINST ('Hal') WHERE group=3;

Let me know your thoughts or ideas on what would be the best way to store a huge database table and query it to return specific values.

A: 

After doing some research i have come accross some concepts and ideas that will make things in a massive SQL database faster.

The grouping idea is not a bad one as it will shorten the size of the Balanced Tree indexed stored in the database however having said that how do you decide the group order. Do you base it on Daily, Monthly, or using some other criteria.

Which every way you decide the group order keep in mind the structure of the Balanced Tree Index.

My Project above has a massive scope compared to the above example, but from the above i would make an index for date. This will give me a balanced tree for the primary key and a balanced tree for date. Therefore i can work my way through both the balanced tree to get to the value i require.

Bear in mind the pitfalls and advantages of balanced tree Indexing structure:

You need to remember certial aspects.

  • Balanced tree provides you with faster search on the cost of memory.
  • Balanced tree if not design carefully can increase the size of your table, So much so that the balanced tree uses more space than the data it self. So be conservitive on balanced tree. Dont just throw each colum at it
  • Design your code in such a way that it uses the colums that are indexed to shorten the return query size.

Hope this helps other people that have to design tables with massive data sizes :D

Shahmir Javaid