views:

676

answers:

4

I am looking at storing some JMX data from JVMs on many servers for about 90 days. This data would be statistics like heap size and thread count. This will mean that one of the tables will have around 388 million records.

From this data I am building some graphs so you can compare the stats retrieved from the Mbeans. This means I will be grabbing some data at an interval using timestamps.

So the real question is, Is there anyway to optimize the table or query so you can perform these queries in a reasonable amount of time?

Thanks,

Josh

+2  A: 

3 suggestions:

  1. index
  2. index
  3. index

p.s. for timestamps you may run into performance issues -- depending on how MySQL handles DATETIME and TIMESTAMP internally, it may be better to store timestamps as integers. (# secs since 1970 or whatever)

Jason S
+3  A: 

Well, for a start, I would suggest you use "offline" processing to produce 'graph ready' data (for most of the common cases) rather than trying to query the raw data on demand.

Andrew Rollings
+7  A: 

There are several things you can do:

  1. Build your indexes to match the queries you are running. Run EXPLAIN to see the types of queries that are run and make sure that they all use an index where possible.

  2. Partition your table. Paritioning is a technique for splitting a large table into several smaller ones by a specific (aggregate) key. MySQL supports this internally from ver. 5.1.

  3. If necessary, build summary tables that cache the costlier parts of your queries. Then run your queries against the summary tables. Similarly, temporary in-memory tables can be used to store a simplified view of your table as a pre-processing stage.

Eran Galperin
+1  A: 

If you are using MYSQL 5.1 you can use the new features. but be warned they contain lot of bugs.

first you should use indexes. if this is not enough you can try to split the tables by using partitioning.

if this also wont work, you can also try load balancing.

Bernd Ott