views:

204

answers:

8

I have a table which consists of 200 Companies Stock prices for 5 years. This is one large table which consists of Company Name, Stock Open, High, Low, Close, Date

I am now required to do some processing on the same and also let users [up to 10] access this database to fetch reports on different sets of parameters and queries.

Should I use the database as it is or do you have any suggestion to make it more optimized.

Thanks.

+1  A: 

If it is really just company name with that data then it is already normalized. If there was more about the company like address, phone, etc then you would want to break that out into a separate table.

carson
A: 

i'd have a table for the company, and a table for the stock prices on a given day (the open/high/low/close bits), to save duplicating the company info everywhere.

benlumley
+3  A: 

Pull out the name and use an integer ID. It should be faster and will tolerate name changes. The stock symbol can also be extracted to the parent table.

Nerdfest
My earlier post was very early. It will not actually speed the retrieval up by much for the raw data, and will likely slow it down slightly because of a join. There is value in the normalization if you're adding any other company data, or allowing the ability to change the name though.
Nerdfest
+2  A: 

I think you need to consider the report, are they always going to be month by month, for example? if so you could create a table of aggregated data.

Otherwise i think careful indexes are your only option for performance

Andrew Bullock
The question isn't about optimization it is about normalization.
carson
Ok I'll never think outside the box and sit in my cave of ignorance. There was very little room for normalisation, and his problem is clearly query speed. My answer isnt about normalisation no, but it addresses what appears to be his underlying issue
Andrew Bullock
I think it is important that the topic be understood as it is asked. Normalization is an important part of database design and just shouldn't be confused with optimization. I would argue this table doesn't need to be optimized either.
carson
No I agree with you, still, just trying to suggest ideas he might not have thought of to help out
Andrew Bullock
Every analyst will have his own query and nothing is fixed. As said by Ken [below], if things become slow or an issue then I can post it.Thanks for your time and suggestions.
A: 

I would add an UID field and several Dimensions for Date (i.e. years table, years+months table, years+quarters table, fiscal years table, etc.).

vmarquez
A: 

Normalized and optimized are not always the same thing.

What are your users going to do with the data?

Walter Mitty
A: 

This is not an optimization (although you could argue that it is a normalization provided companies can change name):

CREATE TABLE company (
  id INTEGER PRIMARY KEY, -- Well, this would be a serial, but that works different in different DBMS
  name VARCHAR(256) UNIQUE
);

CREATE TABLE price (
  company_id INTEGER REFERENCES company(id) NOT NULL,
  date  TIMESTAMP NOT NULL,
  open  DECIMAL, -- Just grabbed a type here, probably not right for you.
  high  DECIMAL,
  low   DECIMAL,
  close DECIMAL,

  PRIMARY KEY(company_id, date)
);

See here for info on the key generation.

How do you handle companies changing name, by the way? By ignoring it would be the simple answer, but is it correct? :)

So, anyway if the table grows too big for good performance I would just partition it.

Henrik Gustafsson
Caveat: I only wrote this without testing it at all. It probably doesn't work :)
Henrik Gustafsson
+1  A: 

A misquote of someone:

Rules of Optimization

  1. Don't do it.
  2. For EXPERTS only: Don't do it yet.

If the question is "... do I leave it alone or do I make it more optimized", leave it alone until you know, by measurement, that there is a problem.

If there is a problem with the query or update of the table, then update your question with details about the query, any indexes, how frequently the table is updated/accessed, etc. You'll get all kinds of suggestions at that point.

As was mentioned earlier, as far as normalization goes, you might consider extracting Company Name to its own table if the same Company Name appears multiple times in the table.

Ken Gentle