views:

117

answers:

4

Hello, Maybe you can help me. I need to query 3 tables in order to get data for a financial stock.

The idea is to go to the instruments table, find the index for each instrument and then bring all the prices for that particular instrument together with the indicators that are on a separate table.

Tables stockdata and indicators are both almost 50.000 records. instruments just 30.

This is the query that is not working:

SELECT
  indicators.ddate,
  instruments.name,
  indicators.sma_14,
  indicators.sma_5,
  stockdata.close
FROM
 indicators
 INNER JOIN instruments ON (indicators.instrument_id=instruments.id)
 INNER JOIN stockdata ON (instruments.name=stockdata.name)

Here is the EXPLAIN result

+----+-------------+-------------+-------+-----------------------------+---------------------+---------+------+-------------+
| id | select_type | table       | type  | possible_keys               | key                 | key_len | rows | Extra       |
+----+-------------+-------------+-------+-----------------------------+---------------------+---------+------+-------------+
| 1  | SIMPLE      | instruments | index | PRIMARY,instruments_index01 | instruments_index01 |      61 |   25 | Using index |
| 1  | SIMPLE      | indicators  | ref   | indicators_index01          | indicators_index01  |       5 |  973 | Using where |
| 1  | SIMPLE      | stockdata   | ref   | stockdata_index01           | stockdata_index01   |      31 | 1499 | Using where |
+----+-------------+-------------+-------+-----------------------------+---------------------+---------+------+-------------+

I really appreciate any help you can provide!

This is the schema for the parts of the tables that are involved in my question:

TABLE `indicators` (
  `id`             int AUTO_INCREMENT NOT NULL,<br>
  `instrument_id`  int,
  `date`           date,
  `sma_5`          float(10,3),
  `sma_14`         float(10,3),
  `ema_14`         float(10,3),
  /* Keys */
  PRIMARY KEY (`id`)
)

TABLE `instruments` (
  `id`         int AUTO_INCREMENT NOT NULL,
  `name`       char(20),
  `country`    char(50),
  `newsquery`  char(100),
  /* Keys */
  PRIMARY KEY (`id`)
)

TABLE `stockdata` (
  `id`        int AUTO_INCREMENT NOT NULL,
  `name`      char(10),
  `date`      date,
  `open`      float,
  `high`      float,
  `low`       float,
  `close`     float,
  `volume`    int,
  `adjclose`  float,
  /* Keys */
  PRIMARY KEY (`id`)
)
+1  A: 
SELECT 
  ind.ddate,
  ins.name,
  ind.sma_14,
  ind.sma_5,
  sto.close
FROM indicators ind
JOIN instruments ins ON ind.instrument_id = ins.instrument_id
JOIN stockdata sto ON ins.name = sto.name

another option:

select ind.ddate, ins.name, ind.sma_14, ind.sma_5, 
     (select close from stockdata where name = ins.name limit 1) as close
from indicators ind
join instruments ins on ind.instrument_id = ins.instrument_id
Fosco
Hello, the second option worked as follows select indicators.`date`, instruments.name, indicators.sma_14, indicators.sma_5, (select close from stockdata where name = instruments.name limit 1) as closefrom indicators indicatorsjoin instruments instruments on indicators.instrument_id = instruments.idBut still took : 44619 rows fetched (29,42 sec)
JordanBelf
+1  A: 

I am suspicious of joining on the stockdata.name field. Do you have the right indexes defined on the name field in the stockdata and instruments table? Is it possible that joining on the name could be returning invalid results and you could join on another .id field?

Wil P
the thing is that the stockdata table does not have an instrument_id column, thats why I need to bring in the insturment table to get the intrument id for that given name from there and then join with the indicators table.
JordanBelf
A: 

You're querying on the nonindexed field name in stockdata. Either create an index or instead join on id. (I would do the latter, changing name to id in instruments)

Visage
+4  A: 

You are joining the indicators table to the instruments table, and the indicators.instrument_id column is not indexed.

You are also joining the instruments table to the stockdata table using the instruments.name and stockdata.name columns, both of which are type CHAR. Joining using CHAR or VARCHAR is usually significantly slower than joining using INT columns:

Using CHAR keys for joins, how much is the overhead?

To make matters worse, your CHAR columns are different sizes (char(20) and char(10) respectively), and they are not indexed. This really makes things difficult for MySQL! See How MySQL Uses Indexes for more information.

Ideally, you should alter your table structure so that the join can be performed using indexed INT fields. Something like this:

CREATE TABLE `instruments` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` char(20) DEFAULT NULL,
  `country` char(50) DEFAULT NULL,
  `newsquery` char(100) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

CREATE TABLE `indicators` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `instrument_id` int(11) DEFAULT NULL,
  `date` date DEFAULT NULL,
  `sma_5` float(10,3) DEFAULT NULL,
  `sma_14` float(10,3) DEFAULT NULL,
  `ema_14` float(10,3) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `fk_instrument_indicators` (`instrument_id`),
  CONSTRAINT `fk_instrument_indicators` FOREIGN KEY (`instrument_id`) REFERENCES `instruments` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB;

 CREATE TABLE `stockdata` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `instrument_id` int(11) NOT NULL,
  `name` char(20) DEFAULT NULL,
  `date` date DEFAULT NULL,
  `open` float DEFAULT NULL,
  `high` float DEFAULT NULL,
  `low` float DEFAULT NULL,
  `close` float DEFAULT NULL,
  `volume` int(11) DEFAULT NULL,
  `adjclose` float DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `fk_instrument_stockdata` (`instrument_id`),
  CONSTRAINT `fk_instrument_stockdata` FOREIGN KEY (`instrument_id`) REFERENCES `instruments` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB;

Then use the indexed fields in your join:

SELECT
  indicators.date,
  instruments.name,
  indicators.sma_14,
  indicators.sma_5,
  stockdata.close
FROM
 indicators
 INNER JOIN instruments ON (indicators.instrument_id=instruments.id)
 INNER JOIN stockdata ON (instruments.id=stockdata.instrument_id)

By using indexed INT columns, your joins will be much faster. Using InnoDB constraints will help ensure data integrity.

If there is a reason why you must join on the name column, make both the same size, and index them.

Mike
thanks Mike, I will give it a try. I let you know
JordanBelf