views:

52

answers:

2

I have two MySQL tables, states and trans:

states (200,000 entries) looks like:

id (INT) - also the primary key
energy (DOUBLE)
[other stuff]

trans (14,000,000 entries) looks like:

i (INT) - a foreign key referencing states.id
j (INT) - a foreign key referencing states.id
A (DOUBLE)

I'd like to search for all entries in trans with trans.A > 30. (say), and then return the energy entries from the (unique) states referenced by each matching entry. So I do it with two intermediate tables:

CREATE TABLE ij SELECT i,j FROM trans WHERE A>30.;
CREATE TABLE temp SELECT DISTINCT i FROM ij UNION SELECT DISTINCT j FROM ij;
SELECT energy from states,temp WHERE id=temp.i;

This seems to work, but is there any way to do it without the intermediate tables? When I tried to create the temp table with a single command straight from trans:

CREATE TABLE temp SELECT DISTINCT i FROM trans WHERE A>30. UNION SELECT DISTINCT j FROM trans WHERE A>30.;

it took a longer (presumably because it had to search the large trans table twice. I'm new to MySQL and I can't seem to find an equivalent problem and answer out there on the interwebs. Many thanks, Christian

A: 

this should do the trick, I've tested it in SQL Server 2008, so hopefully it works in MySQL too.

select energy from states
where id in
(
    select distinct i from trans where a > 30
    union
    select distinct j from trans where a > 30   
)
Axarydax
Thanks for your reply, Axarydax, but that's not quite what I need: I'd like a list of each of the unique (distinct) entries from the union of trans.i and trans.j such that trans.A>30, and for each id in this list, I'd like its state.energy value.With your method, I only get those ids which match trans.j and I'll get duplicates within that list.Thanks for your help,Christian
updated original text
Axarydax
Thanks again - I can see how that would work in one line.But when I try it, the command takes a very long time to execute (10 minutes and waiting...), whereas by using intermediate tables it takes less than 10 seconds.Do you know if there's a way to optimize your command in MySQL?
well the command should fine as it is - it tells database what to do, but not how to do it.The database engine is looking over 200 000 000 records in 'trans' table and comparing each 'a' if it is < 30. Try creating index on 'a' column, it may help speed things up.
Axarydax
A: 

OK... with a bit of help from Axarydax and others, I use the (essentially equivalent) command:

SELECT energy FROM states,
(SELECT i FROM trans where A>30. UNION SELECT j i FROM trans WHERE A>30.) temp
WHERE states.id=temp.i;

which works fast enough if I've got an index on i,j and A.