views:

36

answers:

3

Assuming I have 2 large tables with well over 10 Million Rows in each table.

Table 1 is a Movie table with a userID stating the user saw this movie, and Table 2 is a table of Music with a userID stating the user herd this song.

What I want to do is be able to run a query whenever the user wants to know if there is a Song with the same name as a Movie he has seen.

For example.

SELECT movies.name FROM movies, music WHERE movies.name = music.name and movies.userID = '8a80828c2b46d717012b4801d2650002';

I have indexed Movie Name and Music Name in both tables. The problem is when the user calls the query, it seems like it would run slow depending on how many rows there are in the tables and how many movies/music the user has watched/herd.

Assume the user has watched 9000 movies and herd 90k songs. How would I go about optimizing this query to be fast (under 15 seconds max)?

EDIT: Would it make sense for me to use Triggers after each watched movie to go and check if there is a song that exists and update a third table with userID,movieID,musicID? Would the high volume of inserts into the database cause triggers to work slowly which in turn will effect the database performance?

A: 

Why not create an index on movie userid? Since it is in your WHERE clause, it should run faster.

Benoit
Hey Benoit, thanks for the quick answer. How would adding an index on movie userID help?
Faisal Abid
Indexes are useful to filter rows on a table. Here you are filtering table MOVIE on content of column UserID.
Benoit
A: 

Use integer keys, not strings; also move the userId (integer now) check before the string comparison check.

vulkanino
+1  A: 
select name from songs where name in (select name from movies where userid=5);

OR

select s.name from songs s
join (select name from movies where userid=5) as m
on s.name = m.name;
  • Keep an index on userid in movies
  • Keep an index on name in songs
Hasan Khan