tags:

views:

57

answers:

1

I have two database tables: "places" and "translations". The translations of places names are made by selecting records from "places", which don't have the translations to the specified language yet:

SELECT `id`, `name`
FROM `places`
WHERE `id` NOT IN (SELECT `place_id` FROM `translations` WHERE `lang` = 'en')

This worked fine with 7 000 records of places, but crashed when the number of translations reached 5 000. Since then, the query takes about 10 seconds and returns the error:

2006 - MySQL server has gone away

As I understand, the main problem here is the subquery returning to many results, bu how could I solve it, if I need to select all the places which are not translated yet?

My plan B is to create a new boolean field in "places" table, called "translated", and reset it to "false", each time I change language - that would prevent for having subquery. However, maybe I could just modify my current SQL statement and prevent from adding additional field?

+1  A: 

The obvious alternative:

SELECT
  `id`, `name`
FROM
  `places`
WHERE 
  NOT EXISTS (
    SELECT 1 FROM `translations` WHERE `id` = `places`.`id` AND `lang` = 'en'
  )

There should be a clustered composite index over (translations.id, translations.lang) (composite means: a single index over multiple fields, clustered means: the index governs how the table is sorted).

Tomalak
Thank you, that's exactly what I was missing.
Kernius
@Kernius: Can you post what difference that change makes for you in comparison? (the index is important!)
Tomalak
I had indexed those fields before and still was having that problem, which simply disapeared, when I switched to yours suggestion.
Kernius