I have 2 tables (name(fields)):
data(object_id, property_id, value_id)
and
string(id, value)
All the data is in "string" table. "data" only refers to corresponding strings.
For example, I have:
data(1,2,3)
data(1,4,5)
data(6,4,7)
string(1, 'car')
string(2, 'color')
string(3, 'red')
string(4, 'make')
string(5, 'audi')
string(6, 'car2')
string(7, 'toyota')
Now what I want, is when I delete some rows in data table, then all orphan rows in string table would also be deleted:
if I delete data(6,4,7) then strings with id 6 and 7 would be deleted (because they are no longer used); 4 is used in another data row and therefore not deleted.
My question is, how to write an optimized delete query for string table?
Currently I have something like that (which works, but is very slow):
delete
from string s
where 1=1
and (select count(id) from data where object_id = s.id) = 0
and (select count(id) from data where property_id = s.id) = 0
and (select count(id) from data where value_id = s.id) = 0
I have also tried (depending on the orphan count gives sometimes 10-20% faster result):
delete from string
where (id not in (select usedids.id from (select object_id as id from data
union
select property_id as id from data
union
select value_id as id from data) as usedids)
);
I have about 100k rows in both tables. If I delete about 6000 rows in data table, then cleaning string table takes about 3 minutes. I have an index on every field. I also have foreign key constraints.