tags:

views:

86

answers:

7

I have a hierarchical table 'pages', with 'id' and 'id_parent' fields. Thanks to some crappy old code there are still a lot of rows without a parent. I want to delete these rows. Is this possible using only SQL?

+4  A: 

yes. This is simple and unoptimized, (and assumes that the parent is no longer in existence) but it would give you a place to start

DELETE FROM Pages WHERE Id_Parent NOT IN (SELECT ID FROM PagesParent);

Based on the comment, it appears that this is a self-referencing table. Basically, ID_Parent of Pages is linked to the ID of Pages.

Therefore you need to torture TSQL a bit. NOTE: I work with SQL SERVER and am not certain if this is standard SQL and will work in other RDBMS. That said, something along these lines should get what you're after.

WITH t(ID, ID_Parent) AS
(
    SELECT ID,ID_Parent from Pages where id = @ID
    UNION ALL
    SELECT st.ID, st.ID_Parent FROM Pages st
       INNER JOIN  t ON st.ID_Parent = t.id
)
DELETE PAGES
       FROM Pages s
            LEFT JOIN t ON s.ID = t.ID
       WHERE t.ID IS NULL
Stephen Wrighton
I triedDELETE FROM pages WHERE id != 1 AND id_parent NOT IN (SELECT id FROM pages)But i get "You can't specify target table 'pages' for update in FROM clause"
tape2
@tape2: The table specified in the subquery needs to be the parent table. Typo?
Jon Seigel
@Jon - not necessarily. We're making the assumption that there are 2 distinct tables here, but from his comment that's not necessarily the case....
Stephen Wrighton
@Stephen: Actually, come to think of it after re-reading the question, I think there's only a single table.
Jon Seigel
A: 
SELECT id
--DELETE 
FROM myTable
WHERE id_parent IS NULL

or

SELECT id
--DELETE 
FROM myTable
WHERE id_parent IS NOT IN (SELECT id FROM myTable)
Nathan Koop
+1  A: 

You can do:

DELETE FROM pages WHERE id_parent NOT IN (SELECT id FROM pages)

Assuming your parent rows have a null id_parent you'll need to add:

AND NOT id_parent IS NULL
Whisk
A: 
DELETE
FROM    pages pc
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    pages pp
        WHERE   pp.id = pc.id_parent
        )

Unlike IN solutions, works for rows with id_parent being a NULL.

Quassnoi
+1  A: 

Pretty much as you would say it in English

Delete Pages Where parent does not exist,

or in SQL:

 Delete Pages
 Where Not Exists 
     (Select * From pages
      Where Id = p.id_Parent)
Charles Bretana
+2  A: 

You can also do it that way:

delete P 
from Pages P 
left join PagesParent Parent
   on Parent.Id_Parent=P.Id_Parent
where Parent.Id_Parent is null -- This tells you that the row doesn't exists in the table PagesParent
Danielle
A: 

You can use a left join for this in SQL Server. Don't know if it will work in all databases)

delete p
     --select *
from pages p
left join Parent pt on p.id = pt.id
where pt.id is null

I added the select part in the comments becasue you can run it first to see exactly what records you will be deleting.

HLGEM