views:

93

answers:

4

Hi All,

I have a category table with:

  • categoryID
  • parentCategoryID
  • categoryName

and an items table with:

  • itemID
  • categoryID
  • itemName

I am using MySQL. i want to write a query that will return a count of Items in a category given a categoryID. The query should return total count for all items in all subcategories of the given category.

I hope this makes sense.. sorry if i am not using the correct nomenclature.

A: 

How stuck are you on that schema? It's called an "Adjacency List", and it's simple enough, conceptually, but it has some real shortcomings. The most significant of which is the inability to query for all descendants.

Take a look at this, and consider whether an alternative method of representing trees might work better for you:

http://pugs.postgresql.org/files/ModelingTrees.pdf

Jeff Dege
A: 

of course its possible, but not very efficient. you should go with nested set structures: http://intelligent-enterprise.informationweek.com/001020/celko1_1.jhtml;jsessionid=AFUXE0ZF4PTNXQE1GHPSKH4ATMY32JVN

if you dont like that have a look here: have a look at this: http://explainextended.com/2010/04/18/hierarchical-query-in-mysql-limiting-parents/

Joe Hopfgartner
A: 

if, as you say there's only two levels of categories, then a simple join/alias query will work just fine. If you were allowing arbitrary depths, then you'd have to go with the fancy recursive queries or adjancency sets and whatnot.

Assuming you only allow items to be attached to the 'lower' category, then something like should get you the results you need:

SELECT top.categoryID, top.categoryName, bottom.categoryID, bottom.categoryName,
    COUNT (items.itemID)
FROM categories AS top
LEFT JOIN categories AS bottom ON top.categoryID = bottom.parentCategoryID
LEFT JOIN items ON bottom.categoryID = items.categoryID
WHERE (bottom.categoryID = $your_category)
GROUP BY top.categoryID, bottom.categoryID

If you need to look at only top-level categories, then change the WHERE clause as appropriate.

Marc B
ok i think this will work great.. but i was wrong there are three levels.. can this query be modified to handle a three possible levels?
rodrick
A: 

If you are considering the alternative approaches described by the document referenced by Jeff Dege, to your current Adjacency List tree structure:

Nested Sets are very fast for data that is read often but changes rarely (reads use SQL BETWEEN and indexes; changes can be expensive as they might have to update many existing records), while Path Enumeration (also known as Materialized Path) provides acceptable to good read performance when using indexes and LIKE '[path]%' queries (at least for MySQL as far as I know) and good performance for insertion operations and acceptable performance when moving a category to a different category.

I personally have a project where I'm using Path Enumeration with database IDs as path elements and a dot . to separate the elements (e.g. ancestor path 1.2.3.).

You might want to conduct your own benchmarks to compare these approaches, especially if you have lots of categories (several thousands or more).

Archimedix