tags:

views:

158

answers:

5

I have two tables, omc_categories and omc_products.

I want to pull out categories name where omc_proudcts.category_id is equal to omc_categories.id.

I created the following sql but I am not sure which is right one.

SELECT C.Name AS CatName
   FROM omc_categories AS C
   LEFT JOIN omc_products AS P
     ON C.id = P.category_id
   WHERE P.category_id = $category_id
     AND p.status = "active"

or

SELECT C.Name AS CatName
   FROM omc_products AS P
   LEFT JOIN omc_categories AS C
     ON C.id = P.category_id
   WHERE P.category_id = $category_id
     AND p.status = "active"

Can anyone tell me which one right (if there is any) and why please. I am confused with join.

CREATE TABLE IF NOT EXISTS `omc_categories` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  `shortdesc` varchar(255) NOT NULL,
  `longdesc` text NOT NULL,
  `status` enum('active','inactive') NOT NULL,
  `parentid` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=43 ;

and

CREATE TABLE IF NOT EXISTS `omc_products` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  `shortdesc` varchar(255) NOT NULL,
  `longdesc` text NOT NULL,
  `thumbnail` varchar(255) NOT NULL,
  `image` varchar(255) NOT NULL,
  `class` varchar(255) DEFAULT NULL,
  `grouping` varchar(16) DEFAULT NULL,
  `status` enum('active','inactive') NOT NULL,
  `category_id` int(11) NOT NULL,
  `featured` enum('true','false') NOT NULL,
  `price` float(4,2) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=74 ;

--UPDATE--

Sample data of omc_categories

INSERT INTO `omc_categories` (`id`, `name`, `shortdesc`, `longdesc`, `status`, `parentid`) VALUES
(21, 'Front top', 'front top', '', 'active', 4),
(20, 'Galleri 2', 'Galleri 2', '', 'active', 4),
...
(41, 'Trær', '', '', 'active', 27),
(42, 'newfolder', '', '', 'active', 27);

Sample data of omc_products

    INSERT INTO `omc_products` (`id`, `name`, `shortdesc`, `longdesc`, `thumbnail`, `image`, `class`, `grouping`, `status`, `category_id`, `featured`, `price`) VALUES
(1, 'Doggie', 'Cappelen forlag: New Flight', 'Doggie from New flight.', 'images/newflight_doggie_small.jpg', 'images/newflight_doggie_big.jpg', 'new-flight', 'imagebox-new', 'active', 5, 'false', 0.00),
(2, 'Jinnie', 'New flight Jinnie', '', 'images/newflight_jinnie_small.jpg', 'images/newflight_jinnie_big1.jpg', 'new-flight', 'imagebox-new', 'active', 5, 'false', 0.00),
...
...
(73, 'new image', '', '', 'images/daffodil_big.jpg.jpg', 'images/daffodil_big.jpg', '', '', 'active', 42, 'false', 0.00);

For example last line id 73 has 42 of category_id.

I want to pull out name of category "newfolder" from omc_categories.

A: 

Most probably, you want this:

SELECT  name
FROM    omc_categories
WHERE   id IN
        (
        SELECT  category_id
        FROM    products
        WHERE   status = 'active'
        )
        AND id = $id

This will return category's name if there are active products in this category, nothing otherwise.

Quassnoi
Mmmm, this is new to me who are beginner. It seems like a recursive.Is this different from using join?
shin
you want to use IN like above because your LEFT JOIN in your first query would give you all categories and this is not what you want. LEFT JOINS returns all results for the relation on the left side (excluding where clause exceptions of course). You could add DISTINCT to the "SELECT category_id" as well to reduce the set size.
Arthur Thomas
Just be careful about `IN` and subqueries. If poorly written, the subquery will be executed for each row in the outer query; the `AND id=$id` in the above means there's only one row from the outer query, so this isn't a problem.
outis
A: 

First of all, they are both wrong for $category_id is undefined.

Secondly, if you remove the line

WHERE P.category_id = $category_id

(which is excessive) they will both be right and would be different in result unless you had category_id NOT NULL. Since it cannot be null, they're just the same. As well as inner join or outer join.

BTW it would be nice if you told instructed the database that category_id is in fact a foreign key ;)

Antony Hatchkins
MyISAM tables ignore FOREIGN KEYs.
outis
and what about the rest of the answer? ;)
Antony Hatchkins
`$category_id` is a stand-in; `?` or `:category_id` would do just as well. Since the OP is asking about the SQL query, rather than why a query is failing when run from an unspecified host language, whether or not `$category_id` is defined doesn't apply. The `WHERE` line can't be removed because then the `AND p.status = "active"` expression would result in a syntax error. Removing the `P.category_id = $category_id\n AND` results in a different query; it is not excessive. Whether it should be removed depends on what the OP is trying to find.
outis
+2  A: 

If your looking to pull the categories name where omc_proudcts.category_id is equal to omc_categories.id the following should do that for you (unless I'm missing something...):

SELECT C.Name AS CatName
   FROM omc_categories AS C
   INNER JOIN omc_products AS P
     ON C.id = P.category_id
   WHERE p.status = "active"

INNER JOIN is saying JOIN these tables where these two values equal each other (identical to "WHERE C.id = P.category_id")

contactmatt
+3  A: 

The issue with your statements, as Arthur Thomas points out, is that a left join will include all rows from the left table, using NULLs as values for the columns coming from the right table when the ON clause fails. A plain (inner) join will only include rows for which the ON clause succeeds.

SELECT DISTINCT C.Name AS CatName
   FROM omc_categories AS C
   JOIN omc_products AS P
     ON C.id = P.category_id
        AND P.category_id = $category_id
   WHERE p.status = "active"

Note you can include constant expressions in the ON clause, which reduces the size of the join (no point in adding rows you later remove).

Be careful about how you're building the statement. You don't want it vulnerable to SQL injection.

outis
A: 

You want to use an inner-join..the simplest way to reconstruct your query is this:

SELECT C.Name AS CatName
   FROM omc_categories AS C
   INNER JOIN omc_products AS P
     ON C.id = P.category_id
   WHERE P.category_id = $category_id
     AND p.status = "active"

The inner join is going to give the intersection between the two tables, on the ID that is common to the two tables (id in the categoris, category_id in products).

A better way to solve your problem (show all the categories that have an active product) is this:

SELECT C.Name AS CatName
   FROM omc_categories AS C
   where exists (Select * from omc_products AS P where p.categoryID = c.id)

This is similar to the in statement.

If you wanted to know how many products you have per category, try this:

SELECT C.Name AS CatName, count(p.*), sum(case when p.status = "active" then 1 else 0 end) as numActiveProducts
   FROM omc_categories AS C
   left outer JOIN omc_products AS P
     ON C.id = P.category_id 
group by c.name
Rob
The second statement has the worst performance. What sense of the word "better" do you mean?
outis
Better in the sense that I would want to know what categories have been used in any product. "worst performance" is relative. The last query is probably the "worst performer" of the ones I've posted.
Rob
I'm still not understanding what sense of the word "better" you're using. "[W]hat categories have been used in any product" could describe the result of either query. Do you mean "clearer", as in the SQL statement is closest to the English description of the query?
outis
Not true. The 1st query is parametrized on $category_id and status="active", so you're only going to get back a result based on that one $category_id. The second query tells you any category that is used in a product. The third query tells you the counts of products used per category (including 0) as well as the count of active products.
Rob
The first query has an additional filter, true, but both return categories that have some product. My point is that "what categories have been used in any product" doesn't point out a sense of the word "better" that explains how the second query is better than the first. "More direct translation", "clearer", "simpler" and "faster" are all senses of the word "better" that could be applied to queries.
outis
outis
So, like I said... better in the sense that I would want to know what categories are used in any product and use that in the application (i.e. cache it up for other queries to use). Seems to me you'll have a lot less categories than products. Getting one category name whether it has an active product or not based on a particular categoryID seems...the opposite of better.
Rob
"Better" is still ambiguous, and getting even more so as new senses (such as "more appropriate for the problem at hand") come to light. Do you mean "faster, when combined with caching?" If you only need to look up a single category with each request (that is, answer the question "Does category $category_id contain any products?"), the first query will likely be faster overall, as portions of the cache will be invalidated after a request is handled, though it depends on where and how caching is implemented.
outis
Turn the first statement into the equivalent of the second (by removing the `P.category_id = $category_id` test) and the result is likely faster than the second, as the optimizer can do more with it. Adding an index on omc_products.category_id will improve the second query's performance, but probably not enough to match the first query.
outis
If only views didn't have such terrible performance, they might offer the fastest approach.
outis
Rob
I'm not certain if MySQL optimizes `EXISTS`, but you could always force it by adding a `LIMIT 1` clause to the subquery. As for relative table sizes, the optimizer could just as well decide to loop over the categories first, rather than products, in an inner join.
outis
interesting...I'll give this a go, and see how it works out. I only have SQL Server installs in front of me now, but I'll try MySQL when I get home.
Rob