tags:

views:

159

answers:

6

Hi all,

MY PLATFORM:

PHP & mySQL

WHAT I HAVE HERE:

I have 4 tables, namely, 'books', 'book_type', 'book_categories', 'all_categories'.

WHAT I AM TRYING TO DO:

In simple words, I want to display all the books that are in stock i.e. in_stock = 'y', with all the book related information from all the tables, only once without repeating the entries. Currently the each of the books are repeated and I want to show them only once.

THE CURRENT PROBLEM:

In the frontend within my app., the entries are shown repeatedly when in fact when I am expecting them to show up only once (as in DISTINCT / UNIQUE) and not repeat themselves.

MY SUSPICION:

I suspect that the repeating data is because of the categories that each of the books belong to. Every single book entry is shown as many times, as it belongs to a category. Confusing? I mean that if a book1 belongs to 4 categories, then book1 is shown 4 times. If book2 belong to 2 categories, then it is shown 2 times.

WHAT I NEED:

I need the PHP & mySQL code that would solve the above problem. I am hoping that we can solve the problem without using GROUP_CONCAT in mySQL as there's a limit (1024 ?) for the same. A book can belong to many categories and I do not want to risk losing any data by using GROUP_CONCAT. I would also like to do this in a single query without accessing the database repeatedly in a loop. Thanks for understanding.

All the tables and the corresponding data to replicate the problem are as follows:

CREATE TABLE IF NOT EXISTS `books` (
  `book_id` int(11) NOT NULL auto_increment,
  `book_type_id` int(11) NOT NULL,
  `book_title` varchar(50) NOT NULL,
  `book_price` smallint(4) NOT NULL,
  `in_stock` char(1) NOT NULL,
  PRIMARY KEY  (`book_id`),
  KEY `book_type_id` (`book_type_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;

--
-- Dumping data for table `books`
--

INSERT INTO `books` (`book_id`, `book_type_id`, `book_title`, `book_price`, `in_stock`) VALUES
(1, 1, 'My Book 1', 10, 'y'),
(2, 1, 'My Book 2', 20, 'n'),
(3, 2, 'My Book 3', 30, 'y'),
(4, 3, 'My Book 4', 40, 'y'),
(5, 2, 'My Book 5', 50, 'n'),
(6, 1, 'My Book 6', 60, 'y'),
(7, 3, 'My Book 7', 70, 'n'),
(8, 2, 'My Book 8', 80, 'n'),
(9, 1, 'My Book 9', 90, 'y'),
(10, 3, 'My Book 10', 100, 'n');

--
-- Table structure for table `book_type`
--

CREATE TABLE IF NOT EXISTS `book_type` (
  `book_type_id` int(11) NOT NULL auto_increment,
  `book_type` varchar(50) NOT NULL,
  PRIMARY KEY  (`book_type_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;

--
-- Dumping data for table `book_type`
--

INSERT INTO `book_type` (`book_type_id`, `book_type`) VALUES
(1, 'Good'),
(2, 'Better'),
(3, 'Best');


--
-- Table structure for table `book_categories`
--

CREATE TABLE IF NOT EXISTS `book_categories` (
  `book_id` int(11) NOT NULL,
  `cat_id` int(11) NOT NULL,
  PRIMARY KEY  (`book_id`,`cat_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

--
-- Dumping data for table `book_categories`
--

INSERT INTO `book_categories` (`book_id`, `cat_id`) VALUES
(1, 1),
(1, 2),
(1, 3),
(1, 4),
(1, 5),
(2, 1),
(2, 2),
(3, 1),
(3, 2),
(3, 3);


--
-- Table structure for table `all_categories`
--

CREATE TABLE IF NOT EXISTS `all_categories` (
  `cat_id` int(11) NOT NULL auto_increment,
  `category` varchar(50) NOT NULL,
  PRIMARY KEY  (`cat_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;

--
-- Dumping data for table `all_categories`
--

INSERT INTO `all_categories` (`cat_id`, `category`) VALUES
(1, 'Comedy'),
(2, 'Drama'),
(3, 'Romance'),
(4, 'Horror'),
(5, 'Trivia'),
(6, 'Puzzles'),
(7, 'Riddles'),
(8, 'Kids'),
(9, 'Gents'),
(10, 'Ladies');

MY TARGET:

//MY QUERY:
SELECT books.book_title,  books.book_price,
       book_type.book_type,
       all_categories.category
FROM books 
LEFT JOIN book_type       ON books.book_type_id = book_type.book_type_id
LEFT JOIN book_categories ON books.book_id = book_categories.book_id
LEFT JOIN all_categories  ON book_categories.cat_id = all_categories.cat_id
WHERE books.in_stock = 'y' 

CURRENT OUTPUT:

book_title  book_price  book_type     category
My Book 1    10          Good          Comedy
My Book 1    10          Good          Drama
My Book 1    10          Good          Romance
My Book 1    10          Good          Horror
My Book 1    10          Good          Trivia
My Book 3    30          Better          Comedy
My Book 3    30          Better          Drama
My Book 3    30          Better          Romance
My Book 4    40          Best          NULL
My Book 6    60          Good          NULL
My Book 9    90          Good          NULL

NEED THE FOLLOWING OUTPUT:

book_title  book_price  book_type     category
My Book 1    10          Good          Comedy, Drama, Romance, Horror, Trivia
My Book 3    30          Better          Comedy, Drama, Romance
My Book 4    40          Best          NULL
My Book 6    60          Good          NULL
My Book 9    90          Good          NULL

Thanks to all in advance.

+1  A: 

Buddy, you should take a look at GROUP_CONCAT() function of MySQL. http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html#function_group-concat

It will be something amongst the lines:

SELECT stuff, GROUP_CONCAT(DISTINCT category ORDER BY category DESC SEPARATOR ', ') 
GROUP BY category, price, book_type
Clash
Hi Clash. Only a maximum of 1024 BYTES is allowed for GROUP_CONCAT. So that means if the (category + the separator) length is > 1024 bytes, then the query would fail. Am I not correct? As I mentioned, a book can belong to many categories. Say on an average, the Category is 15 bytes long. Now if the book belongs to 100 Categories, then the space taken is 1500 bytes which is more than 1024. This would lead to incorrect results. I was aware of the GROUP_CONCAT and wanted to avoid it for the same reason. Please let me know your thoughts. Thanks.
Devner
There is the setting `group_concat_max_len` (default 1024), maybe you can set it higher?
Clash
Hi Clash, I am not too sure if that can be done via the scripts. I think that might involve the web-hosting guys who would (as far as I know) be reluctant to change anything. Most of the times, my peers reported that any changes to the server settings are denied very oftenly as web hosts tend to avoid abuse by changing the settings as per user demand. So I thought of solving it without taking their support. I think that if we can get a perfect solution for this, then we need not worry about the hosting guys ever again, with regards to a similar problem. Hope that makes sense.
Devner
The SQL command is`SET [GLOBAL | SESSION] group_concat_max_len = val;`But do you really think you will ever go over the 1024 bytes limit? That's a lot of characters... although it is good that if there is this possibility, you should avoid future problems
Clash
Hi Clash, it may be possible to go over the limit as I have indicated in one of my previous posts to you above. Maybe we can do something with the new approach that you have shown. Also I found something interesting yet troublesome. I had another table which was records all orders placed for a book. If I try to get a count of all the orders placed, then the count result is incorrect. It is printing the count of all the categories present and not the count of the orders placed. This is happening if I fetch the categories using teh GROUP_CONCAT. Is there any way we can rectify this? Thanks.
Devner
About your count query, it depends on your group by clause, can you post the query and table structure?
Clash
+1 for the reference and follow ups. I figured it out. Thank you.
Devner
+3  A: 

Best way to make sure you don't lose ANY data would be multiple queries. Query the tables seperately and join them in PHP, probably so your queries look like this

book_id book_title  book_price  book_type         
    1  My Book 1    10          Good           
    2  My Book 3    30          Better          
    3  My Book 4    40          Best           
    4  My Book 6    60          Good            
    5  My Book 9    90          Good           

    book_id, category
    1   Comedy
    1   Drama
    1   Romance
    2   Comedy

etc

Edit:

No, you shouldn't need 100 hits on the DB, just two, one to get the books, the next to get the categories. The looping would be done in PHP to loop through the second query and join the data with the first. The second query could be

SELECT book.book_id, all_categories.category FROM book_category JOIN all_categories on book_categories.cat_id=all_categories.cat_id JOIN books on books.book_id=book_categories.book_id WHERE books.in_stock= 'y';

OR

SELECT book_categories.book_id, all_categories.category FROM book_category JOIN all_categories on book_categories.cat_id=all_categories.cat_id WHERE book_id IN (SELECT book_id FROM books WHERE books.in_stock= 'y');

MindStalker
Thanks for the response. That would mean that apart from the query that gets my data, it's an additional loop that fetches the categories for each of them. Which means if I have 100 books in stock, then just to get the categories, it's an additional 100 hits on the DB (1 hit/query/book). I feel that so many hits from many users would seriously take down the site. Does that make sense? Please let me know your thoughts on this one.
Devner
Make it work now, optimize later
Joe Philllips
@d03boy Ok, now how do I optimize it? Thanks.
Devner
@MindStalker Thanks for the response. I am trying to fetch the results for all the books that are in store and not just a specific book. So in the query, we cannot supply a specific book_id. Rather, we would have to just join the tables and fetch all the books that has in_store = 'y' (be it 100 books or 500 or 1000). With this in mind, how can we modify your query to accomodate the same?
Devner
+1  A: 

You could also use the CONCAT_WS seperator with a subquery, something like:

SELECT  books.book_title, 
    books.book_price, 
    book_type.book_type, 
    CONCAT_WS(', ', (   
                        SELECT  all_categories.category
                          FROM  all_categories 
                         WHERE  all_categories.cat_id = book_categories.cat_id
                     )
             ) AS book_category_list
 FROM books 
LEFT JOIN book_type   
       ON books.book_type_id = book_type.book_type_id                  
LEFT JOIN book_categories 
       ON books.book_id = book_categories.book_id
    WHERE books.in_stock = 'y'

Sorry, formatting got a little weird there.

Skone
Hi Skone. Did you get a chance to try out your query practically? I did and it still gives me the same results as in CURRENT OUTPUT with the exception that results from your query do NOT show the NULL values. It still repeats the rows, which I wanted to avoid. Also what would be the limits for CONCAT_WS? The limit of GROUP_CONCAT is 1024 bytes, so do you have an idea about the limits of CONCAT_WS? Please let me know. Thanks.
Devner
I didn't actually try it -- sorry that it didn't work as expected.I'm a bit busy at the moment but I'll try to respond later this evening with an answer (no promises) :PCheers.
Skone
Thanks and appreciate your efforts.
Devner
+1 for the GROUP_CONCAT
Devner
+1  A: 

Very simple. Add a group by. So it'd be as such:

EDIT: It looks like the best way to do this would be to use a group concat (even though you are concerned about it's upper limit) in order to get around the upper limit that it has I would suggest updating it at run time. So it'd be something like this:

SET GLOBAL group_concat_max_len = SOME_BIG_VALUE; #SOME_BIG_VALUE >> 1024

SELECT books.book_title,  books.book_price,
       book_type.book_type,
       GROUP_CONCAT(all_categories.category)
FROM books 
LEFT JOIN book_type       ON books.book_type_id = book_type.book_type_id
LEFT JOIN book_categories ON books.book_id = book_categories.book_id
LEFT JOIN all_categories  ON book_categories.cat_id = all_categories.cat_id
WHERE books.in_stock = 'y'
GROUP BY books.book_title
WillMatt
Devner
Sorry I edited this before I read your comments on having to go through the hosting company.
WillMatt
+1 for the extended GROUP_CONCAT code
Devner
A: 

Have you tried adding a distinct to your select?

SELECT DISTINCT books.book_title,  books.book_price,
   book_type.book_type,
   all_categories.category
FROM books 
LEFT JOIN book_type       ON books.book_type_id = book_type.book_type_id
LEFT JOIN book_categories ON books.book_id = book_categories.book_id
LEFT JOIN all_categories  ON book_categories.cat_id = all_categories.cat_id
WHERE books.in_stock = 'y'
GROUP BY books.book_title
Jim Ford
Thanks for the comment, Jim. The result of your query is same as the result of the query provided by WillMatt. Kindly refer to my comment on his answer. Thanks again.
Devner
+1  A: 

Set the session variable to a large value, run your query with GROUP_CONCAT, then reset it back to the global value.

SET SESSION group_concat_max_len=@@max_allowed_packet;

SELECT books.book_title,  books.book_price,
       book_type.book_type,
       GROUP_CONCAT(all_categories.category)
FROM books 
LEFT JOIN book_type       ON books.book_type_id = book_type.book_type_id
LEFT JOIN book_categories ON books.book_id = book_categories.book_id
LEFT JOIN all_categories  ON book_categories.cat_id = all_categories.cat_id
WHERE books.in_stock = 'y'
GROUP BY books.book_id;

SET SESSION group_concat_max_len=@@group_concat_max_len;
mluebke
+1 for the trick. Thank you.
Devner