I know i am writing query's wrong and when we get a lot of traffic, our database gets hit HARD and the page slows to a grind... I think I need to write queries based on CREATE VIEW from the last 30 days from the CURDATE ?? But not sure where to begin or if this will be MORE efficient query for the database?

Anyways, here is a sample query I have written..

$query_Recordset6 = "SELECT `date`, title, category, url, comments 
                       FROM cute_news 
                      WHERE category LIKE '%45%' 
                   ORDER BY `date` DESC";

Any help or suggestions would be great! I have about 11 queries like this, but I am confident if I could get help on one of these, then I can implement them to the rest!!

SELECT `date`, title, category, url, comments 
                       FROM cute_news 
                      WHERE category LIKE '%45%' 
                   ORDER BY `date` DESC

The LIKE '%45%' means a full table scan will need to be performed. Are you perhaps storing a list of categories in the column? If so creating a new table storing category and news_article_id will allow an index to be used to retrieve the matching records much more efficiently.

Martin Smith
+2  A: 

Putting a wildcard on the left side of a value comparison:

LIKE '%xyz'

...means that an index can not be used, even if one exists. Might want to consider using Full Text Searching (FTS), which means adding full text indexing.

Normalizing the data would be another step to consider - categories should likely be in a separate table.

OMG Ponies
Wondering if this would work better?SELECT... FROM... LIMIT 50

OK, time for psychic debugging.

In my mind's eye, I see that query performance would be improved considerably through database normalization, specifically by splitting the category multi-valued column into a a separate table that has two columns: the primary key for cute_news and the category ID.

This would also allow you to directly link said table to the categories table without having to parse it first.

Or, as Chris Date said: "Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else)."

R. Bemrose

Anything with LIKE '%XXX%' is going to be slow. Its a slow operation.

For something like categories, you might want to separate categories out into another table and use a foreign key in the cute_news table. That way you can have category_id, and use that in the query which will be MUCH faster.

Also, I'm not quite sure why you're talking about using CREATE VIEW. Views will not really help you for speed. Not unless its a materialized view, which MySQL doesn't suppose natively.


If your database is getting hit hard, the solution isn't to make a view (the view is still basically the same amount of work for the database to do), the solution is to cache the results.

This is especially applicable since, from what it sounds like, your data only needs to be refreshed once every 30 days.

Lotus Notes

I'd guess that your category column is a list of category values like "12,34,45,78" ?

This is not good relational database design. One reason it's not good is as you've discovered: it's incredibly slow to search for a substring that might appear in the middle of that list.

Some people have suggested using fulltext search instead of the LIKE predicate with wildcards, but in this case it's simpler to create another table so you can list one category value per row, with a reference back to your cute_news table:

CREATE TABLE cute_news_category (
  news_id INT NOT NULL,
  category INT NOT NULL,
  PRIMARY KEY (news_id, category),
  FOREIGN KEY (news_id) REFERENCES cute_news(news_id)

Then you can query and it'll go a lot faster:

SELECT n.`date`, n.title, c.category, n.url, n.comments 
FROM cute_news n
JOIN cute_news_category c ON (n.news_id = c.news_id)
WHERE c.category = 45 
ORDER BY n.`date` DESC
Bill Karwin

Any answer is a guess, show:
- the relevant SHOW CREATE TABLE outputs
- the EXPLAIN output from your common queries.

And Bill Karwin's comment certainly applies.

After all this & optimizing, sampling the data into a table with only the last 30 days could still be desired, in which case you're better of running a daily cronjob to do just that.