views:

1308

answers:

9

How to increase the performance for mysql database because I have my website hosted in shared server and they have suspended my account because of "too many queries" the stuff asked "index" or "cache" or trim my database I don't know what does "index" and cache mean and how to do it on php thanks

A: 

You may have poorly-written queries, and/or poorly written pages that run too many queries. Could you give us specific examples of queries you're using that are ran on a regular basis?

Jonathan Sampson
A: 

sure this query to fetch the last 3 posts

select * from posts where visible = 1 and date > ($server_date - 86400) and dont_show_in_frontpage = 0 order by votes*1000+((1440 - ($server_date - date))/60)*2+visites*600 desc limit 3

what do you think?

ahmed
What is the point of all this arithmetic? Why not just "ORDER BY votes DESC". Also, how many queries are you running on average per hit to your site? What type of site is this?
Jonathan Sampson
thanks, about why the query is complicated?because it is an algorithm to order the articles in my websitemy website is blogs aggregator.I don't know how many queries -but I think there are alot- I will try to see
ahmed
1. Make sure you have an index on the 'date' column, so that MySQL can immediately disregard the older rows. 2. Even with an index on 'votes', the calculation means MySQL couldn't use it so it has to scan every row into a temporary table for sorting, not just the 'best' 3.
bobince
+1  A: 

You should not select * ever. Instead, select only the data you need for that particular call. And what is your intention here?

order by votes*1000+((1440 - ($server_date - date))/60)2+visites600 desc
Jonathan Sampson
thanks for the advise about select * , hopefully I change this soon, thanks, about the rest of the query it is an algorithm to order the articles in my website because it is blogs aggregator.
ahmed
+4  A: 

There are a couple things you can look into:

  1. Query Design - look into more advanced and faster solutions
  2. Hardware - throw better/faster hardware at the problem
  3. Database Design - use indexes and practice good database design

All of these are easier said then down but it is a start.

Berek Bryan
+6  A: 

Setup a copy of your application locally, enable the mysql query log, and setup xdebug or some other profiler. The start collecting data, and testing your application. There are lots of guides, and books available about how to optimize things. It is important that you spend time testing, and collecting data first so you optimize the right things.

Using the data you have collected try and reduce the number of queries per page-view, Ideally, you should be able to get everything you need in less 5-10 queries.

Look at the logs and see if you are asking for the same thing twice. It is a bad idea to request a record in one portion of your code, and then request it again from the database a few lines later unless you are sure the value is likely to have changed.

Look for queries embedded in loop, and try to refactor them so you make a single query and simply loop on the results.

The select * you mention using is an indication you may be doing something wrong. You probably should be listing fields you explicitly need. Check this site or google for lots of good arguments about why select * is evil.

Start looking at your queries and then using explain on them. For queries that are frequently used make sure they are using a good index and not doing a full table scan. Tweak indexes on your development database and test.

Zoredache
+1  A: 

Firstly, sack your host, get off shared hosting into an environment you have full control over and stand a chance of being able to tune decently.

Replicate that environment in your lab, ideally with the same hardware as production; this includes things like RAID controller.

Did I mention that you need a RAID controller. Yes you do. You can't achieve decent write performance without one - which needs a battery backed cache. If you don't have one, each write needs to physically hit the disc which is ruinous for performance.

Anyway, back to read performance, once you've got the machine with the same spec RAID controller (and same discs, obviously) as production in your lab, you can try to tune stuff up.

More RAM is usually the cheapest way of achieving better performance - make sure that you've got MySQL configured to use it - which means tuning storage-engine specific parameters.

I am assuming here that you have at least 100G of data; if not, just buy enough ram that your entire DB fits in ram then read performance is essentially solved.


Software changes that others have mentioned such as optimising queries and adding indexes are helpful too, but only once you've got a development hardware environment that enables you to usefully do performance work - i.e. measure performance of your application meaningfully - which means real hardware (not VMs), which is consistent with the hardware environment used in production.


Oh yes - one more thing - don't even THINK about deploying a database server on a 32-bit OS, it's a ruinous waste of good ram.

MarkR
+2  A: 

Indexing is done on the database tables in order to speed queries. If you don't know what it means you have none. At a minumum you should have indexes on every foriegn key and on most fileds that are used frequently in the where clauses of your queries. Primary keys should have indexes automatically assuming you set them up to begin with which I would find unlikely in someone who doesn't know what an index is. Are your tables normalized?

BTW, since you are doing a division in your math (why I haven't a clue), you should Google integer math. You may neot be getting correct results.

HLGEM
My database is normalized. but I didn't know about indexing because anyway I think this the time to know it thanks for explanationsorry butI could not understand what do you mean by "you should Google integer math. You may neot be getting correct results."!English as a second language :)
ahmed
+4  A: 

What an index is:

Think of a database table as a library - you have a big collection of books (records), each with associated data (author name, publisher, publication date, ISBN, content). Also assume that this is a very naive library, where all the books are shelved in order by ISBN (primary key). Just as the books can only have one physical ordering, a database table can only have one primary key index.

Now imagine someone comes to the librarian (database program) and says, "I would like to know how many Nora Roberts books are in the library". To answer this question, the librarian has to walk the aisles and look at every book in the library, which is very slow. If the librarian gets many requests like this, it is worth his time to set up a card catalog by author name (index on name) - then he can answer such questions much more quickly by referring to the catalog instead of walking the shelves. Essentially, the index sets up an 'alternative ordering' of the books - it treats them as if they were sorted alphabetically by author.

Notice that 1) it takes time to set up the catalog, 2) the catalog takes up extra space in the library, and 3) it complicates the process of adding a book to the library - instead of just sticking a book on the shelf in order, the librarian also has to fill out an index card and add it to the catalog. In just the same way, adding an index on a database field can speed up your queries, but the index itself takes storage space and slows down inserts. For this reason, you should only create indexes in response to need - there is no point in indexing a field you rarely search on.

What caching is:

If the librarian has many people coming in and asking the same questions over and over, it may be worth his time to write the answer down at the front desk. Instead of checking the stacks or the catalog, he can simply say, "here is the answer I gave to the last person who asked that question".

In your script, this may apply in different ways. You can store the results of a database query or a calculation or part of a rendered web page; you can store it to a secondary database table or a file or a session variable or to a memory service like memcached. You can store a pre-parsed database query, ready to run. Some libraries like Smarty will automatically store part or all of a page for you. By storing the result and reusing it you can avoid doing the same work many times.

In every case, you have to worry about how long the answer will remain valid. What if the library got a new book in? Is it OK to use an answer that may be five minutes out of date? What about a day out of date?

Caching is very application-specific; you will have to think about what your data means, how often it changes, how expensive the calculation is, how often the result is needed. If the data changes slowly, it may be best to recalculate and store the result every time a change is made; if it changes often but is not crucial, it may be sufficient to update only if the cached value is more than a certain age.

Hugh Bothwell
great analogy. Thanks.
andyk