views:

111

answers:

3

Hi,

I am trying to get do this:

<?php
  $good_customer = 0;
  $q = mysql_query("SELECT user FROM users WHERE activated = '1'"); // this gives me about 40k users

  while($r = mysql_fetch_assoc($q)){
    $money_spent = 0;

    $user = $r['user'];
    // Do queries on another 20 tables
    for($i = 1; $i<=20 ; $i++){
      $tbl_name = 'data' . $i;

      $q2 = mysql_query("SELECT money_spent FROM $tbl_name WHERE user = '{$user}'");
      while($r2 = mysql_fetch_assoc($q2)){
        $money_spend += $r2['money_spent'];
      }

      if($money_spend > 1000000){
        $good_customer += 1;
      }
    }
  }

This is just an example. I am testing on localhost, for single user, it returns very fast. But when I try 1000, it takes forever, not even mentioned 40k users.

Anyway to optimise/improve this code?

EDIT: By the way, each of the others 20 tables has ~20 - 40k records

EDIT2:

Okay, drop the "money spend" idea. This is my current structures:

user table => user is PK

logs_week_1 table => user is FK.

logs_week_2 table => user is FK

logs_week_3 table => user is FK

... will have more logs tables in future.

I want to find the "average time" they spend on my site which the time stored in each of the logs tables.

So you guys were saying, storing the logs weekly is a bad idea? I should merge into one table?

+2  A: 

Sounds like you have a problem with your model. Why do you have 20 data-tables instead of one with a week-column?

Then you could do a

Select user, Sum( money_spent ) As total_money_spent
From data
Group By user

or even

Select Count(*) As good_customer_count
From data
Group By user
Having Sum( money_spent ) > 1000000

With your current structure you can only do something like this:

Select u.user, d1.money_spent + d2.money_spent + ...
From users u
Join data1 d1 On ( d1.user = u.user )
Join data2 d2 On ( d2.user = u.user )
...

or

Select Count(*) As good_customer_count
From
  ( Select d1.money_spent + d2.money_spent + ... As total_money_spent
    From data1 d1
    Join data1 d1 On ( d1.user = u.user )
    Join data2 d2 On ( d2.user = u.user )
    ...
  )
Where total_money_spent > 1000000

This will certainly be faster than your current solution.


And the time spent on a page should be stored in a numeric field.

Peter Lang
if the money_spent is "time" type. How do I do the sum? E.g 00:10:23, 00:12:01 etc
mysqllearner
I think we need information about your table structures and about your data to answer that.
Peter Lang
`money_spent` did sound like some sort of number column to me...
Peter Lang
@peter: I will edit my question, about the structures
mysqllearner
+1, but I updated my answer with UNION ALL option :)
Unreason
A: 

You should store the time spent on your site as number (in minutes or seconds), not as time. Then you can calculate the averages and sums on this value. And keep your logs in one table.

Riho
+1  A: 

As Peter already gave a good answer I will only post how the query would look with proper design (all log data in one table)

SELECT user, AVG(TIMEDIFF(start_time, end_time)) AS average_time
FROM logs
GROUP BY user

You can apply further where conditions to the above to get stats for only a certain period (week, month, etc) or you can also group by another level.

You can also get MAX and COUNT in the same query (as well as standard deviation and other aggregate function) in an effective manner.

Of course, take care of your indexes for best performance with larger data sets.

EDIT:

Just as I was giving peter +1 I noticed that he did not mention UNION ALL option

So, you could (this is not optimal and does not contradict design issue warnings given by others)

SELECT user, AVG(TIMEDIFF(start_time, end_time)) AS average_time
FROM (
    SELECT * FROM log_week_1
    UNION ALL
    SELECT * FROM log_week_2
    UNION ALL
    SELECT * FROM log_week_3
    ...
) U
GROUP BY user

And also you can create a VIEW for this union.

Unreason
@Unreason: Whats the different between using Union and INNER Join? Currently I am using UNION, the result seems abit weird to me. I am trying to use INNER Join now
mysqllearner
@mysqllearner: if your inner join is 1-1 then joins your tables next to each other, you can select all columns from all tables in a single row. if you use join the number of columns remains the same (must be the same) and results are appended one after other (with the big difference in performance between UNION and UNION ALL: UNION will return unique rows and that will require building an index, UNION ALL can return duplicate records, but I assumed that your logs don't overlap).
Unreason
@mysqllearner: I posted the UNION ALL solution because it is conceptually equivalent (for selects) to fixing your design - keeping all logs in one table (but still it is not the same performance wise).
Unreason