tags:

views:

49

answers:

2

Hi everyone.

My site logs clicks to a database whenever a user views an article. This table is automatically cleared out every 3 days. We use the data to work out the most viewed pages over that 3 day period.

I'd also like to use this data for marketing purposes, eg, to determine which users like which sections of the site.

I've written a script in php that does the following:

  • grab the user IDs of any logged-in member who viewed an article in the past 3 days

  • for each user, i query to count how many times they viewed articles within each section, Eg, Bobby viewed 10 pages in Food & Drink, and 6 pages in Sport. I tried combining this step and the previous one together but got weird results for the totals.

  • This gives me a nice array in this form:

    [Bobby Jones] => Array ( [Film] => 10 [Home Page] => 1 [Food & Drink] => 2 [Health & Beauty] => 1 [Gifts & Gadgets] => 3 )

What I want from this data is to eventually have a table that logs the data in the array above, and then when I run my queries, increments it.

Unfortunately, this adds a lot of overhead. When I have my arrays like the one above, I have to run another query to check if that combination of user and category already exists in the database, and if it does, I have to increment it by that day's values. Eg, if Bobby viewed 10 Film pages last week, and this week viewed 6, I need to UPDATE the table. If he's never viewed a Food page before, I need to INSERT instead.

This query is returning 400 or so users who've interacted with the site in the last 3 days. This means that for each I user I have to do 1 query to get their browsing totals, 1 query to see if they've already browsed that category before, and another query to update/insert, depending on whether they've browsed it or not. You can see how inefficient this is.

Can anyone suggest a better way of doing this? My ultimate goal is to end up with a table that shows me how frequently my users browse my categories, so I can say "show me all the users who like Food & Drink" etc.

Thanks, Matt

+1  A: 

You can accomplish the UPDATE/INSERT behavior using MySQL's INSERT...ON DUPLICATE KEY UPDATE.

You can combine the SELECT and INSERT queries using MySQL's INSERT...SELECT.

If you post more details (say, a schema, for example), I could show you a sample query combining those two techniques, though the MySQL Manual is pretty in-depth on both subjects.

James McNellis
Thanks, this was what I needed. I added a UNIQUE index to some of my fields then used the INSERT ... ON DUPLICATE KEY UPDATE function - this makes the page manageable (runs in about 20 seconds) and does what I need with the data, without any weird results. Thank you!
Matt Andrews
+1  A: 

If you're using MySQL and the version is sufficiently high, look into INSERT ... ON DUPLICATE KEY UPDATE. That should cut down on a query.

Then make sure your tables are properly keyed, and those two queries should be a breeze.

Adam Bard