views:

23

answers:

1

I was looking at the amazon website and was wondering how one of the feature would have been implemented. The feature : what customers buy after viewing a particular item.

If i were to develop such a feature i would probably generate a session id for each user session and store the session id-page id combination in a log file. and if a book is bought set a separate flag for the session id-page id. A separate program can then be run on the log file periodically, to identify the groups that were bought together/viewed together and that information can be stored in a persistent file.

This is ofcourse a simple solution without taking into consideration the distributed nature of the servers - but would this suffice or can you help me identify a better design.

+1  A: 

What you can do is track what each viewer has looked at, by having (user_id, product_id, date) in a database. You can then look at what people bought the day they looked at that item.

But, you will want to improve on this, as, if I looked at a cooking pan, and bought a book, you don't want to show the two, unless it was a cooking-related book, perhaps.

But, storing the session id is probably not the best choice, as, what happens if I had looked at something, then later, after talking with my SO I decided I could buy that iPad, your way it wouldn't show up as something I bought after looking at a notebook.

So, start with writing out some user stories, basically just some scenarios that show what you want to do, then the design will probably be easier.

James Black
Good Idea. But the user in many cases might not be registered while searching for a product. I considered the use of db initially but wont it be a severe performance drag when there are millions of page hits a day.
praveen
The user story driven design is a great idea.
praveen
If you table is not heavily indexed then the inserts are fast, so just have just a couple of indexes and it should be fine.
James Black
You could register people by ip address and unixtime, just to have a unique identifier, but if they are registered then you just user their registered user_id.
James Black
@praveen - You could just store the ip address and have that be an unregistered user, if you wanted, that would just mean that you may have more than one person tied to that account information, but, that depends on how important it is to you for accuracy. This is why scenarios are useful, you can write out several different ones and decide how you want everything to work.
James Black