views:

34

answers:

3

Hello,

I have written some code that provides statistical information about visitors to a website (like "Google Analytics")

I have a REFERER table with refererID, refererURL columns. The refererURL contains the whole url, including the parameters.

other tables, such as the statistics table, refer to this table by the refererID column.

This is not efficient in terms of database storage space.

How could I save the referer URL differently to save as much space as I can?

Thank you Yaron

+4  A: 

You could strip the query parameters from the urls before inserting them and keep them in a separate table then in your audit table (that counts references) store only the refererID and the date of the reference. This would normalize your referral data. I'd be hesitant to go any further than this as you'll start to lose information, but you might want to also keep (in a separate column) just the host part of the referring site to make it easier to calculate statistics by "site" in the future.

tvanfosson
A: 

I would recommend storing the domain, path, and query each separately. The domain will get a lot of reuse, especially from all of the Google traffic.

When displaying analytics data you will normally show the site which sent the traffic and when drilling down into the information you will display the more specific URL which sent the traffic.

Brendan Enrick
Thank you Brendan
Yaron
+1  A: 

I think you approach is fine. Storage is not a big deal in 2010.

You should consider performances a lot more.

Pierre 303