views:

81

answers:

3

Hi,

What I wish to achieve is - log all information about each and every visit to every page ofmy website (like ip address, browser, referring page, etc). Now this is easy to do.

What I am interested is doing this in a way so as to cause minimum overhead (runtime) in the php scripts. What is the best approach for this efficiency-wise:

1) Log all information to a database table

2) Write to a file (from php directly)

3) Call a C++ executable, that will write this info to a file in parallel [so the script can continue execution without waiting for the file write to occur ...... is this even possible]

I may be trying to optimize unnecessarily/prematurely, but still - any thoughts / ideas on this would be appreciated. (I think efficiency of file write/logging can really be a concern if I have say 100 visits per minute...)

Thanks & Regards,

JP

+4  A: 

You have this C++ executable. Called web-server. It logs every hit to your site already.

Col. Shrapnel
Plus most web servers can be configured to log everything you mentioned (and more)...
ircmaxell
Hi, My site is hosted in a shared hosting environment (GoDaddy if that helps). So I think I will not have access to the web-server logs? They have something called "access logs". There are two problems:i) It doesnt seem to log every visit ii) I dont see any way to configure it.Perhaps I am missing something. Any suggestions?thanksJP
JP19
@JP19 Man, don't take it as offense, but it's most funny comment I've ever seen. Would you please ask your daddy if he can stand even 10 visits per minute? If not, why'd you concern for such things too far away from your shared hosting abilities? Dunno for godaddy, but access logs **intended** to log every hit.It's the only access log purpose. I have a shtload of access log analyzers on my sites, just because it logs everything, while PHP based logger will log php files access only. But 100 per minute on shared... Oh my...
Col. Shrapnel
BTW, 100 per minute is just 6 per second. Not a load to find smth special. Any mechanism you mentioned will handle it easy. I took it 100 per second at first.
Col. Shrapnel
I take no offense, but - there can be cases where a page on my site can attract a heavy peak traffic in one hour, while the average monthly traffic is pretty low. I was just planning for worst case scenario. (In anycase, premature optimization is my weakness that I am trying to overcome :)
JP19
@JP19 you didn't take the whole point either. **Writing logs is a least task your server have to accomplish at peak traffic**. Your lame application will hang far earlier than something happened to log system.
Col. Shrapnel
@Col S.: Yes, I agree with it. However, the whole point was not having access to system logs (on a shared hosting thing).
JP19
+1  A: 
  1. Robust but could be a pain to implement
  2. Be careful for multithreading. What happens if two users call simultaneously your php script and the file is already open for writing.
  3. Same as 2 but the exception will occur in the C++ executable.

I would suggest you using a logging framework such as log4php.

Darin Dimitrov
he could write in different log files based on IP ?User 1: 193.123.123.1User 2: 194.123.123.1, write in 2 files file_193.txt, file_194.txt
Mihai Iorga
What if all the users are behind the same proxy?
Darin Dimitrov
Ya, simultaneous multiple writes were my concern. I will check out log4php to see if it addresses my issue (of efficiency).thanksJP
JP19
A: 
  • database write can really make things get ugly.
  • calling an executable is not ok, because it can stuck, what what will you do when your c++ get's stuck? you will get 100 opens per minute.
  • i think the best way in this situation is to write in plain text files ...

my opinion.

Mihai Iorga
Hmmm... ya, 100 C++ executable calls per minute looks like a bad idea.About the 'stuck' thing, my point was - its okay to have some errors in the exec, (even miss some logs) - but, atleast the PHP script will not suffer slowdown. I could even implement a queue kind of thing in C++.any thoughts?thanksJP
JP19
yeah on shared hosting
Col. Shrapnel
I know friend Col. Shrapnel means no offense, but the thing is - there are cases where some articles on a website can get a heavy peak traffic in one hour, while the average monthly traffic is still pretty low. Needless to say - I was planning for worst case scenario. (Anyways old thread but wanted to clarify :)
JP19