ansaurus

Question

Answer 1

A:

Don't use a relational database for that. It's not designed to store that type of information.

You can try a NoSQL database such as Mongo (I know a lot of places use that for their logging since it has so little overhead).

If you must stick with MySQL, you can add an index to the ip column which should speed things up significantly...

ircmaxell 2010-08-17 20:04:06

That's what I would suggest. Also, think about the concept of calculating unique users. Calculate it just once and then reuse it. Number of yesterdays unique visitors will not change. Number of unique visitors last week will not change either..

dwich 2010-08-17 20:07:45

Based upon that, you could shard per day/week/month/whatever, and create a new table for each new period. That way you still retain the information (if you **really** need it), and get the performance gain of dealing with relatively small tables. But I must ask, why do you need to retain that much data? Why not just summarize once per day and then delete after a month or two?

ircmaxell 2010-08-17 20:12:17

@ircmaxell: I know how to use indexes... I'm only asking for DB models for high populated databases. I need to save all data because my framework needs all information of all clients in differents servers for stadistics and other things. Thanks(What's faster, INDEX IP, or the another solution, make a table with unique ip's?)

Wiliam 2010-08-17 20:24:56

Well, it all depends on the server. Don't forget that this table will be extremely write heavy (assuming it's being written to live). So the `on duplicate key update` may have a performance hit since it needs to read the index, and then seek to the position where insert would just need the seek (sounds like a tiny bit extra, and it is for one query. For thousands per second it's significant. Plus it enables writes to be streamed back to back rather than requiring seeks all over the place. Bottom line: test it. Make a test db, and a script to write to it and see...

ircmaxell 2010-08-17 20:31:47

Answer 2

+1 A:

Have another MyISAM table with only IP column and UNIQUE index on it. You'll get the proper count in no time (MyISAM caches number of rows in table)

[added after comments]

If you also need to count visits from each IP, add one more column visitCount and use

INSERT INTO 
  visitCounter (IP,visitCount) 
VALUES 
  (INET_ATON($ip),1) 
ON DUPLICATE KEY UPDATE 
  SET visitCount = visitCount+1

Mchl 2010-08-17 20:04:36

@Mchl if the IP column is UNIQUE wont that table always return COUNT = 1 per IP?

Frankie 2010-08-17 20:11:25

It will, but I understood that William wanted to count the number of all distinct IPs. This can still be modified by adding a `count` field and using `INSERT ... ON DUPLICATE KEY UPDATE ... ` syntax to increment it.

Mchl 2010-08-17 20:19:22

For unique visits it's a good solution. Save unique IP and the actual timestamp

Wiliam 2010-08-17 20:21:11

@Mchl: Storing IP as integer is better than storing it has string? Faster?

Wiliam 2010-08-17 20:28:48

Absolutely. It's storing a 4 byte integer instead of up to 15 bytes.... Plus it doesn't require the charset and collation routines for searching and retrieving...

ircmaxell 2010-08-17 20:35:24

Yeah. Just don't do `WHERE INET_NTOA(ip) = '127.0.0.1'` but do `WHERE ip = INET_ATON('127.0.0.1')`. The difference is: second one uses index.

Mchl 2010-08-17 21:20:20

ansaurus

tags:

views:

answers:

PHP and MySQL stats system.

related questions