ansaurus

Question

Is there way to match IP with IP+CIDR straight from SELECT query?

Answer 1

A:

Hmmm. You could build a table of the cidr masks, join it, and then compare the ip anded (& in MySQL) with the mask with the ban block ipaddress. Would that do what you want?

-- MarkusQ

P.S. If you don't want to build a mask table, you could compute the mask as -1 << (x-cidr) with x = 64 or 32 depending.

MarkusQ 2009-02-27 17:43:22

Answer 2

A:

For IPv4, you can use:

SET @length = 4;

SELECT  INET_NTOA(ipaddr), INET_NTOA(searchaddr), INET_NTOA(mask)
FROM  (
  SELECT
        (1 << (@length * 8)) - 1 & ~((1 << (@length * 8 - cidr)) - 1) AS mask,
        CAST(CONV(SUBSTR(HEX(ipaddr), 1, @length * 2), 16, 10) AS DECIMAL(20)) AS ipaddr,
        CAST(CONV(SUBSTR(HEX(@myaddr), 1, @length * 2), 16, 10) AS DECIMAL(20)) AS searchaddr
  FROM  ip
) ipo
WHERE ipaddr & mask = searchaddr & mask

Quassnoi 2009-02-27 18:06:26

Answer 3

+3 A:

Remember that IPs are not a textual address, but a numeric ID. I have a similar situation (we're doing geo-ip lookups), and if you store all your IP addresses as integers (for example, my IP address is 192.115.22.33 so it is stored as 3228767777), then you can lookup IPs easily by using right shift operators.

The downside of all these types of lookups is that you can't benefit from indexes and you have to do a full table scan whenever you do a lookup. The above scheme can be improved by storing both the network IP address of the CIDR network (the beginning of the range) and the broadcast address (the end of the range), so for example to store 192.168.1.0/24 you can store two columns:

network     broadcast
3232235776, 3232236031

And then you can to match it you simply do

SELECT count(*) FROM bans WHERE 3232235876 >= network AND 3232235876 <= broadcast

This would let you store cIDR networks in the database and match them against IP addresses quickly and efficiently by taking advantage of quick numeric indexes.

Guss 2009-02-27 20:45:34

Unfortunately, MySQL cannot combine two indexes together. It of course will try to use the index on either network or broadcast, but as ip addresses are distibuted evenly, full table scan will be much more efficient in this case.

Quassnoi 2009-02-27 23:04:21

This myth is a bit out of date :-) starting with MySQL 5.0, the server can merge multiple indexes (http://dev.mysql.com/doc/refman/5.1/en/index-merge-optimization.html). Regardless, I can't see how a full table scan is better then using an index, even if it is only 1 index.

Guss 2009-02-28 01:27:10

If your filter returns more than about 10% or rows, full table scan is better. Try it :)

Quassnoi 2009-02-28 20:18:56

And you don't need COUNT(*) here, SELECT ... LIMIT 1 is enough to ban :)

Quassnoi 2009-02-28 20:19:28

And MySQL cannot combine two ranged conditions with an AND clause: http://dev.mysql.com/doc/refman/5.1/en/index-merge-intersection.html

Quassnoi 2009-02-28 20:31:34

I'm really not sure where you get those ideas. (a) there is no way a full table scan is better then an index scan - granted if your index scan generates a significant number of rows that must be retrieved fully then the difference will be negligible. "Significant" here is 80%-90%, not 10!

Guss 2009-03-01 07:45:08

P.S. this condition is very unlikely in this problem domain, and can be mitigated easily by only selecting on the columns in the index at which point MySQL can skip retrieving full rows and just answer from the index.

Guss 2009-03-01 07:46:40

(b) You are of course right, but the original question used `count()` so I just copied that.

Guss 2009-03-01 07:47:12

(c) The link you show specifically says that AND condition works. At worst it means that you have to have index(network,broadcast) to make it work efficiently. And this is only one of 3 index merge algorithms MySQL can choose to employ. Quassnoi - please read the material you link to.

Guss 2009-03-01 07:49:23

Guss, believe me, I do read and check thing before I post them. If you read my link, you will see: either EQUALS condition on the whole index, or a RANGED CONDITION OVER PRIMARY KEY. You can't have two primary keys in one table, you cannot combine two RANGED (RANGED) conditions with an AND.

Quassnoi 2009-03-01 08:59:53

for (a): just checked in my database. SELECT COUNT(*) FROM it WHERE id > 10000 AND source > 0: 0,11 sec (TABLE SCAN), same with FORCE INDEX (PRIMARY): 1,58 sec, 14 times longer. PRIMARY INDEX is on ID, "id > 10000" filters out about 10% rows, "source > 0" is always TRUE.

Quassnoi 2009-03-01 09:11:28

@Quassnoi - sorry for disparaging, you are 99% correct. range scans can only be done on primary keys (with InnoDB engine, which probably means that it wont work with MyISAM engine).

Guss 2009-03-01 12:19:50

That being said, if you only select on indexed columns, then even if the server scans all the rows (which it will do in the above case), it doesn't have to do a full table scan (which requires loading full rows and is very slow) - it can just load the entire index -that fits into memory most times.

Guss 2009-03-01 12:22:21

ah... BTW Quassnoi: if "id>10000" filters out 10% of the rows, then you are selecting on 90% of the rows, see my point (a) above.

Guss 2009-03-01 12:49:03

I just demoed this on a MySQL 5.1: with an index lookup that hits about 90% of the rows, using the index (force index) is indeed slower then a full table scan (ignore index). If the index lookup hits about 10% of the rows, (like in this question) the situation is reversed (even for range lookups).

Guss 2009-03-01 17:06:16

Maybe it depends on the table data. In the tables I checked against, INDEX SCANs perform better than TABLE SCANs only if index condition covers about 10% of data or less. If table rows are long, situation may probably change.

Quassnoi 2009-03-02 17:08:13

ansaurus

tags:

views:

answers:

Is there way to match IP with IP+CIDR straight from SELECT query?

related questions