views:

1050

answers:

3

Something like

SELECT COUNT(*) AS c FROM BANS WHERE typeid=6 AND (SELECT ipaddr,cidr FROM BANS) MATCH AGAINST 'this_ip';

So you don't first fetch all records from DB and then match them one-by one.

If c > 0 then were matched.

BANS table:

id int auto incr PK
typeid TINYINT (1=hostname, 4=ipv4, 6=ipv6)
ipaddr BINARY(128)
cidr INT
host VARCHAR(255)

DB: MySQL 5

IP and IPv type (4 or 6) is known when querying.

IP is for example ::1 in binary format

BANNED IP is for example ::1/64

A: 

Hmmm. You could build a table of the cidr masks, join it, and then compare the ip anded (& in MySQL) with the mask with the ban block ipaddress. Would that do what you want?

-- MarkusQ

P.S. If you don't want to build a mask table, you could compute the mask as -1 << (x-cidr) with x = 64 or 32 depending.

MarkusQ
A: 

For IPv4, you can use:

SET @length = 4;

SELECT  INET_NTOA(ipaddr), INET_NTOA(searchaddr), INET_NTOA(mask)
FROM  (
  SELECT
        (1 << (@length * 8)) - 1 & ~((1 << (@length * 8 - cidr)) - 1) AS mask,
        CAST(CONV(SUBSTR(HEX(ipaddr), 1, @length * 2), 16, 10) AS DECIMAL(20)) AS ipaddr,
        CAST(CONV(SUBSTR(HEX(@myaddr), 1, @length * 2), 16, 10) AS DECIMAL(20)) AS searchaddr
  FROM  ip
) ipo
WHERE ipaddr & mask = searchaddr & mask
Quassnoi
+3  A: 

Remember that IPs are not a textual address, but a numeric ID. I have a similar situation (we're doing geo-ip lookups), and if you store all your IP addresses as integers (for example, my IP address is 192.115.22.33 so it is stored as 3228767777), then you can lookup IPs easily by using right shift operators.

The downside of all these types of lookups is that you can't benefit from indexes and you have to do a full table scan whenever you do a lookup. The above scheme can be improved by storing both the network IP address of the CIDR network (the beginning of the range) and the broadcast address (the end of the range), so for example to store 192.168.1.0/24 you can store two columns:

network     broadcast
3232235776, 3232236031

And then you can to match it you simply do

SELECT count(*) FROM bans WHERE 3232235876 >= network AND 3232235876 <= broadcast

This would let you store cIDR networks in the database and match them against IP addresses quickly and efficiently by taking advantage of quick numeric indexes.

Guss
Unfortunately, MySQL cannot combine two indexes together. It of course will try to use the index on either network or broadcast, but as ip addresses are distibuted evenly, full table scan will be much more efficient in this case.
Quassnoi
This myth is a bit out of date :-) starting with MySQL 5.0, the server can merge multiple indexes (http://dev.mysql.com/doc/refman/5.1/en/index-merge-optimization.html). Regardless, I can't see how a full table scan is better then using an index, even if it is only 1 index.
Guss
If your filter returns more than about 10% or rows, full table scan is better. Try it :)
Quassnoi
And you don't need COUNT(*) here, SELECT ... LIMIT 1 is enough to ban :)
Quassnoi
And MySQL cannot combine two ranged conditions with an AND clause: http://dev.mysql.com/doc/refman/5.1/en/index-merge-intersection.html
Quassnoi
I'm really not sure where you get those ideas. (a) there is no way a full table scan is better then an index scan - granted if your index scan generates a significant number of rows that must be retrieved fully then the difference will be negligible. "Significant" here is 80%-90%, not 10!
Guss
P.S. this condition is very unlikely in this problem domain, and can be mitigated easily by only selecting on the columns in the index at which point MySQL can skip retrieving full rows and just answer from the index.
Guss
(b) You are of course right, but the original question used `count()` so I just copied that.
Guss
(c) The link you show specifically says that AND condition works. At worst it means that you have to have index(network,broadcast) to make it work efficiently. And this is only one of 3 index merge algorithms MySQL can choose to employ. Quassnoi - please read the material you link to.
Guss
Guss, believe me, I do read and check thing before I post them. If you read my link, you will see: either EQUALS condition on the whole index, or a RANGED CONDITION OVER PRIMARY KEY. You can't have two primary keys in one table, you cannot combine two RANGED (RANGED) conditions with an AND.
Quassnoi
for (a): just checked in my database. SELECT COUNT(*) FROM it WHERE id > 10000 AND source > 0: 0,11 sec (TABLE SCAN), same with FORCE INDEX (PRIMARY): 1,58 sec, 14 times longer. PRIMARY INDEX is on ID, "id > 10000" filters out about 10% rows, "source > 0" is always TRUE.
Quassnoi
@Quassnoi - sorry for disparaging, you are 99% correct. range scans can only be done on primary keys (with InnoDB engine, which probably means that it wont work with MyISAM engine).
Guss
That being said, if you only select on indexed columns, then even if the server scans all the rows (which it will do in the above case), it doesn't have to do a full table scan (which requires loading full rows and is very slow) - it can just load the entire index -that fits into memory most times.
Guss
ah... BTW Quassnoi: if "id>10000" filters out 10% of the rows, then you are selecting on 90% of the rows, see my point (a) above.
Guss
I just demoed this on a MySQL 5.1: with an index lookup that hits about 90% of the rows, using the index (force index) is indeed slower then a full table scan (ignore index). If the index lookup hits about 10% of the rows, (like in this question) the situation is reversed (even for range lookups).
Guss
Maybe it depends on the table data. In the tables I checked against, INDEX SCANs perform better than TABLE SCANs only if index condition covers about 10% of data or less. If table rows are long, situation may probably change.
Quassnoi