tags:

views:

1924

answers:

13

For my server app, I need to check if an ip address is in our blacklist.

What is the most efficient way of comparing ip addresses? Would converting the IP address to integer and comparing them efficient?

+17  A: 

Depends what language you're using, but an IP address is usually stored as a 32-bit unsigned integer, at least at the network layer, making comparisons quite fast. Even if it's not, unless you're designing a high performance packet switching application it's not likely to be a performance bottleneck. Avoid premature optimization - design your program for testability and scalability and if you have performance problems then you can use a profiler to see where the bottlenecks are.

Edit: to clarify, IPv4 addresses are stored as 32-bit integers, plus a netmask (which is not necessary for IP address comparisons). If you're using the newer and currently more rare IPv6, then the addresses will be 128 bits long.

sk
+3  A: 

Yes I have found that to be efficient, it will be a long though, and of course you have to index blacklisted IPs in integer form.

eulerfx
+6  A: 

32-bit integers are the way to go -- until you start dealing with 128-bit IPv6 addresses.

Graeme Perrow
+2  A: 

Use a tool like PeerGuardian which disallows incoming TCP/IP connections at the driver level to IPs on a blacklist. Highly secure, no code required (arguably: highly secure, because no code required).

Justice
A: 

if you receive the IP address as a string, comparing it to a string may be more efficient than converting it to integer representation

but i'd profile both solutions to be certain, if a few milliseconds (nanoseconds!) are going to matter on this operation ;-)

Steven A. Lowe
+1  A: 

Do you have an existing problem with efficiency?

If so then by all means post the code (or pseudo-code) and we can pick at the corpse.

If not then I would suggest trying something simple like storing the entries in a sorted list and using your environment's existing Sort() and Find().

Mark Glorie
+4  A: 

You mean if you should compare it as a text string or convert int to int and compare as an int?

That's not usually the bottleneck in this sort of lookups. you can just try to implement both methods and see which one runs faster.

The real issue with IP address lookup is usually making efficient queries, taking advantage of the fact that you are dealing with IP addresses and not just random numbers. to accomplish this you can lookup LC trie and maybe this article

Obviously this should interest you only if your blacklist holds tens of thousands or millions of entries. If it has only 10-20 entries a linear search should be preferred and indeed the more interesting question is textual comparison vs integer comparison.

shoosh
+1  A: 

Integer comparisons are much faster than string comparisons.

If you store the integers in a sorted list, you can find them faster than in an unsorted list.

Adam Pierce
+2  A: 

I've done this and I've tested it, using an unsigned int (32 bit) is the fastest - I'm assuming that you're comparing this to the string representation.

Another thing that might help you is when creating the table, in the past I've had 2 colums: LowIP and HighIP; that way I have been able to blacklist entire ranges of IP's with 1 record entry and still get good performance by checking for the IP in the range.

Guy
+2  A: 

I once inherited code where somebody thought that storing IP addresses as 4 int's was a really good thing, except they spent all their time converting to/from int's.

Keeping them as strings in the database was far easier, and it only required a single index. You'd be surprised how well sql server can index strings as opposed to 4 columns of integers. But this IP list wasn't for blacklisting. A database round-trip is pretty costly.

If a database is overkill, store them in a dictionary in memory, but that's just a guess since we've no idea how many you need to compare. Since most hashcodes are 32-bit int's, and IPv4 addresses are 32 bits, the IP address itself might just be a good hashcode.

But as others point out, the best option might be to reduce the load on your server and buy specialized hardware. Maybe you keep recently blacklisted IP's in memory and periodically publish new one's to the router.

If you're the one trying to make some software inside a router, then you'll need to fish out your data-structures book and create something like a b-tree.

Robert Paulson
Comparing one bad representation (4 ints) with another (string) isn't a fair comparison really.
MSalters
It was anecdotal response of the KISS principle, and storing an IP address as a string was sufficient for the purpose at hand.
Robert Paulson
A: 

And what if you need to do fast filtering with as little memory requirement as possible?

I'm thinking of using a Bloom filter. Any suggestions on those?

+1  A: 

The Radix or PATRICIA Trie is the optimal structure for this.

Check out the C source for flow-tools: http://www.splintered.net/sw/flow-tools/

I worked on this years ago.

mparaz
+2  A: 
static public bool IsEqual(string ToCompare,
                                      string CompareAgainst)
  {

     return IPAddressToLongBackwards(ToCompare)==IPAddressToLongBackwards(CompareAgainst);
  }

static private uint IPAddressToLongBackwards(string IPAddr)
  {
     System.Net.IPAddress oIP=System.Net.IPAddress.Parse(IPAddr);
     byte[] byteIP=oIP.GetAddressBytes();


     uint ip=(uint)byteIP[0]<<24;
     ip+=(uint)byteIP[1]<<16;
     ip+=(uint)byteIP[2]<<8;
     ip+=(uint)byteIP[3];

     return ip;
  }

If I understand you correctly this is code to compare two IP Addresses, Do you want this ? you can further do such things like...

static public bool IsGreater(string ToCompare,
                               string CompareAgainst)
  {

     return IPAddressToLongBackwards(ToCompare)>
        IPAddressToLongBackwards(CompareAgainst);
  }

because you got the address bytes... If you like that then vote me UP but please do not vote me negative

Aizaz