views:

62

answers:

4

This is a table design issue. I have a table that stores IP addresses. The data in the table is queried very heavily. The IPs can have different flags such as "unblocked", "temporarily blocked" and "permanently blocked". 95% - 99% of the IP addresses do not have any type of block on them.

Is there a way to limit the # of rows in the table without excluding any of the data - while keeping all of the data in the same table?

A suggestion that was made to me was to utilize comma delimited values in one of the fields (I presume with unblocked IP addresses). I am not at all familiar with this technique, however.

A: 

I think you should have a table that has a list of unique ip addresses. You should also have another one for the transaction on those ip addresses like ("unblocked", "temporarily blocked" and "permanently blocked").

Colour Blend
A: 

Do you actually have a performance problem on this table?

The key here is indexing; assuming you are indexing based on IP or on flags (or both) you should be able to quickly query any rows you need regardless of how many rows there are.

If you are concerned about performance but need to keep the data, you could always have two tables -- one for IPs with flags, one for unflagged rows. You'd need a set of procs/triggers for inserting/deleting rows as flag states change.

Don't use the comma-separate thing. It makes querying for individual IPs much more laborious (not to mention having to deal with the same problem of dealing with changes in flags...)

Joe
+2  A: 

Are the IP Addresses string URLs, like http://www.Amazon.com, or are they actual dotted quad notation ? if they are the latter, and if you are doing this to try to improve performance, then consider storing the 32 bit integer represetation of the IP address instead of the dotted quad string representation. (Are you using IP4 addresses or IP6 addresses?

The string represetnation of xxx.xxx.xxx.xxx takes 15 bytes, a 32 bit integer takes only 4. Add a byte for your statusFlag, and you have a table that's only 5 bytes wide. This should be perfomant enough to have every possible IP4 address (4 Billion of them) in it.

Charles Bretana
A: 

What is the difference between an address that is not mentioned in your table and an unblocked address? If a large majority of addresses are unblocked, then maybe you should represent that by being absent from the table?

Otherwise, if you are storing IPv4 addresses (not domain names), take Charles Bretanas suggestion and store the adresses as raw integers. If so, you can also add another 32-bit integer for a netmask, so you can store entire ranges (i.e. to block every address 10.0.0.0 - 10.255.255.255, you store address 10.0.0.0 as one integer and netmask 255.0.0.0 as another integer). This can reduce the number of rows extremely much (depending on your blocking behaviour), but it also makes efficient querying for a specific address a bit more complicated.

The same basic techiques can be applied for IPv6 address also, except that they are longer.

Rasmus Kaj