views:

385

answers:

6

What is the most efficient way to store and retrieve IP addresses in MySQL? Right now I'm doing:

SELECT * FROM logins WHERE ip = '1.2.3.4'

Where ip is a VARCHAR(15) field.

Is there a better way to do this? I'm using Python.

+2  A: 

The most important thing is to make sure that column is indexed. This could make a huge difference to queries based on IP address.

Mark Byers
A: 

Why do you need it to be faster? ... "Premature optimization"

drachenstern
If you have a database storing millions of records with IP addresses changing a varchar(15) field to an int field will have a significant impact on performance. I'm not sure if that's what ensnare is running into but there are cases where this would be a legitimate concern.
Spencer Ruport
Yes, that is my concern.
ensnare
It's tagged beginner, so I doubt millions of records. In any case, profiling and benchmarking are the only way to sanely make a recommendation for that scenario.
DGM
@drachenstern: -1 for a useless response. Someone asks how to do something better and your answer is "don't" ? Even if this is just a hobby project, there's nothing wrong with trying to improve yourself and use the best possible methods.
nickf
No, it's not an invalid response. It never hurts to make sure the WHY of the question is as valid as the HOW. And besides, I commented to the favor on some other answers in this thread. I would say that he has been given at least three ways to do this besides the default, and my answer is just counseling patience. I doubt that the SQL engine needs to be optimized overly much for comparing a field of this size, and considering very few people are going to store all 4.3 billion addresses, I seriously contend that the question should be redefined. For ~1 million records, SQL knows what to do.
drachenstern
But yes, indexing is the first thing I would offer, assuming profiling shows that it's needed/beneficial. Next would be conversion from string, but that eats computation time somewhere.
drachenstern
@nickf, You have equivocated "more efficient" with "better"; they do not always converge. Things like correctness, clarity, maintainability, testability, readability, modularity, and documentation are easily as important as performance. What's more, performance can be rather odd, with bottlenecks occurring at surprising places and unexpected solutions outperforming ones you would predict to have the best performance. The vast majority of performance-oriented problems I've seen on SO were not based on any evidence that anyone was optimizing the right code or that the optimized code helped.
Mike Graham
Indeed, I suspect that with common usage replacing this VARCHAR with an INT UNSIGNED would save a little storage space but take a little longer since you need to process all data. This may or may not matter. This performance probably isn't the best way to determine what the *best* way to store an IP in a database is.
Mike Graham
@Mike: I see this as analogous to asking "Which is faster, i++ or ++i?" - whichever method you use, it's not going to have any *actual* impact in 99.9999999% of cases, but it doesn't hurt to know the answer or to implement the better solution, however marginal the gain *(on the proviso that it is equally maintainable/legible etc)*. This answer should have been a comment, IMO.
nickf
@nickf, `++i` has a reputation of being faster, but when they are used such that they mean the same thing, a good C compiler should render them *exactly the same*. Herein lies the problem with the way many people do optimization: they base it on intuition and myth and rumour rather than concrete evidence.
Mike Graham
A: 

maybe store the integer value directly in an integer field? An IP address is basically 4 "shorts".

Check it out: http://en.kioskea.net/faq/945-converting-a-32-bit-integer-into-ip

Brian Dilley
so he would store it in four fields? and they would be bytes, no? 2^8 everytimeAlternatively, he could convert them to decimal numbers and store the decimal, although that would be a larger field, it would be a single field.
drachenstern
A more appropriate vernacular would be to say that an IP address is 4 bytes.
Spencer Ruport
+10  A: 

For IPv4 addresses, you may want to store them as an int unsigned and use the INET_ATON() and INET_NTOA() functions to return the IP address from its numeric value, and vice versa.

Example:

SELECT INET_ATON('127.0.0.1');

+------------------------+
| INET_ATON('127.0.0.1') |
+------------------------+
|             2130706433 | 
+------------------------+
1 row in set (0.00 sec)


SELECT INET_NTOA('2130706433');

+-------------------------+
| INET_NTOA('2130706433') |
+-------------------------+
| 127.0.0.1               | 
+-------------------------+
1 row in set (0.02 sec)
Daniel Vassallo
you wouldn't store as `INT` but `INT UNSIGNED`. i like the use of purposely inbuilt functions though. +1
pstanton
@pstanton: You're right. Fixed my answer.
Daniel Vassallo
Thanks. Do you know if there is a python function for this?
ensnare
@ensnare: Yes, check this: http://snipplr.com/view/14807/convert-ip-to-int-and-int-to-ip/.
Daniel Vassallo
That was very helpful. Thank you.
ensnare
+5  A: 

If you only want to store IPv4 addresses, then you can store them in a 32-bit integer field.

If you want to support IPv6 as well, then a string is probably the most easy-to-read/use way (though you could technically store them in a 16-byte VARBINARY() field, it would be annoying trying to generate SQL statements to select by IP address "by hand")

Dean Harding
+1: the only answer that predicts usage of IPv6
Juliano
A: 

Whatever is easiest for you to work with. The size or speed issue is not an issue until you know it is an issue by profiling. In some cases, a string might be easier to work with if you need to do partial matching. But as a space or performance issue, don't worry about it unless you have real cause to worry about it.

DGM