Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
95 views
in Technique[技术] by (71.8m points)

python - How to store an IP in mySQL

We've got a healthy debate going on in the office this week. We're creating a Db to store proxy information, for the most part we have the schema worked out except for how we should store IPs. One camp wants to use 4 smallints, one for each octet and the other wants to use a 1 big int,INET_ATON.

These tables are going to be huge so performance is key. I am in middle here as I normally use MS SQL and 4 small ints in my world. I don't have enough experience with this type of volume storing IPs.

We'll be using perl and python scripts to access the database to further normalize the data into several other tables for top talkers, interesting traffic etc.

I am sure there are some here in the community that have done something simular to what we are doing and I am interested in hearing about their experiences and which route is best, 1 big int, or 4 small ints for IP addresses.

EDIT - One of our concerns is space, this database is going to be huge like in 500,000,000 records a day. So we are trying to weigh the space issue along with the performance issue.

EDIT 2 Some of the conversation has turned over to the volume of data we are going to store...that's not my question. The question is which is the preferable way to store an IP address and why. Like I've said in my comments, we work for a large fortune 50 company. Our log files contain usage data from our users. This data in turn will be used within a security context to drive some metrics and to drive several security tools.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I would suggest looking at what type of queries you will be running to decide which format you adopt.

Only if you need to pull out or compare individual octets would you have to consider splitting them up into separate fields.

Otherwise, store it as a 4 byte integer. That also has the bonus of allowing you to use the MySQL built-in INET_ATON() and INET_NTOA() functions.

Performance vs. Space

Storage:

If you are only going to support IPv4 addresses then your datatype in MySQL can be an UNSIGNED INT which only uses 4 bytes of storage.

To store the individual octets you would only need to use UNSIGNED TINYINT datatypes, not SMALLINTS, which would use up 1 byte each of storage.

Both methods would use similar storage with perhaps slightly more for separate fields for some overhead.

More info:

Performance:

Using a single field will yield much better performance, it's a single comparison instead of 4. You mentioned that you will only run queries against the whole IP address, so there should be no need to keep the octets separate. Using the INET_* functions of MySQL will do the conversion between the text and integer representations once for the comparison.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

56.9k users

...