views:

72

answers:

2

Hi, I have a copy of a log file I want to make easier to view/edit.
I use textpad to remove stuff I do not want and I can enter a regular expression as search term and use \1.\2.\3.\4 in the target field for captured groups.
I would like to change all IP addresses which start each line from

[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}

to

[0-9]{3}\.[0-9]{3}\.[0-9]{3}\.[0-9]{3}

with padded leading zeros How to do that in one go?

Example input:

10.2.123.4
110.12.23.40
123.123.123.123
1.2.3.4

example output

010.002.123.004
110.012.023.040
123.123.123.123
001.002.003.004

See my own answer for what works

Thanks for your input

+1  A: 

Spit on ".", pad, join. No regex needed. Regex would not provide any benefit, even.

JavaScript, for example:

var ip = "110.12.23.40";

ip = ip.split(".").map( function(i) {
  return ("00"+i).slice(-3);
}).join(".");

alert(ip);  // 110.012.023.040
Tomalak
Erm - where in my post or tags does it say I can run jQuery? Or is your post a textpad snippet I do not know about?I am in an editor with 1.5 million lines and need to run a regex.I can do this in JS just fine if I could run JS in my editor.
mplungjan
@mplungjan: Erm - where in my answer do I say that this code would be jQuery? It says JavaScript, and that's what it is. There are methods to manipulate text files outside of an editor, and I think this is one case where it would be beneficial to do so.
Tomalak
Thanks for your comment and sorry for the erm - just a little tired of what I see as unrelated answers to my questions at SO so far. I need a regex in my editor. I use JS daily. can also write a JAVA or REXX program that reads through the file and pads the ip addresses. But that is not what I want or need right now.PS: Your "map" is not average JS, but JS 1.6. Very clever and useful in other contexts
mplungjan
@mplungjan: My JavaScript was to illustrate a point, only. I could have used another language. You cannot easily use regex for this since a) regex *finds*, it does not replace; b) finding the correct spots to insert zeros can be difficult in the context of a huge log file and c) this will always be a multi-step operation, to do it correctly it would require 8 search-and-replace steps (4 bytes separatetly, "0" and "00" prefixes separately). What I'm saying is this: Use a programming language to process your text file. Regex-find IPs, replace using something equivalent to my code.
Tomalak
I see what you are saying, but as I mentioned, I can replace captured strings. If the captured input is [0-9]{1} my replace is 00\1 if [0-9]{2} it is 0\1 and if [0-9]{3} it is \1Since I can do it one at a time, I was sure someone could come up with a compound regex like this single replace ([0-9]{1})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3}) -> 00\1.\2.\3.\4So I guess I can do it in 4 regex now.
mplungjan
more than 4 actually
mplungjan
@mplungjan: Yes, eight, as I said. You need to capture and replace these cases separately: `"x.*.*.*"`, `"xx.*.*.*"`, `"*.x.*.*"`, `"*.xx.*.*"`, `"*.*.x.*"`, `"*.*.xx.*"`, `"*.*.*.x"`, `"*.*.*.xx"`. Not replacing complete IP-address-like structures will leave you with leading zeros in dates, times and other data. It's not impossible to do in a text editor, but a lot more hassle than it's worth. (And as far as efficiency goes: This way you would need to go over the entire file eight times - but with a small helper program only once.)
Tomalak
It was actually not a huge thing, 8 times enter and I had the ^ making sure it was only the first part of the line.It took less than what I spent on this question ;) Also I had to sort and remove crap requests something textpad is very good helping your eyes with and programming languages not...Thanks for your input - At least I learned about map()
mplungjan
+1  A: 

Ok, I decided to do it in more than one go. I post it here for future reference or in case someone comes up with a single regex

Note there is a trailing space on each find and each replace

^([0-9]{1})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3}) -> 00\1.\2.\3.\4 
^([0-9]{2})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3}) -> 0\1.\2.\3.\4 

^([0-9]{3})\.([0-9]{1})\.([0-9]{1,3})\.([0-9]{1,3}) -> \1.00\2.\3.\4 
^([0-9]{3})\.([0-9]{2})\.([0-9]{1,3})\.([0-9]{1,3}) -> \1.0\2.\3.\4 

^([0-9]{3})\.([0-9]{3})\.([0-9]{1})\.([0-9]{1,3}) -> \1.\2.00\3.\4 
^([0-9]{3})\.([0-9]{3})\.([0-9]{2})\.([0-9]{1,3}) -> \1.\2.0\3.\4 

^([0-9]{3})\.([0-9]{3})\.([0-9]{3})\.([0-9]{1}) -> \1.\2.\3.00\4 
^([0-9]{3})\.([0-9]{3})\.([0-9]{3})\.([0-9]{2}) -> \1.\2.\3.0\4 

Textpad syntax:

^\([0-9]\{1\}\)\.\([0-9]\{1,3\}\)\.\([0-9]\{1,3\}\)\.\([0-9]\{1,3\}\) -> 00\1.\2.\3.\4 
^\([0-9]\{2\}\)\.\([0-9]\{1,3\}\)\.\([0-9]\{1,3\}\)\.\([0-9]\{1,3\}\) -> 0\1.\2.\3.\4 

^\([0-9]\{3\}\)\.\([0-9]\{1\}\)\.\([0-9]\{1,3\}\)\.\([0-9]\{1,3\}\) -> \1.00\2.\3.\4 
^\([0-9]\{3\}\)\.\([0-9]\{2\}\)\.\([0-9]\{1,3\}\)\.\([0-9]\{1,3\}\) -> \1.0\2.\3.\4 

^\([0-9]\{3\}\)\.\([0-9]\{3\}\)\.\([0-9]\{1\}\)\.\([0-9]\{1,3\}\) -> \1.\2.00\3.\4 
^\([0-9]\{3\}\)\.\([0-9]\{3\}\)\.\([0-9]\{2\}\)\.\([0-9]\{1,3\}\) -> \1.\2.0\3.\4 

^\([0-9]\{3\}\)\.\([0-9]\{3\}\)\.\([0-9]\{3\}\)\.\([0-9]\{1\}\) -> \1.\2.\3.00\4 
^\([0-9]\{3\}\)\.\([0-9]\{3\}\)\.\([0-9]\{3\}\)\.\([0-9]\{2\}\) -> \1.\2.\3.0\4 
mplungjan
Why are you escaping all parentheses and curly braces? Is this textpad thing?
Kobi
"in case someone comes up with a single regex" -> Nobody will, because this is impossible. And yes, I use that word rarely and if at all, then without exaggeration.
Tomalak
@Kobi: Agreed, the backslashes should be removed. They have nothing to do with the shown regexes.
Tomalak
Yes it is a textpad thing.
mplungjan