tags:

views:

31

answers:

3

Hi,

How can I extract unique numbers from a long list of numbers (arranged linewise). I need a regular expression to do that.

I am using a log program called baretailpro which enables searching through a log data and display per regex.

Example Input:

123
321
123
432
343
343
432
811
932
432
...list extends upto n number linewise

Example Output:

123
321
432
343
811
932
...list extends upto n numbers linewise

Rgds,
Anita

A: 

normally, you don't use regex to find unique items. you use tools like uniq (in *nix/windows) or sort -u to find unique strings. Or you can use a programming language with arrays/dictionary/hashes support, eg Perl/Python. If its possible for baretail log analyser to output data to a plain text/csv file, you can then use these tools to get your unique strings.

example assuming you exported your data to text file

$ cat file
123
321
123
432
343
343
432
811
932
432
$ sort -u file
123
321
343
432
811
932
ghostdog74
+1  A: 

If it really has to be a regex, and if baretailpro supports lookahead regexes, then you can search for

^(\d+)$\r?\n?(?=.*^\1$)

and replace all with nothing (empty string). You need to set the option for the dot to match all characters including newlines; if you don't have that option, use ^(\d+)$\r?\n?(?=[\s\S]*^\1$) instead.

The problem with this is that this regex will remove all the occurences of duplicate numbers except the last one. If you want to keep the first one, your regex engine must support lookbehind and also infinite repetition inside lookbehind. Hardly any regex engine does besides .NET and JGSoft. But if you can use it, then this one will be better:

(?<=^\1$.*)^(\d+)$\r?\n?
Tim Pietzcker
A: 

Hi, I am on windows and not having Excel or Db software so, I did it like this:

  1. Installed unxUtil port for Win32 (3.6MB or download and use only ones you want)
  2. pasted the file containing linewise nos to the folder contain utils i.e. wbin
  3. Used instructions on www.ibm.com/developerworks/linux/library/l-tiptex6.html Or The ones specified by Mr. ghostdog74 above. NB: On windows used commands as forwarding output to text/output file.(as redirect to file) eg: > result.txt

  4. And its all done.

File is sorted

rgds, Sunita [anita's jumbo colleague]

Sunita