views:

22

answers:

2

Hi!

I have a large text file with 4-digit codes and some information about them in every row. It looks something like this:

3456 information
1234 info
2222 Some ohter info

I need to sort this file, so the codes are in ascending order in the file. Also, some codes appear more than once, so I need to remove duplicates. Can I do this with perl, awk or some other scripting language?

Thanks in advance,

-skazhy

+3  A: 
sort happybirthday.txt | uniq

From IBM.

1st result for Google: unix remove duplicate lines.

mcandre
Wow thanks, the answer was so simple to my question :)
skazhy
A: 

You can create a hash then read the file in line by line and for each line

  • split at the first space
  • check if the val(0), the number that you just split, is in the hash
  • if not the insert the val(1), rest of the line, into the hash with a key val(0)
  • continue

Then print the (sorted) hash to the file.

Kyra