ansaurus

Question

Answer 1

A:

Are you trying to search and replace the line with nothing? You could try the g command instead:

:%g/search_expression_here/d

The d at the end tells it to delete the lines that match.

You may find more tips here.

Jeremy Stein 2009-12-13 15:20:59

Thanks for this tip, but I'm also trying to find a RegEx that matches the three lines above so I can find the lines to remove.

magneticMonster 2009-12-13 15:26:10

Will it always be 3 lines?.. if the first and last lines of the file are the same, you wouldn't want a regex that returns the entire file

adi92 2009-12-13 15:54:43

Answer 2

+1 A:

instead of using vim you do something like

sort filename | uniq -c | grep -v "^[ \t]*1[ \t]"

to figure out what is the duplicate line and then just use normal search to visit it and delete it

adi92 2009-12-13 15:23:18

no need cat. pass the filename to sort

2009-12-14 00:11:32

The grep only works on lines repeated twice, 20-29, 200-299, ... times. It would probably be better to use `grep -v '^[ \t]*1[ \t]'` where I've written '`\t`' to indicate a tab should appear in the command line.

Jonathan Leffler 2009-12-14 00:15:01

great idea.. i added that in

adi92 2009-12-14 05:33:21

Answer 3

+1 A:

This link addresses the problem

Kimvais 2009-12-13 15:24:21

Answer 4

A:

Answers using 'uniq' suffer from the problem that 'uniq' only finds adjacent duplicated lines, or the data file is sorted losing positional information.

If no line may ever be repeated, then it is relatively simple to do in Perl (or other scripting language with regex and associative array support), assuming that the data source is not incredibly humungous:

#!/bin/perl -w
# BEWARE: untested code!
use strict;
my(%lines);
while (<>)
{
    print if !defined $lines{$_};
    $lines{$_} = 1;
}

However, if it is used indiscriminately, this is likely to break the XML since end tags are legitimately repeated. How to avoid this? Maybe by a whitelist of 'OK to repeat' lines? Or maybe only lines with open tags with values are subject to duplicate elimination:

#!/bin/perl -w
# BEWARE: untested code!
use strict;
my(%lines);
while (<>)
{
    if (m%^\s*<[^\s>]+\s[^\s>]+%)
    {
         print if !defined $lines{$_};
         $lines{$_} = 1;
    }
    else
    {
         print;
    }
}

Of course, there is also the (largely valid) argument that processing XML with regular expressions is misguided. This coding assumes the XML comes with lots of convenient line breaks; real XML may not contain any, or only a very few.

Jonathan Leffler 2009-12-13 16:10:23

Answer 5

+1 A:

You could select the lines then do a :'<,'>sort u if you don't care about the ordering. It will sort and remove duplicates.

Pierre-Antoine LaFayette 2009-12-13 23:35:01

Answer 6

+1 A:

to the OP, if you have bash 4.0

#!/bin/bash
# use associative array
declare -A DUP
file="myfile.txt"
while read -r line
do
    if [ -z ${DUP[$line]} ];then
        DUP[$line]=1
        echo $line >temp
    fi
done < "$file"
mv temp "$file"

2009-12-14 00:27:01

Answer 7

+1 A:

with python to remove all repeated lines:

#!/usr/bin/env python

import sys
def remove_identical(filein, fileout) : 
  lines = list()
  for line in open(filein, 'r').readlines() : 
    if line not in lines : lines.append(line)
  fout = open(fileout, 'w')
  fout.write(''.join(lines))
  fout.close()

remove_identical(sys.argv[1], sys.argv[2])

skeept 2009-12-14 01:39:54

ansaurus

tags:

views:

answers:

Remove Duplicate Line in Vim?

related questions