views:

1257

answers:

11

Scenario:

  • I have a text file that has pipe (as in the "|" character) delimited data.
  • Each field of data in the pipe delimited fields can be of variable length, so counting characters won't work (or using some sort of substring function... if that even exists in VIM).

Is it possible, using VIM / Vi to delete all data from the second pipe to the end of the line for the entire file? There are approx 150,000 lines, so doing this manually would only be appealing to a masochist...

e.g.

Change the following lines from:

1111|random sized text 12345|more random data la la la|1111|abcde

2222|random sized text abcdefghijk|la la la la|2222|defgh

3333|random sized text|more random data|33333|ijklmnop

to:

1111|random sized text 12345

2222|random sized text abcdefghijk

3333|random sized text

I'm sure this can be done somehow... I hope.

TIA

UPDATE: I should have mentioned that I'm running this on Windows XP, so I don't have access to some of the mentioned *nix commands (CUT is not recognized on Windows).

+20  A: 
:%s/^\v([^|]+\|[^|]+)\|.*$/\1/
Brian Carper
What does \v do?
Paul Tomblin
Turns on "very magic" mode for that regex. It lets you avoid backslash-escaping parens and pluses and whatnot. See :h /\v
Brian Carper
+1 for \v. I've used vim for so long without knowing
PEZ
Worked like a charm thanks.
Jason Down
A: 

I've found that vim isn't great at handling very large files. I'm not sure how large your file is. Maybe cat and sed together would work better.

sjbotha
What do you need cat for? sed can read files on it's own!
André
Maybe you're after one of those useless use of cat awards? http://partmaps.org/era/unix/award.html =)
PEZ
Anyway, 150000 lines of that type should be OK with Vim.
PEZ
I've had 100MB+ files (generated) and Vim is the only editor I've seen that will handle them... I don't know why you would be having problems with it!
rmeador
WOrked fine with this file =)
Jason Down
And I had edited 20 Gb logfiles with vim regexes. Yeah it was slow. But did the job.
Zsolt Botykai
A: 

Here is a sed solution:

sed -e 's/^\([^|]*|[^|]*\).*$/\1/'
unbeknown
A: 

Why use vim?

Why not just do

cat my_pipe_file | cut -d'|' -f1-2

HTH

cheers,

Rob

Rob Wells
Don't need the "cat".
Paul Tomblin
@Paul, you're right. Just force of habit I guess. (-:
Rob Wells
Fap fap fap! :) http://blog.jrock.us/articles/Useless%20use%20of%20%22useless%20use%22.pod
Philip Durbin
+7  A: 

If you don't have to use vim, another alternative would be the unix "cut" command.

cut -d '|' -f 1-2 file > out.file
Paul Tomblin
+13  A: 

You can also record a macro:

qq02f|Djq

and then you will be able to play it with 100@q to run the macro on the next 100 lines.

Macro explanation:

  • qq - starts macro recording
  • 0 - go to the first character of the line
  • 2f| - find the second occurrence of the | character on the line
  • D - delete the text after the current position to the end of the line
  • j - go to the next line
  • q - ends macro recording
CMS
This will stop working prematurely if any line doesn't have two | on it, but works otherwise.
Brian Carper
+1  A: 

You can also do:

:%s/^\([^\|]\+|[^\|]\+\)\|.*$/\1/g
Jay
A: 

This will filter all lines in the buffer (1,$) through cut to do the job.

:1,$!cut -d '|' -f 1-2

Try with

:.!cut -d '|' -f 1-2

To do it only on the current line.

PEZ
+2  A: 

Just another vim way to do the same thing:

%s/^\(.\{-}|\)\{2}\zs.*//
%s/^\(.\{-}\zs|\)\{2}.*// " if you want to remove the 2nd pipe as well

This time, the regex matches as few (\{-}) charaters as possible that are followed by a pipe, and twice (\{2}), they are ignored (\zs) to replace all following (\z**s**) text by nothing.

Luc Hermitte
+1, I need to go read up on vim regexps.
PEZ
Same here. Nice that this answer actually explains what it is doing a bit.
Jason Down
+1  A: 

Use :command to make a user command you can easily run.

:command -range=% YourNameHere <line1>,<line2>s/^\v([^|]+\|[^|]+)\|.*$/\1/
graywh
+1  A: 

use awk.

awk -F"|" '{$0=$1"|"$2}1' file
ghostdog74