ansaurus

Question

How can I print specific lines from a file in Unix?

Answer 1

A:

I wouldn't do it this way with large files, but (untested):

open(my $fh1, "<", "line_number_file.txt") or die "Err: $!";
chomp(my @line_numbers = <$fh1>);
$_-- for @line_numbers;
close $fh1;

open(my $fh2, "<", "text_file.txt") or die "Err: $!";
my @lines = <$fh2>;

print @lines[@line_numbers];
close $fh2;

runrig 2010-07-23 15:19:55

<3 Pancake Bunny

R. Bemrose 2010-07-23 15:25:42

Here's an example:File 1 has this data:AnnaBobCathyDarrenFile 2 has this:24I want to use file 2 to determine which lines of File 1 are printed. In this case, I want to print the 2nd and 4th lines of File 2, so my results would be:AnnaDarrenThanks!

itzy 2010-07-23 15:26:50

Hm, the comment didn't format as expected... the files all have just one word or number on each line.

itzy 2010-07-23 15:28:11

Ok, I've got the idea now.

runrig 2010-07-23 15:37:06

Answer 2

+4 A:

Assuming the line numbers to be printed are sorted.

open my $fh, '<', 'line_numbers' or die $!;
my @ln = <$fh>;
open my $tx, '<', 'text_file' or die $!;
foreach my $ln (@ln) {
  my $line;
  do {
    $line = <$tx>;
  } until $. == $ln and defined $line;
  print $line if defined $line;
}

M42 2010-07-23 15:25:59

+1 for using best practices throughout. nice example!

Ether 2010-07-23 16:20:55

Thanks and thanks again for correction

M42 2010-07-23 17:28:37

Answer 3

+3 A:

$ cat numbers
1
4
6
$ cat file
one
two
three
four
five
six
seven
$ awk 'FNR==NR{num[$1];next}(FNR in num)' numbers file
one
four
six

ghostdog74 2010-07-23 15:26:48

+1 nice and clean :)

nico 2010-07-23 15:31:14

for a GNU tools answer, how about sed?

Cole 2010-07-23 16:22:15

Answer 4

A:

I'd do it like this:

#!/bin/bash
numbersfile=numbers
datafile=data

while read lineno < $numbersfile; do
    sed -n "${lineno}p" datafile
done

Downside to my approach is that it will spawn a lot of processes so it will be slower than other options. It's infinitely more readable though.

Daenyth 2010-07-23 15:41:08

Answer 5

+2 A:

You can avoid the limitations of the some of the other answers (requirements for sorted lines), simply by using eof within the context of a basic while(<>) block. That will tell you when you've stopped reading line numbers and started reading data. Note that you need to reset $. when the switch occurs.

# Usage: perl script.pl LINE_NUMS_FILE DATA_FILE

use strict;
use warnings;

my %keep;
my $reading_line_nums = 1;

while (<>){
    if ($reading_line_nums){
        chomp;
        $keep{$_} = 1;
        $reading_line_nums = $. = 0 if eof;
    }
    else {
        print if exists $keep{$.};    
    }
}

FM 2010-07-23 16:26:33

Answer 6

A:

This is a short solution using bash and sed

sed -n -e "$(cat num |sed 's/$/p/')" file

Where num is the file of numbers and file is the input file ( Tested on OS/X Snow leopard)

$ cat num
1
3
5

$ cat file
Line One
Line Two
Line Three
Line Four
Line Five

$ sed -n -e "$(cat num |sed 's/$/p/')" file
Line One
Line Three
Line Five

Steve Weet 2010-07-23 16:42:55

Answer 7

+2 A:

cat -n foo | join foo2 - | cut -d" " -f2-

where foo is your file with lines to print and foo2 is your file of line numbers

frankc 2010-07-23 16:58:52

similar, but probably slower (textfile and lines are the 2 files): cat -n textfile | grep -f lines | cut -d' ' -f2

dblu 2010-07-23 22:48:28

That one is going to print the wrong stuff. If the lines file has 3 it will print line 3, 13, 23 etc, plus lines where 3 just happens to be part of the original input

frankc 2010-07-23 23:52:41

Answer 8

A:

$ cat input
every
good
bird
does
fly

$ cat lines
2
4

$ perl -ne 'BEGIN{($a,$b) = `cat lines`} print if $.==$a .. $.==$b' input
good
bird
does

If that's too much for a one-liner, use

#! /usr/bin/perl

use warnings;
use strict;

sub start_stop {
  my($path) = @_;
  open my $fh, "<", $path
    or die "$0: open $path: $!";

  local $/;
  return ($1,$2) if <$fh> =~ /\s*(\d+)\s*(\d+)/;
  die "$0: $path: could not find start and stop line numbers";
}

my($start,$stop) = start_stop "lines";

while (<>) {
  print if $. == $start .. $. == $stop;
}

Perl's magic open allows for creative possibilities such as

$ ./lines-between 'tac lines-between|'
  print if $. == $start .. $. == $stop;
while (<>) {

Greg Bacon 2010-07-23 17:41:00

Answer 9

+1 A:

Here is a way to do this in Perl without slurping anything so that the memory footprint of the program is independent of the sizes of both files (it does assume that the line numbers to be printed are sorted):

#!/usr/bin/perl

use strict; use warnings;
use autodie;

@ARGV == 2
    or die "Supply src_file and filter_file as arguments\n";

my ($src_file, $filter_file) = @ARGV;

open my $src_h, '<', $src_file;
open my $filter_h, '<', $filter_file;

my $to_print = <$filter_h>;

while ( my $src_line = <$src_h> ) {
    last unless defined $to_print;
    if ( $. == $to_print ) {
        print $src_line;
        $to_print = <$filter_h>;
    }
}

close $filter_h;
close $src_h;

Generate the source file:

C:\>  perl -le "print for aa .. zz" > src

Generate the filter file:

C:\> perl -le "print for grep { rand > 0.75 } 1 .. 52" > filter

C:\> cat filter
4
6
10
12
13
19
23
24
28
44
49
50

Output:

C:\> f src filter
ad
af
aj
al
am
as
aw
ax
bb
br
bw
bx

To deal with an unsorted filter file, you can modified the while loop:

while ( my $src_line = <$src_h> ) {
    last unless defined $to_print;
    if ( $. > $to_print ) {
        seek $src_h, 0, 0;
        $. = 0;
    }
    if ( $. == $to_print ) {
        print $src_line;
        $to_print = <$filter_h>;
    }
}

This would waste a lot of time if the contents of the filter file are fairly random because it would keep rewinding to the beginning of the source file. In that case, I would recommend using Tie::File.

Sinan Ünür 2010-07-25 16:38:52

Answer 10

A:

Here is a way to do this using Tie::File:

#!/usr/bin/perl

use strict; use warnings;
use autodie;
use Tie::File;

@ARGV == 2
    or die "Supply src_file and filter_file as arguments\n";

my ($src_file, $filter_file) = @ARGV;

tie my @source, 'Tie::File', $src_file, autochomp => 0
    or die "Cannot tie source '$src_file': $!";

open my $filter_h, '<', $filter_file;

while ( my $to_print = <$filter_h> ) {
    print $source[$to_print - 1];
}

close $filter_h;

untie @source;

Sinan Ünür 2010-07-25 17:03:01

ansaurus

tags:

views:

answers:

How can I print specific lines from a file in Unix?

related questions