ansaurus

Question

finding the missing values in a range using any scripting language - perl, python or shell script

Answer 1

+2 A:

Python:

for line in open("inputfile.txt"):
    vals = set(map(int, line.split()))
    minv, maxv = min(vals), max(vals)
    missing = [str(v) for v in xrange(minv + 1, maxv) if v not in vals]
    print " ".join(missing)

balpha 2010-04-28 08:30:42

This is working fine, thanks a lot. can u explain a bit if possible.

manu 2010-04-28 12:56:49

@manu: the code is very linear. Read input (skip duplicates (why?)), convert to integers; get min, max; iterate from min to max, skip values in input, make list of strings out of them; join list to make a single string.

badp 2010-04-28 13:03:59

@bp: Converting to a set was more for the lookups in the list comprehension than for removing dupes. I admit it's not totally consequent; if there are just a few values, it doesn't really matter; if there are many, one would probably use `itertools.imap` instead of `map`.

balpha 2010-04-28 13:24:25

Answer 2

+6 A:

In Python:

def report_missing_numbers(f):
    for line in f:
        numbers = [int(n) for n in line.split()]
        all_numbers = set(range(numbers[0], numbers[-1]))
        missing = all_numbers - set(numbers)
        yield missing

Note: all_numbers is a bit of a lie, since the range excludes the final number, but since that number is guaranteed to be in the set, it doesn't affect the correctness of the algorithm.

Note: I removed the [-1] from my original answer, since int(n) doesn't care about the trailing '\n'.

Marcelo Cantos 2010-04-28 08:33:08

+1, I was thinking to write same thing.

S.Mark 2010-04-28 08:36:09

Your code read just one line from the input. btw, `file` is a built-in name, don't use it as a variable.

J.F. Sebastian 2010-04-28 11:19:34

Good points, @J.F. I've fixed it. Can I have my vote back?

Marcelo Cantos 2010-04-28 12:13:49

yes, you can. btw, `.split()` without arguments doesn't break on accidental double spaces: `'1 2'.split()` -> `['1', '2']` but `'1 2'.split(' ')` -> `['1', '', '2']`. `int()` doesn't work on empty strings.

J.F. Sebastian 2010-04-28 14:13:44

Cool! Kind thanks for returning the vote and the tip on `split()`. Sometime in the dim dark past, I used to know that, but I had forgotten.

Marcelo Cantos 2010-04-28 22:04:22

Answer 3

+2 A:

Sample code Using Perl:

#!/usr/bin/perl
use strict;
use warnings;

my @missing;

while(<DATA>) {
    my @data = split (/[ ]/, $_);
    my $i = shift @data;
    foreach (@data) {
        if ($_ != ++$i) {
               push @missing, $i .. $_ - 1;
               $i = $_;
        }
    }
}

print join " ", @missing;

__DATA__
673 673 673 676 676 680
2667 2667 2668 2670 2671 2674

OUTPUT

674 675 677 678 679 2669 2672 2673

Space 2010-04-28 09:01:10

Answer 4

A:

In Perl:

#!/usr/bin/perl

use 5.010;

for (1..1000) {
    say "I will not ask the internet to do my homework";
}

mscha 2010-04-28 09:02:30

Are you thinking to edit your post.

Space 2010-04-28 09:04:46

If you think the question is bad, you should vote to close it, not post a silly answer.

Thomas Wouters 2010-04-28 09:06:55

It might not be homework. This exact problem came up at my job a few months ago when we needed to list which records had gone missing from our DB.

Nefrubyr 2010-04-28 09:09:58

@all- Thanx a ton, my problem got solved :) This forum is amazing. Jai ho stackoverflow@mscha-well this is not any homework. I was having few gene locus infrmn as hit but was not able to pickup the in between locus. nw I can get it :)

manu 2010-04-28 12:56:05

@Thomas he has 56 reputation - how can he vote to close it?

Konerak 2010-05-11 08:03:35

Answer 5

+5 A:

Perl:

use Modern::Perl;

for my $line (<DATA>) {
    chomp $line;
    my @numbers     = split /\s+/, $line;
    my ($min, $max) = (sort { $a <=> $b } @numbers)[0, -1];
    my @missing     = grep { not $_ ~~ @numbers } $min .. $max;
    say join " ", @missing;
}

__DATA__
673 673 673 676 676 680
2667 2667 2668 2670 2671 2674

/I3az/

draegtun 2010-04-28 09:09:24

Answer 6

A:

Modification of Marcelo's solution with safe release of file handle in the event of an exception:

with open('myfile.txt') as f:
    numbers = [int(n) for n in f.readline()[:-1].split(' ')]
all_numbers = set(range(numbers[0], numbers[-1]))
missing = all_numbers - set(numbers)

This also avoids using the builtin name file.

blokeley 2010-04-28 10:34:38

Answer 7

A:

Shell solution using Bash, sort, uniq & jot (Mac OS X):

numbers="673 673 673 676 676 680"
numbers="2667 2667 2668 2670 2671 2674"
sorted=($(IFS=$'\n' echo "${numbers}" | tr " " '\n' | sort -u ))
low=${sorted[0]}
high=${sorted[@]: -1}
( printf "%s\n" "${sorted[@]}"; jot $((${high} - ${low} + 1)) ${low} ${high} ) | sort | uniq -u

dinco 2010-04-28 12:57:39

Answer 8

+1 A:

Ruby:

$stdin.each_line do |line|
  numbers = line.scan(/\d+/).map(&:to_i)
  missing = (numbers.min..numbers.max).to_a - numbers
  puts missing.join " "
end

Golf version (79 characters):

puts $stdin.map{|l|n=l.scan(/\d+/).map(&:to_i);((n.min..n.max).to_a-n).join" "}

Lars Haugseth 2010-04-28 13:11:31

Answer 9

A:

bash solution:
cat file_of_numbers| xargs -n2 seq | sort -nu

frankc 2010-04-28 14:06:30

This lists *all* numbers from min to max, not just the missing ones.

Lars Haugseth 2010-04-29 08:01:22

Answer 10

A:

Pure Bash:

while read -a line ; do
  firstvalue=${line[0]}
  lastvalue=${line[${#line[@]}-1]}
  output=()
  # prepare the output array
  for (( item=firstvalue; item<=lastvalue; item++ )); do
    output[$item]=1
  done
  # unset array elements with an index from the input set
  for item in ${line[@]}; do
    unset  "output[$item]"
  done
  # echo the remaining indices
  echo -e "${!output[@]}"
done < "$infile"

fgm 2010-05-11 07:21:43

ansaurus

tags:

views:

answers:

finding the missing values in a range using any scripting language - perl, python or shell script

input

output should be like this

related questions