tags:

views:

85

answers:

3

What I'm doing is this: have one file as input, another as output. I chose a random line in the input, put it in the output, and then delete it.

Now, I've iterated over the file and am on the line I want. I've copied it to the output file. Is there a way to delete it? I'm doing something like this:

for i in 0..number_of_lines_to_remove
    line = rand(lines_in_file-2) + 1 #not removing the first line
    counter = 0
    IO.foreach("input.csv", "r") { |current_line|
      if counter == line
        File.open("output.csv", "a") { |output|
          output.write(current_line)
        }
      end
      counter += 1
    }
end

So, I have current_line, but I'm not sure how to remove it from the source file.

A: 

Array.delete_at might do. Given an index, it removes the object at that index, returning the object.

input.csv:

one,1
two,2
three,3

Program:

#!/usr/bin/ruby1.8

lines = File.readlines('/tmp/input.csv')
File.open('/tmp/output.csv', 'a') do |file|
  file.write(lines.delete_at(rand(lines.size)))
end
p lines    # ["two,2\n", "three,3\n"]

output.csv:

one,1
Wayne Conrad
A: 

You have to re-write the source-file after removing a line otherwise the modifications won't stick as they're performed on a copy of the data.

Keep in mind that any operation which modifies a file in-place runs the risk of truncating the file if there's an error of any sort and the operation cannot complete.

It would be safer to use some kind of simple database for this kind of thing as libraries like SQLite and BDB have methods for ensuring data integrity, but if that's not an option, you just need to be careful when writing the new input file.

tadman
i don't have to rewrite the source with that solution, i just have to keep reusing the string that is copied from the input. since the lines are being removed from the string, all that is left is to decrease its total size to match the removed lines. i've also never done anything with sqlite/bdb so i have no idea how to proceed. my sql knowledge only goes as far as some select queries. if you have any examples to help me out i'd appreciate it.
zxcvbnm
Beanish has an example of re-writing the file below, where changes are preserved between runs. If you have no experience with SQL, then SQLite is probably over-kill.
tadman
+1  A: 

Here is a randomline class. You create a new randomline object by passing it an input file name and an output file name. You can then call the deleterandom method on that object and pass it a number of lines to delete.

The data is stored internally in arrays as well as being put to file. Currently output is in append mode so if you use the same file it will just add to the end, you could change the a to a w if you wanted to start the file fresh each time.

   class Randomline
  attr_accessor :inputarray, :outputarray

def initialize(filein, fileout)
@filename = filein
@filein = File.open(filein,"r+")
@fileoutput = File.open(fileout,"a")
@inputarray = []
@outputarray = []  

readin()
end


def readin()
@filein.each do |line|
  @inputarray << line
end
end

def deleterandom(numtodelete)
  numtodelete.times do |num|
    random = rand(@inputarray.size)
    @outputarray << inputarray[random]
    @fileoutput.puts inputarray[random]
    @inputarray.delete_at(random) 
  end 

    @filein = File.open(@filename,"w")
    @inputarray.each do |line|
      @filein.puts line
    end

end

end

here is an example of it being used

a = Randomline.new("testin.csv","testout.csv")

a.deleterandom(3)
Beanish