views:

60

answers:

3

Hello,

So I have a table that looks like this:

A    B
A    C
B    A
C    A
C    B

I want to delete the lines that the connection of two values are already in represented (so A----B is the equivalent connection as B----A). Basically I want my table to look like this.

A    B
A    C
B    C

How can I do this in Ruby?

-Bobby

EDIT:

Here is my current code:

require 'rubygems'


f = File.new("uniquename.txt","w")
i = IO.readlines('bioportnetwork.txt').collect{|l| l.split.sort}.uniq
i.each do |z|
f.write(z + "\n")
end

I tried this code, but I think the IO.readlines did not read my columns correctly. Here is one part of my table.

9722,9754   8755
8755         9722,9754
9722,9754   7970,7971
7970,7971    9722,9754  

How can I get it read correctly, then saved out correctly as a TSV file?

-Bobby

+3  A: 

So, let's say you have loaded your TSV file into an array of pairs:

arr = [["A", "B"], ["A", "C"], ["B", "A"], ["C", "A"], ["C", "B"]]
Hash[arr.map{|pair| [pair.sort, pair]}].values
#=> [["B", "A"], ["C", "A"], ["C", "B"]]

This is OK if the order of pairs in original array is not important.

And if neither order of elements in pairs is important:

arr.map(&:sort).uniq
#=> [["A", "B"], ["A", "C"], ["B", "C"]]
Mladen Jablanović
+1  A: 

I'm assuming by 'table' you mean an array-of-arrays similar to this:

x = [['A', 'B'],
     ['A', 'C'],
     ['B', 'A'],
     ['C', 'A'],
     ['C', 'B']]

If so, you can de-duplicate the list with x.collect{|a| a.sort}.uniq.

Update: To read the data out of the file and into the array, use something like:

lines = IO.readlines('filename.txt')
x = []
lines.each {|l| x << l.split}

Update 2: Or, you can one-line the whole thing:

IO.readlines('test.txt').collect{|l| l.split.sort}.uniq

Update 3: When writing out to the file, don't use IO.write. It converts the array to a string automatically, which might be where you are running into your problem. Instead, use IO.puts:

f.puts x[0].to_s << "\t" << x[1].to_s
bta
+1  A: 

Set equivalency is defined in ruby, and Sets use equivalency only to check new members, so you can use a nested set structure to solve this quickly and easily.

set_of_all_sets = Set.new
file.each_line do |line|
  line =~ /(\S)\s(\S)/
  set_of_all_sets << Set.new([$1, $2])
end
array_of_all_sets.map{|set| set.to_a}
animal