tags:

views:

232

answers:

3

Hi, I have a tsv file with many columns like so; genename X1 X100 X103 X105 X115 X117 X120 X122 X123
Gene20728 0.415049 0.517868 0.820183 0.578081 0.30997 0.395181

I would like to import it into mathematica, and then extract and sort a column.

ie I want to extract column ["X117"] and sort it, and output the sorted list.

Thanks, Tom

+1  A: 
table = Import["file.csv", "Table"];
x117 = Drop[table[[All, 7]], 1];
sorted = Sort[x117];
ragfield
hi, I don't know what the column index for x117 will be on import. So do I have to read in the header/colnames first line and convert from colname to index on the fly?
Tom
+2  A: 

I do not think there is a built in method of achieving the smart structure you seem to be asking for. Below is the what I think is the most straight forward implementation out of the various possible methods.

stringdata = "h1\th2\n1\t2\n3\t4\n5"

h1  h2
1   2
5   4
3

Clear[ImportColumnsByName];
ImportColumnsByName[filename_] := 
 Module[{data, headings, columns, struc},
  data = ImportString[filename, "TSV"];
  headings = data[[1]];
  columns = Transpose[PadRight[data[[2 ;; -1]]]];
  MapThread[(struc[#1] = #2) &, {headings, columns}];
  struc
  ]

Clear[test];
test = ImportColumnsByName[stringdata];
test["h1"]
test["h2"]
Sort[test["h1"]]

outputs:

{1, 3, 5}
{2, 4, 0}
{1, 3, 5}

Building on ragfield's solution, this is a more dynamic method, however every call to this structure makes a call to Position and Part.

Clear[ImportColumnsByName];
ImportColumnsByName[filename_] := Module[{data, temp},
  data = PadRight@ImportString[filename, "Table"];
  temp[heading_] := 
   Rest[data[[All, Position[data[[1]], heading][[1, 1]]]]];
  temp
  ]

Clear[test];
test = ImportColumnsByName[stringdata];
test["h1"]
test["h2"]
Sort[test["h1"]]

outputs:

{1, 3, 5}
{2, 4, 0}
{1, 3, 5}
Davorak
I forgot to mention, ImportString needs to be replaced by Import to operate on files.
Davorak
+1  A: 

Starting from ragfield's code:

table = Import["file.csv", "Table"];
colname = "X117"
x117 = Drop[table[[All, Position[tb[[1, All]], colname]//Flatten]],
            1]//Flatten;
sorted = Sort[x117];
AVB