tags:

views:

37

answers:

2

Hello, I am trying to import a large number of files into Matlab for processing. A typical file would look like this:

    mass      intensity
 350.85777         238
 350.89252        3094
 350.98688        2762
 351.87899         468
 352.17712         569
 352.28449         426
Some text and numbers here, describing the experimental setup, eg  
Scan 3763 @ 81.95, contains 1000 points:

The numbers in the two columns are separated by 8 spaces. However, sometimes the experiment will go wrong and the machine will produce a datafile like this one:

mass      intensity

Some text and numbers here, describing the experimental setup, eg  
Scan 3763 @ 81.95, contains 1000 points:

I found that using space-separated files with a single header row, ie

importdata(path_to_file,' ',  1);

works best for the normal files. However, it totally fails on all the abnormal files. What would the easiest way to fix this be? Should I stick with importdata (already tried all possible settings, it just doesn't work) or should I try writing my own parser? Ideally, I would like to get those values in a Nx2 matrix for normal files and [0 0] for abnormal files.

Thanks.

+1  A: 

what do you mean 'totally failes on abnormal files'?

you can check if importdata finds any data using e.g.

>> imported = importdata(path_to_file,' ',  1);
>> isfield(imported, 'data')
second
Sorry, I should've been more precise. 'Totally failed' means it imports the number after the '@', in the example above 81.95. It's not a complete failure, but not what I need.
reseter
odd, mine just comes up empty. could you post an example of an 'abnormal' file? (+what matlab version you are using)
second
+4  A: 

I don't think you need to create your own parser, nor is this all that abnormal. Using textscan is your best option here.

fid = fopen('input.txt', 'rt');
data = textscan(fid, '%f %u', 'Headerlines', 1);
fclose(fid);

mass = data{1};
intensity = data{2};

Yields:

mass =
  350.8578
  350.8925
  350.9869
  351.8790
  352.1771
  352.2845

intensity =
         238
        3094
        2762
         468
         569
         426

For your 1st file and:

    mass =
       Empty matrix: 0-by-1

    intensity =
       Empty matrix: 0-by-1

For your empty one.

By default, text scan reads whitespace as a delimiter, and it only reads what you tell it to until it can no longer do so; thus it ignores the final lines in your file. You can also run a second textscan after this one if you want to pick up those additional fields:

fid = fopen('input.txt', 'rt');
data = textscan(fid, '%f %u', 'Headerlines', 1);

mass = data{1};
intensity = data{2};

data = textscan(fid, '%*s %u %*c %f %*c %*s %u %*s', 'Headerlines', 1);

scan = data{1};
level = data{2};
points = data{3};

fclose(fid);

Along with your mass and intensity data gives:

    scan =
            3763

    level =
       81.9500

    points =
            1000
Geodesic

related questions