tags:

views:

415

answers:

2

I had a similar question. but what i am trying now is to read files in .txt format into matlab. My problem is with the headers. Manytimes due to errors the system rewrites the headers in the middle of file and then matlab cannot read the file. IS there a way to skip it? I know i can skip reading some characters if i know what the character is.

here is the code i am using.

[c,pathc]=uigetfile({'*.txt'},'Select the data','V:\data);
file=[pathc c];
data= dlmread(file, ',', 1,4);

this way i let the user pick the file. My files are huge typically [ 86400 125 ] so naturally it has 125 header fields or more depends on files.

Thanks

Because the files are so big i cannot copy , but its in format like

day time col1 col2 col3 col4 ...............................
2/3/2010 0:10 3.4 4.5 5.6 4.4 ...............................
..................................................................
..................................................................

and so on

+1  A: 

With DLMREAD you can read only numeric data. It will not read date and time, as your first two columns contain. If other data are all numeric you can tell DLMREAD to skip first row and 2 columns on the right:

data = dlmread(file, ' ', 1,2); 

To import also day and time you can use IMPORTDATA instead of DLMREAD:

A = importdata(file, ' ', 1);
dt = datenum(A.textdata(2:end,1),'mm/dd/yyyy');
tm = datenum(A.textdata(2:end,2),'HH:MM');
data = A.data;

The date and time will be converted to serial numbers. You can convert them back with DATESTR function.

yuk
For the above problem , i am just looking a way for matlab to read my files. Like you saw in above code , iam skipping the headers. I was wondering if there is a way for matlab not to stop reading if it encounters headers row agian in middle. For example headers on row 1000 or so ( now i won't know which row it is unless i run) thanks
Paul
This is why I asked for an example. Use textscan as in the Jonas's answer.
yuk
+1  A: 

It turns out that you can still use textscan. Except that you read everything as string. Then, you attempt to convert to double. 'str2double' returns NaN for strings, and since headers are all strings, you can identify header rows as rows with all NaNs.

For example:

%# find and open file
[c,pathc]=uigetfile({'*.txt'},'Select the data','V:\data'); 
file=[pathc c];
fid = fopen(file);

%# read all text
strData = textscan(fid,'%s%s%s%s%s%s','Delimiter',','); 

%# close the file again
fclose(fid);

%# catenate, b/c textscan returns a column of cells for each column in the data
strData = cat(2,strData{:}); 

%# convert cols 3:6 to double
doubleData = str2double(strData(:,3:end));

%# find header rows. headerRows is a logical array
headerRowsL = all(isnan(doubleData),2);

%# since I guess you know what the headers are, you can just remove the header rows
dateAndTimeCell = strData(~headerRowsL,1:2);
dataArray = doubleData(~headerRowsL,:);

%# and you're ready to start working with your data 
Jonas
You don't nee to use cellfun for str2double. Just `doubleData=str2double(strData);` will work.
yuk
Check quotes in uigetfile and textscan lines.
yuk
Oops, thanks for finding the error, @yuk. Also, thanks for pointing out that str2double works without cellfun. I learned something new.
Jonas
@Jonas : Thanks a lot
Paul
@Jonas: Silly question... which is my matrix data? like earlier it was data which had all the values..Another major issue. Its running out of memory, so i cannot even run it , i know my files are huge, about 60mb
Paul
@Paul: dataArray contains the numeric data, dateAndTimeCell contains the corresponding strings for date and time. I'm a bit surprised that you should run out of memory with a 60mb file. Maybe it helps a little to replace `dateAndTimeCell` with `strData`, and `dataArray` with `doubleData` (in which case the numeric data would be `doubleData`). Also, did you run `clear classes` before you tried the function? Did you do `dbstop if error` and check the size of the arrays and the free memory with `memory` at the point you run out of RAM?
Jonas
@ what is the maximum matlab can handle? Ok i am running two files each 60-70mbs. My Ram is ok as even when i am out of memory i can see memory is at 1-1.2gbs and i have 2gb of memory. I have noticed anytime matlab uses more than 800,000K , it maxes out.And i have to restart matlab
Paul
On Windows, run `memory` - it will tell you how much memory Matlab can use (look also at the documentation to memory)
Jonas

related questions