tags:

views:

98

answers:

4

I have a .csv file with records written in the following form:

2010-04-20 15:15:00,"8.9915176259e+00","8.8562623697e+00"
2010-04-20 15:30:00,"8.5718021723e+00","8.6633827160e+00"
2010-04-20 15:45:00,"8.4484844117e+00","8.4336586330e+00"
2010-04-20 16:00:00,"1.1106980342e+01","8.4333062208e+00"
2010-04-20 16:15:00,"9.0643470589e+00","8.6885660103e+00"
2010-04-20 16:30:00,"8.2133517943e+00","8.2677822671e+00"
2010-04-20 16:45:00,"8.2499419380e+00","8.1523501983e+00"
2010-04-20 17:00:00,"8.2948492278e+00","8.2884797924e+00"

From these data I would like to make clusters - I would like to add a column with number indicating the hour - so in case of the first row a value 15 has to be added in a new row.

The first problem is that calling a function

[numData, textData, rawData] = xlsread('testData.csv')

creates an empty matrix numData and one-column textData and rawData structures.

Is it possible to create any template which recognizes a yyyy, MM, dd, hh, mm, ss values from the data above?

What I would basically like to do with these data is to categorize the values by hours so from the example row of input:

2010-04-20 15:15:00,"8.9915176259e+00","8.8562623697e+00"

update 1: in Matlab the line above is recognized as a string:

'2010-04-26 13:00:00,"1.0428104753e+00","2.3456394130e+00"'

I would want this to be the output:

15, 8.9915176259e+00, 8.8562623697e+00

update 1: a string has to be parsed

Does anyone know how to parse a string and retrieve a timestamp ('2010-04-20 15:15:00'), value1 (1.0428104753e+00) and value2 (2.3456394130e+00) from it as separate values?

+1  A: 
datevec('2010-04-20 15:15:00')
ans =

        2010           4          20          15          15           0

str2num for the other two columns and csvwrite will also be your friend.

I am not sure I understand all of your question, but I suspect this will get you well on the way to solving it.

MatlabDoug
Sorry, Doug, haven't seen your answer.
yuk
You can always delete your answer before you get the reputation points that I want! :)
MatlabDoug
Thanks Doug for a response regarding the timestamp parsing. I've refined a question about the string parsing so hopefully now is clearer what I would like to do.
niko
A: 

You can use DATEVEC to convert date strings to year, month, day, hours, minutes and sec.

DV = datevec(textData)

DV =

        2010           4          20          15          15           0
        2010           4          20          15          30           0
        2010           4          20          15          45           0
        2010           4          20          16           0           0
        2010           4          20          16          15           0
        2010           4          20          16          30           0
        2010           4          20          16          45           0
        2010           4          20          17           0           0

Hours = DV(:,4);

EDIT:

If I use (R2010a) your sample data and your code

[numData, textData, rawData] = xlsread('testData.csv')

I get two last columns in numData and the first column in textData. Applying DATEVEC to textData will convert the date string to numbers.

EDIT2:

To parse your string you can do, for example:

s = '2010-04-20 15:15:00,"8.9915176259","8.8562623697"';
s2 = regexprep(s,'"','');
A = textscan(s2,'%s %f %f','Delimiter',',');
DV = datevec(A{1});
result = [DV(4) A{2:3}];
yuk
I tried to use xlsread but again numData is empty while rawData and textData are 2857x1 structures (there are 2857 records in the file).
niko
s2 = '2010-04-20 15:15:00,8.9915176259e+00,8.8562623697e+00'>> A = textscan(s2,'%s %f %f','Delimiter',',');??? Error using ==> textscanFirst input must be of type double or string.
niko
Because you are applying textscan to cell array of strings. I would use textscan to read the file, if xlsread does not work as expected, but you already get that answer from gnovice.
yuk
A: 

using http://www.mathworks.com/matlabcentral/fileexchange/19707-xls2struct

and assuming your .csv file has a 'header' row. I used (time, v1, v2) for this example:

 data=xls2struct('t.csv');
 timeVec = datevec(data.time);
 data.time = timeVec(:,4); %replace time with hours
 matData=[data.time, data.v1,data.v2] %concat data from the struct

matData =

   15.0000    8.9915    8.8563
   15.0000    8.5718    8.6634
   15.0000    8.4485    8.4337
   16.0000   11.1070    8.4333
   16.0000    9.0643    8.6886
   16.0000    8.2134    8.2678
   16.0000    8.2499    8.1524
   17.0000    8.2948    8.2885
hash blue
+2  A: 

If I use XLSREAD with your sample file data, I get the correct outputs just as yuk did. This statement from the documentation for XLSREAD may explain the problem you are getting:

If your system does not have Excel for Windows installed, or MATLAB cannot access the COM server, xlsread operates in basic mode. In this mode, xlsread only reads XLS files.

As an alternative, I was also able to read your sample data file using TEXTSCAN:

>> fid = fopen('testData.csv','r');                    %# Open the file
>> data = textscan(fid,'%s %s %s','Delimiter',',',...  %# Read the data
                   'CollectOutput',true);
>> fclose(fid);                                        %# Close the file
>> data = strrep(data{1},'"','')  %# Format the data and remove '"'

data = 

    '2010-04-20 15:15:00'    '8.9915176259e+00'    '8.8562623697e+00'
    '2010-04-20 15:30:00'    '8.5718021723e+00'    '8.6633827160e+00'
    '2010-04-20 15:45:00'    '8.4484844117e+00'    '8.4336586330e+00'
    '2010-04-20 16:00:00'    '1.1106980342e+01'    '8.4333062208e+00'
    '2010-04-20 16:15:00'    '9.0643470589e+00'    '8.6885660103e+00'
    '2010-04-20 16:30:00'    '8.2133517943e+00'    '8.2677822671e+00'
    '2010-04-20 16:45:00'    '8.2499419380e+00'    '8.1523501983e+00'
    '2010-04-20 17:00:00'    '8.2948492278e+00'    '8.2884797924e+00'

Now you can format the strings in the cell array data into a matrix of numeric values using DATEVEC, CELLFUN, and STR2NUM:

>> dateVectors = datevec(data(:,1));  %# Format the date string
>> newData = [dateVectors(:,4) cellfun(@str2num,(data(:,2:3)))]

newData =                        %# A numeric matrix

   15.0000    8.9915    8.8563
   15.0000    8.5718    8.6634
   15.0000    8.4485    8.4337
   16.0000   11.1070    8.4333
   16.0000    9.0643    8.6886
   16.0000    8.2134    8.2678
   16.0000    8.2499    8.1524
   17.0000    8.2948    8.2885
gnovice