views:

197

answers:

4
no  time  scores
1    10    123
2    11    22
3    12    22
4    50    55
5    60    22
6    70    66
.    .     .
.    .     .
n    n     n 

Above a the content of my txt file (thousand of lines).

1st column - number of samples
2nd column - time (from beginning to end ->accumulated)
3rd column - scores

I wanted to create a new file which will be the total of every three sample of the scores divided by the time difference of the same sample.

e.g. (123+22+22)/ (12-10) = 167/2 = 83.5
     (55+22+66)/(70-50) = 143/20 = 7.15

new txt file

83.5
7.15
.
.
.
n

so far I have this code:

fid=fopen('data.txt')
data = textscan(fid,'%*d %d %d')
time = (data{1})
score= (data{2})
for sample=1:length(score)
     ..... // I'm stucked here ..
end
....
A: 

For what it's worth, here is how you would go about to do that in Python. It is probably adaptable to Matlab.

import numpy
no, time, scores = numpy.loadtxt('data', skiprows=1).T

 # here I assume that your n is a multiple of 3! otherwise you have to adjust
sums = scores[::3]+scores[1::3]+scores[2::3]
dt = time[2::3]-time[::3]

result = sums/dt
Olivier
+2  A: 
%# Easier to load with importdata
data = importdata('data.txt',' ',1);
%# Get the number of rows
n = size(data,1);
%# Column IDs
time = 2;score = 3;
%# The interval size (3 in your example)
interval = 3;
%# Pre-allocate space
new_data = zeros(numel(interval:interval:n),1);
%# For each new element in the new data
index = 1;
%# This will ignore elements past the closest (floor) multiple of 3 as requested
for i = interval:interval:n
    %# First and last elements in a batch
    a = i-interval+1;
    b = i;
    %# Compute the new data
    new_data(index) = sum( data(a:b,score) )/(data(b,time)-data(a,time));
    %# Increment
    index = index+1;
end
Jacob
I wonder what is the different between importdata and textscan?
Jessy
In `textscan` you need to specify the formatting, `importdata` figures it out (most of the time).
Jacob
I wonder if the "data" in the file.data ..refer to the name of the file? data.txt?
Jessy
I got this error when implement the code ..Attempt to reference of non-structure array..:(
Jessy
No, the `data` is the data component after `importdata`. Can you just type `file = importdata('data.txt');disp(file)` and tell me what the output it?
Jacob
it gives many numbers in form of matrix .. without any errors... but when I implement the whole code, it gave this error ..Attempt to reference of non-structure array
Jessy
OK, I've changed the code, try it now.
Jacob
new error comes out .. Index exceeds matrix dimensions. I wonder if this is because the total number of lines of the data is not necessary the multiple of 3?
Jessy
Could you send a link to the data file somewhere? Also, if you successfully got the data matrix with `textscan` try using that instead of `importdata`
Jacob
I'm not successfully using the textscan as well :( ..Here is the link of the file http;//www.2shared.com/document/SDQS/data.html
Jessy
Why I keep getting this error? :( Index exceeds matrix dimensions ..Error in-> new_data(index) = sum( data(a:b,score) )/(data(b,time)-data(a,time)); ...and this only happened when I change the interval number other than 3 ..
Jessy
I fixed a bug in my code --- sorry! Try it now.
Jacob
Thanks Jacob :-) ... this is really great! you helped me a lot... thank you very much. I have small problem though, it seems when I checked the output of ..it seems that it took the interval of 6 and not 5...
Jessy
My apologies!! I didn't check it before posting, try the code now.
Jacob
I updated it again, try the current version.
Jacob
THANKS Jacob! this works PERFECT!... God Bless you...!!
Jessy
Haha, no problems, glad to help :)
Jacob
A: 

I suggest you use the importdata() function to get your data into your variable called data. Something like this:

data = importdata('data.txt',' ', 1)

replace ' ' by the delimiter your file uses, the 1 specifies that Matlab should ignore 1 header line. Then, to compute your results, try this statement:

(data(1:3:end,3)+data(2:3:end,3)+data(3:3:end,3))./(data(3:3:end,2)-data(1:3:end,2))

This worked on your sample data, should work on the real data you have. If you figure it out yourself you'll learn some useful Matlab.

Then use save() to write the results back to a file.

PS If you find yourself writing loops in Matlab you are probably doing something wrong.

High Performance Mark
@Mark: Loops aren't bad always. This has been discussed before in SO. Plus, this solution fixes the interval at 3, ( `data(1:3:end,3) + data(2:3:end) + ... `).
Jacob
it gave me this error -- mrdivide out of memory
Jessy
@Jessy -- how big is your dataset ?
High Performance Mark
thousand of lines --- 318687 lines
Jessy
+7  A: 

If you are feeling adventurous, here's a vectorized one-line solution using ACCUMARRAY (assuming you already read the file in a matrix variable data like the others have shown):

NUM = 3;
result = accumarray(reshape(repmat(1:size(data,1)/NUM,NUM,1),[],1),data(:,3)) ...
    ./ (data(NUM:NUM:end,2)-data(1:NUM:end,2))

Note that here the number of samples NUM=3 is a parameter and can be substituted by any other value.

Also, reading your comment above, if the number of samples is not a multiple of this number (3), then simply discard the remaining samples by doing this beforehand:

data = data(1:fix(size(data,1)/NUM)*NUM,:);

Im sorry, here's a much simpler one :P

result  = sum(reshape(data(:,3), NUM, []))' ./ (data(NUM:NUM:end,2)-data(1:NUM:end,2));
Amro
+1: I've never come across accumarray before, much neater solution than mine. Thanks @Amro.
High Performance Mark
+1: Still, it hurts my eyes!
Jacob
Thank you :-) ..
Jessy