views:

71

answers:

1

This is a follow up to the question I asked earlier and with the help of some people here I was able to start up with the function I want to write,but I am yet to complete it. Here is my earlier question: I have a series of files with the extension (.msr), they contain measured numerical values of more that ten parameters which ranges from date,time,temperature, pressure, .... that are separated by semi colon. The examples of the data values are shown below.

2010-03-03 15:55:06; 8.01; 24.9; 14.52; 0.09; 84; 12.47;
2010-03-03 15:55:10; 31.81; 24.9; 14.51; 0.08; 82; 12.40;
2010-03-03 15:55:14; 45.19; 24.9; 14.52; 0.08; 86; 12.32;
2010-03-03 15:55:17; 63.09; 24.9; 14.51; 0.07; 84; 12.24;

Each of the files have as a name REG_2010-03-03,REG_2010-03-04,REG_2010-03-05,... and they are all contained in a single file.

  1. I want to extract from each of the file the date information which in this case 2010-03-03, column 3 and column 6.
  2. Find the statistical mean of the each of the columns of 3 and 6. 3.Then store the results in a new file which will only contain the date,and the calculated mean of the columns above for further analysis.

My question now: I want to to be able to open the directory which contains 30 files with extension of .msr . I want to open the source file, then for each file inside it, to extract the informations needed as I have explained earlier and for each file read above to store the date (uniform in each file) and the mean value of column 3 and 6 in a single file.Thus the destination file will contain at each line three columns which are the date, mean(3rd column) and mean(6th column) separated by space making it a total of 30 rows. Below is the code I started with and would appreciate your guide on how to implement this.

just as you outlined above. Here is the outline of what I want to achieve

1) Open the directory that contains the files(here is USB KEY). 2) Read all the msr filenames inside it. 3) Open each msr files. 4) Extract the date (its the first column in the file),ignore the time and the separator( 5) extract data 1 (data at the 3rd column) 6) extract data 2 (data at the 6th column) 7) Calculate the mean for 3rd column and 6th column. 8) output to file (date,mean 3rd column,mean 6th column) 9) close msr files 10) close the directory(if possible)

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int file_getline_analyse(char *infile,char *outfile,char *path,char *strline) {

int return_value=0;

    FILE *fd=NULL;    // pointer for data source
    FILE *fo= NULL;   // Destination file
    char *file_path=NULL;     

    char *date, *tmp,*time;
    double sum, mean = 0;
    file_path=calloc((strlen(path)+strlen(infile)),sizeof(file_path));   
    if (file_path==NULL) {
        printf("file_path in get_line\n");
        exit(EXIT_FAILURE);
    }

    strcpy(file_path,path);    // copies the path entered in the function call to the allocated meomory 
    strcat(file_path,infile);  // concatenates the contents of the  allocated meomory from the source file

    fd=fopen(file_path,"r");

    fo = fopen(outfile, "w");

    if((fd==NULL) && (fo==NULL))  {
        return_value = -1;
    }
    else {
        int i=0;
        int j=0;
        while ((fgets (strline, BUFSIZ, fd))>0){
            date = strtok(strline, " ");
            time=strtok(NULL, " "); // skip over time
            tmp = strtok(NULL, ";");
            if (i == 3|| i == 6) { // get only the 3rd and 6th value
                sum += strtod(tmp, NULL);
                ++i;
                if(j== '\n') {
                    // Replacing the characters at the end of the line by 0:
                    char *p = strchr (strline, '\n');
                    if (p) {
                        *p = 0;
                    }
                    return_value = 0;
                    break;

                }
                j++;


            }

            mean = sum/(double)(j+1);

            fprintf(fo,"%s: %.2f\n", date, mean);

        }
        fclose (fd);
        fclose(fo);
    }

    free(file_path);
    file_path=NULL;

    return return_value;
}
A: 

If you don't need it to be in C, I would choose another language, for example Perl:

sub analyze($) {
  my ($fname) = @_;
  my ($date, $sum3, $sum6, $n) = (undef, 0, 0, 0);

  open(F, "<", $fname) or die "$fname: $!";
  while (defined(my $line = <F>)) {
    my @words = split(m";", $line);
    $date = split(" ", $words[0])[0]; # only use the date, not the time
    $sum3 += $words[2];
    $sum6 += $words[5];
    $n++;
  }
  close(F) or die "$fname: $!";
  printf("%s;%f;%f\n", $date, $sum3 / $n, $sum6 / $n);
}

foreach my $fname (@ARGV) {
  analyze($fname);
}

In C, you are missing such handy features like:

  • Automatic memory management
  • Easy support for strings, for example concatenation, splitting
Roland Illig
Thank you Roland, I understand your points perfectly well, but the point is that, I am using it to add some functions in an already existing c project, thus it have to be in c. Actually the file in question is in USB key,so I have to open the usb key directory,get the directory that contains the files and then extract the contents as described earlier. Thats all.
chriscol