views:

1002

answers:

4

i need to read in data files which look like this:

* SZA: 10.00
 2.648  2.648  2.648  2.648  2.648  2.648  2.648  2.649  2.650  2.650 
 2.652  2.653  2.652  2.653  2.654  2.654  2.654  2.654  2.654  2.654 
 2.654  2.654  2.654  2.655  2.656  2.656  2.657  2.657  2.657  2.656 
 2.656  2.655  2.655  2.653  2.653  2.653  2.654  2.658  2.669  2.669 
 2.667  2.666  2.666  2.664  2.663  2.663  2.663  2.662  2.663  2.663 
 2.663  2.663  2.663  2.663  2.662  2.660  2.656  2.657  2.657  2.657 
 2.654  2.653  2.652  2.651  2.648  2.647  2.646  2.642  2.641  2.637 
 2.636  2.636  2.634  2.635  2.635  2.635  2.635  2.634  2.633  2.633 
 2.633  2.634  2.634  2.635  2.637  2.638  2.637  2.639  2.640  2.640 
 2.639  2.640  2.640  2.639  2.639  2.638  2.640  2.640  2.638  2.639 
 2.638  2.638  2.638  2.638  2.637  2.637  2.637  2.634  2.635  2.636 
 2.637  2.639  2.641  2.641  2.643  2.643  2.643  2.642  2.643  2.642 
 2.641  2.642  2.642  2.643  2.645  2.645  2.645  2.645

so now i wonder what would be the most elegant way to read this file into an array of floats. i know how to read each single line into a string, and i know how to convert the string to float using atof(). but how to do the rest the easiest? i've heard about string buffers, might this help me?

any help is greatly appreciated :)

+10  A: 

Since this is tagged as C++, the most obvious way would be using streams. Off the top of my head, something like this might do:

std::vector<float> readFile(std::istream& is)
{
  char chdummy;
  is >> std::ws >> chdummy >> std::ws; 
  if(!is || chdummy != '*') error();
  std::string strdummy;
  std::getline(is,strdummy,':');
  if(!is || strdummy != "SZA") error();

  std::vector<float> result;
  for(;;)
  {
    float number;
    if( !is>>number ) break;
    result.push_back(number);
  }
  if( !is.eof() ) error();

  return result;
}

Why float, BTW? Usually, double is much better.

Edit, since it was questioned whether returning a copy of the vector is a good idea:

For a first solution, I'd certainly do the obvious. The function is reading a file into a vector, and the most obvious thing for a function to do is to return its result. Whether this results in a noticeable slowdown depends on a lot of things (the size of the vector, how often the is called called and from where, the speed of the disk this reads from, whether the compiler can apply RVO). I wouldn't want to spoil the obvious solution with an optimization, but if profiling indeed shows that this is to slow, the vector should be passed in per non-const reference.

(Also note that C++1x with rvalue support, hopefully soon to be available by means of a compiler near you, will render this discussion moot, as it will prevent the vector from being copied upon returning from the function.)

sbi
The generic "read all floats"-loop would be `float number; while (is >> number) result.push_back(number);`
sth
Though yours is of course equivalent.
sth
@sth: Indeed, that's more terse, although I don't like the 'number` variable "leaking" out of the loop.
sbi
@andreash: Um, I just noted the `10.00` from your example seems to be part of the header. Sorry, I overlooked that. But you seemed to have been able to abstract from that... `:)`
sbi
You can avoid the loop by using std::copy: 'std::copy( std::istream_iterator<float>(is), std::istream_iterator<float>(), std::back_inserter( result ) );'
David Rodríguez - dribeas
dribeas: Though sadly it's questionable that the "loopless" code is an improvement from a readability standpoint.
quark
sbi: Do you really want to be returning a full vector by copy?
quark
the back inserter is the way to go
EvilTeach
@quark: Thanks for the question! I've added a paragraph discussing this.
sbi
@dribeas: Yes, I've could have done this. But, frankly, I also don't see how this improves readability. Plus wherever I have worked in the last decade this would have required just about everyone else to look it up (or ask me) in order to understand it. If I'd be tempted to make this any terser, I'd go for sth's loop, but not further. (Well, then again, if the `std::copy` would be hidden in a nicely named function, maybe the ability of others to immediately understand it might not be as badly hampered and outweighed by their chance to learn something...)
sbi
I think you forgot to return something.
Viktor Sehr
@Viktor: Indeed. Thanks for pointing it out. I fixed it.
sbi
A: 

I would do something like this:

std::ifstream input("input.txt");
std::vector<float> floats;
std::string header;
std::getline(input, header); // read in the "* SZA: 10.00" line
if(header_is_correct(header)) {
    float value;
    // while we could successfully read in a float from the file...
    while(input >> value) {
        // store it in the vector.
        floats.push_back(value);
    }
}

NOTE: header_is_correct(header) is just an example, you will need to implement any error checking for that first line manually there.

Evan Teran
why the downvote? I've tested this and it correctly reads each float from the file into a vector.
Evan Teran
+6  A: 
Beh Tou Cheh
interesting, though you should be clear that this is only for c++0x compilers.
Evan Teran
Very true but the lambda can just as easily be placed into a struct style predicate. I thought for style and future references (lets face it 1-2 years from now, the above code and alike will be the norm) that it would be a good idea to have a different view on how things can be done.
Beh Tou Cheh
I liked this. Nice usage of the new lambdas, even if this can't be the answer.
Edison Gustavo Muenz
+1 for using lambda and strtk
Viktor Sehr
A: 

Simple solution using STL algorithms:

#include <vector>
#include <iostream>
#include <string>
#include <iterator>

struct data
{
   float first; // in case it is required, and assuming it is 
                // different from the rest
   std::vector<float> values;
};

data read_file( std::istream& in )
{
   std::string tmp;
   data d;
   in >> tmp >> tmp >> d.first;
   if ( !in ) throw std::runtime_error( "Failed to parse line" );

   std::copy( std::istream_iterator<float>( in ), std::istream_iterator<float>(),
         std::back_inserter<float>(d.values) );

   return data;
}

If you really need to use an array, you must first allocate it (either dynamically or statically if you know the size) and then you can use the same copy algorithm

// parsing the first line would be equivalent
float data[128]; // assuming 128 elements known at compile time
std::copy( std::istream_iterator<float>(is), std::istream_iterator<float>(), 
      data );

But I would recommend using std::vector even in this case, if you need to pass the data into a function that takes an array you can always pass it as a pointer to the first element:

void f( float* data, int size );
int main()
{
   std::vector<float> v; // and populate
   f( &v[0], v.size() ); // memory is guaranteed to be contiguous
}
David Rodríguez - dribeas