tags:

views:

358

answers:

3

Hi all,

I suppose this might be simple question for all the gurus here but I somehow couldn't figure out the answer.

I want to be able to write csv cells to stream as simple as this:

stream << 1 << 2 << "Tom" << std::endl;

which would create output like 1,2,Tom. How can I achieve that? I figured that I need to create custom streambuf (as I don't think it's the right way to do it on stream level, it would be real pain just to overload << for all the types) but I'm not sure how << is normally implemented. Does it call put or write or what. Should I override those or what? Or did I just miss something completely?

I'd appreciate any help :)

Cheers,

A: 

If A is an iterator over elements...

copy(A, A + N, ostream_iterator<int>(cout, ","));
Chris H
Hm, not sure how would this help me? I don't have any container for values being written to the stream, they are written as they are generated. In addition the iterator is for specific type but I'd like to write anything in same manner as I'd with normal stream.
Tom
+9  A: 

Getting something like 98% of the way there isn't terribly difficult:

#include <iostream>

class add_comma { 
    std::ostream &os;
    bool begin;
    typedef add_comma &ref;
public:
    add_comma(std::ostream &o) : os(o), begin(true) {}

    template <class T>
    ref operator<<(T const &t) { 
        if (!begin)
            os << ",";
        os << "\"" << t << "\"";
        begin = false;
        return *this;
    }

    ref operator<<(std::ostream &manip(std::ostream &o) ) {
        if (&manip == &std::endl)
            reset();
        manip(os);
        return *this;
    }

    void reset() { begin = true; }

    operator void *() { return (void *)os; }
};

int main() { 
    add_comma a(std::cout);

    a << 1 << 2 << "This is a string" << std::endl;
    a << 3 << 4 << "Another string" << std::endl;
    return 0;
}

Edit: I've fixed the code to at least some degree -- it now only puts commas between items that are written, not at the beginning of a line. It only, however, recognizes "endl" as signaling the beginning of a new record -- a newline in a string literal, for example, won't work.

Jerry Coffin
Yes, I actually thought about this solution before but it wouldn't work with manipulators (the other problems with commas everywhere and quoting whole cell if it contains delimiter could be easily solved inside the overloaded operator function). Now when I think about it again I suppose I could override << for the manipulators as well. I'm just wondering if this is the right way to solve the problem :)
Tom
I've done a bit more work on it. It's now to the point that it's probably at least reasonably usable. It won't handle manipulators that take parameters, but that should just be a matter of adding (yet) another overload. Shouldn't be too horrible, but a pain anyway.
Jerry Coffin
This is a good solution. Suppose there was some alternative to defining a wrapper class (e.g., inheriting from `std::ostream` and overriding something, or some magic `csv_mode` manipulator). If you wanted to insert user-defined types you'd be in trouble. Custom `operator<<` methods are usually implemented using the built-in `operator<<` methods, so you'd end up with extra commas.
Dan
@Dan:True, but generally speaking, in a CSV file you pretty much need to break any aggregate type up into its constituent pieces. The usual target is something like a spreadsheet or database, where you'd (generally) want each of those pieces in a separate cell/field. Of course, it that's *not* what you wanted, fixing it is going to be just a bit interesting...
Jerry Coffin
@Dan:I remember I've once been told here that I shouldn't define my own classes inherited from std::*stream as they were not intended for that purpose, rather you should create custom streambuf class, that's why I was asking if this is good solution :)@Jerry: Your edit was basically what I had in mind. I just need to create my own endl manipulator (or own streambuf) as sometimes I need to use dos endlines.
Tom
@Tom:I thought about a filtering streambuf, but in this case I don't think it'll work. A streambuf receives a stream of characters, but in this case you need to know about the individual items presented to the stream, so I think it almost has to work at the level of the stream.
Jerry Coffin
@Tom: very true. The only virtual method in `ostream` is the dtor, so deriving from it won't do you much good. When I wrote "Suppose..." I really was speaking hypothetically.
Dan
@Jerry: That was actually my question more or less - How would I know about individual items presented to stream on streambuf level? :) You put it in better words. But I guess I have solution I was looking for from you now :). Cheers
Tom
@Dan: My bad, I somehow missed that :). Thanks for you help as well
Tom
+4  A: 

While I can appreciate the idea of overloading the stream operator, I would question the practice for the problem at hand.

1. Object-Oriented approach

If you are willing to write in a .csv file, then each line should probably have the very same format than the others ? Unfortunately your stream operator does not check it.

I think that you need to create a Line object, than will be streamable, and will validate each field before writing them to the file (and write them with the proper format). While not as fashionable, you'll have much more chance of achieving a robust implementation here.

Let's say that (for example) you want to output 2 integers and a string:

class Line
{
public:
  Line(int foo, int bar, std::string firstName):
    mFoo(foo), mBar(bar), mFirstName(firstName)

  friend std::ostream& operator<<(std::ostream& out, const Line& line)
  {
    return out << line.mFoo << ',' << line.mBar << ','
               << line.mFirstName << std::endl;
  }
private:
  int mFoo;
  int mBar;
  std::string mFirstName;
};

And using it remains very simple:

std::cout << Line(1,3,"Tom") << Line(2,4,"John") << Line(3,5,"Edward");

2. Wanna have fun ?

Now, this may seem dull, and you could wish to play and yet still have some control over what is written... well, let me introduce template meta programming into the fray ;)

Here is the intended usage:

// Yeah, I could wrap this mpl_::vector bit... but it takes some work!
typedef CsvWriter< mpl_::vector<int,int,std::string> > csv_type;

csv_type(std::cout) << 1 << 3 << "Tom" << 2 << 4 << "John" << 3 << 5 << "Edward";

csv_type(std::cout) << 1 << 2 << 3; // Compile Time Error:
                                    // 3 is not convertible to std::string

Now that would be interesting right ? It would format the line and ensure a measure of validation... One could always complicate the design so that it does more (like registering validators for each field, or for the whole line, etc...) but it's already complicated enough.

// namespace mpl_ = boost::mpl

/// Sequence: MPL sequence
/// pos: mpl_::size_t<N>, position in the Sequence

namespace result_of {
  template <class Sequence, class pos> struct operator_in;
}

template < class Sequence, class pos = mpl_::size_t<0> >
class CsvWriter
{
public:
  typedef typename mpl_::at<Sequence,pos>::type current_type;
  typedef typename boost::call_traits<current_type>::param_type param_type;

  CsvWriter(std::ostream& out): mOut(out) {}

  typename result_of::operator_in<Sequence,pos>::type
  operator<<(param_type item)
  {
    typedef typename result_of::operator_in<Sequence,pos>::type result_type;

    if (pos::value != 0) mOut << ',';
    mOut << item;

    if (result_type::is_last_type::value) mOut << std::endl;              

    return result_type(mOut);
  }

private:
  std::ostream& mOut;
}; // class CsvWriter


/// Lil' bit of black magic
namespace result_of { // thanks Boost for the tip ;)

  template <class Sequence, class pos>
  struct operator_in
  {
    typedef typename boost::same_type<
        typename mpl_::size<Sequence>::type,
        typename mpl_::next<pos>::type
      > is_last_type;

    typedef typename mpl_::if_<
      is_last_type,
      CsvWriter< Sequence, mpl_::size_t<0> >,
      CsvWriter< Sequence, typename mpl_::next<pos>::type >
    >::type;
  }; // struct operator_in<Sequence,pos>

} // namespace result_of

Here you have a stream writer that ensures that the cvs file is properly formatted... baring newlines characters in the strings ;)

Matthieu M.
For 1) Actually no :). I usually need to write CSV in 3rd party format and each line can be in different format (I'd say it's kind of protocol, csvs are used as "packets" :)). But I was talking generally - just wanted to throw something to this CSV object and it would handle proper CSV formatting...2) Wow, that code looks really good, but I have to look up what does it actually do :))
Tom