tags:

views:

2524

answers:

5

Hi, I'd like to read whole content of a text file to a std::string object with c++.

With Python, I can write:

text = open("text.txt", "rt").read()

It is very simple and elegance. I hate ugly stuff, so I'd like to know what is the most elegance way to read a text file with c++? Thanks.

+3  A: 

You seem to speak of elegance as a definite property of "little code". This is ofcourse subjective in some extent. Some would say that omitting all error handling isn't very elegant. Some would say that clear and compact code you understand right away is elegant.

Write your own one-liner function/method which reads the file contents, but make it rigorous and safe underneath the surface and you will have covered both aspects of elegance.

All the best

/Robert

sharkin
Corollary: Elegance is as elegance does; notions of elegant code differ between languages and paradigms. What a C++ programmer might consider elegant could be horrific for a Ruby or Python programmer, and vice-versa.
Rob
+36  A: 

There are many ways, you pick which is the most elegant for you.

Reading into char*:

ifstream file ("file.txt", ios::in|ios::binary|ios::ate);
if (file.is_open())
{
    size = file.tellg();
    char *contents = new char [size];
    file.seekg (0, ios::beg);
    file.read (contents, size);
    file.close();
    //... do something with it
    delete [] contents;
}

Into std::string:

std::ifstream in("file.txt");
std::string contents((std::istreambuf_iterator<char>(in)), 
    std::istreambuf_iterator<char>());

Into vector<char>:

std::ifstream in("file.txt");
std::vector<char> contents((std::istreambuf_iterator<char>(in)),
    std::istreambuf_iterator<char>());

Into string, using stringstream:

std::ifstream in("file.txt");
std::stringstream buffer;
buffer << in.rdbuf();
std::string contents(buffer.str());

file.txt is just an example, everything works fine for binary files as well, just make sure you use ios::binary in ifstream constructor.

Milan Babuškov
I like your answer even better than mine, which is not something I say often. Good job! +1
Chris Jester-Young
you actually need an extra set of parentheses around the first argument to contents' constructor with istreambuf_iterator<> to prevent it from being treated as a function declaration.
Greg Rogers
@Greg: thanks, I fixed it now.
Milan Babuškov
delete [] missing from char* version?
Shadow2531
memblock in the first version should probably be contents.
Roskoto
@Shadow2531: I figured it should not be deleted until you're done doing something with it.
Milan Babuškov
@Roskoto: thanks, I fixed it.
Milan Babuškov
@Milan: Understood and thanks for clarifying.
Shadow2531
FWIW, I think ios_base format flags like 'binary' and 'ate' etc. should be referenced by ios_base::binary and ios_base::ate etc. I think using ios::binary and ios::ate etc. is deprecated.
Shadow2531
@Shadow2531: I tried with fairly recent GCC (4.2.3) but it does not give any deprecation warning. Care to give some URL that talks about it?
Milan Babuškov
'Deprecated' might not be the correct term. But, I've been told that ios_base::binary is the proper way and that ios::binary is a left-over pre-Standardization.
Shadow2531
To find out for sure, I think you'd have to look in a copy of ISO/IEC 14882. However, fwiw, binary and such is defined under the ios_base class in include\c++\4.2.1-dw2\bits\ios_base.h
Shadow2531
Ferruccio
@Ferruccio: please see above comment by Greg Rogers.
Milan Babuškov
Ah. I see. Thanks.
Ferruccio
Use std::vector contents(size) rather than char* contents;
Martin York
A: 
Shadow2531
+3  A: 

There's another thread on this subject.

My solutions from this thread (both one-liners):

The nice (see Milan's second solution):

string str((istreambuf_iterator<char>(ifs)), istreambuf_iterator<char>());

and the fast:

string str(static_cast<stringstream const&>(stringstream() << ifs.rdbuf()).str());
Konrad Rudolph
+1  A: 

But beware that a c++-string (or more concrete: An STL-string) is as little as a C-String capable of holding a string of arbitraty length - of course not!

Take a look at the member max_size() which gives you the maximum number of characters a string might contain. This is an implementation definied number and may not be portable among different platforms. Visual Studio gives a value of about 4gigs for strings, others might give you only 64k and on 64Bit-platforms it might give you something really huge! It depends and of course normally you will run into a bad_alloc-exception due to memory exhaustion a long time before reaching the 4gig limit...

BTW: max_size() is a member of other STL-containers as well! It will give you the maximum number of elements of a certain type (for which you instanciated the container) which this container will (theoretically) be able to hold.

So, if you're reading from a file of unknow origin you should:
- Check its size and make sure it's smaller than max_size()
- Catch and process bad_alloc-exceptions

And another point: Why are you keen on reading the file into a string? I would expect to further process it by incrementally parsing it or something, right? So instead of reading it into a string you might as well read it into a stringstream (which basically is just some syntactic sugar for a string) and do the processing. But then you could do the processing directly from the file as well. Because if properly programmed the stringstream could seamlessly be replaced by a filestream, i. e. by the file itself. Or by any other input stream as well, they all share the same members and operators and can thus be seamlessly interchanged!

And for the processing itself: There's also a lot you can have automated by the compiler! E. g. let's say you want to tokenize the string. When defining a proper template the following actions:
- Reading from a file (or a string or any other input stream)
- Tokenizing the content
- pushing all found tokens into an STL-container
- sort the tokens alphabetically
- eleminating any double values
can all(!!) be achived in one single(!) line of C++-code (let aside the template itself and the error handling)! It's just a single call of the function std::copy()! Just google for "token iterator" and you'll get an idea of what I mean. So this appears to me to be even more "elegant" than just reading from a file...

Don Pedro