views:

582

answers:

9

I have a function that reads lines from a log file, converts these lines to a certain class and returns a STL list of instances of this class.

My question is: how should I declare this function so that the whole list is NOT copied when attributing it to the caller? Without loss of generality, assume:

list<Request> requests = log_manipulator.getAsRequestList();

How should I declare getAsRequestList()? Returning a reference to a list or just returning a list?

This is a serious issue because in this particular assignment the lists will contain circa 1.5M elements, and thus a mistake like that can screw up memory usage.

Thanks in advance.

+1  A: 

A pointer to a list or you can rewrite it as:

list<request> requests;
log_manipulator.populate_list(&requests);
jonnii
this is currently the best way to do this with c++03 or below. c++0x will allow you to define move constructors that does exactly what you want here without copying, but the standard isn't finalized and you won't see it in common use for a few years.
Charles Ma
I dunno, I think move constructors will be the first things added to the STL containers.
GMan
A: 

Declare is as

void getAsRequestList(list<Request>* requests);

And call it as

list<Request> requests;
log_manipulator.getAsRequestList(&requests);
Igor Krivokon
Why was it downvoted?
Igor Krivokon
Upvoted to compensate for unexplainable downvote (not mine). I'm not a big fan of pointers, but I don't think there is anything wrong with this answer.
Rafael Almeida
Using of pointer vs. (non-const) reference was discussed many times. Pro-reference: you can't pass NULL pointer. Pro-pointer - your values never change "magically". In our company, we accepted a standard that prohibits passing of non-const references.
Igor Krivokon
+11  A: 

Returning a reference is not advisable, and returning the list object would cause copying. Best would be to change the method's signature to accept and populate a list reference:

list<Request> requests;
log_manipulator.getRequestListByRef(requests);

with void getRequestListByRef(list<Request>&) as the method's signature.

Alex Martelli
This is the way to do it! You just have to love in/out parameters. Especially with large containers of some sort :)
Magnus Skog
This works great with the million-line-long input I have and is a little bit less 'ugly' than pointers, so I'm going with this one. (RVO is interesting, but, as explained below, not guaranteed, and this is a program-critical part).
Rafael Almeida
The wrong answer was accepted. Use output iterators instead.
John Dibling
+8  A: 

You have two easy options:

A return parameter

void getAsRequestList(list<Request>& requests);

...

list<Request> requests;
log_manipulator.getAsRequestList(requests);

An output iterator

template <class OutputIterator>
void getAsRequestList(OutputIterator dest);

...

list<Request> requests;
log_manipulator.getAsRequestList( 
   insert_iterator< list<Request> >(requests, requests.begin()) );


See also: How do I return hundreds of values from a C++ function?

Shog9
Note that you can avoid having to specify the insert_iterator type using the "inserter" free function: log_manipulator.getAsRequestList(inserter(requests, requests.begin()));. Or back_inserter(requests).
Steve Jessop
The output iterator is the way to go. Passing references to lists breaks genericity and just ain't cool.
John Dibling
+1  A: 

Return a local variable as a reference is a funny thing in C++,

list<Request>& requests = log_manipulator.getAsRequestList();

it is compiler dependent so works in one machine but not the other.

your best bet is to declare your getAsRequestList() this way

void getAsRequestList(list<Request>& requests) 
{
   // populate results in requests
}

or new your requests on heap with in getAsRequestList() and return a pointer

oykuo
What do you mean by "compiler dependant"? Don't you mean "undefined"? As in "don't do this"?
1800 INFORMATION
Sure, but dude didn't tell me to actually do it. In fact, I'm guessing that he was actually implicitly telling me not to.
Rafael Almeida
+5  A: 

You might be able to get away with returning a simple list - search for "Return Value Optimization" for details. Simply, the compiler is allowed to generate code that bypasses the costly copy constructor.

Only after trying this and not being satisfied with the results would I recommend the "populate" versions suggested in the other answers. It's uglier.

Of course, the caveat is that RVO is not guaranteed to happen.

aib
I agree that the reference passing is uglier. Also, this RVO fact is vary interesting (wikipedia has a respectable reference at http://en.wikipedia.org/wiki/Return_value_optimization). For this particular situation, where this is a critical point in the program, I'm going with the reference passing solution for certainty, but this is certainly a nice tip! Thanks!
Rafael Almeida
+3  A: 

Return a auto_ptr to the list:

auto_ptr<list<Request> > getAsRequestList()
{
    auto_ptr<list<Request> > list = new list<Request>();
    // populate list
    return list;
}
Eclipse
Interesting approach!
Rafael Almeida
and a good one +1
markh44
It's essentially what C# and Java would do (albeit with a more simplistic garbage collection system).
Eclipse
+1  A: 

Pass as in/out parameter by reference is definately my choice.

But just to give an alternative you could pass an iterator. Normally you would have to pre-size the container so that assignment by the iterator goes into a pre-allocated slot but the STL also has a back inserting iterator that you could use.

#include <iostream>
#include <iterator>
#include <list>


template<typename T>
void FillMyContainer(T inserter)
{
    for(int loop=0;loop < 10;++loop)
    {
        (*inserter)    = loop;
        ++inserter;
    }
}

int main()
{
    /*
     * Example of back_inserter in use.
     */
    std::list<int>      data;
    std::copy(std::istream_iterator<int>(std::cin),
              std::istream_iterator<int>(),
              std::back_inserter(data));

    /*
     * Using back_inserter in the context of the question
     */
    std::list<int>      fill;
    FillMyContainer(std::back_inserter(fill));
}

So the question now becomes why does log_manipulator not have methods to return an iterator rather than a list. Then your underlying implementation does not need to rely on a specific container type?

Martin York
A: 

Since you have mentioned about memory usage, Have you considering treating log_manipulator as 'iterator' or treat log_manipulator as container and write an iteator (e.g. log_iter) which iterates over it? Such that iterator operator++ will result in reading and parsing the next line from the log file and operator * will return a current 'request object'. As long as the iterator corresponds to STL iterator semantics, you can use any STL algorithms on it.

For example, You can then easily convert it to a vector using

std::vector<Request> reqvector;
std::copy(log_manipulator.begin(), log_manipulator.end(), std::back_inserter(reqvector))

(assuming begin(),end() function return a 'log_iter')

Check Writing Your Own Iterators article from Dr. Dobb's journal

Nitin Bhide