views:

60

answers:

3

Newbie to C++ learning by converting a java program to c++. The following code results in a segmentation fault (SIGSEGV) when executed.

//add web page reference to pages queue (STL)
void CrawlerQueue::addWebPage(WebPage & webpage) {
    pagesBuffer.push(webpage);
}

//remove and return web page reference from pages queue
WebPage & CrawlerQueue::getWebPage() {
    if (pagesBuffer.size() > 0) {
        WebPage & page = pagesBuffer.front();
        pagesBuffer.pop();
        return page;
    } else
        throw "Web pages queue is empty!";
}

//code that results in segmentation fault when called
void PageParser::extractLinks(){ 
    try {
        WebPage &  page =  crawlerqueue.getWebPage();
    }catch (const char * error) {
       return;
    }
}

The changes to the above code that fix the segmentation fault issue are highlighted(<====):

//return a const WebPage object instead of a WebPage reference
const WebPage CrawlerQueue::getWebPage() {          <====
    if (pagesBuffer.size() > 0) {
        WebPage page = pagesBuffer.front();         <==== 
        pagesBuffer.pop();
        return page;
    } else
        throw "Web pages queue is empty!";
}

//no segmentation fault thrown with modifications
void PageParser::extractLinks(){ 
    try {
        WebPage page =  crawlerqueue.getWebPage(); <====
    }catch (const char * error) {
       return;
    }
}

What gives? I'm still trying to understand references and pointers

+1  A: 

A reference (and also a pointer) points to a piece of data somewhere. When you had the version of getWebPage() that returned a reference, that reference was pointing to a piece of data inside of pagesBuffer. When you ran pop() after that, you removed that item from the queue, thereby deleting its memory, but your reference still pointed at it, so it was a dangling reference.

When you modified your code to return by values, you made copies of the object that you were returning, so the copies were still around even after you ran pop().

(C++ isn't like Java, where a reference keeps an object from being deleted -- you have to manage that yourself.)

Ken Bloom
+3  A: 
pagesBuffer.pop();

This line invalidate your reference.

Remember that standard container works with values, not "references", so when you add an object using a reference to it, in fact you add a copy of the object in the container.

Then using pop(), you destroy this object, making any reference or pointer pointing to it invalid.

Maybe you should store (shared) pointers instead of the objects.

Klaim
I think one of the big differences between Java and C++ lies in understanding that in Java everything is a reference and in C++, unless specifically declared as such, everything is a value. However, Java references are more akin to C++ pointers than C++ references.
Craig W. Wright
@Craig: the **really** important part is understanding how garbage collection works with Java references so you don't have to think about object lifetimes, and how C++ doesn't have garbage collection forcing you to think about object lifetimes.
Ken Bloom
@Ken Bloom: You may need to think of object lifetime but it is not a huge burden. A Java pointer is basically equivalent to a boost::shared_pointer<T>. If you use this in your code you get same basic functionality but still have the advantage of deterministic destruction and the ability to use RAII with code.
Martin York
@Martin: I agree: object lifetime isn't a huge burden once you understand remember it's there and understand it. Coming from Java though, it would be an adjustment you have to make.
Ken Bloom
A: 

If you want to store values in the queue, your code needs to be changed:

WebPage  CrawlerQueue::getWebPage() {
    if (pagesBuffer.size() > 0) {
        WebPage  page = pagesBuffer.front();
        pagesBuffer.pop();
        return page;
    } else
        throw "Web pages queue is empty!";
}

When using C++ you need to have a very clear idea in your head about the differences between values, references and pointers. You should also be aware that it is extremely unlikely that a coding style that works in Java will work in C++ - the two languages have almost nothing in common except for some trivial syntactic similarities.

Also, never write code like this:

void PageParser::extractLinks(){ 
    try {
        WebPage &  page =  crawlerqueue.getWebPage();
    }catch (const char * error) {
       return;
    }
}

Silently swallowing exceptions is always a very bad idea, as (normally) is catching very near the throw site.

anon