tags:

views:

86

answers:

3

Python has a function urljoin that takes two URLs and concatenates them intelligently. Is there a library that provides a similar function in c++?

urljoin documentation: http://docs.python.org/library/urlparse.html

And python example:

urljoin('http://www.cwi.nl/%7Eguido/Python.html', 'FAQ.html') ==> 'http://www.cwi.nl/%7Eguido/FAQ.html'

+1  A: 

Short answer, not really.

You would have to parse the string and replace the tail. This would be fairly easy using, for example, boost::regex.

Joel
It's not that simple. See RFC 2396 section 5.2 for the full algorithm on how to resolve a relative URI.
Nicolás
That RFC 2396 is a bitch of a read.
CarbonAsh
A: 

I figured it out. I used the library uriparser: http://uriparser.sourceforge.net/ and hastily implemented the function as follows. It does sparse error checking and may leak memory.

std::string urljoin(std::string &base, std::string &relative)
{
    UriParserStateA state;
    UriUriA uriOne;
    UriUriA uriTwo;

    state.uri = &uriOne;

    if (uriParseUriA(&state, base.c_str()) != URI_SUCCESS)
    {
        return "";
    }
    state.uri = &uriTwo;
    if (uriParseUriA(&state, relative.c_str()) != URI_SUCCESS)
    {
        uriFreeUriMembersA(&uriTwo);
        return "";
    }

    UriUriA result;
    if (uriAddBaseUriA(&result, &uriTwo, &uriOne) != URI_SUCCESS)
    {
        uriFreeUriMembersA(&result);
        return "";
    }
    uriFreeUriMembersA(&uriOne);
    uriFreeUriMembersA(&uriTwo);

    int charsRequired;
    uriToStringCharsRequiredA(&result, &charsRequired);
    charsRequired++;

    char *buf = (char*) malloc(charsRequired * sizeof(char)); if (uriToStringA(buf, &result, charsRequired, NULL) != URI_SUCCESS)
        return "";
    uriFreeUriMembersA(&result);

    std::string ret(buf);
    free(buf);

    return ret;
}
CarbonAsh
+1  A: 

The Poco::URI class in the POCO C++ Libraries can do that (see the resolve() member function).

obiltschnig