views:

125

answers:

1

So I've got a bunch of worker threads doing simple curl class, each worker thread has his own curl easy handle. They are doing only HEAD lookups on random web sites. Also locking functions are present to enable multi threaded SSL as documented here. Everything is working except on 2 web pages ilsole24ore.com ( seen in example down ), and ninemsn.com.au/ , they sometimes produce seg fault as shown in trace output shown here

 #0  *__GI___libc_res_nquery (statp=0xb4d12df4, name=0x849e9bd "ilsole24ore.com", class=1, type=1, answer=0xb4d0ca10 "", anslen=1024, answerp=0xb4d0d234,
        answerp2=0x0, nanswerp2=0x0, resplen2=0x0) at res_query.c:182
    #1  0x00434e8b in __libc_res_nquerydomain (statp=0xb4d12df4, name=0xb4d0ca10 "", domain=0x0, class=1, type=1, answer=0xb4d0ca10 "", anslen=1024,
        answerp=0xb4d0d234, answerp2=0x0, nanswerp2=0x0, resplen2=0x0) at res_query.c:576
    #2  0x004352b5 in *__GI___libc_res_nsearch (statp=0xb4d12df4, name=0x849e9bd "ilsole24ore.com", class=1, type=1, answer=0xb4d0ca10 "", anslen=1024,
        answerp=0xb4d0d234, answerp2=0x0, nanswerp2=0x0, resplen2=0x0) at res_query.c:377
    #3  0x009c0bd6 in *__GI__nss_dns_gethostbyname3_r (name=0x849e9bd "ilsole24ore.com", af=2, result=0xb4d0d5fc, buffer=0xb4d0d300 "\177", buflen=512,
        errnop=0xb4d12b30, h_errnop=0xb4d0d614, ttlp=0x0, canonp=0x0) at nss_dns/dns-host.c:197
    #4  0x009c0f2b in _nss_dns_gethostbyname2_r (name=0x849e9bd "ilsole24ore.com", af=2, result=0xb4d0d5fc, buffer=0xb4d0d300 "\177", buflen=512,
        errnop=0xb4d12b30, h_errnop=0xb4d0d614) at nss_dns/dns-host.c:251
    #5  0x0079eacd in __gethostbyname2_r (name=0x849e9bd "ilsole24ore.com", af=2, resbuf=0xb4d0d5fc, buffer=0xb4d0d300 "\177", buflen=512, result=0xb4d0d618,
        h_errnop=0xb4d0d614) at ../nss/getXXbyYY_r.c:253
    #6  0x00760010 in gaih_inet (name=<value optimized out>, service=<value optimized out>, req=0xb4d0f83c, pai=0xb4d0d764, naddrs=0xb4d0d754)
        at ../sysdeps/posix/getaddrinfo.c:531
    #7  0x00761a65 in *__GI_getaddrinfo (name=0x849e9bd "ilsole24ore.com", service=0x0, hints=0xb4d0f83c, pai=0xb4d0f860) at ../sysdeps/posix/getaddrinfo.c:2160
    #8  0x00917f9a in ?? () from /usr/lib/libkrb5support.so.0
    #9  0x003b2f45 in krb5_sname_to_principal () from /usr/lib/libkrb5.so.3
    #10 0x0028a278 in ?? () from /usr/lib/libgssapi_krb5.so.2
    #11 0x0027eff2 in ?? () from /usr/lib/libgssapi_krb5.so.2
    #12 0x0027fb00 in gss_init_sec_context () from /usr/lib/libgssapi_krb5.so.2
    #13 0x00d8770e in ?? () from /usr/lib/libcurl.so.4
    #14 0x00d62c27 in ?? () from /usr/lib/libcurl.so.4
    #15 0x00d7e25b in ?? () from /usr/lib/libcurl.so.4
    #16 0x00d7e597 in ?? () from /usr/lib/libcurl.so.4
    #17 0x00d7f133 in curl_easy_perform () from /usr/lib/libcurl.so.4

My function looks something like this

int do_http_check(taskinfo *info,standardResult *data)
{
    standardResultInit(data);

    char errorBuffer[CURL_ERROR_SIZE];

    CURL *curl;
    CURLcode result;

    curl = curl_easy_init();

    if(curl)
    {
        //required options first
        curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, errorBuffer);
        curl_easy_setopt(curl, CURLOPT_URL, info->address.c_str());
        curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writer);
        curl_easy_setopt(curl, CURLOPT_WRITEDATA, &data->body);
        curl_easy_setopt(curl, CURLOPT_HEADERFUNCTION, writer);
        curl_easy_setopt(curl, CURLOPT_WRITEHEADER, &data->head);
        curl_easy_setopt(curl, CURLOPT_DNS_USE_GLOBAL_CACHE,0);
        curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, 30 );
        curl_easy_setopt(curl, CURLOPT_NOSIGNAL,1);
        curl_easy_setopt(curl, CURLOPT_NOBODY,1);
        curl_easy_setopt(curl, CURLOPT_TIMEOUT ,240);

        //optional options
        if(info->options.follow)
        {
            curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1);
            curl_easy_setopt(curl, CURLOPT_MAXREDIRS, info->options.redirects);
        }

        result = curl_easy_perform(curl);

        if (result == CURLE_OK)
        {
            data->success = true;
            curl_easy_getinfo(curl,CURLINFO_RESPONSE_CODE,&data->httpMsg);
            curl_easy_getinfo(curl,CURLINFO_REDIRECT_COUNT,&data->numRedirects);
            data->msg = "OK";
        }
        else
        {
            ... handle error
        }


    return 1;
}

Now, when i call function without any threads, just calling it from main it never breaks, so I was thinking its connected to threads, or maybe how data return structure is being returned, but from what I saw in trace it looks like fault is generated in easy_perform() call, and its confusing me. So if someone has any idea where should i look next it would be most helpful, thanks.

+4  A: 

There is a whole section dedicated in libcurl to Multi-Threading.

The first basic rule is that you must never share a libcurl handle (be it easy or multi or whatever) between multiple threads. Only use one handle in one thread at a time.

libcurl is completely thread safe, except for two issues: signals and SSL/TLS handlers. Signals are used for timing out name resolves (during DNS lookup) - when built without c-ares support and not on Windows.

If you are accessing HTTPS or FTPS URLs in a multi-threaded manner, you are then of course using the underlying SSL library multi-threaded and those libs might have their own requirements on this issue. Basically, you need to provide one or two functions to allow it to function properly. For all details, see this:

OpenSSL

http://www.openssl.org/docs/crypto/threads.html#DESCRIPTION

GnuTLS

http://www.gnu.org/software/gnutls/manual/html_node/Multi_002dthreaded-applications.html

NSS

is claimed to be thread-safe already without anything required.

yassl

Required actions unknown.

When using multiple threads you should set the CURLOPT_NOSIGNAL option to 1 for all handles. Everything will or might work fine except that timeouts are not honored during the DNS lookup - which you can work around by building libcurl with c-ares support. c-ares is a library that provides asynchronous name resolves. On some platforms, libcurl simply will not function properly multi-threaded unless this option is set.

Also, note that CURLOPT_DNS_USE_GLOBAL_CACHE is not thread-safe.

0A0D
Yeah, I've read that part of curl, and my program is following all the guidelines from there. But I've located the problem and its not curls fault, it was just debugger was pointing there. I did learn some new things about curl, thanks anyway.
Mogwai