views:

183

answers:

2

I try to get this following url using the downloadURL function as follows:

http://www.ncbi.nlm.nih.gov/nuccore/27884304

But the data is not as what we can see through the browser, now I know it's because some correct information (such as browser type) is needed. How can I know what kind of information I need to set, and how can I set it? (By setHeader function or some other way??)

In VC++, we can use CInternetSession and CHttpConnection Object to get the correct data without setting any other detail information, is there any similar way in Qt or other cross-platform C++ network lib?? (Yes, I need the the cross-platform property.)

QNetworkReply::NetworkError downloadURL(const QUrl &url, QByteArray &data) {
    QNetworkAccessManager manager;
    QNetworkRequest request(url);
    request.setHeader(QNetworkRequest::ContentTypeHeader ,"Mozilla/5.0 (Windows; U; Windows NT
6.0; en-US; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 (.NET CLR 3.5.30729)");
    QNetworkReply *reply = manager.get(request);

    QEventLoop loop;
    QObject::connect(reply, SIGNAL(finished()), &loop, SLOT(quit()));
    loop.exec();


    QVariant statusCodeV = reply->attribute(QNetworkRequest::RedirectionTargetAttribute);
    QUrl redirectTo = statusCodeV.toUrl();

    if (!redirectTo.isEmpty())
    {
        if (redirectTo.host().isEmpty())
        {
            const QByteArray newaddr = ("http://"+url.host()+redirectTo.encodedPath()).toAscii();
            redirectTo.setEncodedUrl(newaddr);
            redirectTo.setHost(url.host());
        }
        return (downloadURL(redirectTo, data));
    }

    if (reply->error() != QNetworkReply::NoError)
    {
        return reply->error();
    }
    data = reply->readAll();
    delete reply;
    return QNetworkReply::NoError; }

By VC, we can just do this, then the correct data is in the CHttpFile.

CString downloadURL (CString sGetFromURL)
{
    // create an internet session 
    CInternetSession csiSession;

    int pos;
    BOOL neof;

    // parse URL to get server/object/port 

    DWORD dwServiceType;
    CString sServerName;
    CString sObject;
    INTERNET_PORT nPort;
    CHttpConnection* pHTTPServer = NULL; 
    CHttpFile* pFile = NULL;


        AfxParseURL ( sGetFromURL, dwServiceType, sServerName, sObject, nPort );

        // open HTTP connection 
        pHTTPServer = csiSession.GetHttpConnection ( sServerName, nPort ); 

        // get HTTP object 
        pFile = pHTTPServer->OpenRequest ( CHttpConnection::HTTP_VERB_GET, sObject, NULL, 1, NULL, NULL, INTERNET_FLAG_RELOAD ); 

        pFile->SendRequest();

}
A: 

You set wrong Content-Type header. The value you provided fits more User-Agent header

Kamil Klimek
Thanks :) I've found some other Url to get the data, but it still helpful for solving this problem
Claire Huang
A: 

Close, but you aren't setting the correct header. You need to do:

request.setRawHeader("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 (.NET CLR 3.5.30729)" );
Brian Roach
Thanks :) I've found some other url to get the data. However, It's still good to know where is the problem in my code. Next time I'll know how to do ^^
Claire Huang