views:

643

answers:

1

I'm using getaddrinfo to do DNS queries from C++ on Windows. I used to use the Windows API DnsQuery and that worked fine, but when adding IPv6 support to my software I switched to getaddrinfo. Since then, I've seen the following:

My problem is that some times getaddrinfo take very long time to complete. The typical response from getaddrinfo takes just a few milliseconds, but roughly 1 time out of 10000, it takes longer time, in some cases around 15 seconds but there's been several cases when it takes several minutes.

I've run Wireshark on the server and analyzed my applications debug logs and see the following:

  • I call the function getaddrinfo.
  • 15 seconds later, my machine queries the DNS server.
  • Some milliseconds later, I get the response from the DNS server.

The weird thing here is that the actual DNS query only takes a tenth of a second, but the time getaddrinfo actually executes is much longer.

The problem has been reported by many users, so it's not something specific to my machine.

So what does getaddrinfo do more than contact the DNS server?

Edit:

  • The problem has occurred with several addresses. If I try to reproduce the problem using these addresses, the problem does not occur.
  • I have done something stupid. Upon every DNS query, the etc/services is parsed. However, that doesn't explain a delay on several minutes. (thanks D.Shawley)

Edit 2

  • One type of DNS queries made by my software is anti-spam DNSBL queries. The log from one user showed me that the lookup for ip.address1.example.com always seemed to take exactly 2039 seconds, while the lookup for another.ip.address.example.com always took exactly 1324 seconds. The day after that, the lookups for those addresses were just fine. At first I thought that the DNS BL authors had put some kind of timeout on their side. But if this was the core problem, getaddrinfo should have timed out earlier?
+1  A: 

Windows has a local daemon that does DNS caching. Your call to getaddrinfo() is getting routed to that daemon, which presumably is checking its cache before submitting the query to your DNS server.

See Windows Knowledge Base article 318803 for more information on disabling the cache.

[Edited]

It sounds to me as though your Windows Server 2003 instance is not configured correctly for IPv6. Once the IPv6 lookups timeout, it will fall back to IPv4. Knowledge Base articles that might help include:

Unfortunately, I don't have access to any Windows Servers, so I can't test/replicate this myself.

Craig Trader
Well, that kind of answers my question. But the same cache was used by DnsQuery and I never saw the problem when using that function. My software is deployed in ~10 000 locations and it wasn't until I switched to getaddrinfo a lot of users started to report this issue. Also, it would seem absurd that a lookup in the local DNS cache would take 15 seconds.
Nitramk