views:

50

answers:

3

I was wondering how the windows host-name resolution system works.
More precisely I wonder about the use, or lack thereof, of local caching in the process.
According to Microsoft TCP/IP Host Name Resolution Order, the process is as follows:

  1. The client checks to see if the name queried is its own.
  2. The client then searches a local Hosts file, a list of IP address and names stored on the local computer.
  3. Domain Name System (DNS) servers are queried.
  4. If the name is still not resolved, NetBIOS name resolution sequence is used as a backup. This order can be changed by configuring the NetBIOS node type of the client.

What I was wondering is, whether stage (2) is cached in some way.
The sudden interest arose this last few days, as I installed a malware protection (SpyBot) that utilizes the HOSTS file. In fact, it is now 14K entries big, and counting...
The file is currently sorted according to host name, but this of course doesn't have to be.
lg(14K), means 14 steps through the file for each resolution request. These request probably arrive at a rate of a few every second, and usually to the same few hundred hosts (tops).

My view of how this should work is like this:

  1. On system startup the windows DNS-resolution mechanism loads the HOSTS file a single time.
  2. It commits a single iteration over it that sorts file. A working copy is loaded into memory.
    The original HOSTS file, will not be further read throughout the resolution's process' life.
  3. All network-processes (IE, Firefox, MSN...) work via this process/mechanism.
    No other process directly interfaces/reads HOSTS file.
  4. Upon receiving a name resolution request, the process check its memory-resident cache.
    If it finds the proper IP then is answers appropriately.
  5. Otherwise (it's not cached), the resolution process continues to the memory resident (sorted) HOSTS file, and does a quick binary search over it. From here on, the process continues as originally described.
    The result of the resolution is cached for further use.

Though I am not sure as to the significance of these, I would really appreciate an answer.
I just want to see if my reasoning is right, and if not, why so?
I am aware that in this age of always-on PCs the cache must be periodically (or incrementally) purged. I ignore this for now.

A: 

Your method does not work when the ip address of a known hostname is changed in hosts without adding or changing the name.

Technet says that the file will be loaded into the DNS client resolver cache.

IMO this is mostly irrelevant: A lookup in a local file (once its in the disk cache) will still be several orders of magnitude faster than asking the DNS servers of your ISP.

Turbo J
A: 

I don't think that each process maintains it's own cache. If there is a cache, it probably exists in the TCP/IP stack or kernel somewhere, and even then, only for a very short while.

I've had situations where I'll be tinkering around with my hosts file and then using the addresses in a web browser and it will update the resolved names without me having to restart the browser.

jay.lee
+1  A: 

In the DNS Client service (dnsrslvr) you can see a function called LoadHostFileIntoCache. It goes something like this:

file = HostsFile_Open(...);

if (file)
{
    while (HostsFile_ReadLine(...))
    {
        Cache_RecordList(...);
        ...
    }

    HostsFile_Close(...);
}

So how does the service know when the hosts file has been changed? At startup a thread is created which executes NotifyThread, and it calls CreateHostsFileChangeHandle, which calls FindFirstChangeNotificationW to start monitoring the drivers\etc directory. When there's a change the thread clears the cache using Cache_Flush.

wj32
The link [1] to Technet says so as well, however this seems more exact. So I'm voting for this, though both are fine. [1] http://technet.microsoft.com/en-us/library/bb727005.aspx
David דוד