views:

696

answers:

4

I was reading about output buffering in JavaScript here, and was trying to get my head around the script the author says was the fastest at printing 1 to 1,000,000 to a web page. (Scroll down to the header "The winning one million number script".) After studying it a bit, I have a few questions:

  • What makes this script so efficient compared to other approaches?
  • Why does buffering speed things up?
  • How do you determine the proper buffer size to use?
  • Does anyone here have any tricks up her/his sleeve that could optimize this script further?

(I realize this is probably CS101, but I'm one of those blasted, self-taught hackers and I was hoping to benefit from the wisdom of the collective on this one. Thanks!)

+2  A: 

I would bet the slowest thing in printing 1m numbers is the browser redrawing the page, so the fewer times you call document.write(), the better. Of course this needs to be balanced against large string concatenations (because they involve allocating and copying).

Determining the right buffer size is found through experimentation.

In other examples, buffering helps align along natural boundaries. Here are some examples

  • 32 bit CPUs can transfer 32 bits more efficiently.
  • TCP/IP packets have maximum sizes.
  • File I/O classes have internal buffers.
  • Images, like TIFFs, may be stored with their data in strips.

Aligning with the natural boundaries of other systems can often have performance benefits.

Lou Franco
+1  A: 

One way to think about it is to consider that a single call to document.write() is very expensive. However, building an array and joining that array into a string is not. So reducing the number of calls to document.write() effectively reduces the overall computational time needed to write the numbers.

Buffers are a way to try and tie together two different cost pieces of work.

Computing and filling arrays has a small cost for small arrays, bug large cost for large arrays. document.write has a large constant cost regardless of the size of the write but scales less than o(n) for larger inputs.

So queuing up larger strings to write, or buffering them, speeds overall performance.

Nice find on the article by the way.

Nathan Feger
+5  A: 

What makes this script so efficient compared to other approaches?

There are several optimizations that the author is making to this algorithm. Each of these requires a fairly deep understanding of how the are underlying mechanisms utilized (e.g. Javascript, CPU, registers, cache, video card, etc.). I think there are 2 key optimizations that he is making (the rest are just icing):

  • Buffering the output
  • Using integer math rather than string manipulation

I'll discuss buffering shortly since you ask about it explicitly. The integer math that he's utilizing has two performance benefits: integer addition is cheaper per operation than string manipulation and it uses less memory.

I don't know how JavaScript and web browsers handle the conversion of an integer to a display glyph in the browser, so there may be a penalty associated with passing an integer to document.write when compared to a string. However, he is performing (1,000,000 / 1000) document.write calls versus 1,000,000 - 1,000 integer additions. This means he is performing roughly 3 orders of magnitude more operations to form the message than he is to send it to the display. Therefore the penalty for sending an integer vs a string to document.write would have to exceed 3 orders of magnitude offset the performance advantage of manipulating integers.

Why does buffering speed things up?

The specifics of why it works vary depending on what platform, hardware, and implementation you are using. In any case, it's all about balancing your algorithm to your bottleneck inducing resources.

For instance, in the case of file I/O, buffer is helpful because it takes advantage of the fact that a rotating disk can only write a certain amount at a time. Give it too little work and it won't be using every available bit that passes under the head of the spindle as the disk rotates. Give it too much, and your application will have to wait (or be put to sleep) while the disk finishes your write - time that could be spent getting the next record ready for writing! Some of the key factors that determine ideal buffer size for file I/O include: sector size, file system chunk size, interleaving, number of heads, rotation speed, and areal density among others.

In the case of the CPU, it's all about keeping the pipeline full. If you give the CPU too little work, it will spend time spinning NO OPs while it waits for you to task it. If you give the CPU too much, you may not dispatch requests to other resources, such as the disk or the video card, which could execute in parallel. This means that later on the CPU will have to wait for these to return with nothing to do. The primary factor for buffering in the CPU is keeping everything you need (for the CPU) as close to the FPU/ALU as possible. In a typical architecture this is (in order of decreasing proximity): registers, L1 cache, L2 cache, L3 cache, RAM.

In the case of writing a million numbers to the screen, it's about drawing polygons on your screen with your video card. Think about it like this. Let's say that for each new number that is added, the video card must do 100,000,000 operations to draw the polygons on your screen. At one extreme, if put 1 number on the page at a time and then have your video card write it out and you do this for 1,000,000 numbers, the video card will have to do 10^14 operations - 100 trillion operations! At the other extreme, if you took the entire 1 million numbers and sent it to the video card all at once, it would take only 100,000,000 operations. The optimal point is some where in the middle. If you do it one a time, the CPU does a unit of work, and waits around for a long time while the GPU updates the display. If you write the entire 1M item string first, the GPU is doing nothing while the CPU churns away.

How do you determine which buffer size to use?

Unless you are targeting a very specific and well defined platform with a specific algorithm (e.g. writing packet routing for an internet routing) you typically cannot determine this mathematically. Typically, you find it empirically. Guess a value, try it, record the results, then pick another. You can make some educated guesses of where to start and what range to investigate based on the bottlenecks you are managing.

Does anyone here have any tricks up her/his sleeve that could optimize this script further?

I don't know if this would work and I have not tested it however, buffer sizes typically come in multiples of 2 since the under pinnings of computers are binary and word sizes are typically in multiples of two (but this isn't always the case!). For example, 64 bytes is more likely to be optimal than 60 bytes and 1024 is more likely to be optimal than 1000. One of the bottlenecks specific to this problem is that most browsers to date (Google Chrome being the first exception that I'm aware of) have javascript run serially within the same thread as the rest of the web page rendering mechanics. This means that the javascript does some work filling the buffer and then waits a long time until the document.write call returns. If the javascript was run as separate process, asynchronously, like in chrome, you would likely get a major speed up. This is of course attacking the source of the bottleneck not the algorithm that uses it, but sometimes that is the best option.

Not nearly as succinct as I would like it, but hopefully it's a good starting point. Buffering is an important concept for all sorts of performance issues in computing. Having an good understanding of the underlying mechanisms that your code is using (both hardware and software) is extremely useful in avoiding or addressing performance issues.

Burly
+1  A: 

So this one has been driving me crazy cause I don't think it really is the fastest. So here is my experiment that anyone can play with. Perhaps I wrote it wrong or something, but it would appear that doing it all at once instead of using a buffer is actually a faster operation. Or at least in my experiments.

<html>
<head>
<script type="text/javascript">
    function printAllNumberBuffered(n, bufferSize)
    {
     var startTime = new Date();

        var oRuntime = document.getElementById("divRuntime");
     var oNumbers = document.getElementById("divNumbers");

     var i = 0;
        var currentNumber;
     var pass = 0;
     var numArray = new Array(bufferSize);

     for(currentNumber = 1; currentNumber <= n; currentNumber++)
     {
      numArray[i] = currentNumber;

      if(currentNumber % bufferSize == 0 && currentNumber > 0)
      {
       oNumbers.textContent += numArray.join(' ');
       i = 0;
      }
      else
      {
       i++;
      }
     }

     if(i > 0)
     {
         numArray.splice(i - 1, bufferSize - 1);
      oNumbers.textContent += numArray.join(' ');
     }

     var endTime = new Date();

     oRuntime.innerHTML += "<div>Number: " + n + " Buffer Size: " + bufferSize + " Runtime: " + (endTime - startTime) + "</div>";
    }

    function PrintNumbers()
    {
        var oNumbers = document.getElementById("divNumbers");
     var tbNumber = document.getElementById("tbNumber");
     var tbBufferSize = document.getElementById("tbBufferSize");

     var n = parseInt(tbNumber.value);
     var bufferSize = parseInt(tbBufferSize.value);

     oNumbers.textContent = "";

     printAllNumberBuffered(n, bufferSize);
    }
</script>
</head>
<body>
<table  border="1">
    <tr>
     <td colspan="2">
      <div>Number:&nbsp;<input id="tbNumber" type="text" />Buffer Size:&nbsp;<input id="tbBufferSize" type="text" /><input type="button" value="Run" onclick="PrintNumbers();" /></div>
     </td>
    </tr>
    <tr>
     <td style="vertical-align:top" width="30%">
      <div  id="divRuntime"></div>
     </td>
     <td width="90%">
      <div id="divNumbers"></div>
     </td>
    </tr>
</table>
</body>
</html>
Josh