views:

293

answers:

1

Lua is currently the fastest scripting language out there, and its not so much slower than C/C++ for some sort of programs (on par when doing pidgits 1:1), however Lua scores really bad in a few benchmarks against C/C++.

One of those is the fannkuch test (Indexed-access to tiny integer-sequence), where it scores a horrible 1:148

-- The Computer Language Benchmarks Game
-- http://shootout.alioth.debian.org/
-- contributed by Mike Pall

local function fannkuch(n)
  local p, q, s, odd, check, maxflips = {}, {}, {}, true, 0, 0
  for i=1,n do p[i] = i; q[i] = i; s[i] = i end
  repeat
    -- Print max. 30 permutations.
    if check < 30 then
      if not p[n] then return maxflips end  -- Catch n = 0, 1, 2.
      io.write(unpack(p)); io.write("\n")
      check = check + 1
    end
    -- Copy and flip.
    local q1 = p[1]    -- Cache 1st element.
    if p[n] ~= n and q1 ~= 1 then    -- Avoid useless work.
      for i=2,n do q[i] = p[i] end   -- Work on a copy.
      for flips=1,1000000 do      -- Flip ...
    local qq = q[q1]
    if qq == 1 then    -- ... until 1st element is 1.
      if flips > maxflips then maxflips = flips end -- New maximum?
      break
    end
    q[q1] = q1
    if q1 >= 4 then
      local i, j = 2, q1 - 1
      repeat q[i], q[j] = q[j], q[i]; i = i + 1; j = j - 1; until i >= j
    end
    q1 = qq
      end
    end
    -- Permute.
    if odd then
      p[2], p[1] = p[1], p[2]; odd = false  -- Rotate 1<-2.
    else
      p[2], p[3] = p[3], p[2]; odd = true   -- Rotate 1<-2 and 1<-2<-3.
      for i=3,n do
    local sx = s[i]
    if sx ~= 1 then s[i] = sx-1; break end
    if i == n then return maxflips end -- Out of permutations.
    s[i] = i
    -- Rotate 1<-...<-i+1.
    local t = p[1]; for j=1,i do p[j] = p[j+1] end; p[i+1] = t
      end
    end
  until false
end

local n = tonumber(arg and arg[1]) or 1
io.write("Pfannkuchen(", n, ") = ", fannkuch(n), "\n")

So how could this be optimized (of course as with any optimization you have to measure your implementation to be sure its faster). And you aren't allowed to alter the C-core of Lua for this, or use LuaJit, its about finding ways to optimizing one of Lua's weak weak points.

+3  A: 

Robert Gould > One of those is the fannkuch test (Indexed-access to tiny integer-sequence), where it scores a horrible 1:148

When you quote numbers from the benchmarks game please show where those numbers come from so readers have some context.

In this case you seem to have taken numbers measured on the quadcore machine where the fastest programs have been re-written to exploit multiple cores. Instead of looking at elapsed time sort by CPU time and you'll see the ratio drop to 1:43.

Or look at the median and quartiles to get a better impression of how the set of C++ measurements compares to the set of Lua measurements.

Or there's a whole set of measurements where programs are forced to use just one core - Lua compared with C++ - and if you take a look at those Lua pi-digits programs you'll see that they use the C language GNU GMP library.

igouy
That's a good point, however the weakness in this area remains. So it should be possible to gain some boosts.
Robert Gould
Look at the whole set of measurements where programs are forced to use just one core and then it seems as if pi-digits is the exception and most of the programs are 1:20 - 1:30
igouy
Is Lua not friendly to multi-core/multiprocessing/threading/etc?
Adam Davis