Status: So far the best answer's program executes in 33% of the time of the original program! But there is probably still other ways to optimize it.
Lua is currently the fastest scripting language out there, however Lua scores really bad in a few benchmarks against C/C++.
One of those is the mandelbrot test (Generate Mandelbrot set portable bitmap file N=16,000), where it scores a horrible 1:109(Multi Core) or 1:28(Single Core)
Since the Delta in speed is quite large, this is a good candidate for optimizations. Also I'm sure some that those who know who Mike Pall is might believe its not possible to optimize this any further, but that's blatantly wrong. Anyone who has done optimizations knows it is always possible to do better. Besides I did manage to get some extra performance with a few tweaks, so I know its possible :)
-- The Computer Language Shootout
-- http://shootout.alioth.debian.org/
-- contributed by Mike Pall
local width = tonumber(arg and arg[1]) or 100
local height, wscale = width, 2/width
local m, limit2 = 50, 4.0
local write, char = io.write, string.char
write("P4\n", width, " ", height, "\n")
for y=0,height-1 do
local Ci = 2*y / height - 1
for xb=0,width-1,8 do
local bits = 0
local xbb = xb+7
for x=xb,xbb < width and xbb or width-1 do
bits = bits + bits
local Zr, Zi, Zrq, Ziq = 0.0, 0.0, 0.0, 0.0
local Cr = x * wscale - 1.5
for i=1,m do
local Zri = Zr*Zi
Zr = Zrq - Ziq + Cr
Zi = Zri + Zri + Ci
Zrq = Zr*Zr
Ziq = Zi*Zi
if Zrq + Ziq > limit2 then
bits = bits + 1
break
end
end
end
if xbb >= width then
for x=width,xbb do bits = bits + bits + 1 end
end
write(char(255-bits))
end
end
So how could this be optimized (of course as with any optimization you have to measure your implementation to be sure its faster). And you aren't allowed to alter the C-core of Lua for this, or use LuaJit, its about finding ways to optimizing one of Lua's weak weak points.
Edit: Putting a Bounty on this as to make the challenge more fun.