views:

47

answers:

1

I was wondering if we could replace our Atom N270 based nettops that are running a Rails(ruby 1.8.6...) webapp with some equivalent ARM based device (we like the fanless setup, power consumption, etc.).

The ARM device was XScale-PXA270 @ 520, 128MB (and probably some slower SDRAMs), running linux, there was always enough free memory with comparable performance as a jailbroken iPhone.

Benchmarking the production database (SQLite) gave us promising results (ARM was just 20-30% slower), so I tried to build ruby (1.9.2p0).

The rails app was running very slowly on ARM (fetching from sql and generating templates 10-20x slower). I've decided run some benchmarks to find bottlenecks.

Again, some results were ok (on par with older ruby 1.8.6 we are using now, 6x slower than ruby 1.9.2), and some were very slow (20-30x slower). Fe. it looks that hash methods are 40x slower on ARM. Running Ruby Benchmark Suite showed more bottlenecks, strings, threads, arrays...

I knew ARM is slower than Atom, I was just not expecting such a huge difference, especially after SQLite was running fine.

Is there some flaw with Ruby on ARM, do I need to apply some patches, is this hopeless and should rewrite the whole app in C if I want to use the ARM device or just the device has not enough computing power?

Examples

def fib(n) 
  return 1 if n < 2
  fib(n-1)+fib(n-2)
end 

Benchmark.bm do |x| 
  x.report { fib(32) }
  x.report { fib(36) }
  x.report { h = {}; (0..10**3).each {|i| h[i] = i}  } 
  x.report { h = {}; (0..10**4).each {|i| h[i] = i}  } 
  x.report { h = {}; (0..10**5).each {|i| h[i] = i}  } 
end
ruby -rbenchmark bench.rb

Atom N270, 1GB

ruby 1.9.2p0 (2010-08-18) [i686-linux]
      user     system      total        real
  2.440000   0.000000   2.440000 (  2.459400)
 16.780000   0.030000  16.810000 ( 17.293015)
  0.000000   0.000000   0.000000 (  0.001180)
  0.020000   0.000000   0.020000 (  0.012180)
  0.160000   0.000000   0.160000 (  0.161803)

ruby 1.8.6 (2008-08-11 patchlevel 287) [i686-linux]
      user     system      total        real
 12.500000   0.020000  12.520000 ( 12.628106)
 84.450000   0.170000  84.620000 ( 85.879380)
  0.010000   0.000000   0.010000 (  0.002216)
  0.040000   0.000000   0.040000 (  0.032939)
  0.240000   0.010000   0.250000 (  0.255756)

XScale-PXA270 @ 520, 128MB ruby 1.9.2p0 (2010-08-18) [arm-linux]

      user     system      total        real
 12.470000   0.000000  12.470000 ( 12.526507)
 85.480000   0.000000  85.480000 ( 85.939294)
  0.060000   0.000000   0.060000 (  0.060643)
  0.640000   0.000000   0.640000 (  0.642136)
  6.460000   0.130000   6.590000 (  6.605553)

Build with:


 ./configure --host=arm-linux --without-X11 --disable-largefile \
--enable-socket=yes --without-Win32API --disable-ipv6 \
--disable-install-doc --prefix=/opt --with-openssl-include=/opt/include/ \
--with-openssl-lib=/opt/include/lib

ENV:

PFX=arm-iwmmxt-linux-gnueabi

export DISCIMAGE="/opt"
export CROSS_COMPILE="arm-linux-"
export HOST="arm-linux"
export TARGET="arm-linux"
export CROSS_COMPILING=1
export CC=$PFX-gcc
export CFLAGS="-O3 -I/opt/include"
export LDFLAGS="-O3 -L/opt/lib/"
#LIBS=
#CPPFLAGS=
export CXX=$PFX-g++
#CXXFLAGS=
export CPP=$PFX-cpp

export OBJCOPY="$PFX-objcopy"
export LD="$PFX-ld"
export AR="$PFX-ar" 
export RANLIB="$PFX-ranlib"
export NM="$PFX-nm"
export STRIP="$PFX-strip"
export ac_cv_func_setpgrp_void=yes
export ac_cv_func_isinf=no
export ac_cv_func_isnan=no
export ac_cv_func_finite=no

+2  A: 

It seems you're complaining that optimizations new in Ruby 1.9.2 (when compared to 1.8.x) are x86 specific. The Atom and ARM performance is comparable for Ruby 1.8.x. Perhaps you could ask a ruby-specific mailing list. A quick search shows that yes, there were many changes in Ruby 1.9.x:

Ruby 1.9.2 brings [...] major speed improvements to Ruby by way of the Yet Another Ruby VM (YARV) interpreter

Perhaps the right question is "Does YARV have x86 specific optimizations? Could these optimizations be duplicated in the ARM port?"

TomMD