views:

103

answers:

1

I am looking to do some quite processor-intensive brute force processing for string matching. I have run my prototype in a multi-threaded environment and compared the performance to an implementation using Gridgain with a couple of nodes (also multithreaded).

The performance I observed was that my Gridgain implementation performed slower to my multithreaded implementation. It could be the case that there was a flaw in my gridgain implementation, but it was only a prototype, and I thought the results were indicative. So my question is this:

What are the advantages of having to learn and then build an implementation for a particular grid platform (hadoop, gridgain, or EC2 if going hosted - other suggestions welcome), when one could fairly easily put together a lightweight compute grid platform with a much shallower learning curve?...i.e. what do we get for free with these cloud/grid platforms that are worth having/tricky to implement?

(Please note, I don't have any need for a data grid)

Cheers,

-James

(p.s. Happy to make this community wiki if needbe)

+1  A: 

What kind of grid are you dealing with? A dozen hosts running the same OS would be pretty straightforward to run a grid for - all you really have to deal with is sending work to each host, maybe a little load balancing, maybe take into account what to do if a host goes down, maybe deal with distributing new service code to the hosts when you update your service, but if you don't deal with any of those it's not a big deal since the grid is a manageable size. If you're dealing with 1000s of hosts, or with a service that should never be down or have errors due to single hosts going down then you suddenly have to worry about:

  • not overloading any single host
  • distributing new service code
  • detecting when a host isn't responding and not sending it new work, as well as resending whatever it was working on
  • possibly working across different OSes and architectures (little vs. big endian)
  • energy savings - shutting down hosts during low load and bringing them back up for high load
  • scaling - if you add 100 hosts to your grid tomorrow how long does it take to get them connected and working?
  • reliability - some services may actually perform calculations on 2-3 different hosts and only return an answer that all the hosts agree on

That's a short list of things that most grid software should do for you if you need it. If you're working on something small or non-critical then by all means, roll your own. If you're working on something that has to work, or is big enough that having any manual steps in a deployment process would be a maintenance nightmare then you probably want to go with something that already exists.

tloach
@tloach, this is exactly the sort of list I was looking for, thanks very much. I was surprised that this question didn't attract more answers or more views...Perhaps grid/cloud computing is not quite as big as the media is making it out to be, or people are so busy deploying interesting apps to their grids that they don't have time for SO!
James B