tags:

views:

608

answers:

3

Hey there!

Im developing web-application with Merb and im looking for some safe and stable image processing library. I used to work with Imagick in php, then moved to ruby and start using RMagick. But there is a problem. Long running scripts causing memory leaks. There are couple solution exists, but i don't know which one is the most stable. So, what do u think ?

Right now, my app uses internal API that i wrote to process images, in php. Its running on separate server along with other applications, so its not a big problem. But i think its not a good architecture.

Anyway, i`ll consider any practical tips.

+1  A: 

I too have encountered this issue - the solution is to force garbage collection.

When you have reassigned the image variable to a new image simply use GC.start to ensure the old reference is released from memory.

On later versions of RMagick, I also believe you can also call destroy! on the image when you have finished processing it.

A combination of the two would probably ensure you are covered, but im not sure of the real life impact on performance (I would assume it is negligible i most cases).

Alternatively, you could use mini-magick which is a wrapper for the ImageMagick commandline client.

Matt
yes, it was one of the solutions i`ve heard about. but from other side, calling gc all the time not a good idea (saw article about it a while ago). it can cause slowing down. and also, gc is very 'expensive' operation. im not sure about it, but right now i have no options. also, there is a improved version of rmagick, but still, it has memory leaks, just the matter of time
Dan Sosedoff
My advice would be to profile the garbage collection and see if you can deal with it.An alternative would be ImageScience (http://seattlerb.rubyforge.org/ImageScience.html) but its not as capable as RMagick.
Matt
By calling Image#destroy! on intermediate images, I was able to reduce memory usage by an order of magnitude in my RMagick-bound Rails app, from 200MB down to > 40MB. The trick was to keep the number Magick::Image/Magick::ImageList objects in RAM at the same time as low as possible. Calling GC.start was not necessary.
foz
A: 

This is not due to ImageMagick; it's due to Ruby itself, and it's a well known problem. My suggestion is to split your program into two parts: a long-running part that allocates little memory and just deals with the control of the system, and a separate program that actually does the processing work. The long-running control process should do just enough to find some work for a child process that it spawns, and the child should do all of the processing for that particular work item.

Another option would be to leave the two combined, but after a work unit is complete, use exec to replace your process with a freshly started version of the same program, which would search for another work item, process it, and exec itself again.

This is assuming that the work items are fairly large, which they almost certainly are if you're using ImageMagick. If they're not, you'll find that the overhead of spawning a new process and having the Ruby interpreter re-parse your entire program starts to get a little too large. You can deal with this by having your program do more work units (say, ten or a hundred) before re-executing itself.

Curt Sampson
like i said in original post, currently im using external piece of program that implemented as API.
Dan Sosedoff
Yes, I see now. I guess my advice then is still the same, except you might interpret it as, "stick with it"...
Curt Sampson
+1  A: 

Actually, it isn't really a Ruby specific problem, other Interpreters share that as well. The concrete problem is that the GC of Ruby only sees memory that was allocated by Ruby itself, and not by external libraries (with the notable exception of the library using Rubys memory management facilities). So, a ImageMagick-Object in Ruby memory space is really small, but the image in the space managed by ImageMagick is large. So, this is not a leak per se, but it behaves like one. Rubys Garbage Collector never kicks in if your Process stays under a certain limit (8MB is standard). As ImageMagick never creates large objects in Ruby space, it probably never kicks in. So, either you use the proposed method of spawning a new process or using exec. Another rather nifty one is to have an image processing service in the backend that forks for every task. Another one would be to have some kind of monitoring in place that kickstarts the GC every once in a while.

There is another Library called MagickWand by Timothy Paul Hunter (the author of RMagick) that tries to address these issues and create a nicer API. It's in alpha and requires a rather new release of ImageMagick, though.

Skade
Well, based on the comment to my answer, I might well be wrong about the particulars of this situation. But I am reasonably confident in saying that if there's a leak, it's far more likely in Ruby than in a well-used library like ImageMagick. (Well, ok, there's also the possibility that it's related to the interface between the two, but that counts as Ruby for me.)But I think your general characterisation of the GC behaviour is incorrect. So long as the glue code that interfaces to the library is managing memory appropriately, whatever the library allocates should be seen by the GC.
Curt Sampson
As i said: it is not a leak by definition, though it behaves like one. If the GC derefences the object the memory is released correctly. The point is that the GC never sees enough memory allocated to even kick in. And Rubys GC is really conservative on when it kicks in, especially the fact that it _never_ kicks in as long as the size of seen memory is under a certain size.You are right: a well crafted library that ensures that all allocated memory can be seen by ruby fixes this problem - this is basically what MagickWand is doing.
Skade