views:

101

answers:

4

I'm trying to build something (ultimately a gem but for now an application) that works as follows.

Suppose for example the DB records are breeds of dog. There's a Dog parent class and a child class for each breed. The actual breeds aren't known until runtime.

When the server begins it will load up records from the DB and instantiate instances of classes based on the records, e.g. I may have two beagles and poodle. When someone comes to the server they may want to access one of those dog instances.

Why not just create the instance on the fly? In my case the "dogs" are basically classes that hold an algorithm and data. The algorithm doesn't change, the data changes rarely (on the order of days), but the execution of the algorithm itself, which uses both data and some dynamic data passed in such as a timestamp, will be accessed multiple times a second.

It would be silly to have to recreate an instance of the object and load the data each time just to do a request only to do it again on the next request (the requests don't change the state of the object). I'd be creating and destroying multiple objects a second when I could just reuse the same object.

it doesn't make sense to keep it in the session, since someone wanting a poodle shouldn't need to have the beagles information in her session object; it's irrelevant (and doesn't scale).

How do I persist these objects in memory? I basically want a lookup table to hold the instances. In Java I would create a singleton with some type of hashmap or array that sits in memory. In rails I tried this by creating a singleton class in the lib folder. I think--I may not be understanding this right--that the instance (the fact that it's a singleton is moot) is being lost when the session disappears.

The closest answer I found was http://www.ruby-forum.com/topic/129372 which basically puts everything in class fields and methods. Somehow that doesn't seem right.

TIA!

Addition: I come from Java. In Java I'd just create an object that sits on the heap or maybe in a JNDI tree and as HTTP requests came in they'd be handled by a a servlet or EJB or some per-request item which could then access the persistent object. I can't seem to find the equivalent in rails.

+2  A: 

I wouldn't worry too much about loading and discarding objects unless you can come up with a benchmark that proves it's an issue. Each request creates an extraordinary amount of intermediate objects as a matter of course, and these are generally created and destroyed in a matter of several milliseconds.

It's generally better to focus on loading only what is required, de-normalizing your database to push frequently accessed data or methods into a convenient location, or saving the results of complicated calculations in a cache field.

Benchmark first, optimize only when required.

Saving instances of models to a class cache can work, but it is only an option in a production environment where the model classes are not reloaded with each request. It can also expose you to errors caused by stale data.

If you have a scaling problem that cannot be solved with these methods, you might want to investigate building a persistent server for a portion of your functionality using a combination of Rack and EventMachine. There are a number of ways to go about building a background process that can perform complicated calculations using a pre-loaded set of data, but the specific approach will depend on several things such as the type of data you're working with and how frequently it will be accessed.

tadman
Thank you for taking the time to respond.Scaling aside, this just does not seem like proper programming. Continually recreating the same object is just wrong from an OO standpoint when the object itself can be reused.Rack seems like it's use a much lighter weight version of Rails, I didn't see anything in the documentation suggesting it can do persistence that Rails can't.EventMachine seems like the right answer. http://www.neeraj.name/2009/12/15/ruby-eventmachine-a-short-introduction.html has a good explanation of it. I'm going to take a deeper look.
hershey
The stateless nature of HTTP often contributes to problems such as this where one request may have nothing to do with the next, and an application framework such as Rails often has to clean house on all objects used in the prior request to make room for the next. If you need a persistent environment, build an engine which runs in the background and you'll find massive performance gains not just from avoiding redundant reloads, but from asynchronous operations. Writing a server that "speaks" Memcache protocol is pretty easy and it can be used like a regular cache even if it computes answers.
tadman
I spoke too soon. The EventMachine only persists within the connection. I won't be able to persist a connection across web requests so I'll still lose the state. Any other ways to build an engine?
hershey
I'm talking about building an EventMachine engine that runs as a process separate from the Rails app, not building an EventMachine module within your Rails environment. How many objects are you talking about loading here? Most modern hardware can import and instantiate 100K per second without too much in the way of optimization.
tadman
It's not that many objects, early on it can probably scale ok. It's more philosophical since something just doesn't feel right. If you look at the edit I did at the bottom of my original post you'll see how I've done things in Java on multiple occasions. I could have just recreated the objects (in some but not all of those cases) but Java let me keep the object in memory on the heap (across any and all sessions). It just feels wrong that rails can't do that, that a useful piece of functionality is missing.
hershey
Although the Java approach is arguably better since it doesn't involve reloading the same records so frequently, it's also made possible because the Java frameworks are designed with the presumption that objects of this sort will persist between requests. Rails has been designed from the outset that each request generally lives in its own context, with very little being carried over from one to the next. The "Rails Way" is to simply reload things often to ensure it's fresh, and where required, denormalize or cache. I've found that approach is usually good enough.
tadman
A: 

Maybe your example is confusing in its simplicity. I assume your objects are quite complicated and that your benchmarking shows that constructing them is unreasonable to be done on each request.

In production mode, classes are not unloaded between requests, but instances of those classes are not. So using class members of a class sounds like the way to go to me. Just use it to store your instances.

class ObjectCache
  @@objects = {:beagle => Beagle.new, :poodle => Poodle.new}

  def lookup key
    @@objects[key.to_sym]
  end
end
fullware
Thanks for the reply. You mentioned "in production mode." I'm trying this in development mode and I'm still losing the object between requests (even when the methods are class methods as well, e.g. def self.lookup key) but maybe that's because of the mode. I'd settle for that working for now but frankly this really feels like the wrong way to do it.
hershey
It's not very well defined what happens when you keep instances of ActiveRecord models between requests, so this is probably best avoided. If you're storing just plain-old Ruby objects it might work.
tadman
Good warning; fortunately these are just ruby objects, not ActiveRecord objects so I'm not worried about state being confused in and out of the database. The state of the object "dogs" in the example I gave above comes from some data loaded (read from, never written to) from the db and the algorithm. There's no risk of an object being inconsistent with the db since there's no object in the db in this case.
hershey
You can prevent classes from being unloadable in development mode as well.
fullware
How? http://apidock.com/rails/Module/unloadable says unloadable is depreciated. The best I found was http://strd6.com/2009/04/cant-dup-nilclass-maybe-try-unloadable/ is that the right way to do it? (Unloadable seemed much cleaner.)
hershey
I wasn't talking about "unloadable", but on that topic I haven't found any specific details as to why it is supposedly deprecated, or a way to achieve the same effect. I don't think it's application to this example however. A plugin's classes, for example, aren't unloaded by default, even in development mode. So this is sort of the behavior you need.
fullware
A: 

In production, the controllers and model classes are not reloaded between requests, so you have several options:

  • set the objects as class variables in your application_controller
  • create singleton methods in the model classes themselves that return the values
Toby Hede
A: 

Yes, classes can be prevented from unloading in Development mode, too. It's not only a Production mode thing. Though it happens by default in Production mode, where you must manually set it in Development mode.

oddlyzen