views:

3875

answers:

15

As I learn more and more about OOP, and start to implement various design patterns, I keep coming back to cases where people are hating on Active Record.

Often, people say that it doesn't scale well (citing Twitter as their prime example) -- but nobody actually explains why it doesn't scale well; and / or how to achieve the pros of AR without the cons (via a similar but different pattern?)

Hopefully this won't turn into a holy war about design patterns -- all I want to know is **specifically** what's wrong with Active Record.

If it doesn't scale well, why not?

What other problems does it have?

+2  A: 

The main thing that I've seen with regards to complaints about Active Record is that when you create a model around a table, and you select several instances of the model, you're basically doing a "select * from ...". This is fine for editing a record or displaying a record, but if you want to, say, display a list of the cities for all the contacts in your database, you could do "select City from ..." and only get the cities. Doing this with Active Record would require that you're selecting all the columns, but only using City.

Of course, varying implementations will handle this differently. Nevertheless, it's one issue.

Now, you can get around this by creating a new model for the specific thing you're trying to do, but some people would argue that it's more effort than the benefit.

Me, I dig Active Record. :-)

HTH

Tim Sullivan
"Doing this with Active Record would require that you're selecting all the columns, but only using City." It's actually extremely easy to specify a select clause.
MattMcKnight
A: 

The problem that I see with Active Records is, that it's always just about one table. That's okay, as long as you really work with just that one table, but when you work with data in most cases you'll have some kind of join somewhere.

Yes, join usually is worse than no join at all when it comes to performance, but join usually is better than "fake" join by first reading the whole table A and then using the gained information to read and filter table B.

BlaM
Using :joins or :includes in the finds IE Customer.find(:all, :include => :contacts, :conditions => "active = 1") will do an SQL join, not a full table scan of either.
Tilendor
A: 

@BlaM: You're absolutely right. Although I've never used Active Record, I have used other bolted-on ORM systems (particularly NHibernate), and there are two big complaints I have: silly ways to create objects (ie, .hbm.xml files, each of which get compiled into their own assembly) and the performance hit incurred just loading objects (NHibernate can spike a single-core proc for several seconds executing a query that loads nothing at all, when an equivalent SQL query takes almost no processing).

Not specific to Active Record of course, but I find most ORM systems (and ORM-like systems) seem to suffer from these types of problems.

DannySmurf
There are many alternatives to using hbm.xml files. See for example NHibernate.Mapping.Attributes and fluent-nhibernate.
Mauricio Scheffer
About object creation performance, I've never run into such perf problems, you might wanna check with a profiler.
Mauricio Scheffer
@mausch: Don't need a profiler. It's a fairly well-known issue. Don't know if it applies to the latest version (which I am not using at my job yet).http://ayende.com/Blog/archive/2007/10/26/Real-World-NHibernate-Reducing-startup-times-for-large-amount-of.aspx
DannySmurf
And, looking at NHibernate.Mapping.Attributes, it looks like all this does is generate the .hbm.xml files in memory, which doesn't solve anything. My issue is not maintaining those; it's the startup penalty that comes with them. Auto-generating doesn't solve that.
DannySmurf
The post you quote deals with mapping creation at app startup, not object creation at query-time. When I mentioned object creation perf, I was referring to the latter.
Mauricio Scheffer
mapping creation at startup *is* a problem if you're dealing with 1000s of entities.
Mauricio Scheffer
@mausch: Both things are an issue: app startup, and query-time creation. Startup is the bigger pig; query-time is just resource-intensive.
DannySmurf
+2  A: 

I love the way SubSonic does the one column only thing.
Either

DataBaseTable.GetList(DataBaseTable.Columns.ColumnYouWant)

, or:

Query q = DataBaseTable.CreateQuery()
               .WHERE(DataBaseTable.Columns.ColumnToFilterOn,value);
q.SelectList = DataBaseTable.Columns.ColumnYouWant;
q.Load();

But Linq is still king when it comes to lazy loading.

Lars Mæhlum
+49  A: 

There's ActiveRecord the Design Pattern and ActiveRecord the Rails ORM Library, and there's also a ton of knock-offs for .NET, and other languages.

These are all different things. They mostly follow that design pattern, but extend and modify it in many different ways, so before anyone says "ActiveRecord Sucks" it needs to be qualified by saying "which ActiveRecord, there's heaps?"

I'm only familiar with Rails' ActiveRecord, I'll try address all the complaints which have been raised in context of using it.

@BlaM

The problem that I see with Active Records is, that it's always just about one table

Code:

class Person
    belongs_to :company
end
people = Person.find(:all, :include => :company )

This generates SQL with LEFT JOIN companies on companies.id = person.company_id, and automatically generates associated Company objects so you can do people.first.company and it doesn't need to hit the database because the data is already present.

@pix0r

The inherent problem with Active Record is that database queries are automatically generated and executed to populate objects and modify database records

Code:

person = Person.find_by_sql("giant complicated sql query")

This is discouraged as it's ugly, but for the cases where you just plain and simply need to write raw SQL, it's easily done.

@Tim Sullivan

...and you select several instances of the model, you're basically doing a "select * from ..."

Code:

people = Person.find(:all, :select=>'name, id')

This will only select the name and ID columns from the database, all the other 'attributes' in the mapped objects will just be nil, unless you manually reload that object, and so on.

Orion Edwards
+1  A: 

@BlaM: Sometimes I justed implemented an active record for a result of a join. Doesn't always have to be the relation Table <--> Active Record. Why not "Result of a Join statement" <--> Active Record ?

Johannes
A: 

@Orion Edwards: Mighty! I didn't know about that specific feature. Yet another pro-AR argument to me to put into my arsenal.

Tim Sullivan
A: 

The problem with ActiveRecord is that the queries it automatically generates for you can cause performance problems.

You end up doing some unintuitive tricks to optimize the queries that leave you wondering if it would have been more time effective to write the query by hand in the first place.

engtech
+26  A: 

I have always found that ActiveRecord is good for quick CRUD-based applications where the Model is relatively flat (as in, not a lot of class hierarchies). However, for applications with complex OO hierarchies, a DataMapper is probably a better solution. While ActiveRecord assumes a 1:1 ratio between your tables and your data objects, that kind of relationship gets unwieldy with more complex domains. In his book on patterns, Martin Fowler points out that ActiveRecord tends to break down under conditions where your Model is fairly complex, and suggests a DataMapper as the alternative.

I have found this to be true in practice. In cases, where you have a lot inheritance in your domain, it is harder to map inheritance to your RDBMS than it is to map associations or composition.

The way I do it is to have "domain" objects that are accessed by your controllers via these DataMapper (or "service layer") classes. These do not directly mirror the database, but act as your OO representation for some real-world object. Say you have a User class in your domain, and need to have references to, or collections of other objects, already loaded when you retrieve that User object. The data may be coming from many different tables, and an ActiveRecord pattern can make it really hard.

Instead of loading the User object directly and accessing data using an ActiveRecord style API, your controller code retrieves a User object by calling the API of the UserMapper.getUser() method, for instance. It is that mapper that is responsible for loading any associated objects from their respective tables and returning the completed User "domain" object to the caller.

Essentially, you are just adding another layer of abstraction to make the code more managable. Whether your DataMapper classes contain raw custom SQL, or calls to a data abstraction layer API, or even access an ActiveRecord pattern themselves, doesn't really matter to the controller code that is receiving a nice, populated User object.

Anyway, that's how I do it.

Sam McAfee
A: 

Although all the other comments regarding SQL optimization are certainly valid, my main complaint with the active record pattern is that it usually leads to impedance mismatch. I like keeping my domain clean and properly encapsulated, which the active record pattern usually destroys all hope of doing.

Kevin Pang
ActiveRecord actually *solves* the impedance mismatch problem by letting you code in a OO fashion against a relational schema.
Mauricio Scheffer
Care to elaborate? General consensus is that objects modeled after a relational database are, by definition, not object oriented (since relational databases don't revolve around OO concepts such as inheritance and polymorphism).
Kevin Pang
There are three known ways to map inheritance to a relational schema. Ref: http://www.castleproject.org/ActiveRecord/documentation/trunk/usersguide/typehierarchy.html
Mauricio Scheffer
I think you're mistaking the Castle Active Record OSS project for Active Record the design pattern. The original question (and my response) are referring to the design pattern. The Castle Active Record project has things baked into it to help with OO development, but the pattern itself does not.
Kevin Pang
I was just quoting Castle as reference. RoR's ActiveRecord implements Single table inheritance only (http://www.martinfowler.com/eaaCatalog/singleTableInheritance.html), but the other strategies are being considered (http://blog.zerosum.org/2007/2/16/inheritance-vs-relational-databases-in-ror)
Mauricio Scheffer
Even with single table inheritance, you can use inheritance and polymorphism (http://api.rubyonrails.org/classes/ActiveRecord/Base.html)
Mauricio Scheffer
So ActiveRecord can be used with any of these three strategies to get inheritance and polymorphism. Maybe it could be thought as a simple extension to the pattern.
Mauricio Scheffer
+2  A: 

The question is about the Active Record design pattern. Not an orm Tool.

The original question is tagged with rails and refers to Twitter which is built in Ruby on Rails. The ActiveRecord framework within Rails is an implementation of Fowler's Active Record design pattern.

John Topley
+5  A: 

I think there is a likely a very different set of reasons between why people are "hating" on ActiveRecord and what is "wrong" with it.

On the hating issue, there is a lot of venom towards anything Rails related. As far as what is wrong with it, it is likely that it is like all technology and there are situations where it is a good choice and situations where there are better choices. The situation where you don't get to take advantage of most of the features of Rails ActiveRecord, in my experience, is where the database is badly structured. If you are accessing data without primary keys, with things that violate first normal form, where there are lots of stored procedures required to access the data, you are better off using something that is more of just a SQL wrapper. If your database is relatively well structured, ActiveRecord lets you take advantage of that.

To add to the theme of replying to commenters who say things are hard in ActiveRecord with a code snippet rejoinder

@Sam McAfee Say you have a User class in your domain, and need to have references to, or collections of other objects, already loaded when you retrieve that User object. The data may be coming from many different tables, and an ActiveRecord pattern can make it really hard.

user = User.find(id, :include => ["posts", "comments"])
first_post = user.posts.first
first_comment = user.comments.first

By using the include option, ActiveRecord lets you override the default lazy-loading behavior.

MattMcKnight
A: 

Try doing a many to many polymorphic relationship. Not so easy. Especially when you aren't using STI.

Omega
A: 

how does active record address the domain mismatch between OOP architecture and database architecture, i.e. impeadence. Isn't strong coupling between OOP design and database design doomed to scalability issues?

A: 

My long and late answer, not even complete, but a good explanation WHY I hate this pattern, opinions and even some emotions:

1) short version: Active Record creates a "thin layer" of "strong binding" between the database and the application code. Which solves no logical, no whatever-problems, no problems at all. IMHO it does not provide ANY VALUE, except some syntactic sugar for the programmer (which may then use an "object syntax" to access some data, that exists in a relational database). The effort to create some comfort for the programmers should (IMHO...) better be invested in low level database access tools, e.g. some variations of simple, easy, plain hash_map get_record( string id_value, string table_name, string id_column_name="id" ) and similar methods (of course, the concepts and elegance greatly varies with the language used).

2) long version: In any database-driven projects where I had the "conceptual control" of things, I avoided AR, and it was good. I usually build a layered architecture (you sooner or later do divide your software in layers, at least in medium- to large-sized projects):

A1) the database itself, tables, relations, even some logic if the DBMS allows it (MySQL is also grown-up now)

A2) very often, there is more than a data store: file system (blobs in database are not always a good decision...), legacy systems (imagine yourself "how" they will be accessed, many varieties possible.. but thats not the point...)

B) database access layer (at this level, tool methods, helpers to easily access the data in the database are very welcome, but AR does not provide any value here, except some syntactic sugar)

C) application objects layer: "application objects" sometimes are simple rows of a table in the database, but most times they are compound objects anyway, and have some higher logic attached, so investing time in AR objects at this level is just plainly useless, a waste of precious coders time, because the "real value", the "higher logic" of those objects needs to be implemented on top of the AR objects, anyway - with and without AR! And, for example, why would you want to have an abstraction of "Log entry objects"? App logic code writes them, but should that have the ability to update or delete them? sounds silly, and App::Log("I am a log message") is some magnitudes easier to use than le=new LogEntry(); le.time=now(); le.text="I am a log message"; le.Insert();. And for example: using a "Log entry object" in the log view in your application will work for 100, 1000 or even 10000 log lines, but sooner or later you will have to optimize - and I bet in most cases, you will just use that small beautiful SQL SELECT statement in your app logic (which totally breaks the AR idea..), instead of wrapping that small statement in rigid fixed AR idea frames with lots of code wrapping and hiding it. The time you wasted with writing and/or building AR code could have been invested in a much more clever interface for reading lists of log-entries (many, many ways, the sky is the limit). Coders should dare to invent new abstractions to realize their application logic that fit the intended application, and not stupidly re-implement silly patterns, that sound good on first sight!

D) the application logic - implements the logic of interacting objects and creating, deleting and listing(!) of application logic objects (NO, those tasks should rarely be anchored in the application logic objects itself: does the sheet of paper on your desk tell you the names and locations of all other sheets in your office? forget "static" methods for listing objects, thats silly, a bad compromise created to make the human way of thinking fit into [some-not-all-AR-framework-like-]AR thinking)

E) the user interface - well, what I will write in the following lines is very, very, very subjective, but in my experience, projects that built on AR often neglected the UI part of an application - time was wasted on creation obscure abstractions. In the end such applications wasted a lot of coders time and feel like applications from coders for coders, tech-inclined inside and outside. The coders feel good (hard work finally done, everything finished and correct, according to the concept on paper...), and the customers "just have to learn that it needs to be like that", because thats "professional".. ok, sorry, I digress ;-)

Well, admittedly, this all is subjective, but its my experience (Ruby on Rails excluded, it may be different, and I have zero practical experience with that approach).

In paid projects, I often heard the demand to start with creating some "active record" objects as a building block for the higher level application logic. In my experience, this conspicuously often was some kind of excuse for that the customer (a software dev company in most cases) did not have a good concept, a big view, an overview of what the product should finally be. Those customers think in rigid frames ("in the project ten years ago it worked well.."), they may flesh out entities, they may define entities relations, they may break down data relations and define basic application logic, but then they stop and hand it over to you, and think thats all you need... they often lack a complete concept of application logic, user interface, usability and so on and so on... they lack the big view and they lack love for the details, and they want you to follow that AR way of things, because.. well, why, it worked in that project years ago, it keeps people busy and silent? I don't know. But the "details" separate the men from the boys, or .. how was the original advertisement slogan ? ;-)

After many years (ten years of active development experience), whenever a customer mentions an "active record pattern", my alarm bell rings. I learned to try to get them back to that essential conceptional phase, let them think twice, try them to show their conceptional weaknesses or just avoid them at all if they are undiscerning (in the end, you know, a customer that does not yet know what it wants, maybe even thinks it knows but doesn't, or tries to externalize concept work to ME for free, costs me many precious hours, days, weeks and months of my time, live is too short ... ).

So, finally: THIS ALL is why I hate that silly "active record pattern", and I do and will avoid it whenever possible.

EDIT: I would even call this a No-Pattern. It does not solve any problem (patterns are not meant to create syntactic sugar). It creates many problems: the root of all its problems (mentioned in many answers here..) is, that it just hides the good old well-developed and powerful SQL behind an interface that is by the patterns definition extremely limited.

This pattern replaces flexibility with syntactic sugar!

Think about it, which problem does AR solve for you?

frunsi
It is a data source architectural pattern. Perhaps you should read Fowler's Patterns of Enterprise Application Architecture? I had similar thoughts to yours prior to actually using the pattern/ORM and finding how much it simplified things.
MattMcKnight
I share your feelings. I smell something wrong when a framework does not support compound keys.... I avoided any kind of ORM before SQLAlchemy, and we often use it at a lower level, as a SQL generator. It implements Data Mapper and is very flexible.
Marco Mariani
Since two days I am involved in a project that uses "state-of-the-art" ORM, maybe the implementations are matured now (in comparison to what i worked with some years ago). Maybe, my mind will change, we'll see in three months :-)
frunsi
The project is done, and you know what? ORM still sucks, I wasted so much time with mapping problems that are easily expressed in a relational way to a bunch of "object-oriented code". Well, of course the ORM provided ways to express queries in a kind of OOP+SQL-Mix - of course in an OOP-like syntax - but that just took more time than simply writing an SQL query. The abstraction leaked, the "OOPSQLExperiment" on top of OOP - to allow users to write SQL in OOP syntax was the worst idea ever. No, never again.
frunsi