tags:

views:

298

answers:

5

So I'm working on a project for a realtor. I have the following objects/MySQL tables in my design:

Complexes
Units
Amenities
Pictures
Links
Documents
Events
Agents

These are the relationships between the above objects.

Complexes have a single Agent.
Complexes have multiple Units, Amenities, Pictures, Links, Documents, and Events.
Units have multiple Pictures, Links, and Documents.

Amenities, Pictures, Links, Documents, and Events all have the necessary foreign keys in the database to specify which unit/complex they belong to.

I need to load the necessary objects from the database into PHP so I can use them in my project.

If I try to select all the data out of the table in 1 query, using LEFT JOINS, I'll get AT LEAST (# of links) * (# of pictures) * (# of documents) rows for each unique unit. Add amenities, and events to that and I'll get all that * # of amenities * # of events for each complex...Not sure I want to try to deal with loading that into an object in PHP.

The other possibility is for each complex/unit, execute 1 separate SQL statement each for links, pictures, documents, events and amenities

My questions are as follows:

If I properly index all my tables, is it REALLY a bad idea to execute 3-5 extra queries for each complex/unit?

If not, how else can I get the data I need to load into a PHP object. Ideally, I would have an object as follows for units:

Unit Object
(
    [id]
    [mls_number]
    [type]
    [retail_price]
    [investor_price]
    [quantity]
    [beds]
    [baths]
    [square_feet]
    [description]
    [featured]
    [year_built]
    [has_garage]
    [stories]
    [other_features]
    [investor_notes]
    [tour_link]
    [complex] => Complex Object
        (
            [id]
            [name]
            [description]
            etc.
        )
    [agent] => Agent Object
        (
            [id]
            [first_name]
            [last_name]
            [email]
            [phone]
            [phone2] 
            etc.

        )
    [pictures] => Array
        (
            [1] => Picture Object
                (
                )
        )
    [links] => Array
        (
            [1] => Link Object
                (
                )
        )
    [documents] => Array
        (
            [1] => Document Object
                (
                )
        )    
)

I don't ALWAYS need ALL of this information, sometimes I only need the primary key of the complex, sometimes I only need the primary key of the agent, etc. But I figured the correct way to do this would be to load the entire object every time I instantiate it.

I've been doing a lot of research on OO PHP, but most (read all) online examples use only 1 table. That obviously doesn't help as the project I'm working on has many complex relationships. Any ideas? Am I totally off the mark here?

Thanks

[UPDATE]

On the other hand, usually on the front-end, which everyone will see, I WILL need ALL the information. For instance, when someone wants information on a specific complex, I need to display all units belonging to that complex, all pictures, document, links, events for the complex as well as all pictures, documents and links for the unit.

What I was hoping to avoid was, during one page load, executing one query to get the complex I need. Then another query to get the 20 units associated with the complex. Then for each of the 20 units, executing a query for picture, another for documents, another for links, etc. I wanted to get them all at once, with one trip through the database.

[EDIT 2] Also, note that the queries to select the pictures, documents, links, events, and agent from the database are pretty simple. Just basic SELECT [list of columns] FROM [table] WHERE [primary_key] = [value] with the occasional INNER JOIN. I'm not doing any complex computations or subqueries, just basic stuff.

[BENCHMARK] So after reading all the answers to my question, I decided to run a benchmark on what I decided to do. What I do is load all the units that I need. Then as I need to display pictures, document, blah blah, I load them at that time. I created 30,000 test units, each with 100 pictures, 100 documents, and 100 links. Then I loaded a certain number of units (I started with 1000, then 100, then the more realistic 10), looped through them, then loaded all pictures, documents and links associated to the unit. With 1000 units, it took approximately 30 seconds. With 100 units, it took about 3 seconds. With 10 units, it took about .5 seconds. There was a lot of variance with the results. Sometimes, with 10 units, it would take .12 seconds. Then it would take .8. Then maybe .5. Then .78. It was really all over the place. However, it seemed to average around half a second. In reality, though, I might only need 6 units at a time, and they each might only have 10 pictures, 5 links and 5 documents associated with them...so I think the "grab the data when you need it" approach is the best bet in a situation like this. If you needed to get all this data at once though, it would be worthwhile to come up with a single SQL statement to load all the data you need so you are only looping through the data one time (6700 units at a time took 217 seconds while the full 30,000 made PHP run out of memory).

A: 

the point of ORM is not to load entire objects every time. the point is to make it easy and transparent for your app to access object.

that being said, if you need the unit object, then load the unit object, and only the unit object. if you need the agent object, then load that when you need it, not when you load the unit object.

longneck
The biggest problem is that for the administration section, these pieces are segmented to allow granular management. But on the front-end, 1 page will be displaying a complex, all it's units, all related pictures, documents, events, and links as well as the assigned Agent and contact information...
SpaDusA
since you'll be accessing your data via indexed columns and actually using it all, this shouldn't be a problem. my answer was geared towards pulling back all of the data because the ORM said so, instead of because it will be used.
longneck
+2  A: 

If I properly index all my tables, is it REALLY a bad idea to execute 3-5 extra queries for each complex/unit?

In short, no. For each of the related tables, you should probably run a separate query. That's what most ORM (Object-Relational Mapping/Modelling) systems would do.

If performance is really a problem (and, based on what you've said, it won't be) then you might consider caching the results using something like APC, memcache or Xcache.

Rob Knight
A: 

Maybe you should think of breaking this up.

When you initiate your object, get only what details you need for that object to function. If and when you need more details, then go and get them. You distribute your load and processing this way: the object only gets the load and processing it needs to function, and when more is needed, it gets it then.

So, in your example - create the complex first. When you need to access a unit, then create that unit, when you need the agent, then get that agent, etc.

$complexDetails = array('id' => $id, etc);
$complexUnits = array();
.........
$complexUnits[] = new unit();
.........
$complexDetails['agent'] = new Agent();
Robert DeBoer
A: 

I had to address this issue a while back when I concocted my own MVC framework as an experiment. To limit the layers of data loaded from the DB, I passed an integer to the constructor. Each constructor would decrement this integer before passing it to the constructors of the objects it instantiated. When it got to 0, no more sub-objects would be instantiated. This meant, basically, the int passed was the number of layers loaded.

So if I only wanted an attribute of the unit object, I'd do this:

$myUnit = new Unit($unitId,1);
Lucas Oman
A: 

If you want to "store" the objects, meaning cache them, just load them into a PHP array and serialize it. Then you can store it back to the database, in memcache or anywhere else. Attaching a label to it would allow you to retrieve it, and include a time stamp so you know how old it is (i.e. needs to be refreshed).

If the data doesn't change, or changes infrequently, there really is no reason to run multiple complex queries every time. Simple ones, like getting a primary, you might as well just hit the database directly.

Brent Baisley
I don't want to store the objects, I don't think. What I'm trying to figure out is just how to load the information in the database efficiently in PHP.
SpaDusA