views:

1715

answers:

9

Hi,

I'm using CakePHP with $modelName->find(...) calls to select quite a number of rows (could be hundreds)

Normally, in PHP/MySQL of course this wouldn't be a problem, as you fetch them in a while loop. But, CakePHP loads all rows in to an array which exhausts the memory limit.

Is there a way to use the $modelName->find(...) constructs but return an iterator to fetch each row on demand?

Thanks, David

A: 

If you're using a stored procedure to produce the query, it might be a good idea to mimic paging behavior to get segments of the query at a time. In this case, you'd send in two extra parameters to the stored procedure for the starting row index and the "page"-size, and return adjust the select statement to retrieve only those records between row_index + 1 and row_index + page_size.

This can all be put into an extra layer of looping, so you get a new segment, and then inside of that go through each row in that segment.

notnot
Thanks,It's a solution I suppose. The only problem is that there's a trade-off between memory and queries.So, if I have 1000 queries and I've memory for 100 then I've to do 10 queries. But I still have to store 100 rows at a time. Becomes worse at 10,000 or 100,000...Unfortunate really.Dave
Davidd
It depends on what you're doing with the data on the other side. If you're displaying it, then you can use AJAX to keep populating based on each new chunk that comes in, which would save time in loading the page initially.The programmer's bane, you have to trade time for space and vice-versa.
notnot
I'm running through all the rows (they're metadata for rows in other tables...which are children of other rows..) and doing some stats to generate a summary array. I just need to "see" each row once which should be fine, even if it takes a bit but loading is a problem. Offset it is it seems.
Davidd
A: 

You might also be getting so much data because of your model relationships. For your someModel->find() call, how many other models are related to someModel?

I've used at least 1000 rows in an array on a typical shared hosting server without memory issues.

UPDATE: You can simply use the Containable behavior if you do not want the related models to be returned. So, if you are trying to find all Posts but do not want the Comments, you can use the $this->Post->contain() syntax to only return Post records. The Containable is a behavior that must by added in your model using $actAs parameter, $actAs = array('Containable');

A: 

I understand that you're looking for an iterator returned from the find condition, but how about using the LIMIT clause with a variable offset (e.g. the # of rows you want returned at once)? This could bring up some concurrency issues, but if you also include the ORDER BY id clause, you should see consistent behavior in the rows returned. Then, in your loop, just repeatedly issue the find(...) query.

This clearly isn't an elegant of a solution as an iterator would be, but I'd imagine that the overhead of repeatedly issuing a request to return more rows would about balance out with the savings by retrieving several rows at once (in Cake).

Finally, if you really are looking for performance, then I think CakePHP might not be your bag of tea. It is improving in speed with new releases, but I believe it still lags behind other frameworks pretty significantly when it comes to performance.

Travis Leleu
I suppose using offset and limit is the best way.I was hoping not to have to resort to it, presuming that there would be a way to access CalePHP's fetching method manually.It's just unfortunate because what I'm doing should be pretty efficient but there's so much overhead.Arg...Dave
Davidd
Overhead is the name of the game with Cake!
Travis Leleu
A: 

I guess this is not possible due CakePHP is dynamically building a multidimensional array representing your database entity relationships. This should be done after fetching all query rows in order CakePHP to know all possible related entities.

Example:

3 rows needs to be fetched in order to build the corresponding multidimensional array:

Article 1
  |
  -- Comment 1
  |
  -- Comment 2
  |
  -- Comment 3

Query result (1..n):

Article | Comment
-----------------
1       | 1
-----------------
1       | 2
-----------------
1       | 3
knoopx
Not sure I understand.Usually even with a complex hierarchy u can use a constant number of queries. Especially in CakePHP the way it loads into memory.So, if u have two articles, each with fifty comments you can first get the articles, then SELECT * comments WHERE article_id IN(article_ids)
Davidd
This is inneficient and a bad practice. CakePHP uses *eager loading* which prevents the dreaded 1+N problem in which fetching 100 posts that each need to display their author triggers 101 database queries. Through the use of eager loading, the 101 queries can be reduced to 2.
knoopx
+2  A: 

No, out of the box Cake PHP (and ActiveRecord in general) doesn't support iterating over a result set like that. If you have a use case where you actually need all the records of a table, you're probably better off using raw SQL. (or rethinking your use case).

You could also (and have likely already thought to) use an offset of some kind, and call ->find multiple times. If you take this approach don't forget to order your result set by some field to ensure a deterministic result. Databases only seem to return sorted rows when you leave an ORDER BY off. I haven't personally tried this, and it seems inefficient from a multiple queries standpoint, but it's something worth trying.

Alan Storm
Actually, i'm implementing this stuff right now using PDO and buffered queries.I return an object from my find function that implements iterable, and has a handle to the statement.http://stackoverflow.com/questions/531536/php-orm-query-results-arrays-vs-result-handle-wrapped-in-iterator-interface
SchizoDuckie
and people say cake is slow... huph!
Xeoncross
A: 

Thanks for all the advice.

Looks like it will either be a case of using direct MySQLi queries which means total control or using offset/limited queries and looping.

Cheers, Dave

Davidd
A: 

you can add a limit to your find request. I do not have time right now to write a complete answer. I'll update it later.

And no, from what I know, when you do a request in mysql or with the normal driver. It will return a all the elements of your select anyway. So if you have a problem with memory limite it might be somewhere else. You could add a limit to a certain number of row. If your table have multiple dependency but you don't need to load every foreign key you could use the "contains" attribute to only load what you need.

Could you give us a describe of your table. and what you want to select.

Sybiam
A: 

If your problem is caused by the relationships of your model, you can reduce the recursion this way:

$modelname->recursive = -1;

then you only will get the data of the current model, without any relation.

Iterating through all the records, you'll be able to obtain one-by-one their relationships querying again with recursive > 0

rossoft
A: 

You can either limit the find method call with the recursive param (API for Model#find) or you can unbind model associations on the fly and reduce the amount of data retrieved (Creating and Destroying Associations on the Fly)

RFelix