views:

38

answers:

2

On Ruby on Rails, say, if the Actor model object is Tom Hanks, and the "has_many" fans is 20,000 Fan objects, then

actor.fans

gives an Array with 20,000 elements. Probably, the elements are not pre-populated with values? Otherwise, getting each Actor object from the DB can be extremely time consuming.

So it is on a "need to know" basis?

So does it pull data when I access actor.fans[500], and pull data when I access actor.fans[0]? If it jumps from each record to record, then it won't be able to optimize performance by doing sequential read, which can be faster on the hard disk because those records could be in the nearby sector / platter layer -- for example, if the program touches 2 random elements, then it will be faster just to read those 2 records, but what if it touches all elements in random order, then it may be faster just to read all records in a sequential way, and then process the random elements. But how will RoR know whether I am doing only a few random elements or all elements in random?

+1  A: 

Why would you want to fetch 50000 records if you only use 2 of them? Then fetch only those two from DB. If you want to list the fans, then you will probably use pagination - i.e. use limit and offset in your query, or some pagination gem like will_paginate.

I see no logical explanation why should you go the way you try to. Explain a real situation so we could help you.

However there is one think you need to know wile loading many associated objects from DB - use :include like

Actor.all(:include => :fans)

this will eager-load all the fans so there will only be 2 queries instead of N+1, where N is a quantity of actors

Tadas Tamosauskas
how about, if we want to show the 100,000 fans in random order, so that no fan is more favored than the other, but to show the first 10, we actually need to fetch all 100,000 and sort them by random() ? otherwise, if we get item 11 to 20 and again sort by random() for page 2, then it can contain fans that already appeared in page 1. i was at first thinking of any type of access instead of a particular case.
動靜能量
you could generate, let's say, 30 random id's and select users where id IN (generated_ids), :limit => 10. you will need more ids, because if the user is deleted the generated id will not be available.
Tadas Tamosauskas
+1  A: 
stephenr