views:

49

answers:

2

I have this question on and of for a year or two, but it is so nonspecific, and my native language is not English, so I don't know how to ask to be precise and everybody will understand what is my dilemma.

It is even more valid if using objects but for the sake of simplicity I'll make a simple example (theoretical) with a function and database.

We all know functions are good because I don't need on every page to think about what it does, it simply does it. Let's say we have a member table in mysql that has many fields including ID, username, age, sex, avatarlink. For this example tat's enough...

Now I want to make a function named fWriteUser. That function would on many locations in a site write username, put a small picture of member gender (sex) and put a avatar image next to it and when you go with mouse over username it would say user's age. Ok, stupid example, but it is ok I guess...

On every page I deal with usernames of course I have a recordset with some of the parameters I need from them. For example in forum I also take username and ID out of this table and from forum tables I get topic and message of forum post...

Now I have 2 choices to call the function:

  1. I allways call the function with like fWriteUser(userID). The userID I have in session variable (I use ASP but the question is not ASP specific), but I could have a cookie or something else. I don't worry about calling this function, as I only feed userID and the function does the rest. On every page on the site I have same call and I get what the function is supposed to to. But this way I need in the function to take sex, avatarlink, username from the query. I HAVE to make another select from table member to take this data out. And if in forum I have 20 topics this would be called 20 times. ID is known of the member so I guess this is a fast function. If I would use objects I would call object method but in this method the query would also need to be run.

And sicer I take some member data on every page I in fact spend mysql time doing it in function again. For the commodity of having to call function only with ID, so I can in fact later change inside of function and add some new member info as I only change the function itself.

  1. Second way is I feed the function with all the required data from every page I need to call it like fWriteUser(userID, username, sex, avatarlink)... That way I can take all this data on every HTML page when I take other member data so the SELECT query runs only once. The big downside is that if I want to add new member variable I have to open all pages and in the SELECT statement add another variable and change function call to add another variable (or add new property if using objects).

For a big site approach 2 is certanly a pain to implement and sustain... I would prefer to use option 1, but that would add many more select statements on every page, because member function call is just example, I may have other functions.

So my question is is it wise to make 10 selects instead of one (every function it's own select) or is it better to feed every variable from the main page and make it hard to maintain...

I don't know how much processor time is neeed for one SELECT from members if I take just a few variabls, I have indexes and about 100.000 members in the table. Is this select cached somehow?

I could make like a 100 session variables for members so I can feed them when needed inside functions but I am not sure how many sessions I can have and it would work for members but not for some other data...

I know it is not a specific problems but more of a best practice problem, but this is preventing me to make functions "self sustained" and dependent on one parameter instead of 10 and I am having difficulty maintaining the site. I tried to use objects (I am now to OOP) but I see objects methods have the same problem - you have to SELECT in each method if you want to do different things with object. I mean, functions in objects that need something from database have same problem.

Thank you for everybody who made it till here... I know it was a long read :-( But it is a year or two on my mind...

Yours

Jerry

UPDATE - on all the answers there are, I think the point is still being missed. Let me try to make myself more clear with a practical example.

I have a forum. On the forum there is a list of 50 topics with titles and user data. Titles are in table topics and userdata comes from table members.

I have 2 choices to fetch the data:

  1. I can make a SELECT FROM topics JOIN members etc... and get in each row a title and all the user data I need.

  2. I can take only topics data and do not take member data except member ID at this point for every loop. Where in HTML I have to show username's their gender, avatar link etc. I can call function fWriteUser(varID) and the function in effect does everything, load from member table and does output with the data.

Obviously it is better to use method 1 because I call select 1 time! I have all the data I need for the page in this select. But because I don't want to fetch member data again from function fWriteUser I have to pass all the member data that the function uses like call fWriteUser(ID, avatarlink, gender, etc.).

I am using this approach now and it is fast of course. The second approach would not be so efficient as I would need 1 query to fetch topic data and then n-times make SELECT in the function to get user data for topic writer 1, topic writer 2 etc. I could do cache as was suggested, but that would only help if the authors on the topic list would be the same. If I have 50 different authors each with it's own topic, I would still need a recordset loop to get topic data and user ID and inside that loop I would call function 50 times which would make 50 SELECT statements for 50 topics to get the member data of the correcponding members.

But this second approach is nicer to maintain pages. I just call fWriteUser(varID) and I don't worry. Using first approach I worry. If I add another parameter I want to have manipulating print output from a member (like display age) I need to open ALL the html pages I have and add age to all the queries that I use on the pages where memebrs are involved (and this are of course different queries, not same, as once I join members with topics, once with recipes, once with photo albums etc...). That is hours work just because I wanted to add age. So extending the call to have one more parameter is possible automatically using arrays, but the problem is I have to modify all the join queries anyway.

I hope I am more clear now what bothers me, if not I reast my case ;-)

+1  A: 

Hello Jerry,

If I understand your problem correctly you need something that will do the same processing in multiple places on your site. That is a function. So you most definitely need a function (or method if you use OOP).

To avoid running the same query more than once you can implement some simple script cache. I will assume memory will not be a problem and you won't need to run the function for thousands of users in the same script execution. Correct me if I'm wrong.

To implement that cache keep a static member in that function that will play the role of your cache warehouse. An array where the keys are the user IDs. On each call if that is the first time the function is called for that ID do the query and then add to the cache. If not get the result directly from the cache.

function getData($id){
    static $cache = array();
    if(!isset($cache[$id])){
        $cache[$id] = /* get the data */
    }

    /* do any processing or output with data from $cache[$id] */   
}

What you need to do is basically modify your original function and cache it's output for some input values. To be able to cache something you need something that will persist between function calls. That would be a static variable. You could also use a global variable if you're not comfortable with how a static variable works.

Alin Purcaru
Yes I know I need function, but I was not sure if I make DB query in every function on that page or feed the function with 10 parameters I get on the main HTML page when querying that table.
Jerry2
About running for thousand of users, not on one page of course. But for example on forum I have 20 topics from different users at once, so I would need to have data for up to 20 users at the same time in this one HTML.
Jerry2
Your script is in PHP I guess, I am much more familiar with classic ASP. Could you tell me that kind of objects do you use here, variables? So whe users loads a page you make a array variable for each user that has all the things he need inside and in any function that uses this user and needs it's data fist I can getData? Is this my right understanding?
Jerry2
I say PHP in the tags and I assumed you are familiar with it. Unfortunately I can't instruct you in ASP specifics but the principles should be the same. The key here is the static variable that you use for caching. You could replace it with a global variable if your not familiar with static. Also 20 users will not be a problem and you can use caching.
Alin Purcaru
I just want to ask about your PHP script: You fetch to static array? I mean the one that is lost across pages? So I would fetch the user data to static variables on every page? at the beginning before functions are used? Is it any good to use your instructions but to fetch to session object so I would not need to fetch user data on every page? But user would carry to much data I guess... But for HIMSELF he could carry session data I think, not load his data on every page. For the other users that "write" to forum, using static data would be best. But how to find variables names in the array?
Jerry2
You could implement any level caching if you want, but note that the database also has caching so you shouldn't go too far with the caching level. Session data is stored on your server. Not on the client side. Session identification is stored by the client, but that is just a session ID.
Alin Purcaru
You should consider reading about functions and data types in your preferred language so you have a good understanding of how they work before going further. This shouldn't take you more than a few hours.
Alin Purcaru
So if I make 2 succesive calls to mysql for the same info it takes them from cache? What is the fatest way to fetch data? Database, session or static variable?
Jerry2
I have read this things about functions, I do have a big site with thousands of them, it is just a problem I'd like to use the fastest route because I have many visitors.
Jerry2
Yes, to the first question, that is the whole idea of a cache. Well definitely a variable in your script. That's why I suggested it in the first place.
Alin Purcaru
So before I need let's say any user data I put a script to fetch the 30+ variables for that user? Or even better, every function that handles users should have this check and fetches once if it is not there. But I would have to fetch all the variables, 40+ even if one particular function uses only 4. Is this overhead? Do statis variables take much of the processor time or memory (I have IIS7 8GB)
Jerry2
Variables take just memory. Processing takes processing time.
Alin Purcaru
S 8GB could accomodate having few thousand of variables I guess..
Jerry2
I have edited my post to clarify my need
Jerry2
A: 

Hi Jerry,

although I am not sure if I got all the aspects of your elaborate question right, I think it is about two problems actually:

1st: Minimizing database access 2nd: Minimizing maintenance overhead on extending the structure of your user (or whatever model) object.

The first point is answered by Alin properly I guess (I am not a PHP expert), and in other application server environments I would recommend a similar pattern (e. g. like a singleton class in Java).

Now for the 2nd point: To not maintain the list of all model attributes in all places one possible approach could be to de-normlaize the structure by storing the meta information of your attributes in the database. For example to have a second table with a n:1-relationship to your user table, like in (pseudo SQL):

CREATE user_attribs (
  id INTEGER NOT NULL PRIMARY KEY, -- artifical key
  user_id INTEGER NOT BULL FOREIGN KEY (users, id), -- referencing users table
  key CHAR(<approriate length>) NOT NULL, -- the name of the attribute
  value TEXT, -- the attribute's contents
);

Then you would build a collection of all user's attributes by issueing:

SELECT key, value FROM user_attribs WHERE user_id = <actual user_id>;

or querying them all together in a join-clause, and putting them in a dictionary-like structure like a HashMap, Map, Dictionary or whatever your program language of choice gives you. For every page you only access the dictionary entries per user which are needed.

But this approach has a severe drawback: Because your attributes are referenced by strings there is no interpreter/compiler automatism to warn you about trying to access non-existing attributes, or about storing some you won't find again, because e. g. you wrote "birthday" in one module and "day_of_birth" in the other. So it needs some considerable amount of self-discipline and documentation.

ofi
Yes, that is about it, 2 problems. But if I could call functions with only 1 parameter 8and the others would be taken from DB, session, variable) that the second problem of maintaining would not be needed as I would call functions only using userID and I would only change what is inside function with new items of a member.
Jerry2
Members are special problem becuase I need their data on every page across site. Thank you for the idea of referencing tables. I would still rather call function only with one parameter every time, but i am still not sure is it wise to call db 10 times or put member parameters all in session object and fetch them to session after session is lost from DB... I am not sure how much data a session objects can hold. The Alin solution, as I see it, fetches on each page member data to array and all the functions take this info when page is executed, so 1 call per page.
Jerry2
Using session there would be 1 call only when session is lost, so this would be best approach I think, but I fear to make 40 session variables for each user on the site. I am not sure how much server could handle session objects.
Jerry2
As you mentioned in a comment to Alin's answer your are working with ASP. Hence you probably should consider caching your member data in the application context and configure your application pool accordingly (have a look at http://www.developer.com/net/asp/article.php/2245511/IIS-and-ASPNET-The-Application-Pool.htm for example). But then you have to consider security aspects like not every user seeing all ohter members attributes and the like. Does this point to your intended direction more likely?
ofi
Yes, but I am using classic ASP. Maybe I should make a dictionary with all the members but that would be a huge dictionary for 40.000 members. I was more interested in how to make functions the fastest way, either to fetch them 10 variables or to fetch them ID only and in function I fetch the rest from DB, variable etc... I like the second approach but it scares me to talk to DB table 10 times on one page, doesn't feel right.
Jerry2
I have edited my post to be more clear what I need.
Jerry2