tags:

views:

104

answers:

8

How do you decide on which side you perform your data manipulation when you can either do it in the code or in the query ?

When you need to display a date in a specific format for example. Do you retrieve the desired format directly in the sql query or you retrieve the date then format it through the code ?

What helps you to decide : performance, best practice, preference in SQL vs the code language, complexity of the task... ?

+2  A: 

I would never (ever) specify any formatting in the query itself. That is up to the consumer to decide how to format. All data manipulation should be done at the client side, except for bulk operations.

Otávio Décio
+5  A: 

All things being equal I prefer to do any manipulation in code. I try to return data as raw as possible so its usuable by a larger base of consumers. If its very specialized, maybe a report, then I may do manipulation on the SQL side.

Another instance where I prefer to do manipulation on the SQL side is if it can be done set based.

If its not set based, and looping would be involved, then I would do the manipulation in code.

Basically let the database do what its good at, otherwise do it in code.

Gratzy
A: 

In the case of the date column, I'd save the full date in the DB and when I return it I specify in code how I'd like to show it to the user. This way you can ignore the time part or even change the order of the date parts when you show it in a datagrid for example: mm/dd/yyyy, dd/mm/yyyy or only mm/yyyy.

Leniel Macaferi
+3  A: 

Formatting is a UI issue, it is not 'manipulation'.

DaveE
Good point......
Gratzy
'Manipulation' is probably an inappropriate translation of what I wanted to express. My question embraces all that can be done either in SQL or in code. Not sure this is clearer :s
DrDro
DaveE
...and no matter what yoou choose for a given situation, *somebody* will say you should have done it differently. Some places will have specific guidelines/straightjackets that you have to follow. Best you can do is find people to give you good advice (here is one place), and test different implementations *if* what you have is problematic. Over time, you'll get a feel for what works and what doesn't in your particular situation.
DaveE
"As with all things, it depends" Indeed, that was the point of my question. "somebody will say you should have done it differently" True, but that's probably true for anything. I wanted to know what do you take into account to make the decision. In that perspective +1 for your comment that was much more instructive than the original answer.
DrDro
+2  A: 

Leniel Macaferi has a good point. If you have to support more than one localization, you don't want to do the presentation work in the database; although that's probably true in general.

Sorry, I don't have enough rep to leave a comment.

uncle brad
A: 

About the only thing that I do in a query that could probably be done in code also is converting the datetimes to the user's time zone.

MySQL's CONVERT_TZ() function is easy to use and accurate. I store all of my datetimes in UTC, and retrieve them in the user's time zone. Daylight savings rules change. This is especially important for client applications since relying on the native library relies on the fact that the user has updated their OS.

Even for server side code, like a web server, I only have to update a few tables to get the latest time zone data instead of updating the OS on the server.

Other than those types of issues, it's probably best to distribute most functions to the application server or client rather than making your database the bottleneck. Application servers are easier to scale than database servers.

If you can write a stored procedure or something that might start with a large dataset, do some inexpensive calculations or simple iteration to return a single row or value, then it probably makes sense to do it on the server to save from sending large datasets over the wire. So, if the processing is inexpensive, why not have the database return just what you need?

Marcus Adams
+2  A: 

My answer is the reverse of everyone else's.

If you are going to have to apply the same formatting logic (the same holds true for calculation logic) in more than one place in your application, or in separate applications, I would encapsulate the formatting in a view inside the database and SELECT from the view. You do not need to hide the original data, that can also be available. But by putting the logic into the database view you're making it trivially easy to have consistent formatting across modules and applications.

For instance, a Customer table would have an associated view CustomerEx with a MailingAddress derived column that would format the various parts of the address as required, combining city, state, and zip and compressing out blank lines, etc. My application code SELECTs against the CustomerEx view for addresses. If I extend my data model with, say, an Apt# field or to handle international addresses, I only need to change that single view. I do not need to change, or even recompile, my application.

Larry Lustig
A: 

If it is just formatting and will not always need to be the same formatting, I'd do it in the application which is likely to do this faster.

However the fastest formatting is the one that is done only once, so if it is a standard format that I alawys want to use (say displaying American phone numbers as (###)###-#### ) then I'll store the data in the database in that format (this still may involve the application code, but onthe insert not the select). This is especially true if you might need to reformat a million records for a report. If you have several formats, you might considered calculated columns (we have one for full name and one for lastname, firstname and our raw data is firstname, middlename, lastname, suffix) or triggers to persist the data. In general I say store the data the way you need to see it if you can keep it in the appropriate data type for the real manipulations you need to do such as datemath or regular math for money values.

HLGEM