views:

431

answers:

8

It seems that the phrase "Premature Optimization" is the buzz-word of the day. For some reason, iphone programmers in particular seem to think of avoiding premature optimization as a pro-active goal, rather than the natural result of simply avoiding distraction. The problem is, the term is beginning to be applied more and more to cases that are completely inappropriate.

For example, I've seen a growing number of people say not to worry about the complexity of an algorithm, because that's premature optimization (eg http://stackoverflow.com/questions/2190275/help-sorting-an-nsarray-across-two-properties-with-nssortdescriptor/2191720#2191720). Frankly, I think this is just laziness, and appalling to disciplined computer science.

But it has occurred to me that maybe considering the complexity and performance of algorithms is going the way of assembly loop unrolling, and other optimization techniques that are now considered unnecessary.

What do you think? Are we at the point now where deciding between an O(n^n) and O(n!) complexity algorithm is irrelevant? What about O(n) vs O(n*n)?

What do you consider "premature optimization"? What practical rules do you use to consciously or unconsciously avoid it?

EDIT

I know my description is a bit general, but I'm interested in specific, practical rules or best practices people use to avoid "pre-mature optimization", particularly on the iphone platform.

Answering this requires you to first answer the question of "what is pre-mature optimization?". Since that definition clearly varies so greatly, any meaningful answer requires the author to define the term. That's why I don't really think this is a CW question. Again, if people disagree, I'll change it.

+25  A: 

What is premature optimization?

Premature optimization is the process of optimizing your code (usually for performance) before you know whether or not it is worthwhile to do so. An example of premature optimization is optimizing the code before you have profiled it to find out where the performance bottleneck is. An even more extreme example of premature optimization is optimizing before you have run your program and established that it is running too slowly.

Are we at the point now where deciding between an O(n^n) and O(n!) complexity algorithm is irrelevant? What about O(n) vs O(n*n)?

It depends on the size of n and how often your code will get called.

If n is always less than 5 then the asymptotic performance is irrelevant. In this case the size of the constants will matter more. A simple O(n * n) algorithm could beat a more complicated O(n log n) algorithm for small n. Or the measurable difference could be so small that it doesn't matter.

I still think that there are too many people that spend time optimizing the 90% of code that doesn't matter instead of the 10% that does. No-one cares if some code takes 10ms instead of 1ms if that code is hardly ever called. There are times when just doing something simple that works and moving on is a good choice, even though you know that the algorithmic complexity is not optimal.

Every hour you spend optimizing rarely called code is one hour less that you can spend on adding features people actually want.

Mark Byers
+1 for the first point. Plenty of code is automatically *fast enough* by virtue of its use cases.
grossvogel
Unless it is O(Ackermann function) :)
BlueRaja - Danny Pflughoeft
@grossvogel +1 for "fast enough"
drachenstern
+1 for reinstating common sense
Longpoke
Yeah, I debated over framing the question in terms of Big O, because it's obviously very context specific. Good advice. +1
DougW
Accepting this for now because it's good advice, and seems to come closest to answering the question. Still, it's a bit more abstract than I was hoping for. The answer might be that nobody has done an authoritative study on pragmatic approaches to the problem. If anyone knows of one, please post it.
DougW
A: 

I think this is a question of common sense. There's a need to understand the big picture, or even what's happening under the hood, to be able to consider when a so-called "premature" move is justified.

I've worked with solutions where web service calls were necessary to calculate new values based on the contents of a local list. The way this was implemented was by making a web request per value. It wouldn't be premature optimization to send several values at a time. The same thing goes for the use of database transactions for multiple operations, i.e. multiple inserts.

When it comes to algorithms, initially the most important thing is to get it right and as simple as possible. Worrying about stack vs heap issues at that point would be madness.

k_b
@k_b: I disagree: it _would_ still be premature. Maybe those "obvious" web service calls are not the bottleneck in the application. You could be wasting time solving the wrong problem.
John Saunders
My point is; why implement an inefficient module like that in the first place? It wouldn't take more time to implement sending multiple values per call.
k_b
@John Saunders: Agree if we are dealing with a LAN, don't agree if we are dealing with WAN. The network latency on each web service call is going to be noticeable immediately when services are called over a WAN.
Marjan Venema
@Marjan: and maybe the time taken by database queries is 10 times the network latency. The only thing that makes sense is to profile and find out what the next, worst problem is, then go optimize it. Otherwise you risk fixing the wrong problem, and not fixing the right problem.
John Saunders
The problem is, peoples' "common sense" on this subject varies enormously. In a team environment, extreme variations in coding practice is dangerous. So how do we quantify it in real terms? That's the question.
DougW
@DougW ~ Now you've come across something we can quantify. You enforce variations in coding practice to be within standard limits by using tools like resharper and enforcing their use across the team, also code reviews. note I'm not telling you you have to use resharper, just suggesting a tool I've used that can be consistently updated via source control.
drachenstern
+3  A: 

Algorithm complexity, and even choice, is an issue that should be hidden behind an interface. For example, a List is an abstraction that can be implemented various ways with different efficiencies for different situations.

Sometimes avoiding premature optimization can aid design, because if you design with the idea that you will need to optimize later, then you are more inclined to develop at the abstract level (e.g. list) rather than the iimplementation (e.g. Array or linked list) level.

This can result in simpler, and more readable code, in addition to avoiding distraction. If programmed to the interface, different implementations can be swapped in later to optmize. Prematurely optimizing leads to the risk that implementation details may be prematurely exposed and coupled with other software components that should not see these details.

Larry Watanabe
If you are writing a data container like a List, and you have co,plexities other than the well known, you better specify it in bit bold letters, or some people may run into huge downtimes and server faults.
Thomas Ahle
the keyword is "premature" ... optimization is fine, at the right time, which would be before production.
Larry Watanabe
You need to be careful with over-abstracting though. Doing something like hiding a container class behind an plain iterator pair may slow the user code down by an order of magnitude if it requires random access. It's important to understand how code will be used rather than pursuing abstraction as some kind of dogma.
Alan
+9  A: 

My vote goes for most people optimize what they think is the weak point, but they don't profile.

So regardless of how well you know algorithms and regardless of how well you've written your code, you don't know what else is happening outside your module. What do the APIs you've called do behind the scenes? Can you always gaurantee that the particular order of ops is the fastest?

This is what is meant by Premature Optimization. Anything that you think is an optimization that has not been rigorously tested by way of a profiler or other definitive tool (clock cycles by ops is not a bad thing, but it only tells you performance characteristics ~ actual calls is more important than timing, usually), is a premature optimization.

@k_b says it well above me, and it's what I say too. Make it right, make it simple, then profile, then tweak. Repeat as necessary.

drachenstern
I agree to some extent, but there's clearly a limit. I may choose to store an array in memory instead of having my grandmother mail it to me whenever I need it, but I don't think that's pre-mature. So that's the question, where do you draw the line? And how do you draw the line in regular, repeatable ways, that you can explain to other team members?
DougW
@DougW - I draw the line in my last post: "Make it right, make it simple, then profile, then tweak". I think some things are obvious (I'll need this on every run, it makes more sense to index by keeping it in memory rather than reading from disk everytime, and makes for easier business logic ~ as compared to ~ I need this everytime, how do I force it to stay in L2 cache) whereas some things are not (quicksort vs heapsort vs hashsort)
drachenstern
@DougW ~ add a person's name as I've done yours to make sure they see the comment right away. I had an orange envelope with no notification, so had to search for who left me a message.
drachenstern
@drachenstern Thanks for the tip, didn't realized it actually grepped that. I totally hear what you're saying, but I'm trying to dig a little deeper. My recent experience has been that what is "obvious" to (especially iphone) programmers is becoming more and more diverse. Certainly we can agree on many cases, but I'm interested in finding concrete techniques that can be applied to novel cases as they arise.
DougW
+5  A: 

For some people, optimization is part of the fun of writing code, premature or not. I like to optimize, and restrain myself for the sake of legibility. The advice not to optimize so much is for the people that like to optimize.

iphone programmers in particular seem to think of avoiding premature optimization as a pro-active goal

The majority of iPhone code is UI related. There is not much need to optimize. There is a need not to choose a poor design that will result in bad performance, but once you start coding up a good design there is little need for optimization. So in that context, avoiding optimization is a reasonable goal.

drawnonward
I think the point I get from this is "most iphone code is hobby code, written by hobbyists". That's not necessarily a bad thing I suppose, but it raises a lot of interesting questions. +1
DougW
My intent was more to say that it is portal code than hobby code. A lot of apps pass the heavy lifting off to a server, or there is no heavy lifting. There are some impressive games that must be highly optimized, but things like facebook and twitter are just as popular and leave little room for optimization. But there is a lot of hobby code out there too.
drawnonward
I find the phrase "hobby code, written by hobbyists" to be vaguely insulting. Excluding games, I doubt that I use any iPhone apps that requires more than a few select optimizations. Most apps that use the frameworks as they were designed and push network calls into a background thread will perform acceptably well with a naive implementation.
kubi
I believe the idea that "iphone programmers.. think of avoiding premature optimization as a pro-active goal" comes from SO users that are new to iPhone development and think that they can only attain acceptable performance by rewriting Core Data or fast iteration. 90%+ of the time the default Apple methods are plenty fast for what those users need, hence the call to avoid premature optimization becoming a default answer.
kubi
+1  A: 

What practical rules do you use to consciously or unconsciously avoid it?

One way to avoid unnecessary optimization is to consider the relative cost benefit:

A) Cost for programmer to optimize code + cost to test said optimization + cost of maintaining more complex code resulting from said optimization

vs.

B) Cost to upgrade server on which software runs or simply buy another one (if scalable)

If A >> B consider whether it's the right thing to do. [Ignoring for the moment the environmental cost of B which may or may not be a concern to your organization]

This applies more generally than just to premature optimization but it can help instill in your developers a sense that spending their time doing optimization work is a cost and should not be undertaken unless there is a real measurable difference in something that actually matters like: number of servers required or customer satisfaction through improved response times.

If management can't see the benefit in reduced $ and customers can't see the benefit in better response times, ask yourself why you are doing it.

Hightechrider
There's a quote I always remember: "It's easier to optimise correct code than it is to correct optimised code"
Mark Simpson
I completely agree, but what I'm looking for is practical examples of how do you go about quantifying the values in your equation. And how do you make sure, practically, that all the team members have similar estimates? +1
DougW
@DougW I don't think there is an easy way to come up with those numbers, since it takes an individuals experience with estimating to come up with those numbers. You really need programmers with years of experience estimating for theirself, and it helps if they've done similar optimization on a regular basis. You can almost always (note that I'm using always, obvious room for contradiction) almost always purchase faster hardware. ~ but I've been wrong before, I'll be wrong again ;)
drachenstern
@drachenstem Agreed, it's not a problem with a single solution per-se, but that doesn't mean people haven't figured out good heuristics for it. And in most cases you're right, but with embedded devices like the iphone you often can't buy faster hardware.
DougW
@DougW ~ Too true re: iPhone, but there's also not been 5+ years of development for most developers to really know their ability when it comes to the iPhone. I think the heuristic becomes even more important when it's iPhones...
drachenstern
+2  A: 

What do you consider "premature optimization"? What practical rules do you use to consciously or unconsciously avoid it?

Using the Agile approach (rapid iterations with refinement of requirements through interactions with users) is helpful as the awareness that the current interface is probably going to change drastically after the next session with the users makes it easier to focus on developing the essential features of the application rather than the performance.

If not, a few iterations where you spent a lot of time optimizing a feature that was entirely discarded after the session with the user should give you the message.

Larry Watanabe
I think what you're saying is that before you try to "make the common case fast", you need to make sure you know what the common case is? Yeah I think that qualifies as a practical rule. +1
DougW
+5  A: 
egrunin
A bit more abstract than I'm looking for, but definitely a valuable rule. I don't think I've ever heard it put in quite those words. +1
DougW