views:

104

answers:

5

All,

Given a code that you are not at all knowledgeable about in terms of the functionality and implementation, how would you go about finding the performance bottlenecks in that code? Please list any specific tools / standard approaches that you might be using.

+2  A: 

You should use a profiling too, depends on the platform:

  • .NET: Visual Studio performance tools, JetBrains dotTrace
  • Java: JProfiler

The above tools work very well for applications, but the features vary. For example, Visual Studio can summarize performance data based on tiers.

How to approach the problem is highly dependent on the type of the program, and the performance problem you're facing. But basically, you'll repeat the following cycle:

  • Record performance data (maybe change the settings for higher / lower granularity on recorded data)
  • Identify hot spots, where most of the application time is consumed
  • Maybe use reverse call tables to identify how the hot spot is invoked, and from where in the code
  • Try to refactor / optimize the hot spot
  • Start over, and check how much your optimization was effective.

It might take several iterations of the above cycle to get you to a point that you have acceptable performance.

Note that these tools provide many different features and ways to look at performance data, or record them. Provided that you don't have any knowledge of the internal structure of the application, you should start playing with different features and reports that the tools provide, so that you can pinpoint where to optimize.

Iravanchi
You probably meant to type JetBrains dotTrace; ReSharper is not their profiling tool.
Cumbayah
Right, sorry. I was thinking dotTrace, don't know why I typed that! It's probably because I love ReSharper so much!
Iravanchi
For compiled code use Zoom or oprofile on Linux, Shark on Mac OS X, VTune on Windows.
Paul R
A: 

Without having an idea on the kind of system you are working with, these pieces of gratuitous advice:

  • Try to build up knowledge on how the system scales: how are 10 times more users handled, how does it cope with 100 times more data, or with a 100 times slower network environment...

  • Find the proper 'probing' points in the system: a distributed system is, of course, harder to analyze than a desktop app.

  • Find proper technology to analyze the data received from the probes. Profilers do a great job visualizing bottleneck functions, but I can imagine they are of no help for your cloud service. Try to graphically visualize your data, your brain is much better at recognizing graphical patterns than numerical, let alone textual.

  • oh - find out what the expectations are! It's no use optimizing the boot time of your app if it's only booted three times a year.

xtofl
The programming language would be Java. I want to get a general sense of how bottlenecks are identified, given any piece of code. I am not specific to an system as such.
darkie15
A: 

I'd say the steps would be:

  1. Identify the actual functionality that is slow, based on use of the system or interviewing users. That should narrow down the problem areas (and if nobody is complaining, maybe there's no problem.)
  2. Run a code profiler (such as dotTrace / Compuware) and data layer profiler (e.g. SQL Profiler, NHibernate Profiler, depending on what you're using.) You should get some good results after a day or so of real use.
  3. If you can't get a good idea of the problems from this, add some extra stopwatch code to the next live build that logs the number of milliseconds in each operation.

That should give you a pretty good picture of the multiple database queries that should be combined into one, or code that can be moved out of an inner loop or pre-calculated, etc.

realworldcoder
+1  A: 

Use differential analysis. Pick one part of the program and artificially slow it down (add a bunch of code that does nothing but waste time). Re-run your test and observe the results. Do this for a variety of aspects of your program. If adding the delays does not alter performance, then that aspect is not your bottleneck. The aspect that results in the largest perrformance hit might be the first place to look for bottlenecks.

This works even better if the severity of the delay code is adjustable while the program is running. You can increase and decrease the artificial delay and see how that affects the performance. If you encounter a test where the change in observed performance seems to follow the artificial delay linearly, then that aspect of the program might be your bottleneck.

This is just a poor man's way of doing it. The best method is probably to use a profiler. If you specify your language and platform, someone could probably recommend a good profiler.

bta
Thanks bta. I would be more interested in evaluating performance for Java.
darkie15
My aim with this question was that, in addition to the profiling tools, is there any 'unconventional' pattern followed to identify bottlenecks
darkie15
+2  A: 

I assume you have the source code, and that you can run it under a debugger, and that there is a "pause" button (or Ctrl-C, or Esc) with which you can simply stop it in its tracks.

I do that several times while it's making me wait, like 10 or 20, and each time study the call stack, and maybe some other state information, so I can give a verbal explanation of what it is doing and why.

That's the important thing - to know why it's doing what it's doing.

Typically what I see is that on, say, 20%, or 50%, or 90% of samples, it is doing something, and often that thing could be done more efficiently or not at all. So fixing that thing reduces execution time by (roughly) that percent. The bigger a problem is, the quicker you see it. In the limit, you can diagnose an infinite loop in 1 sample.

This gets a lot of flak from profiler-aficionados, but people who try it know it works very well. It's based on different assumptions. If you're looking for the elephant in the room, you don't need to measure him. Here's a more detailed explanation, and a list of common myths.

The next best thing would be a wall-time stack sampler that reports percent at the line or instruction level, such as Zoom or LTProf, but they still leave you puzzling out the why.

Good luck.

Mike Dunlavey
+1 - I find that this is almost always the best way to do initial performance tuning.
Rex Kerr
@Rex: Tx. If there's an async protocol involved, or message-queueing, it have to switch techniques at some point. Otherwise, this method carries me through to the bitter end, because of the "magnification effect". That's where if there's a series of "bottlenecks" costing, say, 50%, 25%, and 12.5%, removal of the largest doubles the cost and obviousness of the remaining ones
Mike Dunlavey