views:

961

answers:

26

I work in the maintenance team of a big project (around 7k+ classes) and my daily work is mainly to fix bugs. Sometimes, though, I have no bugs to work in. When this happens, I spent most of the time looking for performance gaps in the code. The fact that I have more than 7 thousand classes to look in means it's not obvious to find these gaps.

So I'd like to know what simple things should I look for when trying to improve the performance of the system?

I'm not asking about specific code techniques, but general ones. For example:

  • I already looked for all occurrences of code like String a = new String("") and changed to StringBuilder a = new StringBuilder();
  • I already changed all the access to the database, where applicable, to use PreparedStatement
  • All of the Debug logging were removed, and the Finest ones were removed when possible

As you can see, those changes could easily be made because they do not require measuring the system performance -- the only thing I needed to do was using the search tool inside Eclipse.

+50  A: 

Laudable goal, but you need to focus on the actual demonstrable performance problems -- not where you 'think' the performance problems are.

Spend time in a profiler to find the real issues...then go from there. Otherwise you're just churning code without any way of knowing whether you're making a measurable impact.

Even if you had a list of "things to change without measuring system performance" would you really trust them to be right for your circumstance?

In your situation I would suggest you spend time building test harnesses/performance instrumentation so you can see where to get the most bang for your buck.

EDIT: To address the downvote(s) and sentiment about "I know using a PreparedStatement is faster" -- rather than asking for silver bullets, a better question to ask when faced with this issue is "how should I most productively spend my free time to make things better?" The OP clearly wants to improve the situation -- which is great...but without measuring "where it hurts" he's literally shooting in the dark. Is a PreparedStatement faster? Sure -- but if the real performance gremlin is in some other spot, why spend time 'fixing' the DB code when you could make a REAL impact by going after the actual points-of-pain?

One other thing: in a stable system such as the OP is describing, making code changes without good quantifiable justification is often considered bad practice, due to the risks introduced. In such stable systems, the question of risk/reward is for ANY code change must be considered. The risk is significant: many "simple, couldn't break anything" changes have slipped release schedules/introduced major problems. The reward? Uncertain, since you don't actually know if your change was responsible for a performance gain. Hence, we profile to make sure we're improving code which matters.

DarkSquid
Beat me to the punch. Excellent answer.
McWafflestix
Yes! There's no point in making a piece of code faster if you spend only 0.0000000001% of your time in it.
Laurence Gonsalves
Hmmm. I do not really need to measure the impact of connecting to the database using PreparedStatement. I just know it's faster.
Paulo Guedes
A profiler will show you where the program is spending its time doing work. That will let you know where you should try to make improvements. That's what DarkSquid is proposing you do.
Eric
@Paulo Guedes: even prepared statements can reduce performance in extreme cases, they add extra load to the server, after all. maybe a seldom-used statement is kept in cache to the detriment of other more critical queries
Javier
Exactly--Stay away from optimization unless you have specific metrics!
Bill K
You said what I was trying to say in a much better way. +1
Randolpho
+1  A: 

In general:

  1. Look at complicated sections of code and see if they can be cleaned up

  2. Pay particular attention to loops and see if any of them can be improved for performance

  3. Recursive calls are generally big hits in performance as well. Make sure recursion is being handled properly and make sure any bits of recursion are justified.

AlbertoPL
A: 

I'd say to look for the areas where the performance of the system is slow. I'm inherently a bit suspicious of attempts to naively improve performance for no sake other than for improving performance. I'd say you'd be far better off looking for appropriate locations in the code to refactor (which it sounds like some of yours already were; I'm not trying to be pedantic, though; refactoring is a very good thing, but it is NOT and should not be confused with performance improvements; refactoring can provide the opportunity for introducing performance improvement, but they're different things).

"Premature optimization is the root of all evil." - Donald Knuth

McWafflestix
Rather than saying refactoring can result in performance improvement, I'd say that it can provide the opportunity for introducing performance improvements. It's a bit of a gray area, but since refactoring's definition includes "without changing the behavior", I like to think of performance enhancements as different from refactoring. Helped, sometimes enabled, by refactoring, for sure - but not refactoring.
Carl Manaster
@CarlManaster: good point, I completely agree. I hope you don't mind, but I'm going to edit to reflect your suggested better wording.
McWafflestix
No objection at all; thank you.
Carl Manaster
+3  A: 

I suggest using a quality tool. My shop uses sonar. It helps you:

  • find duplicate code
  • find complex code zones
  • find code rule violation and potential bugs
  • see code coverage
  • see undocumented code

http://sonar.codehaus.org/

Mercer Traieste
+5  A: 

In general, just looking at the code base and attempting to improve performance by looking for certain things, does not guarantee that you'll get measurable performance gains.

Usually, it's more beneficial to use a profiler to find out which sections of code are being run the most, or conversely, which are taking the most amount of time to run, and then looking at ways to optimize those areas. This will yield the greatest benefit.

Peter
+1  A: 

Rather than focussing on "slow" parts of the application, you should try to focus on "bad" parts of the application.

There is an automated tool that help you find where your code behaves like CRAP

CRAP is really the acronym for the tool! It does some useful things like checking for cyclomatic complexity, but really does give you a 10 feet look at the code

Also a good java profiler could help you find the bottlenecks if there really are any.

Eric
+1  A: 

First, only attempt to provide performance enhancements in locations that you know are performance hogs. This can only really be determined using a profiler. There are a few nice tools that may help in that regard.

Second, just replacing Strings with StringBuilders does not a performance enhancement make. In fact, in many cases, that could cause a slowdown. You should only use stringbuilders when you are building a large string as part of a loop -- and even then only as part of a larger running loop. In all other cases, simple concatenation is usually faster.

Randolpho
+1  A: 

I would change

String a = new String("");

to

String a = "";

Find all the places where you recreate object and figure out is there any way to just return the same value.

Read Effective Java 2nd edition dedicated chapters.

Mykola Golubyev
Microoptimization. Much better to have a profiler pinpoint the hotspots first.
Thorbjørn Ravn Andersen
It is not. It is a simple rule - nothing common with some micros. Creating objects in a cycles and etc.
Mykola Golubyev
+1  A: 

The Java compilers are also pretty good at sniffing for performance improvements, probably better than any single human. So while there are some obvious places you can improve, there is also a good possibility of making things harder for the compiler to optimize. It's much better to profile and identify the bottlenecks after compilation and focus on those. And then, the solution is probably going to be algorithmic rather than simply changing a class name. So in summary, look for a good profiler. :)

James M.
+4  A: 

If there are performance problems, then @DarkSquid's and @AlbertoPL's advice is right on. If not, though, perhaps your time would be better spent preparing the code for future modifications. Like analyzing test coverage, particularly unit-test coverage, like assessing cyclomatic complexity or simply looking at the classes with most reported bugs (or biggest classes, or some other simple metrics). Proactive analyses like these can make the maintenance easier when that time comes.

Carl Manaster
+9  A: 

Instead of focusing on performance, you may want to use some of the following static analysis tools to identify any bugs/potential bugs in code, and fix those. These will sometimes help identify performance issues:

Both of these include Eclipse plug ins.

Jack Leow
+3  A: 

You can use static analysis tools such as FindBugs. The javac compiler already tries to optimize some stuff. Things such as string concatenation is already optimized by the compiler and converted to a stringbuilder.

When in doubt, nothing beats the profiler but beware of premature optimization.

+7  A: 

Don't change blindly things just because they "look" like feasible.

Think about this:

 logger.debug("Initializing object state " + object.initialize() );

If you simply and blindly remove that statement the object won't get initialized.

Of course such statement is wrong in first place, but, believe me they exist!!!! And will make your life miserable if something like that happens to you.

Better is to use a profiler, and find out which objects/methods/calls are consuming more time/memory/ etc. and try to identify bottlenecks.

If you have > 7k classes it is highly probable you are just fixing a bunch of code that is not being used at all.

PROFILE!!!!

OscarRyz
+2  A: 

In performance, make sure you have a real problem first, and if you do, then use a profiler like TPTP or JMeter [edit: HPJMeter once was a general Java performance tool, but now it's HP/UX-specific]. Intuition is a very poor guide.

Be sure to profile a realistic test scenario. Then focus your attention on the methods that show up at the top of the statistics. Lather, rinse, and repeat. Also decide up front when to stop: when performance is satisfactory you don't want to waste time on micro-optimizations that make your code more obscure.

Look for algorithmic optimizations too, not just low-level Java coding tweaks. They can have huge impacts.

Be sure to read Java Performance Tuning for strategies and ideas.

Be aware that as your application warms up it will run faster (eg, it no longer has to do class loading, one time initialization, JIT compilation).

I once spent several months tripling the performance of a Java-based VoiceXML browser to lower hardware costs at sites using it. I was surprised time and again by where the hot spots were. So as @DarkSquid recommends, don't guess, measure.

Jim Ferrans
+1  A: 

Perhaps a good place to start is to try to see where your application might break or violate any SLAs. If there aren't any concrete complaints about performance, try to up the performance requirements and see which portions of code cause issues.

If you have time sensitive functionality, try to test that under greater system load or with stricter limits. If you have large space requirements, up your data size or limit your heap space. If you run into issues in these scenarios fix those hot spots.

While this may not have an impact on your usual daily performance, it will ensure your system stays available when system load or input peaks.

omerkudat
A: 

Although it's generally a good idea to only spend time optimising code that is actually a problem, there are a few things that you can do as a matter of course.

The canonical example is in C. Consider the loop

for (int ct=0; ct<strlen(str); ++ct) { ... }

Here strlen actually scans the char array looking for the NUL terminator character. This is an O(n) operation, making the whole loop O(n^2). Therefore, in C, take the strlen outside with const int len = strlen(str);. You could do the same thing in Java (I do), but it has an insignificant performance impact. Code according to taste.

Getting back to things that really apply to Java:

  • Write good code. Seriously, botched together code tends to hide performance problems. If you optimise first, then you aren't going to be able see the useful optimisations as easily.
  • Know and use the libraries. They are probably fast, and are more likely to be warmed up.
  • Take care selecting data structures (this isn't actually helpful advice, but it needs to be said).
  • Consider the amount of memory that your datastructure is using. This often gets hidden with microbenchmarks and affects system performance non-locally.
  • Use ArrayList instead of LinkedList. You can find details here and elsewhere, but LinkedList tends to be slow even when you think it'll be fast.
  • Consider in which part of the application performance is critical. For instance, responding to mouse moves will probably always be fine, but opening a dialog may have a noticeable delay (less important on shared servers).
  • Remember that CPU is faster than cache, is faster that main memory, is faster than local network/disk, is faster than the internet, is faster than web services.
  • Consider the sad case as well as the happy case.
Tom Hawtin - tackline
I'd say know when to use ArrayList vs LinkedList. Try to do a simple insertion sort into each to see what I mean! (I actually changed some code that was inserting TOO MANY IP addresses into a listbox. The listbox was backed by an array list. When I changed it to a Linked List, the time went from 6+ hours to a few minutes. (We later fixed the problem of the user using a class B mask in a more general way!)
Bill K
Why would you do an insertion sort?
Tom Hawtin - tackline
+1  A: 

As some of the guys said before, you can use FindBugs to eliminate the most obvious performance related "bugs". You can quickly identify a lot of troublesome pieces of code.

You can look at a list on the FindBugs site http://findbugs.sourceforge.net/bugDescriptions.html.

Bogdan
+1  A: 

I agree with the others that only optimize code SHOWN to be slow (by a profiler, with Java 6 u 10 and later jvisualvm is very easy to get started with).

If you need other things to do, here is some things I'm pretty certain is not done:

  • Good javadoc for all classes. This helps future maintainers to understand the code faster. Simple refactoring is allowed in this process.
  • An official place for javadoc pages (so the developers can link code to them, allowing for easy navigation like Shift-F2 in Eclipse)
  • Tests for library functions. Essentially this is ALSO documentation as it demonstrates clearly how to use the library and which bordercases can be expected.
  • Figure out a way to run tests and regenerate javadoc automatically.
  • Figure out a way to run/test/stress your application automatically. (Mouse clicking on GUI, invoke lots of requests on web servers etc).

Any of these bullets will improve the code base without actually changing code unnecessarily.

Thorbjørn Ravn Andersen
+4  A: 

Don't just look around the code and change things. As others say, don't fix performance problems until you've proven where they are.

I use this simple method.

With 7000+ classes, I would bet heavy money that your system is way over-designed, and that you have performance problems of the too-many-layers-of-abstraction phylum.

What happens is that simple function and method calls look innocent, and event-handling code is considered the "leading edge", but if you run it and wait until it's being slow, then "pause" it a few times, you see things like this:

  • modifying a database
  • due to initializing data structures
  • in the process of creating/destructing windows
  • in the process of changing connections in a tree-browser
  • due to rearranging elements in a list
  • due to a cut/paste operation
  • due to somebody setting a property
  • due to somebody setting a "modified" bit
  • blah, blah, blah ...

sometimes for 20-30 levels deep.

Any of these layers that appear on multiple samples, if they could be avoided, would save large percentages of execution time.

Mike Dunlavey
+2  A: 

Rules of optimization

  1. Don't do it.
  2. Measure twice.
  3. You shouldn't get here.
  4. Don't micro optimize. Look for algorithmic complexity like the containers you use.
Hans Malherbe
Ahhh. So *that's* why programs running on my computer that's 1000 times more powerful than the one I had 10 years ago are just as slow as the equivalent programs from back then!
Software Monkey
+1  A: 

All of the best performance improvements will be algorithmic improvements.

+1  A: 

Use a profiler and let it tell you where the most frequently used and where most time is spent. Then you know for sure what needs attention and one can iteratively improve the targetted areas of the system.

mP
A: 

If your code uses several critical data structures, and especially maps/sets/anything that involves a lot of lookups, you may want to make sure that you are using them optimally.

For example, if you are using Hash based collections, how good is your hash function in terms of producing a uniform distribution? This is typically not a problem but in the few cases that it is, it can add up to a nice saving since your hash table may essentially be performing as a linked list.

In addition, remember that calculating hashes and equality is not free. Check how efficient calculating your hash is. Suppose that you have multiple numeric fields and multiple String fields; calculating a hash based solely on the numeric fields may not produce as nice a hashing, but may save you the cost of hashing entire strings. Similarly, see if you can reorder your equals checks to do the cheaper and more likely to fail tests first, especially if your equals was automatically generated.

The same issue extends to comparison based collections (e.g., TreeSet), it might make sense to see if you can optimize your compare function. For example, can you make the cheaper comparisons first? (assuming you only use compare for functions).

Uri
+4  A: 
Robert Munteanu
A: 

I would first look to see if newer/faster hardware can solve your problems. As much as one may want to optimize code, it often is more economical to move the software to a faster server. I like to use the DaCapo bennchmarking tool to compare Java performance on hardware. I also keep a running list of hardware which I've tested.

brianegge
A: 

You can look at tools like FindBugs, but they wont' fix your code. I suggest you try IntelliJ Community Edition which is free. This will find possible performance issue and also give you quick fixes.

However, the best approach is to use a performance profiler. Profile a realistic sample of your program and this will often point to simple things you can do to improve performance. i.e. only the top things found in a profiler are worth optimising, the rest is not run enough to be worth changing.

Peter Lawrey