We're looking to start a code-review process for a group of ~30 coders who are all relatively junior programmers.
What is your advice on how to train for code reviews? Are there code review pitfalls to be avoided? Essential things to look for?
We're looking to start a code-review process for a group of ~30 coders who are all relatively junior programmers.
What is your advice on how to train for code reviews? Are there code review pitfalls to be avoided? Essential things to look for?
Focus on the code, not the person. People will still take it personally, but it can be minimized if you help them realize why their approach has be changed for sake of maintenance and/or performance.
Use the sandwich principle: start with good things, address issues in the middle, end with good things.
Code-reviews are a form of reading. The more you do, the better you get. Encourage them to read each-other's code and mentor them to accept criticism without taking it personally.
You need to reward people in code review to prevent peers from "signing off" on each other's code and to encourage thorough review, for example keep a "ranking" which makes the person who finds the most bugs #1 on some large leaderboard. Incentivize what you measure.
Keep in mind that code review is often viewed as something questioning a developer's capability (by the developer whose code is reviewed). Because of this, to avoid personnel issues, review everybody's code, don't make "exceptions" because somebody is "senior". This would be detrimental anyways, because you want the juniors to learn from the seniors.
Number one thing is to convince them that it's not personal. You might have them start on reviewing for concrete problems like security issues (SQL injection is easy to spot and demonstrate). Once they get in the habit of performing code reviews you can add deviation from standards, etc..
The best way to learn to do code reviews is to sit in on code reviews with more experienced reviewers at the company. That way, the junior developer starts to learn what is expected during a code review (of both the reviewer and the reviewee) as well as the general tone of code reviews for your team/company. Another good way is to have your code review team review the junior developer's code (i.e. have them be a reviewee before they are a reviewer).
If you have no review process in place and you are trying to build one from scratch, I would recommend you get a small group of your most senior developers (maybe 4 or 5) and do a code review on some of your code. Make it clear that you expect the reviewers to be prepared (don't just show up -- they should be expected to have read and understand the code in question beforehand and come with good and relevant questions and comments). Then, once you have that in place, slowly integrate your more junior developers.
Make certain to have a coding standard that every understands, as that will help when reviewing code.
Also, it helps if you have some source code that follows all the best practices that they can look at, as an example of what code should look like.
just a few thoughts...
I would advocate getting the Free book that Smart Bear publishes on Peer Code Review. The book is a gem and has a ton of good information on starting a code review process.
Some of the stuff touched on is:
I am in no way affiliated with Smart Bear or it's affiliates in any way... It really is a good book!
Do you mean that in the group of 30, there is not a single experienced programmer? It must be a very special group! A school maybe?
If it is a private company, I suggest reviewing not the code, but their employment policy ;-)
The coding advices are all over the place, in the best-practices and so on.
Speaking of the reviewing process, I think of:
The group needs a common reference.
If it evolves as it goes, it could be stored in a wiki for example, where each page is a static reference. And registering everyone to the RSS will send notifications about modifications, so that will cover the dynamics of change.
Each one must understand the reference. There must be discussions about it.
This is because the common writing is not enough, interpretations also must converge.
The writer of a code must review it himself and correct it, before the review takes place.
This takes care privately of a lot of obvious mistakes, privately (without anybody else seeing it). It leaves only interesting stuff to discuss about, so it speeds up the following steps.
The reviewer can read carefully alone, taking private notes to remember everything.
This gives him a chance to evolve opinions, privately. What may be seen as a mistake at first, might be considered smart half an hour later, when the whole code unit has been understood. It also gives room to determining priorities, or grouping similar items. This speeds up the following steps.
Then the writer and the reviewer get together. Care must be taken to not offense the writer (show also positive aspects, be nice). They talk about each item carefully. Opinions may change during the exchange, focus is on learning.
Take the items in turn, depending on the grouping. When they are finished discussing each item, they write down the final review notes, with three options:
the code item is finally considered correct by both: Forget about this item, don't write it at all in the review notes.
the code item is finally considered incorrect by both: The review notes should mention that the writer will fix it.
there is no agreement on the code item: This is a learning opportunity for the team. The code item should be discussed in a meeting to let everyone learn (possibly without reference to the precise coder and reviewer, to prevent bad feelings).
Notes:
This process takes a lot of time the first time, then it speeds up very much. The time spend is learning time, which is very valuable. To me, the main point of code reviews is to improve coding practices, and ultimately don't need coding reviews at all ;-)
As you can see, this process focus on doing several steps privately, either only one person, or only the two reviewers (only the final notes are allowed to be stored, and shown to the rest of the team or the management). This is important to maintaining pressure low on the workers, focusing on learning and not on counting points.
My advice
Review Unit Tests
Start out reviewing the test code. There's less ego involved in writing tests so pointing out deficiencies there will be less painful for the author. Once you're comfortable with that, start reviewing the actual changes to the codebase.
Keep It Simple
You won't know what bits of data to track up front, so don't try. Grow your process organically.
You need to convince the developers that their time is being used effectively.
One way to do this is to make sure that you've exhausted automated "code review" sorts of technologies before asking your team to spend their time sitting around and reading code.
This means having sufficient automated test coverage (e.g. unit tests) as well as having some static analysis tools looking at your code (assuming such tools exist for your platform).
When these tools are finding certain classes of bugs automatically, then you can make a stronger statement for asking your team to sit in a meeting and read code.
You see, the value proposition of code reviews is something like: if we spend our time pointing more eyeballs at some code, then we'll likely find some bugs before the end users find them and cause us to fix high priority bugs at 3 am.
Now, free/open source proves that this sort of reasoning works... assuming that you have enough interested eyeballs.
One or two more sets of eyeballs might not be enough, even if they are interested/competent eyeballs. On the other hand, having some statistically significant number of eyeballs (10? 15?) might not be enough eyeballs if they aren't sufficiently interested.
You're more likely to have interested people if they can be assured that certain checks have been done for some reasonably large class of bugs by automated tools. Use the human minds to find the sorts of things that today's automated tools cannot.
Also, consider tracking code review metrics - how else will you know whether the code reviews are valuable?
The team makes the metrics: number of defects per code reviews, number of "good ideas" that cause significant refactoring, etc.