views:

256

answers:

4

I took over some code that hadn't been developed since 2002 and I looked through the patches sent against it over time. All these patches were in unified diff format, which apparently is the de facto standard for submitting code improvements. Here's what one patch looked like:

@@ -365,7 +385,10 @@
     return () unless defined $op_sym;

     $a_or_b = $op->[OPCODE] ne "+" ? 0 : 1 unless defined $a_or_b;
-    return ( $op_sym, $seqs->[$a_or_b][$op->[$a_or_b]] );
+    my $line = $seqs->[$a_or_b][$op->[$a_or_b]];
+    my @ret = ( $op_sym, $line );
+    return @ret;
}

How exactly am I supposed to figure out what this change does in context? The patch doesn't tell me what subroutine it affects. I'd have to open the original file, go to line 365, and mentally replace the existing lines there that correspond to '-' lines in the patch file with the '+' lines in the patch file. WTF?

To preserve my sanity, I ended up creating a copy of the original file as file.orig, running patch on file, then using a visual diff tool on file.orig and file to actually see what the patch was doing.

My question is: why don't people spare each other the effort and just send the entire files they've changed?

Can most developers who accept patches figure out instantly what the patch refers to in their code? What if it removes a line that appears often in the file? Will they know which occurrence, in which subroutine, the patch affects?

Maybe patches made sense back in the day when bandwidth was precious, but come on: a source code file normally has under 10,000 lines of code, which means less than 50k. 50k of text get transferred even through the crappiest connection in seconds, even without text compression. I agree that patch files are good for simple changes, like typo fixes.

However, I have yet to see a developer asking for significant code contributions the via whole files. Everyone says "patches welcome" and they expect unified diff. Do they not end up comparing the files side-by-side in a visual diff tool? What about character-level intra-line differences? A patch file doesn't show those.

Am I the only one who prefers to visually diff?

+3  A: 

Patches are useful precisely because they affect only a small part of the code. Consider the case of a developer working away on an open source project. Somebody downloads the code one week, and then submits a change to the developer the next week. Chances are the developer has already been working on that source file, so other things have already changed. With a patch file, the developer can apply the patch to the existing source code, and the patch program will find where it applies and apply it.

If the contributor sent the whole file without identifying what changed, the developer would have no way of knowing which previous version of the code the submitter had changed. How would the developer apply the patch to the latest code in that case?

If the developer wants to review the changes submitted by the contributor, then for a small change it is certainly possible that he can say "oh, he just changed that bit of code there" and move on. For a more complex change, the developer might apply the patch to a temporary copy of the project, and use a visual diff program to show the changes in a way that's easier to browse.

Greg Hewgill
A: 

By sending a patch, it is possible to view the change in context, even if it doesn't match up exactly due to other changes. If you send the whole file, it is impossible for the person receiving the file to know which bits you have changed, and which bits are changes that you will be (inadvertently) reverting from other people. Visual diff tools and source control systems can be used to show the exact effect of the patch. In short, they are more practical and are easier to work with.

Modern source control systems send around changesets (essentially patches) rather than whole files for much the same reasons and also to save bandwidth.

1800 INFORMATION
**If you send the whole file, it is impossible for the person receiving the file to know which bits you have changed, and which bits are changes that you will be (inadvertantly) reverting from other people.** At least with the visual diff tool I'm using (CompareIt), I can instantly see the parts of my file that they are overwriting, even if chunks of code are moved around - these show connected by dotted lines. I have yet to see an example where a patch file would conceivably be clearer than a visual diff. Check some screenshots maybe? http://grigsoft.com/wc3pics.htm
dandv
A: 

On the purely technical side, I suppose that patches are much more compact than the entire file, and it saves storage and bandwidth if you're using a lot of them.

There is of course the human side to it as well. You're picking up code that hasn't been touched in 7 years, but if you were part of the development team that was working on it, you'd have enough knowledge of the code that a diff would give you exactly the information you needed. When discussing the proposed change in the code, the programmers would have known exactly what was the context, and what function it changed, etc.

Basically patches are better for the short-term, but if you want to understand what happened a long time ago, you'll need the entire file.

sykora
A: 

I think it's sensible to use a visual diffing tool, if the affect of the patch is not immediately apparent. So you need a method of doing it that is efficient.

I ended up creating a copy of the original file as file.orig, running patch on file, then using a visual diff tool on file.orig and file to actually see what the patch was doing.

If you have your good copy committed to source-control, you can apply his patch to your file(s) directly, then use your favorite source-control diffing tool to view his changes.

joeytwiddle