views:

250

answers:

3

I have a source code of about 500 files in about 10 directories. I need to refactor the directory structure - this includes changing the directory hierarchy or renaming some directories.

I am using svn version control. There are two ways to refactor: one preserving svn history (using svn move command) and the other without preserving. I think refactoring preserving svn history is a lot easier using eclipse CDT and SVN plugin (visual studio does not fit at all for directory restructuring).

But right now since the code is not released, we have the option to not preserve history.

Still there remains the task of changing the include directives of header files wherever they are included. I am thinking of writing a small script using python - receives a map from current filename to new filename, and makes the rename wherever needed (using something like sed). Has anyone done this kind of directory refactoring? Do you know of good related tools?

+4  A: 

If you're having to rewrite the #includes to do this, you did it wrong. Change all your #includes to use a very simple directory structure, at mot two levels deep and only using a second level to organize around architecture or OS dependencies (like sys/types.h).

Then change your make files to use -I include paths.

Voila. You'll never have to hack the code again for this, and compiles will blow up instantly if something goes wrong.

As far as the history part, I personally find it easier to make a clean start when doing this sort of thing; archive the old one, make a new repository v2, go from there. The counterargument is when there is a whole lot of history of changes, or lots of open issues against the existing code.

Oh, and you do have good tests, and you're not doing this with a release coming right up, right?

Charlie Martin
Agree - structure belongs in build info not code. The Mac compilers all used to allow you to specify a directory tree so you didn't have to stuff around with so many explicit paths. It was a shock when I started using Visual Studio!
Andy Dent
"Then change your make files to use -I include paths"- Includes should not be too deep but I won't like to complicate -I path. There should be wrappers at top level of each directory providing "facades" for the directory. This is the way Boost works; using it just needs including just one directory.
Amit Kumar
I don't think it's too important which way you do it, but the wrapper file means you have to read code to understand how the build works; I'd rather be able to find everything in the Makefile.
Charlie Martin
Adding too many directories to the include path also causes pain. If you do #include "file.h", you are more likely to run into a filename collision when adding a dependency (or porting to a new OS) than if you do #include "dir/file.h", where 'dir' is something unique describing your project.
bk1e
+2  A: 

I would preserve the history, even if it takes a small amount of extra time. There's a lot of value in being able to read through commit logs and understand why function X is written in a weird way, or that this really is an off-by-one error because it was written by Oliver, who always gets that wrong.

The argument against preserving the history can be made for the following users:

  • your code might have embarrassing things, like profanity and fighting among developers
  • you don't care about the commit history of your code, because it's not going to change or be maintained in the future

I did some directory refactoring like this last year on our code base. If your code is reasonable structured at the beginning, you can do about 75-90% of the work using scripts written in your language of choice (I used Perl). In my case, we were moving from set of files all in one big directory, to a series of nested directories depending on namespaces. So, a file that declared the class protocols::serialization::SerializerBase was located in src/protocols/serialization/SerializerBase. The mapping from the old name to the new name was trivial, so that doing a find and replace on #includes in every source file in the tree was trivial, although it was a big change. There were a couple of weird edge cases that we had to fix by hand, but that seemed a lot better than either having to do everything by hand or having to write our own C++ parser.

James Thompson
+2  A: 

Hacking up a shell script to do the svn moves is trivial. In tcsh it's foreach F ( $FILES ) ... end to adjust a set of files. Perl & Python offer better utility.

It really is worth saving the history. Especially when trying to track down some exotic bug. Those who do not learn from history are doomed to repeat it, or some such junk...

As for altering all the files... There was a similar question just the other day over at:

http://stackoverflow.com/questions/573430/ c-include-header-path-change-windows-to-linux/573531#573531

Mr.Ree