views:

659

answers:

13

Imagine that 90% of your job is merely to triage issues on a very massive, very broken website. Imagine that this website is written in the most tightly coupled, least cohesive PHP code you've ever seen, the type of code that would add the original developers to your "slap on sight" list. Imagine that this web application is made up of 4 very disparate parts (1 commercial, 2 "repurposed", and 1 custom) and a crap-ton of virtual duct tape and shims. Imagine that it contains the type of programming practices in which major components of the website actually rely on things NOT working properly, and fixing these broken things usually breaks other things. Imagine that you know from too many bad experiences that changing one seemingly innocuous part of the website, such as splitting a "name" field into two separate "first" and "last" fields, will bring the site to its knees and require hours of rollbacks, merges and patches. Imagine pleading with the customer for years to just ditch the code and start all over but being met with Enterprise-Grade despair and hand wringing. Then imagine getting ASAP/EMERGENCY tickets to implement new features that in any other web site would take 4 hours but you know better with this site so you quote 40 hours, then blow right by that and bill 80 hours, but it's OK because the client is used to that with their website.

Here are some other things that you should also imagine:

  • there are no tests at all right now
  • there are googleteen different layers of logins. Some customers actually have 3 different accounts for different sections of the website
  • when I say "tightly coupled", I mean the loops of include/require statements would probably map out like a celtic knot
  • when I say "least cohesive" I mean some stuff is organized sort of like MVC, but it's not really MVC. In some cases it may take you several hours just to find out how URI A is mapped to file B
  • the UI was written like "obtrusive" and "inaccessible" were the buzzwords of the day

Imagining all that, is it even worth trying to achieve even a moderate level of test coverage? Or should you, in this imaginary scenario, just keep doing the best you can with what you've been given and hoping, praying, maybe even sacrificing, that the client will agree to a rewrite one of these days and THEN you can start writing tests?

ADDENDUM

Since many of you brought it up: I have approached the possibility of a re-write at every chance I've had to date. The marketing people I work with know that their code is crap, and they know it's the fault of the "lowest bid" firm they went with originally. I've probably overstepped my bounds as a contractor by pointing out that they spend a crap ton of money on me to provide hospice care for this site, and that by redeveloping it from scratch they would see an ROI very quickly. I've also said that I refuse to rewrite the site as-is, because it doesn't really do what they want it to do anyway. The plan is to rewrite it BDD style, but getting all the key players in one place is tough, and I'm still not sure they know what they need. In any case, I fully expect that to be A Very Big Project.

Thanks for all the feedback so far!

+11  A: 

Absolutely not.

If you say that multiple things rely on other things specifically not working then how can you even begin to test it?

Personally I would say scrap it and start over. Four hour features that take 80? I hope this is an exaggeration. The headaches you must have.

I would start with a very firm proposal to re-write the code base. Hand-wringing clients must be told the blunt truth some times. How many other developers will work with a broken base? Make some pretty cost / benefit charts.

By all means write tests for code you write. Don't neglect that. I'm saying I wouldn't try to write tests on the existing code base.

Josh K
Fair point.....
Skilldrick
I disagree slightly... For a trivially sized application, sure it's cheaper to scratch it and start over. But in the real world, economic and staffing pressures may prevent a clean rewrite (since they still have to maintain the current app until it's ready). I agree that it's better to start fresh, but it's not practically in all situations (and hence why refactoring may be a wise choice, especially if the application is sizable)...
ircmaxell
Rewriting such systems is nearly impossible. You have to keep maintaining the old while writing the new. Your best bet, is write what tests you can, and refactor, refactor, refactor. Slowly but surely you improve the codebase, and remove the bugs.
Chad
@Chad: I disagree, though I would not attempt to rewrite the codebase entirely in one shot. Break it down into logical sections and rewrite one, along with relinking everything to your new section. Moving around in chunks you can support the existing system while replacing it in chunks.
Josh K
@Josh K: Isn't that the very definition of refactoring? And wouldn't that therefore fit the mold of adding tests to the existing codebase (which will entail refactoring to make that code block at least testable)?
ircmaxell
@ircmaxell: I view refactoring and rewriting differently I guess. Refactoring means smaller pieces *as they come up*, rewriting is a concentrated effort to rewrite large pieces of code. By all means write tests for what you have written, but I wouldn't try to write tests for existing code.
Josh K
@Josh K: That's a fair point. When I think of rewriting, I think of dropping everything existing and starting from a clean sheet. The only "take away" from the original application would be the list of features. Refactoring on the other hand would be going thorough and rewriting parts of the original code base so that eventually there's a total rewrite down the road, but it's gradual and incremental... You say `tom-ate-oes`, I say `tom-at-oes`...
ircmaxell
@Josh K, *"but I wouldn't try to write tests for existing code."*, why not? What better way to document behavior, and make it testable, maintainable, code? I'm doing this very thing right now on a medium sized system that was a complete and utter mess. Duplicated code, 1000s and 1000s of lines of dead code, and a plethora of bad code practices making it a nightmare to maintain and very buggy. When I'm done, it ends up being nearly a rewrite. But the process isn't from scratch. It's a long series of test, refactor, test, refactor... until the app works, and is maintainible...with tests!
Chad
@Chad: It's my opinion. I feel time is better spent recoding and making sure your code has proper tests then to spend time trying to test broken code.
Josh K
@Josh K, I've never seen a rewrite of a system go live without lots of serious support and rushed dev work to get it working like the prior system. Missed edge cases, and other esoteric features that aren't known until users start *really* using it. Rewriting is almost always a mistake. Read: http://www.joelonsoftware.com/articles/fog0000000069.html
Chad
@Chad: I've read that and frankly, I disagree. In some cases a full re-write is warranted. What Joel says is true, it's harder reading code then writing it. However, if we are to trust the questionee and his evaluation of the code base then I would say *it's harder to fix a code base then to debug a new one that you have written* simply because it's easier to debug something you know the ins and outs of rather then debug something as clusterfucked as this.
Josh K
@Josh, conversely, how can you rewrite an app when you don't know how the app works in the first place? You can't re-write without knowing the specs of the existing app. And the only thing that contains the specs, is the code of the existing app...as clusterf'd as it is. Either way you're f'd, at least by refactoring step by step, you can work incrementally...a few changed lines at a time. Knowing that the entire app worked at least a few minutes ago...and how much can you really break in a few minutes of coding?
Chad
@Chad: You'll be surprised how much I've broken with *"just a few lines of code"*. I'm assuming that as broken as it is it *works* at least in the designed capacity. Take a chunk of the system and rework it. Plug it back in and connect everything up. Test. It's not as easy as that but that's the general idea when you're cleaning up someone elses mess.
Josh K
@Josh K - As ridiculous as it may seem, my example was not at all an exaggeration. Adding some upload/download functionality has take at least that long. The earlier example I cited (the one resulting in rollbacks, merges, etc.) came from another estimate that went like this: "1 hour to add the new fields to the form and related database tables. 3 hours to update both reports to match the current form." Total time ended up at 37 hours, PLUS about 20 hours of bug hunting, rollbacks and merges after that was rolled out to find out why all hell broke loose on production after rolling that out.
Chrisbloom7
@Chris: Then I stand even more firmly behind my answer. If it took you (est. 40 hr work week) almost a full week to perform a change that should have taken half a day it's useless.
Josh K
+7  A: 

Give it a go

Writing tests enables you to refactor. If you write your tests at a high-enough level, you might manage to make it so you can refactor without having to re-write the tests every time.

It's a least worth a go, on a small part of the site (I know you won't be able to fully isolate any part, but you can still target part).

Maybe set yourself a time budget (it's down to you to work out what's affordable/worth it), then have a go with some tests and some refactorings. If it doesn't work out, roll back. If it does, carry on.

Good luck!

Skilldrick
Congrats on hitting 9k with the upvote. ;)
Josh K
@Josh, awww thanks :)
Skilldrick
+2  A: 

This sounds like in order to make it testable at all, you'd have to rewrite parts of the system from scratch - unavoidably causing tons of bugs in the process.

From what you describe, the old system is not worth putting that kind of effort into.

I would under no circumstances try and introduce testing for this, but try to get permission to rewrite as soon as possible.

If your client doesn't see the light, consider whether refactoring the project is worth giving some time of your own: Working with clean code is so much better for one's well-being...

Pekka
@Pekka, *"you'd have to rewrite huge parts of the system from scratch - unavoidably causing tons of bugs in the process"*, rewriting the ENTIRE system from scratch will introduce even more bugs in the process.
Chad
@Chad: Why does new code instantly create new bugs? I've rewritten several systems "from scratch" and noticed a marked reduction in the amount of bugs that came up. Proper planning is key, better to spend a week planning and a week coding then a day planning, a week coding, and three weeks fixing.
Josh K
@Chad @Josh I mean rewriting system components while the system as a whole stays in place. In my experience, in monolithic, badly-documented, chaotic apps, this almost always leads to unintended consequences - "oh, I didn't know that global variable mustn't be changed" type of things. I am all in favour of writing from scratch, with top-notch planning, state-of-the art design patterns, etc. It's the only way to go. Trying to make a broken system testable is what Im categorically against.
Pekka
`Trying to make a broken system testable is what I'm categorically against`... While I know where you're coming from by saying that, I'm not sure I can agree. It all depends upon what your definition of broken is. Typically, the entire system isn't broken, it's just parts (even major parts) are horribly broken. So if you can refactor those parts, you can save yourself a whole lot of work at the expense of testing and code re-use. For the most part, you won't have to throw away more than maybe 10% or 20% of the code. The rest you should be able to intelligently refactor... (Some cases anyway)
ircmaxell
@Josh K, if you can plan in a week, code in a week, your app is incredibly small and shouldn't have many bugs in the first place. I'm talking about apps taking several devs, dbas, testers, etc, several months to plan, several to code, several to fully test... rewriting is a huge waste of all that investment.
Chad
I second that. In WTF cases like that it's best practice to start a **parallel running** rewrite. If at least the database is consistent, a secondary view/management implementation is a better prospect than trying to document/test a failing system.
mario
*'In WTF cases like that it's best practice to start a parallel running rewrite.'*, then you end up in a race between keeping the old system running, which often includes adding features, and getting the new one developed. Every time the new system thinks it is done, it has to implement the features that the old system had added to it while it was being developed. It's trying to hit a forever moving target.
Chad
@Chad: I'm giving a general ratio of 1:1 for planning to coding. ;) I don't think a parallel re-write is best, for the reasons you mention. Rather, break the system into managable chunks and plan an incremental re-write.
Josh K
@Josh K, but many of those "chunks" probably work fine. There's no business value to rewriting them for the sake of rewriting them. Refactor as you have to fix bugs, write tests around the areas you touch, so you have some confidence you didn't break the rest of the system. And the rest of the app, "if it ain't broke, don't fix it"
Chad
@Chad: That is possibly the *worst* mentality I've found when it comes to coding. If the bath room is covered is feces, but the toilet flushes do you not clean it?
Josh K
@Josh K, correct, I clean it (refactoring). I don't build a new bathroom (rewriting).
Chad
@Chad: I'm not referring to the rewriting, I'm referring to the *"if it ain't broke, don't fix it"*. And according to you, every time you go in there you should clean up a little bit and then eat off that little bit to ensure that that little bit is clean. I'm saying go in with a play to clean it all up (by rewriting chunks).
Josh K
@Josh while I largely share the "rewrite from scratch" view, I think @Chad's mentality comes from real-world experience, and I have respect for that. In environments where money has to be made with the tools available, there are more factors that play into these decisions, and sometimes you simply have no other option than applying duct tape here and there. Refactoring is *painful* because it costs time and money without making any. Sometimes, it is too painful. Still, in the case the OP describes, as far as can be diagnosed from a SO question, I too would opt for a total, parallel rewrite.
Pekka
@Pekka, *'Refactoring is painful because it costs time and money without making any.'*, it doesn't cost as much as taking 80 hours to make a 4 hour change! The point is, you refactor what you **need to change**. There could be a 10,000 line class that is a complete mess, but if you never touch it, and it works, you don't waste the time and money fixing it. You spend the time cleaning up the code you work with, when you work with it. It's not duct-tape, it's simply making the best use of time and money. Fix where needed, and is most beneficial.
Chad
@Chad "duct tape" is maybe too strong a word: What I mean comes actually closer to your 10,000 line class example. That should be rewritten at some point, too (because adding a feature to it would take thrice as much time as it would if it were fixed, and would be no fun at all) but that point may not be there yet, because there are other priorities.
Pekka
@Pekka, but, if you never need to add a feature or change that file, the time, money, and effort used to rewrite it prematurely was wasted.
Chad
@Chad true. ` ` ` `
Pekka
@Chad: If you build a house out of duct tape surrounding a rotting foundation it will eventually fall apart. Don't add to the mess, clean it up. If you have a 10k line class that is an absolute disaster it might be worth 40 hours to fix it if later on it cuts the work you have to do by 10%. **Never** assume you will never touch a file to add a feature.
Josh K
@Josh I see your point, but who is going to pay for the 40 hours? If the client can't be persuaded and you decide to put them in yourself: Who's going to pay your rent while you work 40 hours for free? Not trying to invalidate your point, just adding a different perspective... I've been in a very similar situation recently, and it can be *hard*.
Pekka
@John K, who's going to pay for the 40hours to fix the file when it already works?
Chad
@Chad: I though they were getting charged 80 hours for a 4 hour change? If you're spending two weeks when you should be spending 10% of *one* week, I think you would rather overbill one time and then underbill the next, saving yourself the headache in between.
Josh K
@Josh K, we're never going to agree on this. I've seen to many rewrites (even of just modules) cause hundreds and hundreds of man hours wasted re-fixing bugs, and re-implementing features that were overlooked in the rewrite because of a lack of documentation of how the original code worked. You may think they users know how their app works, but they very rarely do...until they use the new one and it's different than before. Then the ***beep*** flies!
Chad
@Josh K, @Pekka, I added this to my answer, but I **highly** recommend watching this presentation by *Uncle Bob* http://www.infoq.com/presentations/craftmanship-ethics as it touches on this type of scenario
Chad
A: 

In theory, definitely. The more tightly coupled, bug ridden the maintenance process then the more important the tests. In practise, walk away and live another day!

Paul Hadfield
+13  A: 

You'll likely not get full coverage for some time. But you can write tests for new code/features you implement. Start small. Don't try to do everything all at once.

And perhaps the book "Working effectively with Legacy Code" is worth a read?

Edit

I also would recommend watching this presentation from Uncle Bob which touches on this scenario and how to transform a bad code base, into a good one using "Progressive Widening"

Edit 2

Start by finding any self contained functions. Functions that don't reference anything but the arguments passed in. Move and organize them into helper classes. This is likely just temporary, as many will end up in different classes later, but this will help identify some duplicated code, and start getting things organized. Then, look at how these are used, write tests based on these uses. And pat yourself on the back. You've now started making your code maintainable.

Edit 3

With great timing InfoQ just posted another article How To Do Large Scale Refactoring which is specifically this sort of thing, and another, older article called Refactor or Rewrite? and there are techniques like the Mikado Method where you have to realize you can't always make the move you want in one step, you have to make other moves to setup and realize your goal.

Chad
@Josh K, I was just about to make the edit to clean up that link and when I clicked edit, and saw it was already done, I was very very confused... ha
Chad
@Chad: No problem, I just don't like plain links. ;)
Josh K
+1, that's an excellent book for dealing with a situation like this. For this particular case, I would recommend writing characterization tests, where you write a test case that fails, observe the way the system currently behaves and change the test to pass for the behavior. You're not testing whether the system runs correctly, but at least you're testing whether the system's behavior changes when you modify something.
Dan Bryant
Interesting Dan. I hadn't heard of that style of testing before. The touch bit here is that there are so many different components to test - making a change to the front end often breaks one particular admin tool. Changing stuff in the ecom part can break other 3rd party systems that try to automate certain things (and which I don't have access to). But perhaps a combination of yours and Chads suggestions would be not entirely impossible to implement.
Chrisbloom7
er, "tough", not "touch"
Chrisbloom7
@Chrisbloom7, Test and mock the third party systems. That way, you can plug them in, and hopefully know beforehand if you're changes will break them.
Chad
@Chrisbloom7, I definitely recommend you check out the Legacy Code book. It outlines a number of relatively low-risk techniques you can use to break dependencies and get code under test. It takes a very pragmatic approach and freely admits that some of the refactorings are quite ugly. The goal is to get things under test so that you safely and confidently make more significant refactorings as you improve your code to implement new features. That and you have to love a book that has a chapter named "My Application Has No Structure".
Dan Bryant
@Chrisbloom7, and I highly suggest watching Uncle Bob's presentation that I linked above. It's only 45 minutes, but explains why rewrites rarely work, and how to refactor pragmatically
Chad
@Chrisbloom7, I added some more links to articles you can read discussing the ways to go about improving a poor code-base.
Chad
+5  A: 

Absolutely write tests. Especially in a tight-coupled environment the tests are going to be even more critical (since a bug fix in one area can drastically affect other areas due to the tight coupling).

Now, realize that this will likely not be a trivial task. In order to write tests, you'll need to modify the code to be testable. In order to modify the code, you need to have tests. So you're caught in a dependency cycle...

However, look at the potential benefits. That should tell you if it is really worth it or not.

If you do start out, start small. Pick one tiny piece that looks loosely-coupled, and test that. Then find something else that's not that tangled. Test all the loosest pieces first (the low hanging fruit). Then, once you get to the really tight parts, you'll both feel more comfortable and (hopefully) have more insight as to what you really need to do.

Remember, you don't need 100% coverage to reap the benefits. Each test adds meaning...

ircmaxell
+1 for "low hanging fruit".
Chad
+1  A: 

Start by doing black box, functional testing, connected parts or bits and pieces here and there. This makes continued development and refactoring/rewriting much easier.

Been there, doing that.

Took a while till we could start adding unit testing, but got there eventually.

It's still far from bulletproof but all developers are much more confident to dare to change/fix things when you know that there is a test suite waiting to try to verify your code changes.

nicomen
can you explain what you mean by black box function testing?
Chrisbloom7
http://en.wikipedia.org/wiki/Black-box_testing
nicomen
Ah, thanks. Yes that might work, and I might even be able to get them to write the tests for me, or at least the stories that will become the tests.
Chrisbloom7
+5  A: 

First, if your customer is use to your estimates being half what it actually takes, fix your estimates! It is nice the customer is 'OK' with the estimates being off -- but it is critical you get your estimates more in line with effort actually needed. Without that, what customer would ever consent to a major refactoring -- let alone a rewrite. So get some history of being right with estimates, then move to rework the project.

As for writing tests. That is even more vital for what you describe than for a green-field project. In every piece of code you touch ask yourself if it is possible to decouple the behavior that should be there from the behavior that is there. Write the code the way it should be (with tests) and then add a layer of abstraction to make it the way it currently is (and test that too!). You will feel like your adding to the mess, and you will be -- but slowly, over time, the tests will give you confidence in these areas.

If it's anything like what I've been dealing with, it will be on the order of pulling a single method out into a helper class and patching it back into the existing code, hardly seems worth it -- but it does pay off every time you have to touch that part of the system again. Like they say -- "leave it better than you found it" and you'll start finding it in better shape each time you come back to it. Tests are the best way to leave it better than you found it.

But seriously, getting the client confident in the accuracy of your estimates is required before they will be fully confident in your ability to handle a rework.

David Culp
@David Culp; welcome to the site!
Dean J
Thanks David. Actually, what we've started doing is telling the client "we won't know until we finish". Having done this for nearly a decade, I'm pretty comfortable with quotes. But this system throws them way off.
Chrisbloom7
If the system is well-entangled, it is impossible to fix the estimates (before fixing the code). You should be able to make an estimate once in a while though (half the time, on average).
Stephan Eggermont
From reading the original question I took it to be understood there was a history of estimates on this project. Making the assumption (assuming, I know ... ) the project is mostly uniformly entangled, the history gives a good guage an offset to adjust his normal estimations with. I'm in his same situation by a large measure, and in my experience accurate estimates are one of the most important parts of client comfort.
David Culp
+3  A: 

You can't scrap it. The customer isn't going to let you, and it might not be the best path anyways.

So instead of quoting 40 hours for a fix that should have taken minutes... quote 60. The customer seems A-OK with that. Use 40 to fix, and 20 to refactor... and write tests on what you refactor. If the 60 runs to 100, then spend 120; 80 to fix, and 40 to refactor/test.

Build in time to improve the thing into your normal estimates, or find new work; the current situation, it sounds like, will drive you into hating our field.

Dean J
I really like this idea in general. Even if not to do testing with, then just to refactor, cleanup, consolidate, etc.
Chrisbloom7
+1  A: 

From your scenario, you should have a long list of fragile areas of the code that tend to be affected by innocuous changes (or at least areas that absolutely must work). If you can wright tests against this list, you have a quick way to find out when a change you're implementing has broken something.

Dana the Sane
A: 

If things behave reliably, you can test them, right? Your system works the majority of the time, so you can test for those success conditions.

..innocuous part of the website, such as splitting a "name" field into two separate "first" and "last" fields, will bring the site to its knees and require hours of rollbacks

Splitting a field apart such as first and last name sounds like a potential massive thing - but sounds like you've learned your lesson. At least try to get some funding for a full size test system and put the procedures in place to make moving production data to it automatic, so you can fully test this thing.

Sounds pretty horrible though. Time to dust of the ole resume?

Sam
+2  A: 

The most important thing (After buying Working efficiently with legacy code) is to start small. I work on several projects, each several thousand PHP lines long and often without a single function (and don't even think of objects) and whenever i have to change code i try to refactor the part into a function and write a test for it. This is combined with extensive manual testing of that part so i can be sure it works as before. When i have multiple functions for similar things i move them as static methods into a class and then, step by step, replace them with proper object-oriented code.

Every step from moving it into a function to changing it into a real class is surrounded by unit testing (not very good one as 90% of the code are SQL queries and it's nearly impossible to set up a reliable testing database, but i can still test the behaviour).

Since a lot of code repeats (i found a single SQL query repeated 13 times in a single file and many times more in the other 50 files of that project) i could change all other places, but i don't since those are neither tested nor can i be sure the surrounding code doesn't depend on that code in some wierd way (think global). That code can be changed as soon as i have to touch that code anyways.

It's a long and tedious work and every time i see the code i feel a step (or rather a leap) closer to mental breakdown, but the code quality will improve (slowly but mostly reliably).

Your situation seems to be quite similar, so maybe my ideas might help you with your code.

In short

Start small, change only what you work on and begin to write only limited unit tests and expand them the more you learn about the system.

dbemerlin
A: 

You might want to consider billing another 40 hours/iteration to create a nice BDD (domain) model of how the application works or better: should work. That creates a nice framework where you can document the needed features. When the model is complete enough, you can estimate how much time you'd need to convert it to a working application.

Stephan Eggermont