views:

448

answers:

10

I've inherited a really poorly designed PHP spaghetti code project. It's been gaining a good bit of traffic recently and is starting to have performance issues on top of the poor monolithic code base. Its maxing out performance on a chunky 16GB dedicated machine when it really shouldn't be.

I'm planning on doing some performance tweaks right off the bat to help the performance issue, but this still won't really help the horrible code base. The team is small but expecting to grow very soon.

I've read Joel's article on the troubles of doing a complete rewrite and see the concerns. But how bad does the code base have to be before you consider a rewrite?

There is PHP handling logic interjected into what one would usually consider a "view". Even worse, in some places SQL statements are in these same files! The only real separation of presentation and logic are a few PHP scripts that serve as function libraries. These scripts do most of the ORM stuff... if you can even call it that. Trying to slowly refractor this seems like a nightmare.

Open to your thoughts and opinions... however not interested in hearing, "Run away, Run away!".

+24  A: 

I would say your code has to be pretty dang bad before you do a rewrite, in fact, I can't really think of a good reason why you would rewrite a huge codebase that is currently working (see what happened to Netscape).

Furthermore, I think you'd have a hard time convincing your management to spend a bunch of time/resources just to get what they already had (in their eyes, not yours) months ago (i've already fought that battle once).

That said, you should take the most important or slowest sections of your code and refactor them to make you system run more smoothly and allow for easier maintainability later in the future.

i'm sorry, but dealing with legacy code is just a part of the job, just make sure you make any code you touch better than it was before and you're golden.

I wish i had better news for you, good luck.

Robert Greiner
Well, yes, MSIE won the browser war by NOT rewriting the codebase, but look at where that got us. NS/Mozilla at least gave us a less-or-more standards compliant years before MSIE ever got anywhere close to passing the ACID tests. </rant>
Alan
+3  A: 

I'd start with a mind-map that documents how it works today (before modification).

Then tackle the performance "low hanging fruit" e.g. are images, scripts, css being sent as cache-able files? Any copy/paste code that can be rolled up into generic functions.

Then plan the "final" ideal design... and determine how many phases you need to get from here to there.

Then do phase 1. ;-)

scunliffe
+3  A: 

In my opinion, the first thing you should do is fire up visio and get analyze the whole process in a flow-chart. Then -- when you have a very high-level view of what the program truly does -- you can break out logic chunks into classes, or do whatever else you feel is necessary.

Zack
Good idea, I often do this at the start of projects (or in a particularly tough, or confusing, part) to help guide myself in where I'm going. There's no reason it wouldn't work for a refactor.
The Wicked Flea
smart, as a bonus you also have some nice documentation of your system you probably didn't have before.
Robert Greiner
+5  A: 

First off, if I had anything to do with the original code I apologize. It won't happen again. I was young and naive.

That said, the approach to make this go a bit easier for you is to start with a nugget of good code and as new features or major bug fixes fall into your lap, build them out with your nugget. Eventually your nugget will be the application.

That's when you get a job somewhere else and anonymously answer your replacement's StackOverflow pleas for help.

Edit:

On a more serious note, grab a tool like PHPDoc and use that to start documenting the system. PHPDoc outputs a web site which you can search for duplicate items and things that generally serve the same purpose. It will also help to document your new code.

Rob Allen
sound advice... +1 for the humor :P
Robert Greiner
yep. already had PHPDoc in my plan of attack. Problem is no one has actually used the *Doc documenting syntax for any of the code so far. that will change though.
nategood
+2  A: 

I would start by analysing where you're taking hits on performance. In many cases the performance issues arise because of poor database tuning.

We've got a mission critical legacy ASP app that's around 10 years old, the code over the years has grown arms and legs and is a bit of a maintenance headache in places. That said, and whilst some of the code is as ugly hell, most of our problems are being caused by poor database tuning - lack of indexes, stale indexes, wrong indexes, that kind of thing.

Before diving into a full re-write identify where the problems are first then make an informed decision about what remedial action to take. Your code may be a spaghetti monster nightmare, but the root of your problems may lie elsewhere.

Kev
+5  A: 

Baby steps.

Don't try to refactor all of the code at the same time - the problem is that the code will probably break in horrible ways, and it will still not be in any good debugging shape. If you take small steps at a time, you can verify that the changes you've made are working as you want them.

Find duplicate code (it is probably there) and refactor that. This will make the codebase smaller and easier to manage.

Document. Take out a notepad, go through the code and get an idea of what the code is about and what it is supposed to do. That way you can get a decent idea of the overall project.

Make sure you have something else to do besides refactoring - it will get boring and tedious after a while, but if you keep at it the codebase will improve over time.

Last but not least: be patient - refactoring takes time.

ylebre
+2  A: 

I'd say the spaghetti is an architectural problem, not a functional one. For the architectural side, I'd probably say there's no way out... You've gotta rewrite at some point. If you envision your software is going on a good trend, why not plan for the rewrite by putting more time in it both maintaining the current branch and the rewritten branch. It might be a lot of work, but I'd say it's an investment in your case.

On the other hand, the best strategy for you to adopt at this point, is to run your profiler (Xdebug or Zend) to identify the slowest part, and then just tweak it. No refactoring, ignore the ugly business logic that has nothing to do in the view. You just want to get the performance tweaked.

So my recommendation is:

  1. Plan for the inevitable rewrite
  2. Just focus on optimization instead of re-architect
kizzx2
+14  A: 
  1. Set up a source-control system if there wasn't any
  2. Read through the code to understand its structure and flow, write it down.
  3. Try to identify the problems well. Most of the times, the bigger performance problems is not coming from PHP.
  4. Refactor away. Start small.
  5. Remember to take a break every now and then. We all know the amount of effort this stuff would require.
andyk
this was pretty much my train of thought. they actually need to have a lot of things put in place (scm, project management software, documentation, a real development process...). i took it bc I am young, like to dabble in everything, really enjoy the management, believe in the product, and see potential for serious growth. thanks for the itemized list.
nategood
you're welcome. Good luck and watch out for the clever traps ahead.
andyk
.. agreed on the serious growth. Negative things aside, I have to mention that you are lucky to get to handle a high traffic site in your early career.
andyk
+2  A: 

Good list andyk. What also came to my mind was to write unit tests(phpunit) first to make sure you don't break stuff.

Alfred
A: 

I know, a lot of time has passed, but for the sake of others to come: we have a new tool which might help you with this task. It is called nWire for PHP. It is a code exploration plugin for Eclipse PDT and Zend Studio 7.x.

After analyzing your code, it can present all the associations like call hierarchy, invocations, file inclusions, etc. This can also be presented graphically. We have many customers using it on fairly large projects. It is the perfect tool for developers who get lost is a huge codebase.

zvikico