views:

2626

answers:

26

I've inherited a PHP project that's turning out to be a nightmare. Here are the salient points:

  1. All the original developers have left
  2. The code has no version control
  3. All development and testing was done on the live server by renaming and editing the PHP files. There are multiple copies of each file index.php, index2.php, index3.php etc. and it's unclear which files are really being used
  4. There are multiple includes in each file to files that include other files that include other files, etc.
  5. There have been a multiple developers on the project that each had there own way of doing things. For example, there is a hodgepodge of JavaScript frameworks, some database queries use SQL, others an XML interface and others call procedural functions in the database.

Because of all of these problems, development is frustratingly slow. Besides venting my frustrations to Stack Overflow, any recommendations on how to get started on this mess? I'm fairly new to PHP development myself, but it seems like setting up some kind of development environment so that changes can be tested without breaking the live server is the first step. Any tips on how to get started here? What is a typical way to do testing? Setting up a local version of the site on my desktop seems like a lot of work (server is Linux, but desktops here are Windows). Can I create a subdirectory on the live server for testing, or..? What about the database?

Secondly, is there some kind of profiling I can enable to track which files on the server are actually being used? I'd like to delete the renamed copies of things that aren't actually being included. Even better, is there a way to tell which parts of a file aren't being executed? There are lots of copied functions and garbage in that I suspect aren't being used either. Similarly, for the includes, any tips on straightening out the mess?

Well, I'll stop venting here and throw myself at the mercy of everyone here. :)

+1  A: 

The first thing I would do is set up a testing environment using a virtual machine of some sort. VirtualBox or Virtual PC would be fine choices. That way you can start changing things without fear of breaking the production environment. No matter how much work this seems like it will be (with the database and web server and everything), in the end it will be worth it. One of the great benefits is you can copy the VM and give it to somebody else, if you find you need assistance.

Greg Hewgill
+1  A: 

You definitely need a development environment. If you don't want to mess with running the site on your windows box, you could grab a VMWare image of some Linux distro.

Kevin Tighe
Many of my coworkers did this for their development environments before they started to use Xen servers on a central server.
Sydius
+9  A: 
  1. Set up a development server (as Greg Hewgill mentioned, VirtualBox and Virtual PC are good choices for this).

  2. Put the current site files (including the relevant web server and PHP configurations!) into version control.

  3. Find out what files are being used - use your development server setup to test by removing all the fooN.php files and see if it still works.

  4. Pray...lots (OK, this isn't required, but it sounds like you'll need it).

Harper Shelby
I would put everything into version control and then begin deleting. One can't really know if something in this mess could be useful.
stesch
Yeah, switch 2 and 3.
Adam Jaskiewicz
John MacIntyre
How does it mess up version control? Once you delete them, they're gone from the current revision and you don't have to think about them anymore. If, for whatever reason, you find you still need one (maybe one was used, and your tests didn't catch it), you can then go back and retrieve it.
Adam Jaskiewicz
As I stated in my post, you need to version the system as it exists today. Don't worry about making your vc system dirty at first. Better to have index3.php versioned when you find out in three months that it was included in calctax73.php.
Scott Bevington
It doesn't "mess up" version control if you delete files insofar as the version control continues to work fine, but the deleted files are never really gone. Some people find it untidy to have files under version control that were never intended to be saved to begin with.
Bill Karwin
It's existing code of a working (if brittle) system. All of it should go in version control, Just In Case.
Adam Jaskiewicz
Harper Shelby
Forget my previous statement about messing up your source control database. I'm stuck in a VSS world, where you can't trust branching, and AFAIK you either leave your dead files to be continuously regurgitated, or delete them permanently.
John MacIntyre
I'm becoming more and more glad I've never had to deal with VSS.
Adam Jaskiewicz
One the greatest days in my professional life was when I learned there was something other than VSS for version control.
Scott Bevington
A: 

Try to get detailed stats on the site and find out where the entry and exit points are. A decent way to find out what files are being hit off the top (and then look into those files to see which includes are being pulled).

Valien
+51  A: 
  1. Before all else, get the files in version control as is. Do not proceed past #1 until it is done.
  2. Establish a testing environment.
  3. Clean up the files
Scott Bevington
http://subversion.tigris.org/faq.html#in-place-importYou might want to SVN pretty much everything you can think of /etc, pearInclude directory, anything else you can think of. If the developers weren't discplined to keep the code clean, server is probably fubar as well.
David
@David: good point, they even could have made edits to installed pear packages! Brr!
Bill Karwin
SVN? That is a good idea if it were 5 years ago.
jrockway
+1  A: 

First step of course would be to put it under version control. This way at least you can go back to the original working version. Secondly it might be a good idea to overwrite the include, require, etc functions to, for example, write the filename of the file that's being included to some log file, this way you can find out which files are actually being included (thus hopefully ruling out a lot of the index2.php, index3.php, etc.

To find out, if necessary, if some classes are used and some aren't, you can use get_declared_classes in conjuction with get_defined_vars and gettype to see which types are being instantiated.

As for issue 4 and 5, those are probably a bit harder to solve, but this should get you started hopefully.

Aistina
+3  A: 

I would:

  1. Sit down and take a deep breath;
  2. Decide if that's really where you want to work;
  3. Assuming yes, then I would roll up my sleaves, pick one mess to work on at a time and get to work.

I know that we can't limit ourselves to just one task at a time; however, you can limit your work to solving one mess at a time while working on the daily tasks that come in.

Chris Lively
+13  A: 

Well, first things first. I've been in the situation you're in, and it sucks. I think you're on the right track with wanting to get a development environment up and running.

Development environment

This will include a Webserver / script engine / database engine stack, and an IDE most likely.

For a LAMP stack installer, I recommend using one of these:

Further reading on the LAMP stack:

O'Reilly's OnLamp site

For a good PHP IDE, I recommend using one of these:

Article on IBM's Developer site comparing several IDE's

For Source control, you can use Team Foundation Server, SVN, or Git -- just use something that you know. I would recommend getting everything in source control first (for any emergency maintenance you might have), but then plan on doing a pretty big overhaul.

The Overhaul

You mentioned that you didn't even know what files are getting used, and that they used a file naming convention as a pseudo-version-control. You might want to start overhauling there, once you have a development environment up and running. There are a few things that can help you:

  • Your app customers/users
  • Meticulous and organized note taking
  • A good logging framework

Your customers/users are important, because it sounds like you're new to the project and they're going to know how the app should behave better than you (most likely).

Meticulous note taking is important, because you're going to be essentially re-writing any requirements / design / end-user documentation from the ground up. You need to understand the internals if you're going to do that. And if you're going to understand anything about this system, you'll need to write it down yourself (or you'd be perusing premade documentation right now instead of reading Stack Overflow) ;-)

And finally, a logging framework is important because you need to fix things, and you can't fix things that you don't know are broken. A logging framework gives you visibility into parts of the app that don't have any obvious UI. Inserting it into various parts of the app and then looking at the logs gives you a good idea of when code is executing and in what order.

You'll want to focus on capturing (on paper) how the app should work, and then slowly removing unnecessary files while trying to not break anything. Keep an eye on logs to help with debugging. Make sure your customers aren't screaming that something is broken. Make sure your notes agree with what is getting logged and what your customers are saying.

Preventing this in the future

Recheck everything back into source control. Hopefully you will have arrived at a newer, saner, better directory structure by this point.

Get a test structure in place. Even if this just means getting a basic unit test framework in place and doing some basic smoke tests after each deploy, it's better than nothing. Ideally, you should have a test engineer or a knowledgeable and trustworthy customer that can spend time testing after each deploy.

Put a deployment process in place if you grow more than one developer. Controlling change to your production environment should be your first priority. (The last thing you'd want to do is go through this again, right?) You should have a clear and simple process for moving between environment boundaries (like Dev -> Test then Test -> Production).

Dan Esparza
-1 for recommending visual sourcesafe!
Orion Edwards
Updated my recommendation to use TFS instead of Visual Sourcesafe. Sheesh.
Dan Esparza
+1  A: 

I think all 5 of your points hit on some classic ASP projects I've inherited, and a PHP one too...

I completely agree with the others on get it in source control ASAP and use VMWare, VirtualBox, etc for a test environment.

Make sure to get your database versioned too, especially if the procedures have extra logic in them (not just straight insert, update, delete). DB versioning takes more attention then the php pages. You need to generate all of the objects to sql scripts and put those scripts into source control. Then as you change db structure, procedures, etc you need to update the scripts so you have a history of those changes too.

As for figuring out what is using what on the database side I would suggest looking at ApexSQL Clean. I used this on a project with several hundred ASP files, 200+ tables and about 400 stored procedures. I was able to identify 20 or so tables that were not in use and about 25% of the stored procedures. With ApexSQL Clean you can add all of your php files into the dependency check along with the tables, views and stored procs. Grab the 30 day trial and check it out, it will save you a lot of time.

For what files were in use for the website, I had web server logs for the previous month and ran searches against them for anything I was unsure on. I do also like a variation on what Aistina suggested on modifying the files to log when they are accessed. Maybe have it go to a table in the database you setup that is filename and access count and for every time that file is loaded it increments the count. Then after a period of time you can look over the counts and determine what can go.

ManiacZX
+26  A: 

I've done this. You have my sympathy. If your passport isn't current or for some other reason you can't dodge doing this, here's how I'd approach it:

Step Zero is to get it into version control, no matter how crappy it is. If it even kind of works, and you break something, you need to be able to go back to the working state - or at least compare your changes to it to figure out what went wrong. Do frequent, small check-ins as you're refactoring, and you'll have less code to roll back when things mysteriously go wrong. (Things WILL mysteriously go wrong.)

After that, I'd start at the database. Make sure everything is relatively well-normalized, columns are clearly named, etc.

Do the PHP code next. If the code is really that much of a patchwork, I'd go ahead and fit it to a framework. Look into CakePHP or Symfony - their Rails-ish manner of separating concerns makes the question "where should this piece of code go?" easy to answer. It's not a small task, but once you've done it, you're probably better than half-way to having a sanely-constructed app. Also, the built-in test facilities of a good web framework make refactoring FAR easier - write a test to cover an existing piece of functionality before you change it, and you'll know whether you broke anything after the change.

Once you've got your database sorted and have the model code in the models and the controller code in the controllers, then you can worry about presentation-level stuff like standardizing on a single JS/AJAX library, cleaning up CSS, etc.

As for a dev environment: You should absolutely set up a local dev environment. There are turnkey WAMP packages out there, or you could install to a Linux box/VM (I recommend VirtualBox for virtualization). You should also have a separate integration test environment that mimics the live server. Nothing but live code should run on the live server.

As far as debug/profiling tools, I know that Symfony comes with a pretty slick set of tools, including a little JS toolbar that comes up on your pages (only in debug mode) with logging & profiling information.

Good luck.

bradheintz
Building a new app against an existing database in Symfony is actually pretty easy. If you haven't got a stake in the existing code then this might well be easier then fixing what's there.
Colonel Sponsz
A: 

Do what Harper Shelby said ...

But, I'd also add that if you don't get management support to clean this up, you may want accept the fact that this may be like this for a reason. ... just sayin. ;-)

John MacIntyre
+1  A: 

Here are some ideas:

  • PHP and Apache work just fine on Windows too. Perhaps you can do an all-Windows installation after all?
  • Try grep'ing (or some Windows alternative) for "include" and "require" in all PHP files. Then make a list of all included files found. Compare the list with files in the folder. You should be able to get rid of at least SOME unreferenced files.
  • Alternatively make a list of all the file names and search all the files for them. You could make something like a dependancy graph like this.
Vilx-
+1  A: 

This is indeed a mess. But start getting creative on where to cut off some of the tentacles on this thing:

  1. Get version control. I recommend Git.
  2. Set up a local development server. Find a WAMP, LAMP or MAMP package to get you started since you're new to this.
  3. Find the entry points (index.php, etc.). Check your server access logs to see what these are.
  4. Roll up your sleeves on some regular expression black magic and dump out an include/require tree on all the files. But beware of any include( $filename ) dynamic includes. If you have any of these you'll need to do some logging on $filename to find out what possibly gets included, although the code around it should give you clues. With a little luck you can cull all your unused files this way.
  5. Use more regex black magic to check functions and methods are being referenced elsewhere in the codebase. There may be an IDE that can help you with this. Try NetBeans (I used it to help me refactor a C++ project once, so it may help here.)
  6. As someone else replied, "find out, if necessary, if some classes are used and some aren't, you can use get_declared_classes in conjunction with get_defined_vars and gettype to see which types are being instantiated." You could also just write some code to find all new statements in the code base.
  7. And so on... just think about how you can whittle this monster down. And try to reorganize code where you can.
pbhogan
+7  A: 

One thing you might consider is to install the PHP "xdebug" extension in a development environment, set it to trace all function calls, and then as fully as possible (possibly through automated UI testing) exercise the entire application. You will then be able to analyze/parse the xdebug trace files to find all the files/functions used by the application.

This was part of my plan for analyzing my own PHP nightmare I inherited earlier this year, I had prior xdebug experience so it would be a little less non-trivial for me than for those with no xdebug experience. But instead, I actually just bailed, it was a supplemental part time gig and it was just too stressful; I did however get myself replaced with a full-time person.

George Jempty
I'd suggest this. Running xdebug profiler output through kcachegrind/wincachegrind is also nice as a quick graphical way of seeing which code calls what.
Ant P.
xdebug is a good suggestion for getting overview over the code. Try the function traces feature for example, which is good for tracking which files are actually included.
troelskn
+5  A: 

Other folks on this thread have great advice. I've been in this situation too. Probably everyone at one time in their career walked into a project that looks like it was hit by a tornado.

One suggestion I'd add is that before you do any of the cleanup described by other folks, you need to get management buy-in.

  • Make a plan based on suggestions on this thread.
  • Describe any new hardware or software you'll need to create a development & testing environment, and price these out.
  • Figure out which new skills you need to be trained on to set up and use the development & testing environment. Estimate the time and expenses required for you to get those skills. E.g. books or paid training.
  • Estimate a work schedule for you to do the cleanup. How long to get the code under source control? How long to understand the database? How long to understand the PHP and javascript code?
  • Present this to your manager, and phrase the goal in terms of benefit to his bottom line. E.g. once everything is cleaned up, making changes or rolling out new functionality will be quicker, debugging errors will be more predictable, and ramping up new staff will be easier.

Naturally you need to continue to work with the current mess, because it's a live site. Managing the live site takes priority, so cleanup work must be a background task. That means it'll take even longer. My experiences cleaning up a moderate-sized project as a background task have usually taken six to twelve months. Since the site will continue to evolve over this period, some of your completed cleanup tasks may need to be revised or re-done. Make sure your manager understands all this too.

If the manager balks at your plan to clean up this mess, or doesn't value cleaning it up, at least then you'll know why all the other developers have left this company!

I have a few specific suggestions about how to proceed:

  • In addition to all the other great advice, I'd suggest using the Joel Test as a benchmark. Your plan for cleanup should result in a job environment that would score well on the Joel Test.
  • Read my answer to "What are the best ways to understand an unfamiliar database?"
  • Enable logging on the website so you can analyze which PHP pages are actually being called. At least that tells you which of index2.php, index3.php, index4.php, etc. are truly obsolete.
  • PHP has a function get_included_files() that returns an array of all files included during the current request. By logging this information, you can find out which PHP files are in use, even if they don't show up in the web server log.
  • You really do need to have a testing & development environment that matches your production server. It's no good to test on Windows and deploy on Linux. It's no good to use MySQL 5.0 during development and MySQL 4.0 in production. You can probably get away with a hardware platform that is more modest (though compatible).
Bill Karwin
+1  A: 

I know how you feel. I inherited the development of such a project. It stayed with me for a year and to be honest it made me the developer I am today. There is no better opportunity for personal advancement than working knee deep in shit.

Here are the things that helped me to most:

  • identify which are the key files of the system. You will find them because the most of your work will be done in them
  • create a local version of the project (including the database) and put it under version control
  • work only on a small amount of files with small changes
  • do not put anything inside the production version until you have it tested thoroughly, and then be ready to put the old version back
  • find out how the users of the system are handled (sessions , cookies). Create a super user and then when you need to test your code live on the system put it in a block like this:

if($_POST['your_registered_user_name']{ //Your live code being tested, which will be visible only to you when you are logged in }

other users wont be able to feel the changes. This technique helped me a lot when I was unable to replace the system state on my local machine

  • write test, and follow strict engineering guidelines for all the code you are writing
Nikola Stjelja
+15  A: 

Most of the time you can tell if a file is being used by using grep.

grep -r "index2.php" *

You can also use the PHP parser to help you cleanup. Here is an example script that prints out the functions that are declared and function calls:

#!/usr/bin/php
<?php
class Token {
    public $type;
    public $contents;

    public function __construct($rawToken) {
        if (is_array($rawToken)) {
            $this->type = $rawToken[0];
            $this->contents = $rawToken[1];
        } else {
            $this->type = -1;
            $this->contents = $rawToken;
        }
    }
}

$file = $argv[1];
$code = file_get_contents($file);

$rawTokens = token_get_all($code);
$tokens = array();
foreach ($rawTokens as $rawToken) {
    $tokens[] = new Token($rawToken);
}

function skipWhitespace(&$tokens, &$i) {
    global $lineNo;
    $i++;
    $token = $tokens[$i];
    while ($token->type == T_WHITESPACE) {
        $lineNo += substr($token->contents, "\n");
        $i++;
        $token = $tokens[$i];
    }
}

function nextToken(&$j) {
    global $tokens, $i;
    $j = $i;
    do {
        $j++;
        $token = $tokens[$j];
    } while ($token->type == T_WHITESPACE);
    return $token;
}

for ($i = 0, $n = count($tokens); $i < $n; $i++) {
    $token = $tokens[$i];
    if ($token->type == T_FUNCTION) {
        skipWhitespace($tokens, $i);
        $functionName = $tokens[$i]->contents;
        echo 'Function: ' . $functionName . "\n";
    } elseif ($token->type == T_STRING) {
        skipWhitespace($tokens, $i);
        $nextToken = $tokens[$i];
        if ($nextToken->contents == '(') {
            echo 'Call: ' . $token->contents . "\n";
        }
    }
}
grom
+1  A: 

1) make a back-up of the code now.

2) version control

3) create a test site. Is the site running under Apache? You can even install apache+php+mysql on your own computer, and use that for testing.

4) Deal with security issues. Make sure the site is protected from sql injection, and from email injection. At the very least, you can do a search for database calls and add calls to "mysql_real_escape_string()" (well if it's using a mysql database) ... you can do a real fix later once you understand the code better. For the email injection ... write a filter function that filters out spammer code, and make sure all form fields that are used in an email are filtered. (Yeah it adds more spagetti code, but it's going to take a while before you're ready to significantly refactor the code.)

5) After that, I suggest incremental upgrades. You're new and the code is a higgleypiggley mess, so it's going to take a while to understand it all ... and to fully understand the domain. So just go about your job for a bit, fixing what needs to be fixed, adding what needs to be added. As you're doing so, you're learning how the system is put together. Once you know how the code is organized (or not organized) a little better, you can start planning a major refactoring/rewriting of the system. Hopefully you can do it component by component so you've always got a new milestone in the offing.

chaoticsynergy
+8  A: 

If it's the very worst case, the code is all scrampled, and all display is intermingled with the logic and the database calls, you might do what I had to do with one PHP project.

I gave it three starts on trying the refactoring approach. It was like hill-climbing on a motorcycle and getting 10% of the way each time. So I took another approach which ended up working out much better.

  1. I logged in as a user,
  2. and worked through every screen and every use-case I could find.
  3. I saved the html to static files,
  4. and took notes on the procedural operation and obvious business rules.

I did this for 3 solid days, and then took my notes and had a long conversation with the stakeholders.

After getting agreement on some first steps, I reimplemented all the html UI properly, using good consistent design and abstraction. After getting rolling, I could do a couple screens a day.

Then I took the result back to the stakeholders and ran through a bunch of use cases. (The stakeholders were immensely pleased at steps 1 and 2, because they didn't like the first implementation at all anyway (surprise) and now it looked like there was hope for improvement, not just recovery of sane-old-app.

That turned out to be the end of the hard work (and also the end of the perceived project risk for the stakeholders.)

It turned out that the first crew had gotten so tied up in their own misbegotten spaghetti that there was actually comparatively little content to the work, so duplicating it had less scope than everyone suspected.

But the key decision was that the original code, both content and structure, were unrefactorable and I needed to work from an entirely exterior view with a new framework that was properly designed.

le dorfier
+2  A: 

Surprisingly, no one even mentioned this, as far as I can see, but there is another alternative: give up on the code and just use the functionality of the site itself as the basis for creating a new feature set specification (i.e., the first one ever for this project) and then re-build the site, based on those features, with an established framework such as CakePHP or Drupal.

Yes, there are those who would cringe at the evil word re-write ... but there are cases when this is actually a better way to go, and you hinted at some reasons:

  • you're fairly new to PHP development, yourself
  • you will probably be better off starting out with something clean instead of the pure crap code you've inherited
  • in the final analysis, most people (users) don't give a damn about the source code, and if it looks like it "works" to them, they may look at you like you're crazy if you try to tell them something is dreadfully wrong
  • you will have more fun and live a longer life if you pick up the practices of source revision control and database design within a unified framework that looks like someone actually cared enough to have their name attached to it

Sure, everyone in this position has had to work with code like this before, but sometimes enough is enough and it's better to scrap the spaghetti and start with a fresh plate. If you read Joel's article on why it's bad to do a re-write, you will notice almost none of the circumstances he cites apply to you here.

dreftymac
I would be a little concerned with a new developer biting off more than they can chew if it is a large and complex site, but I agree. In some instanced, a rewrite will save a lot of time in the long-run and sometimes even in the short term too.
Sydius
+1  A: 

Lots of useful posts about how to deal with this.

Without trying to repeat what everyone else has said:

  1. Get a copy of the prod environment running. It can be a virtual machine, or another real machine. But you need to be God on it. If the prod database is on another box, you'll need a dev version, too.
  2. Throw it all into version control. On another box. One that is backed up at least weekly.
  3. Make sure you know how branching works in your version control application. You'll probably need it.
  4. Get the prod server locked down. You don't want any further changes made to it that don't come out of version control.
  5. Create instructions for releasing code from version control to the prod server. The smallest unit of releasable change should be the whole code base.

The next steps depend on how attached to it the users are. If it can't be changed to much for whatever reason, you will need a progressive approach. If development and maintenance still needs to happen, then this is probably your only option. Remember to use that branching feature to separate such mods away from your re-writing efforts.

To put sense into the structure, you have to basically create a new structure alongside what's there. A new DB handler is usually a good place to start, included from a generic include file that every page should load. The goal here is to create a minimal include structure that can be expanded later without telling every page to load additional files.

Now you need to start moving functionality over to your new include files. You will need a way to have several files open at once, such as a multi-file editor, or screen+vi (or emacs). Start with utility functions and code-blocks that are repeated in various places. Try not to get distracted into fixing a lot at once. Some types of problems are going to have to just move places as other problems get fixed. You'll come back to them later.

Don't feel you need to add a third-party framework. Adding such a thing quickly leads to a complete re-write. At this point, this will be a whole lot more work than just taming its include structure. So sort that out first.

As you move functionality over, you will need to have files use your new include file. The first few files you do this for you will be chasing conflicts for a while. It will feel disheartening and pointless but this is probably the hardest part. After a few files, it will get easier. There will be times when you can migrate half-a-dozen pages to the new include files by replacing a dozen includes with just one. The flip side of that action is that there will be files you can just delete.

If you stick at it, you will eventually get to the point where all the include files are the ones you've written and you'll be across the whole include layout. By that point, it will be significantly easier to do much more invasive changes, like putting in a third-party framework.

staticsan
A: 

I just went through this myself.

My number one tip is to not try and change everything on day one. You need friends if you really want to be able to fix this thing. You need your colleagues respect before you suggest how to change everything they've been working on for months (years?).

First, get the code under version control as soon as possible. If that's not going to be easy for you, at least start making daily backups, even if it means just zipping up the files and naming the zip file with the date. If nobody there knows about version control, buy a Pragmatic Programmer's book on CVS or SVN, and set it up yourself. The books can be read in a day, and you can be up and running quickly. If nobody else wants to use version control, you can use it yourself... then when somebody loses a file you can save the day with a copy from your repo. Sooner or later the others will see the wisdom that is version control.

Second, dive into the code as hard as you possibly can. Live it and breathe it for a month. Show the people who are there that you are going to learn their code.

Third, as you go through the code, take copious notes. Write down every thing that bothers you about the code. Just get your thoughts on paper. You can organize it later, after Month One.

Fourth, install a code profiler (such as xdebug). That'll tell you what files & functions are being called on each page, and how long each piece of code takes to run. You can use this to figure out your includes issues, and find slow bits of code. Optimize those first.

After your month of hard work, sifting through the code, and taking notes, turn your notes into a proper document. The different sections could range from security, to caching, to architecture, to whatever else is bothering you. For every criticism you make, offer a better solution, and an estimate on how long it would take to fix. This is where you get rid of all the competing javascript frameworks etc.

Revise this document as much as possible. I cannot stress that enough.

Make sure your audience can tell you're doing it for the good of the company, not just your personal preferences.

Present it to your boss, in person. Set-up a time to discuss it.

They could fire you for having written it. If they do, you're better off without them, because they don't want to improve, and your career will stagnate.

They might want to implement all of your recommendations. It's not likely, but it is possible. Then you'd be happy (unless your recommendations fail).

Most likely they'll want to implement a few of your recommendations, and that's better than nothing. At the very least, it'll help ease your concerns.

As for testing, setup another "virtual host" in Apache (supported on both Windows & Linux). Virtual Hosts let you run multiple sites on a single server. Most larger sites have at least 3 virtual hosts (or actual servers): dev.domain.com (for daily development), staging.domain.com (for QA people to do testing on just before a release), and www.domain.com (your production server). You should also setup dev, staging, and production versions of the database, with different logins & passwords so you don't accidentally confuse them.

An alternate solution would be to give each developer their own virtual host on the Linux server, and they can work via FTP/SCP or network share using samba.

Good luck!

lo_fye
A: 

In addition to the great stuff other people have said, to get a first pass at what files are actively being used, you can install an opcode cache like APC or eaccelerator on your dev server (or even a production server, this won't break anything). Then, click around the web app on your dev server (or let the users do it on your production server).

Now look at the list of cached files in your cache admin page. If a file isn't listed as being cached by your opcode cache, there's a good chance it isn't being loaded by anything.

This isn't a whole solution, but if each directory has 10 index.php files (e.g. index.php, index2.php, etc.), at least you'll know which one is being used by your app.

bobbyh
+3  A: 

You can see a list of all the included/required files by putting this near the bottom of the page:

<?php var_dump(get_included_files()); ?>
lo_fye
A: 

Yes, Version Control is definitely step #0.

I'd also recommend a good Code Search Tool.

Agent Ransack is pretty good (assuming you're on windows) http://www.mythicsoft.com/agentransack/Page.aspx?page=download

I'd be flying blind without code search.

A: 
  1. Get it under revision control.

  2. Decide on naming conventions and file/directory structure.

  3. Make sure you have decent tools/IDE.

  4. Set up a separate development/testing environment if you haven't already

THEN ...

  1. Unfortunately, you'll need to sift through all those 1, 2, 3 files and determine which ones are in use, and which can be disposed of. No other way besides a brute force grind through, file by file.

  2. Even though I have an RCS in place, I still often move what I think are unused scripts to a hidden location, say .mausoleum and then have the RCS ignore that location. Nice to be able to take a peek locally without going back to the repo.

  3. Separate HTML and PHP to the greatest extent possible. I cannot stress this enough! If this is done in each file, fine. Just so long as you have separate chunks of PHP and HTML. Of course, HTML will be peppered with echos here and there, but try to have all tests, switches, everything else moved out of the HTML block and into the PHP block. This alone can be HUGE when it comes to getting things sorted out.

  4. If the code is primarily procedural -- I assume in your case it is -- it's probably best to do some clean up first, before doing any serious refactoring or refactoring into classes.

  5. As you find files/scripts that can logically be combined, do so. (I've seen projects -- probably not unlike yours -- where the total number of surviving files is about 1/4 of what we started with).

Once you've gone this far, then you can begin a proper refactoring or refactoring into classes.

Bonne chance!

Clayton