views:

537

answers:

8

I'm maintaining library written for PHP 5.2 and I'd like to create PHP 5.3-namespaced version of it. However, I'd also keep non-namespaced version up to date until PHP 5.3 becomes so old, that even Debian stable ships it ;)

I've got rather clean code, about 80 classes following Project_Directory_Filename naming scheme (I'd change them to \Project\Directory\Filename of course) and only few functions and constants (also prefixed with project name).

Question is: what's the best way to develop namespaced and non-namespaced versions in parallel?

  • Should I just create fork in repository and keep merging changes between branches? Are there cases where backslash-sprinkled code becomes hard to merge?

  • Should I write script that converts 5.2 version to 5.3 or vice-versa? Should I use PHP tokenizer? sed? C preprocessor?

  • Is there a better way to use namespaces where available and keep backwards compatibility with older PHP?

+3  A: 

I don't think preprocessing the 5.3 code this is a great idea. If your code is functionally identical in both PHP 5.2 and 5.3 with the exception of using namespaces, instead of underscore-separated prefixes, why use namespaces at all? In that case it sounds to me like you want to use namespaces, for the sake of using namespaces..

I do think you'll find that as you migrate to namespaces, you will start to 'think a bit differently' about organizing your code.

For this reason, I strongly agree with your first solution. Create a fork and do backports of features and bugfixes.

Good luck!

Evert
A: 

Well, I don't know if it is the "best" way, but in theory, you could use a script to take your 5.3 migrate code and backport it into 5.2 (potentially even using PHP).

On your namespace files you would want to do something convert:

namespace \Project\Directory\Filename;

class MyClass {
  public $attribute;

  public function typedFunction(MyClass $child) {
    if ($child instanceof MyClass) {
      print 'Is MyClass';
    }
  }
}

To something like:

class Project_Directory_Filename_MyClass {
  public $attribute;

  public function typedFunction(Project_Directory_Filename_MyClass $child) {
    if ($child instanceof Project_Directory_Filename_MyClass) {
      print 'Is MyClass';
    }
  }
}

And in your namespace code you would need to convert from:

$myobject = new Project\Directory\Filename\MyClass();

To:

$myobject = new Project_Directory_Filename_MyClass();

While all your includes and requires would stay the same, I think you would almost need to keep some sort of Cache of all your classes and namespace to do the complex conversion around the 'instanceof' and typed parameters if you use them. That is the trickiest thing I can see.

Kitson
Yes, you've left the trickest part out of your answer :) It needs to deal with built-in classes (\Exception) too.I've looked into it further and now I'm convinced that tokenizer-based script is probably the best way to go, but it requires quite large chunk of PHP syntax (AST) to be parsed :/
porneL
Yes, I admit I am lucky that I was able to move my stuff to 5.3 and not maintain backwards compatibility for 5.2. It is an excellent question. Obviously not one that was really considered widely.
Kitson
+1  A: 

Here's what I've found:

Doing this with regular expressions is a nightmare. You can get most of it done with just a few simple expressions, but then edge cases are a killer. I've ended up with horrible, fragile mess that barely works with one codebase.

It's doable with built-in tokenizer and simple recursive descent parser that handles only simplified subset of the language.

I've ended up with rather ugly design (parser and transformer in one – mostly just changing or re-emitting tokens), because it seemed too much work to build useful syntax tree with whitespace maintained (I wanted resulting code to be human-readable).

I wanted to try phc for this, but couldn't convince its configure that I have built required version of Boost library.

I haven't tried ANTLR for this yet, but it's probably the best tool for that kind of tasks.

porneL
Except for very limited special cases (e.g., your "simplified subset"), regexes are always a disaster when applied to carrying out massive, *reliable* source code changes, precisely because they can't capture the language structure.
Ira Baxter
That's why my answer is best.
hopeseekr
A: 

I haven't tested this on my own, but you may take a look on this php 5.2 -> php 5.3 conversion script.

It's is not the same as 5.3 -> 5.2, but maybe you will find some useful stuff there.

takeshin
Thanks, but it looks like a lint tool, not a converter.
porneL
+1  A: 

I am working on a project that emulates PHP 5.3 on PHP 5.2: prephp. It includes namespace support (not yet complete though.)

Now, out of the experience of writing this there is one ambiguity problem in namespace resolution: Unqualified function calls and constant lookups have a fallback to the global namespace. So you could convert your code automatically only if you either fully qualified or qualified all your function calls/constant lookups or if you didn't redefine any function or constant in a namespace with the same name as a PHP built in function.

If you strictly adhered to this practice (whichever of them you choose) it would be fairly easy to convert your code. It would be a subset of the code for emulating namespaces in prephp. If you need help with the implementation, fell free to ask me, I would be interested ;)

PS: The namespace emulation code of prephp isn't complete yet and may be buggy. But it may give you some insights.

nikic
You can feel free to use my answer below in your code, as long as you preserve the copyright tag ;-)
hopeseekr
+3  A: 

Here's the best answer I think you're going to be able to find:

Step 1: Create a directory called 5.3 for every directory w/ php5.3 code in it and stick all 5.3-specific code in it.

Step 2: Take a class you want to put in a namespace and do this in 5.3/WebPage/Consolidator.inc.php:

namespace WebPage;
require_once 'WebPageConsolidator.inc.php';

class Consolidator extends \WebpageConsolidator
{
    public function __constructor()
    {
        echo "PHP 5.3 constructor.\n";

        parent::__constructor();
    }
}

Step 3: Use a strategy function to use the new PHP 5.3 code. Place in non-PHP5.3 findclass.inc.php:

// Copyright 2010-08-10 Theodore R. Smith <phpexperts.pro>
// License: BSD License
function findProperClass($className)
{
    $namespaces = array('WebPage');

    $namespaceChar = '';
    if (PHP_VERSION_ID >= 50300)
    {
        // Search with Namespaces
        foreach ($namespaces as $namespace)
        {
            $className = "$namespace\\$className";
            if (class_exists($className))
            {
                return $className;
            }
        }

        $namespaceChar = "\\";
    }

    // It wasn't found in the namespaces (or we're using 5.2), let's search global namespace:
    foreach ($namespaces as $namespace)
    {
        $className = "$namespaceChar$namespace$className";
        if (class_exists($className))
        {
            return $className;
        }
    }

    throw new RuntimeException("Could not load find a suitable class named $className.");
}

Step 4: Rewrite your code to look like this:

<?php
require 'findclass.inc.php';

$includePrefix = '';
if (PHP_VERSION_ID >= 50300)
{
        $includePrefix = '5.3/';
}

require_once $includePrefix . 'WebPageConsolidator.inc.php';

$className = findProperClass('Consolidator');
$consolidator = new $className;

// PHP 5.2 output: PHP 5.2 constructor.
// PHP 5.3 output: PHP 5.3 constructor. PHP 5.2 constructor.

That will work for you. It is a cludge performance-wise, but just a little, and will be done away with when you decide to stop supporting 5.3.

hopeseekr
OK why the downvote?
hopeseekr
My solution works without having to spend thousands on a non-commodity product no one can effectively demo and many of us can't afford.
hopeseekr
Though my solution works approximately as good as the "non-commodity product" doing the same thing. So please stop telling people your answer is best. It is one good possibility, but it annoys me that you go running around telling everybody how good it is. Thanks.
nikic
It's not relevant to my problem. It only shows how to load PHP 5.3/5.2 code, not how to create/maintain it. I also fail to understand why would I call this explicitly instead of setting up autoloader.
porneL
A: 

Our DMS Software Reengineering Toolkit can likely implement your solution pretty well. It is designed to carry out reliable source code transformations, by using AST to AST transforms coded in surface-syntax terms.

It has a PHP Front End which is a full, precise PHP parser, AST builder, and AST to PHP-code regenerator. DMS provides for AST prettyprinting, or fidelity printing ("preserve column numbers where possible").

This combination has been used to implement a variety of trustworthy PHP source code manipulation tools for PHP 4 and 5.

EDIT (in response to a somewhat disbelieving comment):

For the OP's solution, the following DMS transformation rule should do most of the work:

rule replace_underscored_identifier_with_namespace_path(namespace_path:N)
   :namespace_path->namespace_path
"\N" -> "\complex_namespace_path\(\N\)" 
if N=="NCLASS_OR_NAMESPACE_IDENTIFIER" && has_underscores(N);

This rule finds all "simple" identifiers that are used where namespace paths are allowed, and replaces those simple identifiers with the corresponding namespace path constructed by tearing the string for the identifier apart into consitutent elements separated by underscores. One has to code some procedural help in DMS's implementation langauge, PARLANSE, to check that the identifier contains underscores ("has_underscores"), and to implement the tear apart logic by building the corresponding namespace path subtree ("complex_namespace_path").

The rule works by abstractly identifying trees that correspond to language nonterminals (in this case, "namespace_path", and replacing simple ones by more complex trees that represent the full name space path. The rule is written as text, but the rule itself is parsed by DMS to construct the trees it needs to match PHP trees.

DMS rule application logic can trivially apply this rule everywhere throughout the AST produced by the PHP parser.

This answer may seem overly simple in the face of all the complicated stuff that makes up the PHP langauge, but all that other complexity is hidden in the PHP langauge definition used by DMS; that definition is some 10,000 lines of lexical and grammar definitions, but is already tested and working. All the DMS machinery, and these 10K lines, are indications of why simple regexes can't do the job reliably. (It is surprising how much machinery it takes to get this right; I've been working on DMS since 1995).

If you want to see all the machinery that makes up how DMS defines/manipulates a language, you can see a nice simple example.

Ira Baxter
Could you please explain why *this* got picked as the answer? I mean, the DMS Reengineering Toolkit costs several hundred dollars and there's no example **at all** that it can do what was asked, whereas other answers on here actually give you working prototypes ... for free.
hopeseekr
There's also no demo, no where to buy it. Heck, it looks like absolute vaporware from where I sit.
hopeseekr
Vaporware? No examples? Did you check out the PHP tools at the website? *You* might not beleive it, but they are all implemented using DMS, carrying out "mass change" to the source code to achieve the effect they do. You might check out http://www.semanticdesigns.com/Products/Services/NorthropGrummanB2.html for heavy-duty application examples. You are right, it isn't free. In fact, it is considerably more expensive than the number you report. You and the OP can separately determine whether its cost is justified for what it does WRT to your needs. You buy it from our company.
Ira Baxter
...EDIT: I modified the answer to provide some more detail about how DMS could accomplish the desired effect. I didn't originally include this, because the OP seemed to know (based on one of the answers he provided) about the value of abstract syntax trees for code modification purposes.
Ira Baxter
Thanks for the update. Would you please show me where one can buy or demo the reengineering toolkit?
hopeseekr
Contact [email protected]. It isn't sold as a commodity product.
Ira Baxter
+2  A: 

This is a followup to my previous answer:

The namespace simulation code got quite stable. I already can get symfony2 to work (some problems still, but basically). Though there is still some stuff missing like variable namespace resolution for all cases apart from new $class.

Now I wrote a script which will iterate recursively through a directory and process all files: http://github.com/nikic/prephp/blob/master/prephp/namespacePortR.php


Usage Instructions

Requirements for your code to work

Your classnames mustn't contain the _ character. If they do, classnames could get ambiguous while converting.

Your code mustn't redeclare any global functions or constants within a namespace. Thus it is ensured that all your code may be resolved at compile-time.

Basically these are the only restrictions to your code. Though I should note that in a default configuration the namespacePortR will not resolve things like $className = 'Some\\NS\\Class'; new $className, because it would require inserting additional code. It's better that this is patched up later (either manually or using an automated patching system.)

Configuration

As we have made the assumption that no global function or constant is redeclared in a namespace you must set the assumeGlobal class constant in the namespace listener. In the same file set the SEPARATOR constant to _.

In the namespacePortR change the configuration block to satisfy your needs.


PS: The script may be provided a ?skip=int option. This tells it to skip the first int files. You should not need it, if you have set the override mode to intelligent.

nikic