views:

138

answers:

2

I am working with a script to compare version numbers for installed and available applications. I would, on a normal basis, use simple comparison operators. Since I am building this application in a PHP 5.3 environment, I have considered the use of version_compare(), but that doesn't seem to suit my needs as cleanly as I would like.

The version strings I am comparing can follow many formats, but those I have encountered thus far are:

  • '2.6.18-164.6.1.el5' versus '2.6.18-92.1.13.el5'
  • '4.3p2' versus '5.1p1'
  • '5.1.6' versus '5.2.12'
  • '2.6.24.4-foo.bar.x.i386' versus '2.4.21-40'

As you can see, there really is no consistent format for me to work with.

The one thing I considered doing was splitting each version string on the non-numeric characters, then iterating the resulting arrays and comparing relative indices. However, I'm not sure that would be a good way of doing it, especially in the case of '2.6.24-4-foo.a.12.i386' versus '2.6.24-4-foo.b.12.i386'.

Are there any well-tested methods of comparing very loose version numbers such as this, specifically in a PHP environment?

A: 

Splitting by symbol (see preg_split) and comparing each element numerically (if both are numeric) or using string comparison (when both are alphanumeric) works for your examples:

    '2.6.18-164.6.1.el5' > '2.6.18-92.1.13.el5'
    2  6  18  164 6  1  e15
    2  6  18  92  1  13 e16 // higher
              ^ 

    '4.3p2' < '5.1p1'
    4 3 p2
    5 1 p1 // higher
    ^

    '5.1.6' < '5.2.12'
     5  1  6
     5  2  12 // higher
        ^

    '2.6.24.4-foo.bar.x.i386' > '2.4.21-40'
     2  6  24  4   foo  bar  x  i386 // higher
     2  4  21  40  ---  ---  -  ---- 
        ^

Where it potentially falls down is a version like 5.2-alpha-foo vs 5.2.49.4-beta-bar where you must compare a purely numeric sub-string with an alphanumeric sub-string:

    5.2-alpha-foo > 5.2.49.9.-beta-bar
    5  2  alpha  foo ----  ---  // wrong - ascii 65(a) vs 52(4)
    5  2  49     4   beta  bar
          ^

You could solve this by treating the alphanumeric field as 0 any time you have a purely numeric sub-string compared against an alphanumeric sub string.

meagar
Hm, well, I tried a few different examples against the result of preg_split('/[^0-9a-z]/i', $foo) and preg_split('/[^0-9a-z]/i', $bar) it seems to work, even with the possible pitfall you mentioned. I need to do some more testing, but this may be the route I end up taking.
Skudd
A: 

For reference rpm compare version strings something like this:

  • Split on all non-alpha-numeric character
  • Group consecutive numeric characters together and all non-numeric characters together (i.e. 1.12.ab002 is split into 1, 12, ab, 002)
  • Compare each group left to right
    • if both versions have numeric group they are compared as numbers (i.e. 1 = 001 and 12 > 5)
    • if either is a non-numeric group a simple string comparison is performed
  • The first non-equal comparison is the result
  • Longer versions are considered greater (i.e. 1.2.3 < 1.2.3.0 and alp < alpha)

This has flaws: 1.2.3rc1 > 1.2.3 and 1.2.3alpha > 1.2.3 which may not be right

Craig