views:

107

answers:

2

Hi!

I would need a tool, if it exists or if you can write in under 5 mins (don't want to waste anyone's time).

The tool in question would resolve the includes, requires, include_once and require_once in a PHP script and actually harcode the contents of then, recursively.

This would be needed to ship PHP scripts in one big file that actually use code and resources from multiple included files.

I know that PHP is not the best tool for CLI scripts, but as I'm the most pro-efficient at it, I use it to write some personal or semi-personal tools. I don't want un-helpful answers or comments that tell me to use something else than PHP or learn something else.

The idea of that approach is to be able to have a single file that would represent everything needed to put it in my personal ~/.bin/ directory and let it live there as a completely functional and self-contained script. I know I could set include paths in the script to something that would honor the XDG data directories standards or anything else, but I wanted to try that approach.

Anyway, I ask there because I don't want to re-invent the wheel and all my searches gave nothing, but if I don't have any insight here, I will continue in the way I was going to and actually write a tool that will resolve the includes and requires.

Thanks for any help!

P.S.: I forgot to include examples and don't want to rephrase the message: Those two files
mainfile.php

<?php
    include('resource.php');
    include_once('resource.php');
    echo returnBeef();
?>

resource.php

<?php
    function returnBeef() {
        return "The beef!";
    }
?>

Would be "compiled" as (comments added for clarity)

<?php

    /* begin of include('resource.php'); */?><?php
    function returnBeef() {
        return "The beef!";
    }
    ?><?php /* end of include('resource.php); */
    /*
    NOT INCLUDED BECAUSE resource.php WAS PREVIOUSLY INCLUDED 
    include_once('resource.php'); 
    */
    echo returnBeef();
?>

The script does not have to output explicit comments, but it could be nice if it did.

Thanks again for any help!

EDIT 1

I made a simple modification to the script. As I have begun writing the tool myself, I have seen a mistake I made in the original script. The included file would have, to do the least amount of work, to be enclosed out of start and end tags (<?php ?>)

The resulting script example has been modified in consequence, but it has not been tested.

EDIT 2

The script does not actually need to do heavy-duty parsing of the PHP script as in run-time accurate parsing. Simple includes only have to be treated (like include('file.php');).

I started working on my script and am reading the file to unintelligently parse them to include only when in <?php ?> tags, not in comments nor in strings. A small goal is to also be able to detect dirname(__FILE__)."" in an include directive and actually honor it.

A: 

You could use the built in function get_included_files which returns an array of, you guessed it, all the included files.

Here's an example, you'd drop this code at the END of mainfile.php and then run mainfile.php.

  $includes = get_included_files();

  $all = "";
  foreach($includes as $filename) {
    $all .= file_get_contents($filename);
  }
  file_put_contents('all.php',$all);

A few things to note:

  • any include which is actually not processed (ie. an include inside a function) will not be dumped into the final file. Only includes which have actually run.
  • This will also have a around each file but you can have multiple blocks like that with no issues inside a single text file.
  • This WILL include anything included within another include.
  • Yes, get_included_files will list the script actually running as well.

If this HAD to be a stand-alone tool instead of a drop in, you could read the inital file in, add this code in as text, then eval the entire thing (possibly dangerous).

Erik
> This will also have a around each file You probably meant the `<?php ... ?>` tags (with the angled brackets).Thanks for submitting this idea. This would, sadly, need quite a bit of modification to a script (includes done only if not run from merged file, ). Anyways, thanks for the idea, I could see a way to use that in another scenario, like a packager for a live script, but it would not be helpful in my case. My needs would be more about merging without actively changing the script.Thanks again.
samueldr
+1  A: 

An interesting problem, but one that's not really solvable without detailed runtime knowledge. Conditional includes would be nearly impossible to determine, but if you make enough simple assumptions, perhaps something like this will suffice:

<?php
  # import.php 
  #
  # Usage:
  # php import.php basefile.php
  if (!isset($argv[1])) die("Invalid usage.\n");

  $included_files = array();

  echo import_file($argv[1])."\n";

  function import_file($filename)
  {
    global $included_files;

    # this could fail because the file doesn't exist, or
    # if the include path contains a run time variable
    # like include($foo);
    $file = @file_get_contents($filename);
    if ($file === false) die("Error: Unable to open $filename\n");

    # trimming whitespace so that the str_replace() at the end of 
    # this routine works. however, this could cause minor problems if
    # the whitespace is considered significant
    $file = trim($file);

    # look for require/include statements. Note that this looks
    # everywhere, including non-PHP portions and comments!
    if (!preg_match_all('!((require|include)(_once)?)\\s*\\(?\\s*(\'|")(.+)\\4\\s*\\)?\\s*;!U', $file, $matches, PREG_SET_ORDER |  PREG_OFFSET_CAPTURE ))
    {
      # nothing found, so return file contents as-is
      return $file;
    }

    $new_file = "";
    $i = 0;
    foreach ($matches as $match)
    {
      # append the plain PHP code up to the include statement 
      $new_file .= substr($file, $i, $match[0][1] - $i);

      # make sure to honor "include once" files
      if ($match[3][0] != "_once" || !isset($included_files[$match[5][0]]))
      {
         # include this file
         $included_files[$match[5][0]] = true;
         $new_file .= ' ?>'.import_file($match[5][0]).'<?php ';
      }

      # update the index pointer to where the next plain chunk starts
      $i = $match[0][1] + strlen($match[0][0]);
    }

    # append the remainder of the source PHP code
    $new_file .= substr($file, $i);

    return str_replace('?><?php', '', $new_file);
  }
?>

There are many caveats to the above code, some of which can be worked around. (I leave that as an exercise for somebody else.) To name a few:

  • It doesn't honor <?php ?> blocks, so it will match inside HTML
  • It doesn't know about any PHP rules, so it will match inside PHP comments
  • It cannot handle variable includes (e.g., include $foo;)
  • It may introduce scope errors. (e.g., if (true) include('foo.php'); should be if (true) { include('foo.php'); }
  • It doesn't check for infinitely recursive includes
  • It doesn't know about include paths
  • etc...

But even in such a primitive state, it may still be useful.

konforce
Yes, thanks! This is, as you said, in a primitive state, but you are partly getting there, which is good. I should have said that I wasn't targeting having conditional includes work like they should, or includes with variables. This could actually fit the bill for my couple of first tests. About inclusion in comments and HTML, as it would be for CLI script distribution, this wouldn't be as bad, as I would pass scripts through `php -w` anyway, so if it is before or after the merging, it is not that much of a problem.This is not the answer, but you have a orange triangle for it. Thanks!
samueldr