views:

803

answers:

3

In order to refer to a local DTD when using PHP SimpleXML, I need to convert the absolute path into a file type URI. For example, convert /home/sitename/www/xml/example.dtd to file:///home/sitename/www/xml/example.dtd.

In the example given, it is easy enough, since all that is required is to add the 'file' scheme in front of the path. But if a situation arises such as there being a blank in one of the directory names, this is not good enough. The mechanism should work on Windows or Linux, and allow for non-ASCII characters in the directory names.

The code devised so far is:

 $absparts = explode('/', _ALIRO_ABSOLUTE_PATH);
 $driveletter = (0 == strncasecmp(PHP_OS, 'win', 3)) ? array_shift($absparts) : '';
 $filename = $driveletter.implode('/', array_map('rawurlencode', $absparts)).'/xml/'.$filename.'.dtd';
 $href = 'file:///'.$filename;

where the defined symbol is the absolute path to the system root (always with forwards slashes), the DTD is in the xml subdirectory, and has a name of $filename followed by the extension .dtd.

Will this work correctly? Is there a better approach?

Let me explain a little more background. The aim is to parse an XML document using SimpleXML. This is done using the simplexml_load_string() function with the LIBXML_DTDVALID option. The XML document will contain a real URI pointing to a home web site, but I do not want to introduce delays involved in reaching a distant web site, or load up the home web site with requests. The reference to the DTD is therefore edited so as to refer to the local machine. But it has to be embedded as a URI within a DOCTYPE inside the XML document. The constraints are not my choice, they are implicit in the rules for a DOCTYPE and are enforced by the SimpleXML function. I can work out where the file is located as an absolute path, but to put it into a DOCTYPE, it must be converted into a URI.

A: 

Do you really need to make it local? Can't you host the dtd on the web server? Obviously that makes it much easier and you don't need to worry about conversion. I think you would be better off trying to get it hosted rather than deal with the OS differences.

Arthur Frankel
The software is widely distributed, and I do not want either the bandwidth demand or the potential delays of referring to a specific home site. The file also exists on each instance of the software, but referring back to the same site using a URI will not always work because the resolving of a local URI may fail within the server.
A: 

have you tried realpath?

knittl
Yes, I'm afraid it just doesn't do what is required. It doesn't encode blanks.
+1  A: 

If you're running PHP locally, you must be running a server like Apache. So why the need for the file:// reference? Just use a regular reference.

If your script is on http://localhost/index.php you can refer to the file as /xml/example.dtd in your HTML. Or, if you mean to read the file from PHP, use $_SERVER['DOCUMENT_ROOT'] . '/xml/example.dtd'

In these cases the same code should work fine on your local machine and on the live server.


OK I had a think based on your clarified question, and you're probably doing more than you need. I'm not convinced you need to detect Windows, you just need to prefix your document root with file:// (and encode the URL). In other words, you'd end up with either file://C:/My Documents... or file:///home/site...

Of course, you could still use a HTTP reference instead of file. Like I say above, the user will be running on a server so you should be able to use the various parts of the $_SERVER variable (e.g. $_SERVER['HTTP_HOST']) to piece together a more concrete URL.

DisgruntledGoat
Well, there is a possible solution there, thanks. It has limitations though. Please see the additions to the original question - we are not talking about HTML here, we are in a DOCTYPE within an XML document. I know where the file is (in terms of an absolute path) but I have to tell SimpleXML where it is via the DOCTYPE, which in turn requires a URI. If the software is running on http://mydomain.com it is unwise to refer to http://mydomain.com/xml/foo.dtd because it may be impossible to resolve mydomain.com from within the server. Maybe http://127.0.0.1/xml/foo.dtd would work.
Hmm, okay. What isn't working with your current approach? Where is the XML file coming from and why can't you just put the correct path in the file manually?
DisgruntledGoat
Nothing so far as I know, it's just a bit cumbersome. And I'm not certain it will work in all circumstances. The XML file is distributed with the software, but the path to it will vary with each web server that uses the software. This is open source software for distribution - I do not control its implementation in most cases.