domdocument

DOMDocument: Ignore Duplicate Element IDs

I'm putting some page content (which has been run through Tidy, but doesn't need to be if this is a source of problems) into DOMDocument using DOMDocument::loadHTML. It's coming up with various errors 'ID x already defined in Entity, line X'. Is there any way to make either DOMDocument (or Tidy) ignore or strip out duplicate element IDs,...

How to make DOMDocument write standalone=yes in PHP?

I'm using PHP5 to create XML files. I have code like this: $doc = new DOMDocument(); ... $xml_content = $doc->saveXML(); The problem is that created XML code starts with a root node like this one: <?xml version="1.0"?> But I want it to be like this: <?xml version="1.0" standalone="yes" ?> I guess I need to call some function on ...

PHP XML Parsing

Which is the best way to parse an XML file in PHP ? First Using the DOM object //code $dom = new DOMDocument(); $dom->load("xml.xml"); $root = $dom->getElementsByTagName("tag"); foreach($root as $tag) { $subChild = $root->getElementsByTagName("child"); // extract values and loop again if needed } Second Using the simplexml_load Me...

Using DOMXPath to replace a node while maintaining its position ...

Ok, so I had this neat little idea the other night to create a helper class for DOMDOCUMENT that mimics, to some extent, jQuery's ability to manipulate the DOM of an HTML or XML-based string. Instead of css selectors, XPath is used. For example: $Xml->load($source) ->path('//root/items') ->each(function($Context) { e...

Debug a DOMDocument Object in PHP

I'm trying to debug a large and complex DOMDocument object in php. Ideally it'd be nice if I could get DOMDocument to output in a array-like format. DoMDocument: $dom = new DOMDocument(); $dom->loadHTML("<html><body><p>Hello World</p></body></html>"); var_dump($dom); //or something equivalent This outputs DOMDocument Object ( ) ...

Indentation with DOMDocument in PHP

I'm using DOMDocument to generate a new XML file and I would like for the output of the file to be indented nicely so that it's easy to follow for a human reader. For example, when DOMDocument outputs this data: <?xml version="1.0"?> <this attr="that"><foo>lkjalksjdlakjdlkasd</foo><foo>lkjlkasjlkajklajslk</foo></this> I want the XML ...

How do I prevent Php's DOMDocument from encoding html entities?

I have a function that replaces anchors' href attribute in a string using Php's DOMDocument. Here's a snippet: $doc = new DOMDocument('1.0', 'UTF-8'); $doc->loadHTML($text); $anchors = $doc->getElementsByTagName('a'); foreach($anchors as $a) { $a->setAttribute('href', 'http://google.com'); } return $doc->saveHTML(); The ...

PHP DOMDocument : loadHTMLFile choking on a mysterious character: RS

Using php's DOMDocument->LoadHTMLFile('test.html'); keeps on returning an error to me, reporting for an error in the content at line 36. Deleting character after character, it turns out it's an apparently empty space that was the culprit. Copying/pasting that sentence in another editor (Editra), showed a strange RS character. What is i...

DomDocument failing to add a "link" element for RSS feed

I am trying to create an RSS feed in PHP using DomDocument but every time I try to make a node like http://domain.com the script fails $oDomDocument = new DOMDocument( "1.0", "iso-8859-1" ); // Create the root now $oRootNode = $oDomDocument->createElement( "rss" ); $oRootNode->setAttribute( "version", "2.0" ); $oDomDocument->appendChil...

PHP and parsing XML question

I'm building a web form in which administrators on my site can add XML to a textarea and submit it to be stored in a database table, but I'm a little confused as to the best method of parsing the XML. The PHP script needs to parse the XML and if there are any parse errors it should return the error message and line/column where the pars...

Parsing XML using PHP

Hiya, I've consistently had an issue with parsing XML with PHP and not really found "the right way" or at least a standardised way of parsing XML files. Firstly i'm trying to parse this: <item> <title>2884400</title> <description><![CDATA[ ><img width="126" alt="" src="http://userserve-ak.last.fm/serve/126/27319921.jpg" ...

Disable warnings when loading non-well-formed HTML by DomDocument (PHP)

I need to parse some HTML files, however, they are not well-formed and PHP prints out warnings to. I want to avoid such debugging/warning behavior programatically. Please advise. Thank you! Code: // create a DOM document and load the HTML data $xmlDoc = new DomDocument; // this dumps out the warnings $xmlDoc->loadHTML($fetchResult); ...

How to remove an HTML element using the DOMDocument class

Is there a way off removing an HTML element by using the DOMDocument class? ...

loading DOMDocument in Frame/webView

since there was no to this question http://stackoverflow.com/questions/1209586/iphone-tabbed-browsing-by-storing-webview-in-an-array-plist yet, I wanna try to get the DOMDocument (the whole content of the webView (with JS and pictures), not only the html document) using the private WebKit framework for iPhone. My question is: How do I s...

domdocument formatting

I am trying to read in the body of a certain webpage to display on a seperate webpage, but I am having a bit of trouble with it. Right now, I use the following code <?php @$doc = new DOMDocument(); @$doc->loadHTMLFile('http://foo.com'); @$tags = $doc->getElementsByTagName('body'); foreach ($tags as $tag) { $index_text .= $tag->nodeV...

DOMDocument & XPath - HTML Tag of each Node

Given the following PHP code using DOMDocument: $inputs = $xpath->query('//input | //select | //textarea', $form); if ($inputs->length > 0) { for ($j = 0; $j < $inputs->length; $j++) { $input = $inputs->item($j); $input->getAttribute('name'); // Returns the Attribute $input->getTag(); // How can I get the input,...

Is there a way to get all of a DOMElement's attributes?

I'm reading some XML with PHP and currently using the DOMDocument class to do so. I need a way to grab the names and values of a tag's (instance of DOMElement) attributes, without knowing beforehand what any of them are. The documentation doesn't seem to offer anything like this. I know that I can get an attribute's value if I have its n...

Same elements from multiple files DomDocument loadHTMLFile PHP

It seems that when I have multiple files within the '/var/www/cal/attach/' directory, it only extracts the elements from the first file over and over again. Do I need to clear out the elements somehow to get this to work properly? What I'm trying to do is have the script go through multiple *.htm files, and parse data from the files int...

Walking through html searching for first level blocks

In addition to my other question about the analyzing of html, and searching for <p> and <ul>/<ol> tags this question. $in = '<p>Bit\'s of text</p><p>another paragraph</p><ol><li>item1</li><li>item2</li></ol><p>paragraph</p>'; function geefParagrafen($in){ $dom = new domDocument; $dom->loadHTML($in); $x = $dom->documentElement; } ...

PHP DOMDocument with åäö (UTF-8)

Hi! I got a HTML/PHP5 page with a form, then when it gets posted, it creates a XML file with the form input as data. But all åäö looks like if I had used utf8_encode() on them. I can't utf8_decode() them, because then the "service" I send the XML files to, complains that is not UTF-8 (like it should). Parser failed. Reason :2: parser ...