phpquery

Scraping Library for PHP - phpQuery?

I'm looking for a PHP library that allows me to scrap webpages and takes care about all the cookies and prefilling the forms with the default values, that's what annoys me the most. I'm tired of having to match every single input element with xpath and I would love if something better existed. I've come across phpQuery but the manual is...

jQuery filtering and traversing (phpQuery)

Hi, I am trying to parse data off a page using phpquery(almost same as jquery), I need code to get these 2 things, B K Guda Association Hall and Main Road, C Type Colony, B K Guda other things can be left out.. this is the html <td> <a href="#..."> <span class="a12bl"> <u> <h2 class="bold">...

How do I prevent the query class from processing the question marks in my text strings?

Rather new to php, so sorry if this seems stupid. I'm really copying a lot of this from previously written code from other developers at my company. The way we run a query is basically like this: $qry = new SQLQuery; $sqlString = "SELECT * FROM database.table WHERE table.text = '" . $textVar . "' and table.text2 = '" . $...

Selecting peculiar XML tags with phpQuery

phpQuery is a really nice tool which has helped me tremendously in the past parse well-formed XHTML and XML documents, but I have recently run into a problem trying to select elements that have colons in their tagname, such as the following: <isc:thumb><![CDATA[http://example.com/foo_thumb.jpg]]&gt;&lt;/isc:thumb&gt; I've tried to use...

best way to filter out non-representational elements

Hi there, I am using phpQuery in order to apply some twaks to editorial content. As its syntax is the same as jQuery, I consider this a dom scripting issue. To get rid of elements which are not representing anything, but wrapping other stuff (like the div in <div><p>blah</p></div> I came up with this selector: $doc['div:not([alig...

Syntax for phpquery scraping

Hello, I need to use a wordpress plugins : http://wordpress.org/extend/plugins/wp-web-scrapper WP Web Scraper to extract the link of an audio tracks on a itunes web page. here's the page where i want to extract the link : http://itunes.apple.com/us/album/guero/id52311104 here’s the link I want to extract on this page : http://a1.pho...

Fix incorrectly displayed encoding on an html document with php

Is there a way to fix the characters that display improperly after running this html markup through phpquery::newDocument? There are slated double quotes around -Classics with modern Woman- in the original document that end up displaying improperly after creating the new doc with phpquery. //Original document is UTF-8 encoded $raw_h...

Replace element using phpquery (php version of jquery)

I want to replace all <span> tags with <p> using phpquery. What is wrong with my code? It finds the span but the replaceWith function is not doing anything. $event = phpQuery::newDocumentHTML(file_get_contents('event.html')); $formatted_event = $event->find('span')->replaceWith('<p>'); This documentation indicates this is possible: ...

Current line-/col-number in phpQuery?

Hi, How can I get the line-/col-number of the current element in phpQuery? I use the phpQuery framework for a validation tool with custom errors. Thanks! ...