views:

627

answers:

5

I am writing a Wordpress plugin and need to go through a number of posts, grab the data from them (for the most part title, permalink, and content), and apply processing to them without displaying them on the page.

What I've looked at:

I've looked at get_posts() for getting the posts, and then

getting title via the_title(),
content via the_content(),
and permalink via the_permalink()

Keep in mind that I need this data after all filters had already been applied, so that I get the exact data that would be displayed to the user. Each of the functions above seems to apply all necessary filters and do some postprocessing already, which is great.

The Problem:

The problem is all these functions, at least in WP 2.7.1 (latest released version right now) by default just echo everything and don't even return anything back. the_title() actually supports a flag that says do not print and return instead, like so

the_title(null, null, false)

The other 2, however, don't have such flags, and such inconsistency is quite shocking to me.

I've looked at what each of the_() functions does and tried to pull this code out so that I can call it without displaying the data (this is a hack in my book, as the behavior of the_() functions can change at any time). This worked for permalink but for some reason get_the_content() returns NULL. There has to be a better way anyway, I believe.

So, what is the best way to pull out these values without printing them?

Some sample code

global $post;
$posts = get_posts(array('numberposts' => $limit));

foreach($posts as $post){
    $title = the_title(null, null, false); // the_title() actually supports a "do not print" flag
    $permalink = apply_filters('the_permalink', get_permalink()); // thanks, WP, for being so consistent in your functions - the_permalink() just prints /s
    $content = apply_filters('the_content', get_the_content()); // this doesn't even work - get_the_content() returns NULL for me
    print "<a href='$permalink'>$title</a><br>";
    print htmlentities($content, ENT_COMPAT, "UTF-8"). "<br>";
}

P.S. I've also looked at http://stackoverflow.com/questions/570152/what-is-the-best-method-for-creating-your-own-wordpress-loops and while it deals with an already obvious way to cycle through posts, the solution there just prints this data.

UPDATE: I've opened a ticket with Wordpress about this. http://core.trac.wordpress.org/ticket/9868

+3  A: 

Most functions the_stuff() in WP that echo something have their get_the_stuff() counterpart that returns something.

Eg get_the_title(), get_permalink()...

Ozh
Ozh, I've already looked at them and given my reasons for why I think using them in their current form is a kind of a hack (plus, manual applying of filters needs to be done). For example, the_content() currently does additional things to the output of get_the_content(), in addition to simply returning NULL for me. And why is it that the_title() has a "return value and skip printing" flag and the others don't?
Artem Russakovskii
Wordpress devs themselves seem to think that applying filters should be a plugin author's job. http://core.trac.wordpress.org/ticket/7166. Though I disagree, I'll probably end up using it, once I figure out why get_the_content() returns NULL for me. Hack, hack hack.
Artem Russakovskii
+1  A: 

If you can't find the exact way to do it, you can always use output buffering.

<?php
ob_start();
echo "World";
$world = ob_get_clean();
echo "Hello $world";
?>
troynt
This is a viable solution and I did think of it as well as last resort but it is indeed another hack. Voted up for the time being.
Artem Russakovskii
If you're going to do this you might want to capture the output of a customized wordpress rss/atom feed and go: $rss = simplexml_load_string($data). Then you can do stuff like this: foreach($rss->item as $item) do_something($item->title);
rojoca
Well, actually, the get_posts() function already returns an array of posts, and then for each element in the array I can access $post->post_content, $post->post_title, etc. Now I'm not even sure what I'd need get_the_content() for in the first place. /confused.
Artem Russakovskii
A: 

Is there any reason you can't do your processing at the time each individual post is posted, or when it's being displayed?

WP plugins generally work on a single post at a time, so there are plenty of hooks for doing things that way.

Frank Farmer
Frank, this plugin doesn't operate on individual posts - it's an admin management interface of sorts. Otherwise, I'd just hook into one of the hooks.
Artem Russakovskii
A: 

OK, I got it all sorted now. Here is the final outcome, for whoever is interested:

  • Each post's data can be accessed via iterating through the array returned by get_posts(), but this data will just be whatever is in the database, without passing through any intermediate filters
  • The preferred way is to access data using get_the_ functions and them wrapping them in an call to apply_filters() with the appropriate filter. This way, all intermediate filters will be applied.

apply_filters('the_permalink', get_permalink())

  • the reason why get_the_content() was returning an empty string is that apparently a special call to setup_postdata($post); needs to be done first. Then get_the_content() returns data properly

Thanks everyone for suggestions.

Artem Russakovskii
A: 

I have to agree this is a notable downside of using wordpress.

The functions are not consistent, some return, some dont, some except arguments some dont, as much as I like wordpress (and develop for it), the consistency sucks!

Adrian