ansaurus

Question

How can I cut off a RSS feed description after 2 sentences using preg_split?

Answer 1

+1 A:

Proper splitting of HTML is very tricky, and not worth doing with regular expressions. If you want HTML, something like DOM text iterator will be useful.

Convert description to text:

$text = html_entities_decode(strip_tags($html),ENT_QUOTES,'UTF-8');

This will take first 200 characters (200 words is a bit too much for a sentence, isn't it?) and then look for end of sentence:
```
$text = preg_replace('/^(.{200}.*?[.!?]).*$/','\1',$text);
```

You could change [.!?] to something more sophisticated, e.g. require space after punctuation or require that there's no punctuation nearby:

  (?<![^.!?]{5})[.!?](?=[^.!?]{5})

(?=…) is positive assertion. (?<!…) negative assertion that looks behind current position. {5} means 5 times.

I haven't tested it :)

porneL 2008-11-27 19:48:50

Answer 2

A:

Thanks! But it seems that there is an example post-processing script that uses preg_replace (for something entirely different), so I'd like to stick with something of the sort.

So if I could use preg_split, how can I take just text in $the_content, find the second sentence or the next sentence after maybe 100 or 200 words and output that? Either one works for me!

2008-11-27 23:57:47

ansaurus

tags:

views:

answers:

How can I cut off a RSS feed description after 2 sentences using preg_split?

related questions