I'm looking for a way to automatically produce an abstract, basically the first few sentances/paragraphs of a blog entry, to display in a list of articles (which are written in markdown). Currently, I'm doing something like this:
def abstract(article, paras=3):
return '\n'.join(article.split('\n')[0:paras])
to just grab the first few lines worth of text, but i'm not totally happy with the results.
What I'm really looking for is to end up with about 1/3 of a screenful of formatted text to display in the list of entries, but using the algorithm above, the amount pulled ends up with wildly varying amounts, as little as a line or two, is frequently mixed with more ideal sized abstracts.
Is there a library that's good at this kind of thing? if not, do you have any suggestions to improve the output?