feedparser

How to detect if a page is an RSS or ATOM feed

Hello, I'm currently building a new online Feed Reader in PHP. One of the features i'm working on is feed auto-discovery. If a user enters a website URL, the script will detect that its not a feed and look for the real feed URL by parsing the HTML for the proper tag. The problem is, the way im currently detecting if the URL is a feed ...

How to parse the "<media:group>" using feedparser?

The rss file is shown as below, i want to get the content in section media:group . I check the document of feedparser, but it seems not mention this. How to do it? Any help is appreciated. <?xml version="1.0" encoding="UTF-8"?> <rss xmlns:ymusic="http://music.yahoo.com/rss/1.0/ymusic/" xmlns:media="http://search.yahoo.com/mrss/" xmlns...

Ruby - Feedzirra and updates

Hi, trying to get my head around Feedzirra here. I have it all setup and everything, and can even get results and updates, but something odd is going on. I came up with the following code: def initialize(feed_url) @feed_url = feed_url @rssObject = Feedzirra::Feed.fetch_and_parse(@feed_url) end def update_from_feed_cont...

Parsing <geo:lat>, <geo:long> tags value using feedparser in python!

Hello, I am using feedparser for parsing from XML file.But I couldn't parse <geo:lat>, <geo:long> tags using feedparser from that file! Do you people have any idea how I can parse those tags using feedparser in python? Thanks in advance! ...

Trace/BPT trap when running feedparser inside a Thread object

Hello, I am trying to run a Thread to parse a list of links using the universal feed parser, but when I start the thread I get a Trace/BPT trap. Here's the code I am using: class parseRssFiles(Thread): def __init__ (self,rssLinks): Thread.__init__(self) self.rssLinks = rssLinks def run(self): self.rssContents =...

Twitter feed appears to be both RSS 2.0 and Atom?

I'm parsing various site feeds, and putting together a small library to help me do it. Looking at the Atom RFC and RSS 2.0 specification, feeds from Twitter seem to be a combination. Twitter specifies an Atom namespace in an RSS 2.0 structure? GitHub uses Atom, whereas Flickr (offers multiple but the default 'Latest' feed from user pro...

Correctly parsing an ATOM feed

I currently have setup a Python script that uses feedparser to read a feed and parse it. However, I have recently come across a problem with the date parsing. The feed I am reading contains <modified>2010-05-05T24:17:54Z</modified> - which comes up in Python as a datetime object - 2010-05-06 00:17:54. Notice the discrepancy: the feed ent...

adding the feedparser module to python

I recently downloaded and installed feedparser with python, I tried to run it but Netbeans shouts on import: ImportError: No module named feedparser restarted the Netbeans, still no go. ...

feedparser fails during script run, but can't reproduce in interactive python console

It's failing with this when I run eclipse or when I run my script in iPython: 'ascii' codec can't decode byte 0xe2 in position 32: ordinal not in range(128) I don't know why, but when I simply execute the feedparse.parse(url) statement using the same url, there is no error thrown. This is stumping me big time. The code is as simple ...

python feedparser with yahoo weather rss

I'm trying to use feedparser to get some data from yahoos weather rss. It looks like feed parser strips out the yweather namespace data: http://weather.yahooapis.com/forecastrss?w=24260013&amp;u=c <yweather:condition text="Fair" code="34" temp="23" date="Wed, 19 May 2010 5:55 pm EDT" /> looks like feedparser is completely ignoring...

Is there a more up to date RSS feed API for Python than Feedparser?

Seems it hasn't been updated in a while, and lacks support for things like sy:updateFrequency. ...

Error installing FeedZirra

Hi, I am new to Ruby on Rails. I am excited about Feed parsing but when I install FeedZirra I am getting this error. I use Windows 7 and Ruby 1.8.7. Please help. Thanks in advance. C:\Ruby187>gem sources -a http://gems.github.com http://gems.github.com added to sources C:\Ruby187>gem install pauldix-feedzirra Building native extensio...

feedparser - various errors

I need feedparser (se http://www.feedparser.org) for a project, and want to keep third party modules in a separate folder. I did this by adding a folder to my python path, and putting relevant modules there, among them feedparser. This first attempt to import feedparser resulted in >>> import feedparser Traceback (most recent call last...

question regarding universal feed parser

Hey guys, I faced a problem grabbing the content from a couple of blog feeds I have crawled. I'm uncertain what is the reason, but by parsing one or two blogs with the feedparser returns me this particular error: results = feedparser.parse(url) ent = [] for entry in results.entries: e = {} e['title'] = entry.title ...

What module can I use to parse RSS feeds in a Perl CGI script?

I am trying to find a RSS parser that can be used with a Perl CGI script. I found simplepie and that's really easy parser to use in PHP scripting. Unfortunately that doesn't work with a Perl CGI script. Please let me know if there is anything that's easy to use like simplepie. I came across this one RssDisplay but I am not sure about th...

How does FeedJack fetches historical feeds

I am building a news aggregation website and I am looking for a way to fetch old feeds(of any particular website ) into the system. During this course, I stumbled on to Feedjack. It is said that it handles what I needed. So I started diving into the source code. (I dont want to plugit in my django project directly.) All I see is this lin...

Python feedparser not using atom/WordPress namespace?

I'm trying to use feedparser (an excellent library) to parse WordPress export files, and a (minor) inconsistency between WordPress version is causing me a huge headache. WordPress 2.x doesn't include atom:link tags in the XML output (without_atom_tags.xml). When parsed, namespaced elements are available without the prefix: >>> feed = ...

how to get final redirected url

i am using google app engine for fetching the feed url bur few of the urls are 301 redirect i want to get the final url which returns me the result i am usign the universal feed reader for parsing the url is there any way or any function which can give me the final url. ...

How do i declare a timeout using urllib2 on Google App Engine?

I'm aware that urllib2 is available on Google App Engine as a wrapper of Urlfetch and, as you know, Universal Feedparser uses urllib2. Do you know any method to set a timeout on urllib2 ? Is timeout parameter on urllib2 been ported on Google App Engine version? I'm not interested in method like: rssurldata = urlfetch(rssurl, deadline=...

Python/feedparser script won't display on CGI/ character coding

#!/usr/bin/python # -*- coding: utf-8 -*- import sys import os import cgi import string import feedparser count = 0 print "Content-Type: text/html\n\n" print """<PRE><B>WORK MAINTENANCE/B></PRE>""" d = feedparser.parse("http://www.hep.hr/ods/rss/radovi.aspx?dp=zagreb") for opis in d: try: print """<B>Place/Time:</B> %s...