hello, i am writing an rss parser is there is any way to find the given url is rss or atom...using java
+1
A:
You could use ROME (I suggest that first) for parsing RSS and Atom Feeds. Alternatively, you'll have to use a SAX parser or create a DOM tree and do the following:
For RSS:
In RSS, you will have to check that there's a rss
element, and it's child must contain a channel
element. There can be 0 or more item
in RSS (I might be wrong).
Example:
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
<title>RSS Title</title>
<description>This is an example of an RSS feed</description>
<link>http://www.someexamplerssdomain.com/main.html</link>
<lastBuildDate>Mon, 06 Sep 2010 00:01:00 +0000 </lastBuildDate>
<pubDate>Mon, 06 Sep 2009 16:45:00 +0000 </pubDate>
<item>
<title>Example entry</title>
<description>Here is some text containing an interesting description of the thing to be described.</description>
<link>http://www.wikipedia.org/</link>
<guid>unique string per item</guid>
<pubDate>Mon, 06 Sep 2009 16:45:00 +0000 </pubDate>
</item>
</channel>
</rss>
For Atom:
In Atom, you will have to check that there's a feed
element. There can be 0 or more entry
in Atom. (I might be wrong).
Example:
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Example Feed</title>
<subtitle>A subtitle.</subtitle>
<link href="http://example.org/feed/" rel="self" />
<link href="http://example.org/" />
<id>urn:uuid:60a76c80-d399-11d9-b91C-0003939e0af6</id>
<updated>2003-12-13T18:30:02Z</updated>
<author>
<name>John Doe</name>
<email>[email protected]</email>
</author>
<entry>
<title>Atom-Powered Robots Run Amok</title>
<link href="http://example.org/2003/12/13/atom03" />
<link rel="alternate" type="text/html" href="http://example.org/2003/12/13/atom03.html"/>
<link rel="edit" href="http://example.org/2003/12/13/atom03/edit"/>
<id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
<updated>2003-12-13T18:30:02Z</updated>
<summary>Some text.</summary>
</entry>
</feed>
PS: I don't know which RSS version or Atom version you want to implement, but follow their guidelines.
The Elite Gentleman
2010-10-18 16:42:04