views:

383

answers:

2

I tried to install the Yahoo BOSS mashup framework, but am having trouble running the examples provided. Examples 1, 2, 5, and 6 work, but 3 & 4 give Expat errors. Here is the output from ex3.py:

gpython examples/ex3.py
    examples/ex3.py:33: Warning: 'as' will become a reserved keyword in Python 2.6
Traceback (most recent call last):
  File "examples/ex3.py", line 27, in <module>
    digg = db.select(name="dg", udf=titlef, url="http://digg.com/rss_search?search=google+android&amp;area=dig&amp;type=both&amp;section=news")
  File "/usr/lib/python2.5/site-packages/yos/yql/db.py", line 214, in select
    tb = create(name, data=data, url=url, keep_standards_prefix=keep_standards_prefix)
  File "/usr/lib/python2.5/site-packages/yos/yql/db.py", line 201, in create
    return WebTable(name, d=rest.load(url), keep_standards_prefix=keep_standards_prefix)
  File "/usr/lib/python2.5/site-packages/yos/crawl/rest.py", line 38, in load
    return xml2dict.fromstring(dl)
  File "/usr/lib/python2.5/site-packages/yos/crawl/xml2dict.py", line 41, in fromstring
    t = ET.fromstring(s)
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 963, in XML
    parser.feed(text)
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 1245, in feed
    self._parser.Parse(data, 0)
    xml.parsers.expat.ExpatError: syntax error: line 1, column 0

It looks like both examples are failing when trying to query Digg.com. Here is the query that is constructed in ex3.py's code:

diggf = lambda r: {"title": r["title"]["value"], "diggs": int(r["diggCount"]["value"])}
digg = db.select(name="dg", udf=diggf, url="http://digg.com/rss_search?search=google+android&amp;area=dig&amp;type=both&amp;section=news")

Any help is appreciated.

Thanks!

A: 

I believe that must be an error in the example: it's getting a JSON result (indeed if you copy and paste that URL in your browser, you'll download a file names search.json which starts with

{"results":[{"profile_image_url":
"http://a3.twimg.com/profile_images/255524395/KEN_OMALLEY_REVISED_normal.jpg",
"created_at":"Mon, 14 Sep 2009 14:52:07 +0000","from_user":"twilightlords",

i.e. perfectly normal JSON; but then instead of parsing it with modules such as json or simplejson, it tries to parse it as XML -- and obviously this attempt fails.

I believe the fix (which probably needs to be brought to the attention of whoever maintains that code so they can incorporate it) is either to ask for XML instead of JSON output, OR to parse the resulting JSON with appropriate means instead of trying to look at it as XML (not sure how to best implement either change, as I'm not familiar with that code).

Alex Martelli
It looks like it at first glance, but this is actually Yahoo's example code failing BEFORE getting the results back. I took a better look at the examples they provided, and it looks like their search on Digg.com is failing. I am trying to hunt down why, but I am also unfamiliar with their code.
Wraith
+1  A: 

The problem is the digg search string. It should be "s=". Not "search="

Huu Nguyen