I have created a GAE app that parses RSS feeds using cElementTree. Testing on my local installation of GAE works fine. When I uploaded this app and tried to test it, I get a SyntaxError.
The error is :
Traceback (most recent call last):
File "/base/python_lib/versions/1/google/appengine/ext/webapp/init.py", line 509, in call handler.post(*groups) File "/base/data/home/apps/palmfeedparser/1-6.339910418736930444/pipes.py", line 285, in post tree = ET.parse(urlopen(URL)) File "", line 45, in parse
File "", line 32, in parse SyntaxError: no element found: line 14039, column 45
I did what Mr.Alex Martelli suggested and it printed out the following on my local machine:
[' <ac:tag><![CDATA[Mobilit\xc3\xa4t]]></ac:tag>\n', ' </ac:tags>\n', ' <ac:images>\n', ' <ac:image ac:number="1">\n', ' <ac:asset_url ac:type="app">http://cdn.downloads.example.com/public/1198/de/images/1/A/01.png</ac:asset_url>\n']
I uploaded the app and it printed out:
[' <ac:tag><![CDATA[Mobilit\xc3\xa4t]]></ac:tag>\n', ' </ac:tags>\n', ' <ac:images>\n', ' <ac:image ac:number="1">\n', ' <ac:asset_url ac:type="app">http://cdn.downloads.example.com/public/1198/de/images/1/A/01.png</ac:asset_url>\n']
These lines correspond to the following lines in the RSS feed I am reading:
<ac:tags>
<ac:tag><![CDATA[Mobilität]]></ac:tag>
</ac:tags>
<ac:images>
<ac:image ac:number="1">
<ac:asset_url ac:type="app">http://cdn.downloads.example.com/public/1198/de/images/1/A/01.png</ac:asset_url>
I notice that there is a newline before the closing ac:tags. Line 14039 corresponds to this new line.
Update:
I use urllib.urlopen to access the URL of the feed. I displayed the contents it fetches both locally and on GAE proper. Locally, no content is truncated. Testing after uploading the app, shows that the feed that has 15289 lines is truncated to 14185 lines.
What method can I use to fetch this huge feed? Would urlfetch work?
Thanks in advance for your help!
A_iyer