Hi I have such files to parse (from scrapping) with Python:
some HTML and JS here...
SomeValue =
{
'calendar': [
{ 's0Date': new Date(2010, 9, 12),
'values': [
{ 's1Date': new Date(2010, 9, 17), 'price': 9900 },
{ 's1Date': new Date(2010, 9, 18), 'price': 9900 },
{ 's1Date': new Date(2010, 9, 19), 'price': 9900 },
{ 's1Date': new Date(2010, 9, 20), 'price': 9900 },
{ 's1Date': new Date(2010, 9, 21), 'price': 9900 },
{ 's1Date': new Date(2010, 9, 22), 'price': 9900 },
{ 's1Date': new Date(2010, 9, 23), 'price': 9900 }]
},
'data': [{
index: 0,
serviceClass: 'Economy',
prices: [9900, 320.43, 253.27],
eTicketing: true,
segments: [{
indexSegment: 0,
stopsCount: 1,
flights: [{
index: 0,
... and a lot of nested data and again HTML and JS...
I need to parse it and extract all json data. Now I use regex with cleaning all '\n' and '\t' and eval() function to convert it to Python dictionary.. I really don't like this solution, eval() especially. But I looked at BeautifulSoup and lxml, and didn't find something that will help to parse it.
Can you suggest something better than regex and eval() for this task?
Page example: http://codepaste.ru/3830/