parsing

How can I parse this particular JSON data from LastFM using jQuery?

I'm attempting to parse a JSON feed using the LastFM API but there are certain elements returned in the JSON array that are prefixed with a # that I don't know how to reference. The feed URL is here and it can be seen visualised here. My jQuery code so far looks like this: $.getJSON('http://ws.audioscrobbler.com/2.0/?method=geo.geteve...

a way to parse S3 data array

If, for example, I had array like this: array(34) { ["ahostel.lt/img/background.png"]=> array(4) { ["name"]=> string(29) "ahostel.lt/img/background.png" ["time"]=> int(1277819688) ["size"]=> int(36811) ["hash"]=> string(32) "2600e98e10aba543fb2637b701dec4f3" } ["ahostel.lt/img/body-navigation-bg.p...

What technology for large scale scraping/parsing?

We're designing a large scale web scraping/parsing project. Basically, the script needs to go through a list of web pages, extract the contents of a particular tag, and store it in a database. What language would you recommend for doing this on a large scale(tens of millions of pages?). . We're using MongoDB for the database, so anythi...

Get source of a website which requires post login

Hello, I wan't to write a script to get the source of a website which requires a post login. I need a shell script to do this. I want to parse some information. Any idea which language is the best choice for handling the http request and maybe cookies? Thank you. ...

get div element contents in C#

I have a moderately well-formatted HTML document. It is not XHTML so it's not valid XML. Given a offset of the opening tag I need to obtain contents of this tag, considering that it can have multiple nested tags inside of it. What is the easiest way to solve this problem with a minimum amount of C# code that doesn't involve using non-...

How do i parse the XML document in the callback?

I get back the responseXml as a javascript object XMLdocument. How do i parse it to just return the body? here is my code snippet: goog.net.XhrIo.send("/blogs/create?authenticity_token="+ goog.string.urlEncode(authtoken), function(e) { var xhr = /** @type {goog.net.XhrIo} */ (e.target); var responseXml = xhr...

Building Lisp/Scheme-like parse tree with flex/bison

Hello, I was trying to parse simple Lisp/scheme-like code E.g. (func a (b c d) ) and build a tree from it, I could do the parsing in C without using bison (i.e, using only flex to return tokens and building the tree with recursion). But, with bison grammar, I am not sure where to add the code to build the list (i.e, which rule to...

How to change date's format from milliseconds

Hi, I am receiving last modification date of a file, using below code : xmlUrl = new URL("http://java.sun.com/j2se/1.4.2/docs/api/java/text/SimpleDateFormat.html"); URLConnection urlconn = xmlUrl.openConnection(); urlDate = new Date(urlconn.getLastModified()); In result I am getting date in below format: Tue Dec 18 05:11:33 Asia/Ka...

How does a parser (for example, HTML) work?

For argument's sake lets assume a HTML parser. I've read that it tokenizes everything first, and then parses it. What does tokenize mean? Does the parser read every character each, building up a multi dimensional array to store the structure? For example, does it read a < and then begin to capture the element, and then once it meets ...

Parse content like XML, with jQuery

Hi everybody, I have this content from one input value: var xml_url = $("input").val(); alert(xml_url); Output: <trans> <result> <item1>1</item1> <item2>content</item2> <item3>NA</item3> <item4>0</item1> </result> </trans> The structure is as a XML file. I want to get this data. I have th...

What are the pros and cons of the leading Java HTML parsers?

Searching SO and Google, I've found that there are a few Java HTML parsers which are consistently recommended by various parties. Unfortunately it's hard to find any information on the strengths and weaknesses of the various libraries. I'm hoping that some people have spent some comparing these libraries, and can share what they've learn...

Regex - PHP :: Regex pattern to parse links and images from html page

Possible Duplicates: How do I extract HTML content using Regex in PHP RegEx match open tags except XHTML self-contained tags Hi, I dont know regex, i working with PHP (possible use Zend framework) I need to get from html page images and links. I think the best way to do it with regex, regex pattern that insert images and ...

lexical analyse or series of regular expressions to parse unstructured text into structured form

I am trying to write some code that will function like google calendars quick add feature . You know the One where you can input any of the following : 1) 24th sep 2010 , Johns Birthday 2) John's Birthday , 24/9/10 3) 24 September 2010 , Birthday of John Doe 4) 24-9-2010 : John Does Birthday 5) John Does Birthday 24th of September 2010 ...

How does one parse and convert AutoCAD MText entity to raw text?

I would like to parse AutoCAD's MText entity and extract the raw text. I see a pattern in the way the text is formatted. If this has already been solved, then I would not need to reinvent the wheel. I have searched online, but have not found sufficient information. I am searching for any links or references on this subject. Edit: To f...

How to get parent node of element using tinyxml

Is there a way to get a parent node from a TiXmlElement? For example... TiXmlElement *parent = child->ParentElement( "someName" ); If you can't do this in tinyxml, are there any other xml parsers that allow this? ...

How to read the content of an XML tag

How to read the content of this XML? <Locations> <Location>4</Location> </Locations> I can parse it, and it works, but I can't access the value of Location (4). I'm using: ZoneCheck *azone; NSString *url1 = [NSString stringWithFormat:@"%@", azone.location]; ...

Python 2.6: parallel parsing with urllib2

Hi there, I'm currently retrieving and parsing pages from a website using urllib2. However, there are many of them (more than 1000), and processing them sequentially is painfully slow. I was hoping there was a way to retrieve and parse pages in a parallel fashion. If that's a good idea, is it possible, and how do I do it? Also, what...

(Javascript- HTML Parser) Can be html parsed with javascript without server languages?

Hi, Can html be parsed with javascript without server languages? my webpage should do : parsing of all images from the url that user enter. Can I do it only with Javascript if yes with library functions that exist doing that? Thanks, ...

how to differentiate xml tags that belong to layer1 and those that belong layer2 if they use the same tag name?

in my xml file, i have same tag name used at different place (layer1 and layer2), how can i differenciate tags named "<tile gid ="int">" from layer1 and layer2 ? i need to process them differently depending if they belong to layer1 or layer2... here's a small sample of my parser and my xml file: // ================= // xml parser samp...

C# - Parsing partial dates in different formats

Hi there. For use in a search method i would like to parse partial dates in different formats. I want to enable the user to search for day and month or month and year or also day and month and year. But the more problematic thing is that i want do this in all possible date formats depending on the country, the user is in. It's a ASP.NET...