parsing

Simple custom tag parsing in java

Hi, I have a client that wants to insert videos, images, form elements, etc. into his text while also keeping html elements that tinymce generates. One thing thing that came to mind is to create special tags that lets him do this, and then use a transformation engine that takes the input -> output. So for a video tag, it could inject t...

Is the parsing part of SqlConnectionStringBuilder available in the FCL?

I'm using a SqlConnectionStringBuilder instance to parse a connection string, but don't want to check its key names for validity. By behaviour the builder will throw an exception if an unsupported key is encountered in the string. For example, exception on unknown key "whatever" is: Keyword not supported: 'whatever'. What I wa...

Cannot parse double

I'm trying to parse values like $15,270.75 with the expression double cost = 0; double.TryParse("$15,270.75", NumberStyles.AllowThousands | NumberStyles.AllowCurrencySymbol | NumberStyles.AllowDecimalPoint, CultureInfo.InvariantCulture, out cost); but have no success ...

Does C# have a library for parsing multi-level cascading JSON?

Is there a library (C# preferred) to resolve what I would call multi-level cascading JSON? Here is an example of what I mean: (Pseudocode/C#) var json1 = @"{ ""firstName"": ""John"", ""lastName"": ""Smith"" }"; var json2 = @"{ ""firstName"": ""Albert"" }"; var json3 = @"{ ""phone"": ""12345"" }"; var casc...

C# Determine if Absolute or Relative URL

I have a relative or absolute url in a string. I first need to know whether it is absolute or relative. How do I do this? I then want to determine if the domain of the url is in an allow list. Here is my allow list, as an example: string[] Allowed = { "google.com", "yahoo.com", "espn.com" } Once I know whether its relativ...

unknown host exception while parsing an xml file

when i am trying to parse an xml, i am getting following exception :- java.net.UnknownHostException: hibernate.sourceforge.net at java.net.AbstractPlainSocketImpl.connect(Unknown Source) at java.net.PlainSocketImpl.connect(Unknown Source) at java.net.Socket.connect(Unknown Source) at java.net.Socket.connect(Unknown Sourc...

Joda Time problem with Daylight Saving change and date time parsing

I have the following problem using Joda Time for parsing and producing date and time around Daylight Saving hours. Here is an example (please, note that March 30th 2008 is Daylight Saving change in Italy): DateTimeFormatter dtf = DateTimeFormat.forPattern("dd/MM/yyyy HH:mm:ss"); DateTime x = dtf.parseDateTime("30/03/2008 03:00:00"); int...

Best way to parse a file in Objective-c

Hello everyone. I am trying to parse out an apache-like config file using Objective-c. Where is the best place to start? I haven't done a lot of file read/write on this platform. Thanks! ...

What are the parsing rules for expressions in C?

How can I understand the parsing of expressions like a = b+++++b---c--; in C? I just made up the expression above, and yes, I can check the results using any compiler, but what I want to know is the ground rule that I should know to understand the parsing of such expressions in C. ...

Parsing big string (HTML code)

Hello! I'm looking to parse some information on my application. Let's say we have somewhere in that string: <tr class="tablelist_bg1"> <td>Beja</td> <td class="text_center">---</td> <td class="text_center">19.1</td> <td class="text_center">10.8</td> <td class="text_center">NW</td> <td class="text_center">50.9</td> <td class="text...

problem when parsing pdf files

I use htmlparser 1.6 to parse web sites. The problem is that when I parse pdf web sites, I obtain in the output file strange characters like ØÇÁÖÜ/:?ÖQØ?WÕWÏ This is a fragment of my code : try { parser = new Parser (); if (1 < args.length) filter = new TagNameFilter (args[1]); else { filter = n...

vb.NET WebRequest to read aspx page to string, access denied?

I'm trying to make an executable in VS2008 that will read a webpage source code using a vb.NET function into a string variable. The problem is that the page is not *.html but rather *.aspx. I need a way to execute the aspx and get the displayed html into a string. The page I want to read is any page of this type: http://www.realtor.ca/...

Code parsing C#

Dear all, I am researching ways, tools and techniques to parse code files in order to support syntax highlighting and intellisence in an editor written in c#. Does anyone have any ideas/patterns & practices/tools/techiques for that. EDIT: A nice source of info for anyone interested: Parsing beyond Context-free grammars ISBN 978-3-642...

Read multiline text with values separated by whitespaces

I have a following test file : Jon Smith 1980-01-01 Matt Walker 1990-05-12 What is the best way to parse through each line of this file, creating object with (name, surname, birthdate) ? Of course this is just a sample, the real file has many records. ...

strcspn() stopping at a period

Hi. I'm writing a function that should parse a string containing a description of a dice roll, for instance "2*1d8+2". I extract the four values OK when they are integers, but I want to be able to use floats as well for the multiplier and the addition at the end. Things get nasty when I try to parse such a string: "1.8*1d8+2.5". I have ...

ANTLR rule to consume fixed number of characters

I am trying to write an ANTLR grammar for the PHP serialize() format, and everything seems to work fine, except for strings. The problem is that the format of serialized strings is : s:6:"length"; In terms of regexes, a rule like s:(\d+):".{\1}"; would describe this format if only backreferences were allowed in the "number of matches"...

Handling Empty Nodes Using Java DOM

Hello everyone, I have a question concerning XML, Java's use of DOM, and empty nodes. I am currently working on a project wherein I take an XML descriptor file of abstract machines (for text parsing) and parse a series of input strings with them. The actual building and interpretation of these abstract machines is all done and working...

Loading a page that sometimes 'hangs' via PHP (Curl)

Hi, I'm trying to get information from a site by parsing/scraping it via PHP & Curl. But sometimes the current page doesn't finish loading, so the script runs without anything happening. It's a simple script like this... ... curl_setopt($curl, CURLOPT_URL, $url); $page = curl_exec($curl); ... Is there a way to simply retry the loa...

Medical information extraction using Python

Hello there, I am a nurse and I know python but I am not an expert, just used it to process DNA sequences We got hospital records written in human languages and I am supposed to insert these data into a database or csv file but they are more than 5000 lines and this can be so hard. All the data are written in a consistent format let me s...

Parsing PDF files hosted in web servers

I have used iText to parse pdf files. It works well on local files but I want to parse pdf files which are hosted in web servers like this one: "http://protege.stanford.edu/publications/ontology_development/ontology101.pdf" but I don't know how??? Could you please answer me how to do this task using iText or other libraries... thx ...