parsing

HTML error checker.

I'm using Asp.net., Assuming I'm allowing user to post messages in my site with HTML tags. How do I ensure he has properly closed all the tags? Is there any HTML-tag-checker available that tries to parse tags and report errors if any? May be just like the BLOGGER has. ...

When should I use a parser?

I have had problems in Regexes to divide a code up into functional components. They can break or it can take a long time for them to finish. The experience raises a question: "When should I use a parser?" ...

How do I parse text into lists in Java?

I have the following file saved as a .txt: I Did It Your Way, 11.95 The History of Scotland, 14.50 Learn Calculus in One Day, 29.95 Feel the Stress, 18.50 Great Poems, 12.95 Europe on a Shoestring, 10.95 The Life of Mozart, 14.50 I need to display the title of the books and the prices on different JLists in Java. How do I do that? A...

Parsing a web page without broken strings.

I'm trying to parse some strings from a web page but I keep getting strings that happen to be broken up with no way to check if the string is complete or not. At the moment, I have a buffer of 1024 bytes that I'm receiving parts of the page with. What should I do to make sure I get the full string, preferably without an overly large buff...

How to Parse Lines With Differing Number of Fields in C++

I have a data that looks like this: AAA 0.3 1.00 foo chr1,100 AAC 0.1 2.00 bar chr2,33 AAT 3.3 2.11 chr3,45 AAG 1.3 3.11 qux chr1,88 ACA 2.3 1.33 chr8,13 ACT 2.3 7.00 bux chr5,122 Note that the lines above are tab separated. Moreover, it sometime may contain 5 fields or 4 fields. What I want to do is to capture 4th fields in...

Is Perl or C faster at parsing?

I have a few very large log files, and I need to parse them. Ease of implementation obviously points me to Perl and regex combo (in which I am a still novice). But what about speed? Will it be faster to implement it in C? Each log file is in the order of 2 GB. ...

.NET HTTP parser

I am writing an application to sniff some HTTP traffic. I am using WinPcap to access the TCP/IP packets. Is there a library that will help me parse the HTTP messages? I have implemented a basic parser myself, but I would like something more mature: I keep running into new variations (chunked messages, gzip-compression etc.) The .NET fr...

how to extract data from cocoa iPhone sax xml parsing routine

I'm trying to read in and parse an xml document in an iPhone app. I begin parsing and then use the override method: static void startElementSAX(void *ctx, const xmlChar *localname, const xmlChar *prefix, const xmlChar *URI, int nb_namespaces, const xmlChar **namespaces, int nb_attributes, int nb_defaulted, c...

Reading an input file in C++

I would like to read an input file in C++, for which the structure (or lack of) would be something like a series of lines with text = number, such as input1 = 10 input2 = 4 set1 = 1.2 set2 = 1.e3 I want to get the number out of the line, and throw the rest away. Numbers can be either integers or doubles, but I know when they are one o...

Parse and add url from clipboard

I need a javascript bookmark to take the url I have in the clipboard parse out the 2 numbers and create a new url, and add a link to the top of the page, that when clicked adds the url to my bookmark menu. Say I have url's like these http://www.website.com/frontpageeditor.jhtml?sectionID=2844&poolID=6276 javascript:getPoolPageUrl(9800...

How to generate random strings that match a given regexp?

Duplicate: Random string that matches a regexp No, it isn't. I'm looking for an easy and universal method, one that I could actually implement. That's far more difficult than randomly generating passwords. I want to create an application that takes a regular expression, and shows 10 randomly generated strings that match that exp...

Java String contains only...

Hi everyone, I'm new to Java and I'm trying to achieve something pretty simple but I am not allowed to use regex... Which is my favorite tool to do that type of task. Basically I need to make sure a string only contains alpha, numeric, space and dashes. I found the class org.apache.commons.lang.StringUtils and the almost adequate meth...

Java: Traversed Tree to Tree

What is the most efficient way to solve this problem: I've traversed a XML file and created the following set of linked (String) lists: a > b > c a > b > d a > f > [i] and now I'm trying to rebuild the XML into its original structure: <a> <b> <c/><d/> </b> <f>i</f> </a> Any help would really be appreciated! ...

ParseStatementList not working on valid SQL statement

I'm trying to use the TSql100Parser.ParseStatementList method to programatically parse sql statements and pull out object names. This is from the Microsoft.Data.Schema.ScriptDom.Sql namespace. Here's the code: string sql = "CREATE VIEW testView AS SELECT * from testTable"; var parser = new TSql100Parser(false); StatementList parsedSta...

Using VBA to parse text in an MS Word document

Hi, I was hoping someone could help with a MS Word Macro. Basically, I have a MS Word document which lists out several text files and specific pages of interest in each file. The file format is similar to: textdocument1.txt P. 6, 12 - issue1 textdocument2.txt P. 5 - issue1 P. ...

What is a good Javascript RDFa parser implementation?

I am looking to implement a client side RDFa based formatting for a web application. This would be similar to Mark Birbeck's ubiquity-rdfa project. Mark's project looks fantastic but it has at least two drawbacks: It is slow. Adding RDFa formatting to a simple page causes a noticeable delay in page loading. It is complex. The ubiqu...

Integer representation for day of the week

I would like to convert a date object its integer representation for the day of week in C#. Right now, I am parsing a XML file in order to retrieve the date and storing that info in a string. It is in the following format: "2008-12-31T00:00:00.0000000+01:00" How can I take this and convert it into a number between 1 and 7 for the day o...

Is there anything like Chronic available in PHP?

I am looking for a very robust datetime parser, similar to Ruby's Chronic, but for PHP. strtotime() isn't cutting it for a lot of the edge cases I'm seeing in my project. Anyone know of any good libraries? Ideally: - PHP 5 OOP - well documented - fast and stable Thanks! ...

Parse four bytes to floating-point in C

How do I take four received data bytes and assemble them into a floating-point number? Right now I have the bytes stored in an array, which would be, received_data[1] ... received_data[4]. I would like to store these four bytes as a single 32-bit single precision float. -Thanks I'm actually receiving a packet with 19 bytes in it and...

Quotation marks turn to question marks ...

So I have a ruby script that parses HTML pages and saves the extracted string into a DB... but i'm getting weired charcters (usually question marks) instead of plain text... Eg : ‘SOME TEXT’ instead of 'Some Text' I've tried HTML entities and CGI::unescape ... but to no avail... did some googling n set $KCODE = 'u' & require 'jcode...