parsing

How to parse a string into a nullable int in C# (.NET 3.5)

I'm wanting to parse a string into a nullable int in C#. ie. I want to get back either the int value of the string or null if it can't be parsed. I was kind of hoping that this would work int? val = stringVal as int?; But that won't work, so the way I'm doing it now is I've written this extension method public static int? ParseNull...

How do you parse a filename in bash?

I have a filename in a format like: system-source-yyyymmdd.dat I'd like to be able to parse out the different bits of the filename using the "-" as a delimiter. ...

What is a good way to format logs?

I'm designing an application which includes the need to log all incoming messages I receive from a Telnet connection. The text is largely plain though can include ANSI tags that provide text colour and formatting (16 colours, bold, underline, etc). I'm would like to format my logs to store the text with formatting, date/time and potenti...

Print stack trace information from C#

As part of some error handling in our product, we'd like to dump some stack trace information. However, we experience that many users will simply take a screenshot of the error message dialog instead of sending us a copy of the full report available from the program, and thus I'd like to make some minimal stack trace information availabl...

BNF grammar test case generation

Does anyone have any experience with a tool that generates test strings from a BNF grammar that could then be fed into a unit test? ...

Getting international characters from a web page?

I want to scrape some information off a football (soccer) web page using simple python regexp's. The problem is that players such as the first chap, ÄÄRITALO, comes out as ÄÄRITALO! That is, html uses escaped markup for the special characters, such as Ä Is there a simple way of reading the html into the correct python st...

How to parse relative time?

This question is the other side of the question asking, "How do I calculate relative time?". Given some human input for a relative time, how can you parse it? By default you would offset from DateTime.Now(), but could optionally offset from another DateTime. (Prefer answers in C#) Example input: "in 20 minutes" "5 hours ago" "3h 2m...

What is the best way to parse html in C#?

I'm looking for a library/method to parse an html file with more html specific features than generic xml parsing libraries. ...

What's the best way of parsing strings?

We've got a scenario that requires us to parse lots of e-mail (plain text), each e-mail 'type' is the result of a script being run against various platforms. Some are tab delimited, some are space delimited, some we simply don't know yet. We'll need to support more 'formats' in the future too. Do we go for a solution using: Regex Sim...

Is there a way to parser a SQL query to pull out the column names and table names?

I have 150+ SQL queries in separate text files that I need to analyze (just the actual SQL code, not the data results) in order to identify all column names and table names used. Preferably with the number of times each column and table makes an appearance. Writing a brand new SQL parsing program is trickier than is seems, with nested SE...

C# Casting vs. Parse

This may seem rudimentary to some, but this question has been nagging at me and as I write some code, I figured I would ask. Which of the following is better code in c# and why? ((DateTime)g[0]["MyUntypedDateField"]).ToShortDateString() or DateTime.Parse(g[0]["MyUntypedDateField"].ToString()).ToShortDateString() Ultimately, is it ...

Parsing exact dates in C# shouldn't force you to create an IFormatProvider

Someone please correct me if I'm wrong, but parsing a yyyy/MM/dd (or other specific formats) dates in C# should be as easy as DateTime.ParseExact(theDate, "yyyy/MM/dd"); but no, C# forces you to create an IFormatProvider. Is there an app.config friendly way of setting this so I don't need to do this each time? DateTime.ParseExact(t...

parsings strings: extracting words and phrases [JavaScript]

I need to support exact phrases (enclosed in quotes) in an otherwise space-separated list of terms. Thus splitting the respective string by the space-character is not sufficient anymore. Example: input : 'foo bar "lorem ipsum" baz' output: ['foo', 'bar', 'lorem ipsum', 'baz'] I wonder whether this could be achieved with a single RegE...

Error Tolerant HTML/XML/SGML parsing in PHP

I have a bunch of legacy documents that are HTML-like. As in, they look like HTML, but have additional made up tags that aren't a part of HTML <strong>This is an example of a <pseud-template>fake tag</pseud-template></strong> I need to parse these files. PHP is the only only tool available. The documents don't come close to being we...

How to read values from numbers written as words?

As we all know numbers can be written either in numerics, or called by their names. While there are a lot of examples to be found that convert 123 into one hundred twenty three, I could not find good examples of how to convert it the other way around. Some of the caveats: cardinal/nominal or ordinal: "one" and "first" common spelling ...

Does C# have a String Tokenizer like Java's?

I'm doing simple string input parsing and I am in need of a string tokenizer. I am new to C# but have programmed Java, and it seems natural that C# should have a string tokenizer. Does it? Where is it? How do I use it? ...

Résumé parsing library for a .Net project

I need to extract information from hundreds of résumés. The ideal would be .doc, .docx, .pdf, .rtf --> hr-xml but since more than 90% of the résumés are .doc, the other formats are not a must have. I'm looking to buy a third-party tool or a component. Do you have any good/bad experience solving a similar problem? Clarification: I'm ...

Tool to parse a file

I'm trying to figure out the best way to parse a GE Logician MEL trace file to make it easier to read. It has segments like >{!gDYNAMIC_3205_1215032915_810 = (clYN)} execute>GDYNAMIC_3205_1215032915_810 = "Yes, No" results>"Yes, No" execute>end results>"Yes, No" >{!gDYNAMIC_3205_1215032893_294 = (clYN)} execute>GDYNAMIC_3205_12150...

Are there any Parsing Expression Grammar (PEG) libraries for Javascript or PHP?

I find myself drawn to the Parsing Expression Grammar formalism for describing domain specific languages, but so far the implementation code I've found has been written in languages like Java and Haskell that aren't web server friendly in the shared hosting environment that my organization has to live with. Does anyone know of any PEG l...

Read Firefox 3 bookmarks

Firefox 3 stores the bookmarks in a sqlite database. There are several hacked sqlite java libraries available. Is there a way to hack the sqlite database in java(not using libraries) to read bookmarks reliably? Does someone know how the sqlite DB is stored and access programmatically (from java)? ...