structured-data

Does Log4j SyslogAppender support MDC and NDC

Simple really, does Log4j SyslogAppender support MDC and NDC in the sense that the output is structured data i.e. uses the structured data features of the protocol? Further, are there any limits on what can be put in the MDC and successfully appended to the log? ...

.NET Library For Fixed Length Text Files

I'm looking for a .NET (much preferably open source in C#) library for dealing with fixed length field text files. It wouldn't be too much to write one, but existing, tested work is always nicer, to start with. I will be extracting data in fixed length fields from files produced by a PBX. Each PBX has its own file format, as well a se...

Processing un-normalised text data.

With reference to an earlier question of mine on parsing fixed length field files, I have come up with another question at issue. I will probably be using a customised version of the FileHelpers library, with field start and length attributes that dynamically read the values for their target properties from a config file. The text data...

algorithm q: Fuzzy matching of structured data

I have a fairly small corpus of structured records sitting in a database. Given a tiny fraction of the information contained in a single record, submitted via a web form (so structured in the same way as the table schema), (let us call it the test record) I need to quickly draw up a list of the records that are the most likely matches fo...

SubSonic 2.x now supports TVP's - SqlDbType.Structure / DataTables for SQL Server 2008

For those interested, I have now modified the SubSonic 2.x code to recognize and support DataTable parameter types. You can read more about SQL Server 2008 features here: http://download.microsoft.com/download/4/9/0/4906f81b-eb1a-49c3-bb05-ff3bcbb5d5ae/SQL%20SERVER%202008-RDBMS/T-SQL%20Enhancements%20with%20SQL%20Server%202008%20-%20Pra...

Unstructured Text to Structured Data

I am looking for references (tutorials, books, academic literature) concerning structuring unstructured text in a manner similar to the google calendar quick add button. I understand this may come under the NLP category, but I am interested only in the process of going from something like "Levi jeans size 32 A0b293" to: Brand: Levi, Si...

What structured text format is the best supported in Python?

This question may be seen as subjective, but I'd like to ask SO users which common structured textual data format is best supported in Python. My initial choices are: XML JSON and YAML Which of these three is easiest to work with in Python (ie. has the best library support / performance) ... or is there another format that I haven't...