tags:

views:

95

answers:

2

Hi,

Our state government has opened its transport timetable data. The data is in xml based TransXchange standard format.

The problem is the data files are huge. The sample data file itself is 300 MB.

The good thing is most of the data is redundant and I don't need it for my application. I am wondering what options do I have of inserting/transforming only the data I need into SQL Server?

Thanks.

+2  A: 

You need an XML streaming (event based) parser to avoid loading the whole tree into memory. Most languages have several based on the SAX (Simple API for XML) standard.

shanna
I am going to be using C#, does that have SAX support?
mob1lejunkie
mob1lejunkie: You didn't mention C# in your question. You asked "what options do I have?". What makes you think some random person answering your question uses C#? I'd use Perl (and a non-SAX but streaming XML parser) myself, but then, I'm biased. And I know Perl.
runrig
I asked what options do I have in the hope of getting product recommendation that already does what I need rather then writing code to do it.
mob1lejunkie