expat

Need advice on getting a job in a foreign country

Hello, I am a citizen of the United States of America and I want to get a job in the United Kingdom. Sounds good on the surface but I am looking for some advice on what to expect in the way of taxes (and the salary planning associated with them), legal issues and anything else that might not be that obvious. If you have done the ExP...

How can a software developer become location independent?

I have been playing with the idea of working from wherever I happen and want to be. Every now and then there is a need to change the scenery. So far I have done that simply by finding a job and relocating from country to another inside Europe. Luckily, this is pretty straightforward inside EU for an IT professional with some experience...

What is the most efficient way of extracting information from a large number of xml files in python?

Hi, I have a directory full (~103, 104) of XML files from which I need to extract the contents of several fields. I've tested different xml parsers, and since I don't need to validate the contents (expensive) I was thinking of simply using xml.parsers.expat (the fastest one) to go through the files, one by one to extract the data. I...

What is an XML parser? Using Expat

This might seem like a simple question. But I have been looking for an XML parser to use in one of my applications that is running on Linux. I am using Expat and have parsed my XML file by reading one in. However, the output is the same as the input. This is my file I am reading in: <?xml version="1.0" encoding="utf-8"?> <books> ...

Geting xml data using xml parser expat

Hello, I have managed to parse ok. But now I am having trouble getting the values that I need. I can get the element and the attributes. But cannot get the values. I would like to get the value of frame in this xml it is 20. /* track the current level in the xml tree */ static int depth = 0; /* first when start element is encountered *...

expat parser: memory consumption

Hi, I am using expat parser to parse an XML file of around 15 GB . The problem is it throws an "Out of Memory" error and the program aborts . I want to know has any body faced a similar issue with the expat parser or is it a known bug and has been rectified in later versions ? ...

XML parsing expat in python handling data

I am attempting to parse an XML file using python expat. I have the following line in my XML file: <Action>&lt;fail/&gt;</Action> expat identifies the start and end tags but converts the & lt; to the less than character and the same for the greater than character and thus parses it like this: outcome: START 'Action' DATA '<' DATA 'f...

Python xml.dom and bad XML

I'm trying to extract some data from various HTML pages using a python program. Unfortunately, some of these pages contain user-entered data which occasionally has "slight" errors - namely tag mismatching. Is there a good way to have python's xml.dom try to correct errors or something of the sort? Alternatively, is there a better way to...

Yahoo BOSS Python Library, ExpatError

I tried to install the Yahoo BOSS mashup framework, but am having trouble running the examples provided. Examples 1, 2, 5, and 6 work, but 3 & 4 give Expat errors. Here is the output from ex3.py: gpython examples/ex3.py examples/ex3.py:33: Warning: 'as' will become a reserved keyword in Python 2.6 Traceback (most recent call last):...

Can I enforce the order of XML attributes using a schema?

Our C++ application reads configuration data from XML files that look something like this: <data> <value id="FOO1" name="foo1" size="10" description="the foo" ... /> <value id="FOO2" name="foo2" size="10" description="the other foo" ... /> ... <value id="FOO300" name="foo300" size="10" description="the last foo" ... /> </data> Th...

How to fix noncompliant HTML so Expat will parse it (htmltidy not working)

I'm trying to scrape information from http://www.nfl.com/scores (in particular, find out when a game is over so my computer can stop recording it). I can download HTML easily enough, and it makes this claim about compliance with standards: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/D...

Losing whitespace around escaped symbols in CDATA using Expat XML parser in C++

I'm using XML to send project information between applications. One of the pieces of information is the project description. So I have: <ProjectDescription>Test &amp; spaces around&amp;some &amp; amps!</ProjectDescription> Or: "Test & spaces around&some & amps!" <-- GOOD! When I then use Expat to parse it, my data handler gets ju...

Parsing XML in Python using Expat

Background: I'm coming from C#-land, so I'm looking for something like being able to handle nodes and values by selecting via Xpath. Here's my code, so far: import urllib import sys from xml.parsers import expat url = 'http://SomeWebService.SomeDomain.com' u = urllib.urlopen(url) Parser = expat.ParserCreate() data = u.read() try: ...

Python.expat can't parse XML file with bad symbols. How to go around?

I'm trying to parse an XML file (OSM data) with expat, and there are lines with some Unicode characters that expat can't parse: <tag k="name" v="абвгдежзиклмнопр�?туфхцчшщьыъ�?ю�?�?БВГДЕЖЗИКЛМ�?ОПРСТУФХЦЧШЩЬЫЪЭЮЯ" /> <tag k="name" v="Cin\x8e? Rex" /> (XML file encoding in the opening line is "UTF-8") The file is quite old, and there...

Looking for streaming xml pretty printer in C/C++ using expat or libxml2

I'm looking for a streaming xml pretty printer for C/C++ that's either self contained or that uses libxml2 or expat and has a BSD-ish license. I've searched a bit and not found one. It seems like something that would be generally useful. Am I missing an obvious tool that does this? Background: I have a library that outputs xml without w...

What configuration should i use for compiling expat for MIPS running Debian Embbeded Linux

Hi, I trying to understand what flags should I use when running ./configure of expat. To compile for and embedded target running a debian based distribution on MIPS. For example: Do I have to tell it where the kernel source is? Do I have to give it the architecture I'm using? ... Thanks Schmil ...

getting expat to use .dtd for entity replacement in python

I'm trying to read in an xml file which looks like this <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE dblp SYSTEM "dblp.dtd"> <dblp> <incollection> <author>Jos&eacute; A. Blakeley</author> </incollection> </dblp> The point that creates the problem looks is the Jos&eacute; A. Blakeley part: The parser calls its character han...

Python + Expat: Error on &#0; entities

I have written a small function, which uses ElementTree and xpath to extract the text contents of certain elements in an xml file: #!/usr/bin/env python2.5 import doctest from xml.etree import ElementTree from StringIO import StringIO def parse_xml_etree(sin, xpath): """ Takes as input a stream containing XML and an XPath expression...

Compile EXPAT to statically-linked .a on Windows

I am writing C program on Windows with MingW and want to use EXPAT XML library. I want to compile my program statically, so I need static .a library. Is there any way to compile EXPAT to .a static, independent library on Windows? ...

libxml2 vs expat for an XMPP server

I'm trying to create an XMPP library (and later a server) from scratch in a new C-like programming language (although the language itself is irrelevant) as a means to learn what I can about the XMPP protocol and server software development in general. As many of you know, XMPP is messaging protocol based on XML that depends on an enormo...