tags:

views:

243

answers:

2

Is there a way to get the current position in the stream of the node under examination by the XmlReader?

I'd like to use the XmlReader to parse a document and save the position of certain elements so that I can seek to them later.

Addendum:

I'm getting Xaml generated by a WPF control. The Xaml should not change frequently. There are placeholders in the Xaml where I need to replace items, sometimes looping. I thought it might be easier to do in code rather than a transform (I might be wrong about this). My idea was to parse it to a simple data structure of what needs to be replace and where it is, then use a StringBuilder to produce the final output by copying chunks from the xaml string.

+3  A: 

Just to head off one suggestion before it's made: you could keep a reference to the underlying stream you pass into XmlReader, and make a note of its position - but that will give you the wrong results, as the reader will almost certainly be buffering its input (i.e. it'll read the first 1024 characters or whatever - so your first node might "appear" to be at character 1024).

If you use XmlTextReader instead of just XmlReader, then that implements IXmlLineInfo, which means you can ask for the LineNumber and LinePosition at any time - is that good enough for you? (You should probably check HasLineInfo() first, admittedly.)

EDIT: I've just noticed that you want to be able to seek to that position later... in that case line information may not be terribly helpful. It's great for finding something in a text editor, but not so great for moving a file pointer. Could you give some more information about what you're trying to do? There may be a better way of approaching the problem.

Jon Skeet
dmo
It looks like XmlTextReader implements IXmlLineInfo.
dmo
A: 

I have the same problem and apparently there is no simple solution.

So I decided to manipulate two read-only FileStream : one for the XmlReader, the other to get the position of each line :

private void ReadXmlWithLineOffset()
{
    string malformedXml = "<test>\n<test2>\r   <test3><test4>\r\n<test5>Thi is\r\ra\ntest</test5></test4></test3></test2>";
    string fileName = "test.xml";
    File.WriteAllText(fileName, malformedXml);

    XmlTextReader xr = new XmlTextReader(new FileStream(fileName, FileMode.Open, FileAccess.Read));
    FileStream fs2 = new FileStream(fileName, FileMode.Open, FileAccess.Read);

    try
    {
        int currentLine = 1;
        while(xr.Read())
        {
            if (!string.IsNullOrEmpty(xr.Name))
            {
                for (;currentLine < xr.LineNumber; currentLine++)
                    ReadLine(fs2);
                Console.WriteLine("{0} : LineNum={1}, FileOffset={2}", xr.Name, xr.LineNumber, fs2.Position);
            }
        }
    }
    catch (Exception ex)
    {
        Console.WriteLine("Exception : " + ex.Message);
    }
    finally
    {
        xr.Close();
        fs2.Dispose();
    }
}

private void ReadLine(FileStream fs)
{
    int b;
    while ((b = fs.ReadByte()) >= 0)
    {
        if (b == 10) // \n
            return;
        if (b == 13) // \r
        {
            if (fs.ReadByte() != 10) // if not \r\n, go back one byte
                fs.Seek(-1, SeekOrigin.Current);
            return;
        }
    }            
}

This is not the best way of doing this because it uses two readers. To avoid this, we could rewrite a new FileReader shared between the XmlReader and the line counter. But it simply gives you the offset of the line you're interested in. To get the exact offset of the tag, we should use LinePosition, but this can be tricky because of the Encoding.

Etienne Coumont