views:

20

answers:

1

I am trying to extract blocks of JSON data from a data stream in the following format:

    Some-Header-Name:Value
    Content-Length:Value
    Some-Other-Header:Value

    {JSON data string of variable length}

The stream contains many instances of the above pattern and the length of JSON data in each instance is different, as indicated by the preceeding Content-Length header.

I wish to create a Regex that matches each of the content length header values and uses it to match the associated content block. I envisage something like this ...

    Content-Length:(?<LENGTH>\d+).*?\r\n\r\n(?<CONTENT>.{$<LENGTH>})

... but I'm not sure how to specify the quantifier for the CONTENT group as a dynamic value.

Note: although the headers are on separate lines and the content is separated from the headers by a blank line, there is no linefeed after the content, so it is not possible to use this to determine the end of content.

Any suggestions would be appreciated.

Thanks, Tim

A: 

Regular expressions match strings, not numbers, and therefore they can't take a part of the string, convert it to a number, and reapply it within the same regex.

You'd have to do it in several steps:

  1. Match the header, extract the length value
  2. Build a new regex like @"(?<HEADER>...)(?<CONTENT>.{" + length + "})"
  3. Reapply that regex and extract the contents.
Tim Pietzcker
Thanks - I guess I was expecting too much. I can see that your approach will work (hence I have accepted your answer), but I was hoping that Regex would offer something that would extract many matches in a single operation.
Tim Coulter