views:

85

answers:

2

Hi all

I'm coding in C# for the .NET Framework 3.5.

I am trying to parse some Json to a JObject.

The Json is as follows:

{
    "TBox": {
        "Name": "SmallBox",
        "Length": 1,
        "Width": 1,
        "Height": 2 },
    "TBox": {
        "Name": "MedBox",
        "Length": 5,
        "Width": 10,
        "Height": 10 },
    "TBox": {
        "Name": "LargeBox",
        "Length": 20,
        "Width": 20,
        "Height": 10 }
}

When I try to parse this Json to a JObject, the JObject only knows about LargeBox. The information for SmallBox and MedBox is lost. Obviously this is because it is interpreting "TBox" as a property, and that property is being overwritten.

I am receiving this Json from a service that's coded in Delphi. I'm trying to create a C# proxy for that service. On the Delphi-side of things, the "TBox" is understood as the type of the object being returned. The inner properties ("Name", "Length", "Width", "Height") are then understood as regular properties.

I can serialize and deserialize a custom 'TBox' object that has Name, Length, Width, and Height properties. That's fine.

What I want to do is step through all the TBox sections in such a way as to extract the following three Json strings.

First:

{
    "Name": "SmallBox",
    "Length": 1,
    "Width": 1,
    "Height": 2 }

Second:

{
    "Name": "MedBox"
    "Length": 5,
    "Width": 10,
    "Height": 10 }

Third:

{
    "Name": "LargeBox"
    "Length": 20,
    "Width": 20,
    "Height": 10 }

Once I have these strings, I can serialize and deserialize to my heart's content.

I'm finding Newtonsoft.Json to be very good. I really don't want to go messing about with other frameworks if I can avoid it.

Any help would be greatly appreciated.

I have very limited input as to changes that can be made to the server.

+3  A: 
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;

JsonTextReader jsonReader = new JsonTextReader(reader);
jsonReader.Read();
while(jsonReader.Read())
{
    if(jsonReader.TokenType == JsonToken.StartObject)
    {
        JObject tbox = JObject.Load(jsonReader);
    }
}

However, note that the RFC says, "The names within an object SHOULD be unique" so if you can, recommend the format be changed.

EDIT: Here's an alternate design that doesn't have duplicate keys:

[
    {
        "TBox": {
            "Width": 1,
            "Length": 1,
            "Name": "SmallBox",
            "Height": 2
        }
    },
    {
        "TBox": {
            "Width": 10,
            "Length": 5,
            "Name": "MedBox",
            "Height": 10
        }
    },
    {
        "TBox": {
            "Width": 20,
            "Length": 20,
            "Name": "LargeBox",
            "Height": 10
        }
    }
]
Matthew Flaschen
Perfect! That was fast. Thanks.
Ubiquitous Che
Sent through an RFC link and had a chat with one of the senior developers. The word SHOULD has a particular meaning in RFC 2119, which basically means that the policy SHOULD be followed unless there's a strong enough reason to break it. In this case there is - the implementation on the server involves sending around lists of generic types. The specific type to be serialized/deserialized is used as the 'name' of the top-level of the Json. It's annoying for me to code against, but they're still within the RFC.
Ubiquitous Che
@Ubiquitous, I never said it violated the RFC. But it is unintuitive, and I don't necessarily agree this is a reason to break it. There are other designs that provide the type information without duplicate keys.
Matthew Flaschen
In this case I would interpret the SHOULD as a direction to JSON-parser-developers that they shouldn't throw an error when they encounter duplicate names. However, since the JS in JSON stands for "JavaScript" and JavaScript data structures CANNOT have duplicate names, it seems clear to me that your Delphi guys are violating the spirit, if not the letter, of the spec. The Newtonsoft behavior is exactly correct, because it's the same thing a JavaScript parser would do.
Joel Mueller
Ha! I'll pass it on. My argument was that the way they've done it won't work for most parsers in the wild. Their response was that was the fault of the parsers for not honoring the meaning of SHOULD correctly. ^_^ Thanks for the support and follow up, it's much appreciated.
Ubiquitous Che
As the implementer of the service in question I will say only this: The JSON *specification* allows duplicate names. If you do not allow duplicate names in your structures then what you have is not, according to it's own terms, JSON. I would also say that the service under development is embryonic. The duplicated values are currently not even used, but it is envisaged they may be. An alternate approach may be devised, but any alteration will be based on making the right decision for the server implementation, not simply to make lives easier for people wishing to use non-compliant parsers.
Deltics
@Matthew: you will have noticed that your alternate design introduces additional structure with NIL additional semantic contribution. It was this sort of increase in noise:signal ratio that JSON set out to alleviate us from with the likes of XML! The current, compliant, design is - in those terms - far more in keeping with the spirit of JSON, in addition to being entirely compliant with the letter of the specification! ;)
Deltics
An object in JSON is a name/value collection and this JSON is misusing the name. Json.NET's JObject in this example is doing exactly what a browser would do when provided with duplicate properties: uses the last value. The best solution is to change the JSON to either have a wrapper object with a property for the type name and a property for the value or add the type as a special property on the value object.
James Newton-King
The service is not for consumption by a web browser. I see little (no) point in writing it in a way for which it is not intended to be used when doing so is less efficient (not just in representation but in processing effort for the serialisation/deserialisation). Cup holders have no place in formula 1 cars. ;) If it violated the *specification* then you may have a point, but it doesn't. And that is the bottom line imho. I just don't see how "being right" be "wrong".
Deltics
I used the web browser as an example because that is the most popular use of JSON and it defines how people expect a JSON object to work: a name/value collection. You aren't breaking the specification but you are breaking user expectations.
James Newton-King
Thank you for telling us what our users will expect. It is fascinating to me that you know what they expect when you don't know anything about the application (beyond choosing to ignore what I have already told you). Just in case the ACTUAL use is of ANY relevance (it may not be to you, but as the person implementing the thing, it sure is to me), I shall point out that JSON is being used in this case as a lightweight mechanism for passing data between two applications, NEITHER of which will be a web browser and NEITHER of which will be pumping the data through a JavaScript parser or engine.
Deltics
Calm - I'm giving my opinion trying to help, not criticize you. Anyways, a user in this case is the developer. Developers are use to working with JSON objects that are name/value collections. The JSON home page - http://json.org/ - defines them as just that: a name/value collection. Again, you aren't breaking the spec but you are breaking user (developer) expectations. If you have a good reason to structure the JSON like you have then that is fine, just be aware that by doing something non-standard the consequence could be more questions for help like this one by the consumers of the JSON.
James Newton-King
Deltics - If you can find even one JSON parser not written at your company that supports duplicate names in a JSON name/value collection, I'll eat my hat. And if you tell me that means all JSON parsers don't comply with the spec, you're going to be laughed at...
Joel Mueller
@Joel - ANY JSON parser that doesn't throw an error when encountering duplicate names by definition SUPPORTS duplicate names. The Newtonsoft parser being used by my colleague is just such another example. Not only does it support duplicate names but it also allows code to WORK with JSON representations CONTAINING duplicate names. Do you want sauce with your hat? ROFL
Deltics
Deltics, James Newton-King's Newtonsoft parser does the same thing that JavaScript itself does - if it encounters duplicate keys, each successive duplicate key overwrites the previous value for that key. In the end, you have one key with one value. In this, it follows the spec, which does not require that multiple values be retained when duplicate keys are encountered. Matthew's workaround, above, involves working with the JSON token stream directly, specifically because of this issue. I'm afraid you're wrong on all counts. Nice try, though.
Joel Mueller
@Deltics "Any JSON parser that doesn't throw an error when encountering duplicate names by definition SUPPORTS duplicate names" Oh yeah? Do me a favor. Open up Firebug and run the following statement to parse your JSON and tell us all what happens. console.log(eval({ "TBox": { "Name": "SmallBox", "Length": 1, "Width": 1, "Height": 2 }, "TBox": { "Name": "MedBox", "Length": 5, "Width": 10, "Height": 10 }, "TBox": { "Name": "LargeBox", "Length": 20, "Width": 20, "Height": 10 }}))
Shawn Grigson
As I thought. You guys are confusing "JSON Parser" with "JavaScript engine". That's like saying "XML Parser" when you mean ".NET Framework" in connection with an .appdata configuration file. Yes, it is XML, but it is what you are DOING WITH IT that determines whether a particular USE of it is correct. Considered separately from what TYPE of document it is, whether XML is valid or not is defined by the XML specification. The same applies here. The JSON being produced is valid. It's not compatible with JavaScript EXECUTION, but that's OK because that's not how it is being used.
Deltics
@Shawn: That isn't exercising a JSON parser, it's executing JavaScript code. If you don't understand the difference then it's pointless discussing what you see with you and trying to explain why it has not the slightest relevance to the issue at hand.
Deltics
@Joel: It does that if you use it for deserialising objects using the auto-deserialisation framework it provides, but if you treat it simply as a JSON PARSER it IS possible to access values with the same names as discrete values. I know you can, because I've seen it done (and I didn't even have to write the code myself).
Deltics
+1  A: 

If I'm not mistaken, the correct answer to this is that your input is not actually JSON. So no, getting a JSON parser to parse it probably isn't going to work.

Maybe you don't have any control over the source of the input, so I'd use a Regex or something to pre-filter the string. Turn it into something like:

{"TBoxes":
    [
        {
            "Name": "SmallBox",
            "Length": 1,
            "Width": 1,
            "Height": 2 
        },
        {
            "Name": "MedBox",
            "Length": 5,
            "Width": 10,
            "Height": 10 
        },
        {
            "Name": "LargeBox",
            "Length": 20,
            "Width": 20,
            "Height": 10 
        }
    ]
}

And treat it like the array that it is.

Mike Ruhlin
I had a go at Regex, and it was proving difficult. In some of the real-world scenarios I'm going to have to handle objects within objects within objects. I tried it out just to be sure, and handling the nested curly-braces and the possible character sets quickly became a pain in the ass. I'm normally a huge fan of regex, but in this case I was hoping for something easier.
Ubiquitous Che
Turns out that they are *technically* still sending valid Json. See my second comment to Matthew above. It's annoying on my end, but I can handle it now.
Ubiquitous Che
You are mistaken - it is correct JSON.
Deltics
No, it's really not correct JSON. You're misinterpreting the spec, which is "based on a subset of the JavaScript Programming Language, Standard ECMA-262 3rd Edition - December 1999." There are no JavaScript parsers that will retain any but the very last value associated with multiple duplicate keys. Go ahead and tell Douglas Crockford, the inventor of JSON, that he's wrong.
Joel Mueller
@Joel: Wrong. There are at least TWO. The one I wrote and the one provided by Newtonsoft. Just because many parsers make the same invalid assumptions you do does not mean they are wrong. They correctly implement the spec. You don't have to like it, but persisting in the view that something that complies with the *specification* is wrong is just plain stupid. What you get when run "execute" the JS in a JSON object is not relevant to the question of what is a correct JSON structure, especially if the JSON structure is not intended to be executed as JS and is merely a data transport. End.
Deltics
I am correctly interpreting the spec - you are the one incorrectly interpreting it. The spec states that names SHOULD be unique. If it intended that names HAD TO BE unique then it would say that they MUST be unique. The spec clearly and deliberately uses the term SHOULD, not MUST, and refers to RFC 2119 which defines those terms. An interpretation which reads SHOULD as MUST is an incorrect interpretation. Fact.
Deltics
@Deltics - a parser that doesn't throw an error when it encounters duplicate keys, and also ends up producing an object that contains only one value per key, is following the spec. I am speaking, in this case, of all JSON parsers save yours, including Newtonsoft. If you want to keep using your own unique definition of the term "key-value pairs" by all means go ahead. But don't pretend that everyone else should rewrite their parsers to cater to you. But don't take my word for it. Here's a Delphi JSON parser. See what it does. http://goo.gl/UoKQ
Joel Mueller
I'm not pretending anything. I'm following the spec, not some imagined document that I think is a spec. shrug.
Deltics
I am not using my own definition of "name-value" pairs NOTE: **NAME**/value, *NOT* **key**/value. The word "key" occurs only twice, both in connection with defining the difference between MUST and SHOULD. You have read the spec? Haven't you? Even if you have, clearly you are using your own very unique definition of the word "specification" where what the specification says isn't what the specification means.
Deltics