ansaurus

Question

Interesting AS3 hash situation. Is it really using strict equality as the documentation says?

Answer 1

+2 A:

The description given by Adobe is neither exact nor correct to be honest, but it is simpler and covers most cases.

You should try the following:

for (var key:* in dic) trace(getQualifiedClassName(key));//will give you 'String'

this behaviour is true for Array and Object as well.

As a rule of thumb: int stays int and any other key is converted to its string representation (including boolean and float values, as well as null and undefined).

The Dictionary class is different in that it does not convert non-primitive objects to String, but uses them as a key directly. They way of handling all other values is "inherited" Object if you will.

Basically, an Object consists of 2 hashes, one for string and one for integer keys. And the Dictionary adds another hash, probably simply using the objects memory address as integer key.

edit: Not really an answer to the actual question, but a point I want to explain in detail, in response to Triynko's comment:

Hacky? You mean, making it work in a way it's not designed to work? Well... since it's not designed to handle 64-bit integers, of course it's a hack. But 64-bits is 64-bits, whether they're interpretted as an integer or floating point.

Using 64 Bit floats to represent 64 Bit ints is already a bad idea, because semantically, it's completely wrong (you can usually expect problems to arise from this kind of misuse).

But then using their String representation as key (which implicetely happens if you use a float as key) is plain suicide:

var f1:Number = 1000000000000003000000000.0;
var f2:Number = 1000000000000002000000000.0;
trace(f1 == f2);//false
trace(String(f1) == String(f2));//true ... kabooooom

You will have a hard time determining, when 2 64 bit ints will collide, since the prequisite is that the string representation of their values interpreted as floats must be equal. Also, different player versions might have different string conversion of floats, as may alternative runtimes such as LightSpark. I really wouldn't wanna rely on this. This yields the kind of bugs that seem to come out of nowhere, when there's enough data to cause collision. And you will not enjoy tracking them down.

Also, float keys perform worse, since they have to be converted to strings before used for hash look up. If you are really so concerned about size, then you'll have to transmit the data as 64 bit int and convert it to a hex String on flash side only.
None the less, I would point out, that many people are extremely happy using XML or JSON, which has far worse overheads.

Bandwidth and any other hardware resources are cheap. Developement is expensive. You should write your apps to be maintainable and robust, otherwise they cost you more in the long run.

greetz
back2dos

back2dos 2010-05-25 18:35:00

Will two slightly different IEEE-754 double-precision floating-point values ever resolve to the same string? For example, (0.3).toString() and (0.1+0.2).toString() are slightly different floating point values, which resolve to different strings. So I'm wondering if there are any exceptional values that are indeed slightly different, but resolve to the same string (unlike my example).

Triynko 2010-05-25 22:52:18

I use .NET Int64 IDs in SQL Server, and I'm marshalling them to Flash as doubles, using BitConverter.Int64BitsToDouble, since flash supports IEEE-754 doubles, but not 64-bit integers. I'd like to use them as dictionary keys, but I'm not convinced it'll work... since I'll be dealing with many keys in the dictionary that differ only by a single bit. If Dictionary is comparing strings, then there could be a problem. I will make my own Dictionary that writes the double's to a ByteArray, then reads each half as a 32-bit int key for comparison, and uses a two-level hash with 32-bit int keys.

Triynko 2010-05-25 23:02:36

Seems to work, using double's as keys at least through 100,000, so I may just use them as-is. I wrote 3 64-bit integers at a time to a byte array, with each one incremented by one, so (0,1,2) were there the first time... then I read them out as doubles and used them as keys, and tried to access key1 with keys 2 and 3. It was never able to access them, so I know it's treating numbers that differ only by a couple bits as different keys. That's good to know.

Triynko 2010-05-26 00:28:35

@Triynko: I think, what you're doing is very hacky. Rather than interpreting 64 bit ints as doubles, you should convert them to their hexadecimal string representation. It costs you twice the size at maximum and it is signifficantly safer

back2dos 2010-05-26 10:03:09

@back2dos: Hacky? You mean, making it work in a way it's not designed to work? Well... since it's not designed to handle 64-bit integers, of course it's a hack. But 64-bits is 64-bits, whether they're interpretted as an integer or floating point. Transmitting it as a string, would actually require more than double the space, because in addition to the eight bytes, it must transmit the length of the string. Furthermore, DOUBLING the data sent is not a good option. As long as .NET BitConverter and AS3 ByteArray both work with IEEE-754 doubles, this will work. And it's working fine now :)

Triynko 2010-05-26 17:24:25

@back2dos: Objects do not treat int keys differently. All keys get evaluated to strings, including ints. (That is, obj["1"] is the same as obj[1].)

fenomas 2010-05-26 17:31:10

@fenomas: wrong. my point was, that `int` keys are stored as `int`s, which is something you may feel free to verify. Use `var o:Object = { }; o[1] = 5; for (var key:* in o) trace(getQualifiedClassName(key));`

back2dos 2010-05-26 17:43:54

I think the take-away lesson here is that all of your keys should be the same type (to ensure uniqueness), and they ultimately resolve to a strict string equality test (for primitive types) or a strict integer equality test (for non-primitive objects), which makes a TON of sense, because you avoid strange comparison rules of floating point values, and you don't risk an (integer) non-primitive object hash (whose value you don't know) colliding with a primitive-value key (since it's converted to a string and stored separately). Otherwise, just imagine an object-as-key hashing to a common value!

Triynko 2010-05-26 21:52:17

@back2dos: I didn't say that object keys get converted to strings, but that they are *evaluated* as strings. In other words, object[1] and object["1"] always evaluate to the same object.

fenomas 2010-05-27 00:41:13

I think fenomas last point is correct in that obj[1] and obj["1"] produces a collision, which I kind of expected for Objects but comes as a total WTF for dictionaries. `var d:Dictionary = new Dictionary(); d["1"] = "stringKey goes here";d[1] = "integerKey goes here"; for (var dictKey:* in dict) trace(getQualifiedClassName(dictKey))`. Running he above code, the dictionary ends up with just one entry (the last one added), when it's supposed to have 2, one for the integer 1 and other for the string "1".

Juan Pablo Califano 2010-05-27 01:27:16

Yet, if you look at the trace, you'll see the above code traces "int", whether you do `dict[1] = "integerKey goes here";dict["1"] = "stringKey goes here";` or you invert the order. That suggests, perhaps, that before hashing, it first tries to parse the key as an integer (or Number?). If it succeeds, it treats the key as as an int.

Juan Pablo Califano 2010-05-27 01:31:53

@Triynko. Re: 'I think the take-away lesson here is that all of your keys should be the same type (to ensure uniqueness)'. I wholeheartedly agree about that after discovering all this mess!

Juan Pablo Califano 2010-05-27 01:35:23

@Triynko: updated my post to explain, why I think you shouldn't not use your little hack.

back2dos 2010-05-27 11:10:14

@back2dos: I don't think you realize the full extent of what I did. First, it's a hack either way. Comparing floats or floats-as-strings would be a VERY BAD hack. But I did a GOOD HACK. C# supports Int64, so comparisons are fine there. But on the Flash end, I'm not actually doing any comparison on the Number class (just using it for storage); rather, I'm calling the ByteArray.writeDouble method to write the bits, then using two readInt calls. This is the Flash equivalent of C# BitConverter methods I'm using. This ensures integer comparison only in Flash, rather than string or float.

Triynko 2010-05-27 18:55:56

@back2dos (continued...): I'm also not using Number directly as a key. I'm actually keeping a list of objects, which I scan to find the one with a matching ID (Number). The (Number) ID comparison is done by my "CompareNumberBits( num1:Number, num2:Number ) method, which calls ByteArray.writeDouble on each number, then reads them back as 32-bit integers for comparison. So like I said... as long as IEEE-754 standard is honored, and C#'s BitConverter and Flash's ByteArray work properly (which I must/should be able to assume), then I should be fine... and it is working fine :)

Triynko 2010-05-27 19:02:47

@back2dos (continued2...): I would concede, however, that (if bandwidth is not an issue) using a hex-string representation would be FAR simpler and maintainable, and the string could be used directly/efficiently as a Flash Dictionary key. In fact... now that we've established exactly how the Dictionary works... this is a very convincing argument. Yeah, I think hex string is optimal (bandwidth be damned! lol).

Triynko 2010-05-27 19:19:08

Answer 2

A:

I'm afraid you've tripped over something pretty tricky here. Dictionary works as advertised in the sense that it uses object identities for keys, and does not test objects by value. And in most cases, that works out to the same thing as using strict equality, because in most cases two references to the same object are strictly equal to each other.

The wrinkle is that the AS3 specification has a special case: all comparisons with NaN are considered false by definition - even comparisons between NaN and itself (which is what your sample code is doing). There is even a compile-time warning to the this effect if NaN appears explicitly in the comparison. Also, there are other special cases for some other primitives like 'true' and 'undefined'.

It should be easy to see what's going on if you try a different test value:

import flash.utils.Dictionary;
var num1:Number = Number.POSITIVE_INFINITY;
var num2:Number = 1 / 0;
var dic:Dictionary = new Dictionary( true );
trace(num1); // Infinity
trace(num2); // Infinity
dic[num1] = "A";
trace( num1 == num2 ); // true!!
trace( num1 === num2 ); // true!!
trace( dic[num1] ); //A
trace( dic[num2] ); //A

That might seem odd depending on what languages you're used to, but all it means is that when you create two different references to infinity, AS3 doesn't give you two different objects whose value is infinity, it gives you two references to the same infinity object. Hence the references are strictly equal and they key to the same value. And using NaN instead of Infinity works the same way in every respect, except the part where you compare the value to itself - when it returns false due to NaN being a special case.

fenomas 2010-05-26 17:28:59

a lot of unverified claims, including your comment on my post. 1. the code given does not generate warnings, simply because there's no explicit comparison with NaN. 2. floats are never referenced, but stored directly. there is a float value for infinity. BTW. all primitives (even String, which technically isn't one), are considered strictly equal, if they are of the same type and equal, thus infinity === infinity. NaN is the only exception. 3. The implementation of Dictionary does not use the === operator. Following keys: null, "null", true, "true", undefined, "undefined" are considered equal.

back2dos 2010-05-26 17:58:18

also, since this is by far the first time, I have to say I feel a little offended by your behaviour. You rarely seem to put effort or thought or the patience to verify your claims into your answers and always presume others don't do that as well, which simply is not the case. hell you even claim, code should produce warnings, when you only need to try compiling, to see it doesn't.

back2dos 2010-05-26 18:02:26

@fenomas: Yeah, I would expect positive infinities to compare equal. I know NaN is an exception, HENCE I specifically chose it as a test case, and HENCE my surprise and objection to the documentation, since such a key should not be retrievable at all (since NaN does not strictly equal anything, even NaN). So, clearly dictionary is NOT using strict equality (===), but rather something similar. It is definitely using a string comparison for PRIMITIVE values, as evidenced by (back2dos saying) getQualifiedClassName(key) returns "string", and I can retrieve a key of "NaN" using Number.NaN.

Triynko 2010-05-26 21:39:23

@back2dos: Regarding compile time warnings, thanks for the correction. About your point 2 I agree, but I'm not sure what you're disagreeing with. About whether Dictionary uses ===, I think the misleading thing is that the docs suggest there's a comparison taking place - naturally, Dictionary doesn't really test strict equality or any other kind of equality, it uses a lookup table.

fenomas 2010-05-27 01:07:49

@back2dos cont.: About your being offended, I'm not sure what you're referring to. I don't think I implied anything here about the amount of effort put into your answer, so I can only presume you're talking about an answer I wrote to some other question. If that's the case, I'm happy to make any corrections necessary but I need a hint about what you're taking issue with. Telling me here that my answers elsewhere were bad is not feedback I can do anything with!

fenomas 2010-05-27 01:16:33

@Triynko: As I said above, I think one travels down the wrong path when talking about Dictionary using a string comparison for this or a strict comparison for that - because of course it doesn't really traverse a list of keys doing comparisons, it simply stores and retrieves data from a hash. I think what the docs are trying to say is that Dictionary and === differentiate in the same way, though as you've shown that isn't always true. But I think that talking about which comparison operator gets used is bound to lead to inconsistencies, since no comparison really takes place.

fenomas 2010-05-27 01:37:55

@fenomas: Actually, a hash DOES traverse a list and compare keys. The hash function generally locates a hash bucket index... but the bucket contains a list of key/value pairs, which must be walked when hash collisions occur and keys end up in the same bucket. Right? But otherwise, yeah, we're all pretty much on the same page now.

Triynko 2010-05-27 19:23:41

@Triynko: That's true, but it doesn't happen in the domain we're talking about. By the time the table lookup occurs, the keys have already been converted to an internal type (which is a plain integer as far as c++ is concerned). So what's important for your purposes is not how the keys are compared, but how they are converted. Which is admittedly a question that back2dos and I are not really answering, we're just talking about mental models that approximate the actual answer. But you can always go source-diving! ;)

fenomas 2010-05-28 00:10:37

ansaurus

tags:

views:

answers:

Interesting AS3 hash situation. Is it really using strict equality as the documentation says?

related questions