views:

384

answers:

2

I have a javascript string which is about 500K when being sent from the server in UTF-8. How can I tell its size in JavaScript?

I know that JavaScript uses UCS-2, so does that mean 2 bytes per character. However, does it depend on the JavaScript implementation? Or on the page encoding or maybe content-type?

+1  A: 

String values are not implementation dependent, according the ECMA-262 3rd Edition Specification, each character represents a single 16-bit unit of UTF-16 text:

4.3.16 String Value

A string value is a member of the type String and is a finite ordered sequence of zero or more 16-bit unsigned integer values.

NOTE Although each value usually represents a single 16-bit unit of UTF-16 text, the language does not place any restrictions or requirements on the values except that they be 16-bit unsigned integers.

CMS
My reading of that passage doesn't imply implementation independence.
Paul Biggar
UTF-16 is not guaranteed, only the fact of the strings stored as 16-bit ints.
bjornl
+1  A: 

Pual, try this combination with using unescape js function:

var byteAmount = unescape(encodeURIComponent(yourString)).length

Full encode proccess example:


    var s  = "1 a ф № @ ®"; //length is 11
    var s2 = encodeURIComponent(s); //length is 41
    var s3 = unescape(s2); //length is 15 [1-1,a-1,ф-2,№-3,@-1,®-2]
    var s4 = escape(s3); //length is 39
    var s5 = decodeURIComponent(s4); //length is 11

See aditional sreen http://dl.dropbox.com/u/2086213/%3Dcoding%3D/js_utf_byte_length.png (I am new user, so cann't use img tag)

Kinjeiro