tags:

views:

52

answers:

2

Hi everyone

I would like to do the following: 1) Serialize my class 2) Also manually edit the serialization dump file to remove certain objects of my class which I find unnecessary.

I am currently using python with simplejson. As you know, simplejson converts all characters to unicde. As a result, when I dump a particular object with simplejson, the unicode characters becomes something like that "\u00bd" for 好.

I am interested to manually edit the simplejson file for convenience. Anyone here know a work around for me to do this?

My requirements for this serialization format: 1) Easy to use (just dump and load - done) 2) Allows me to edit them manually without much hassle. 3) Able to display chinese character

I use vim. Does anyone know a way to conver "\u00bd" to 好 in vim?

+1  A: 

I don't know anything about simplejson or the Serialisation part of the question, but you asked about converting "\u00bd" to 好 in Vim. Here are some vim tips for working with unicode:

  • You'll need the correct encoding set up in vim, see:

    :help 'encoding'
    :help 'fileencoding'
    
  • Entering unicode characters by number is simply a case of going into insert mode, pressing Ctrl-V and then typing u followed by the four digit number (or U followed by an 8-digit number). See:

    :help i_CTRL-V_digit
    
  • Also bear in mind that in order for the character to display correctly in Vim, you'll need a fixed-width font containing that character. It appears as a wide space in Envy Code R and as various boxes in Lucida Console, Consolas and Courier New.

  • To replace \uXXXX with unicode character XXXX (where X is any hexadecimal digit), type this when in normal mode (where <ENTER> means press the ENTER key, don't type it literally):

    :%s/\\u\x\{4\}/\=eval('"' . submatch(0) . '"')/g<ENTER>
    

Note however that u00bd appears to be unicode character ½ (1/2 in case that character doesn't display correctly on your screen), not the 好 character you mentioned (which is u597D I think). See this unicode table. Start vim and type these characters (where <Ctrl-V> is produced by holding CTRL, pressing V, releasing V and then releasing CTRL):

    i<Ctrl-V>u00bd

You should see a small character looking like 1/2, assuming your font supports that character.

Al
Sweet. Thanks for this. I hadn't seen this. Also ^vx followed by 2 digits works.
jcdyer
A: 

If you want json/simplejson to produce unicode output instead of str output with Unicode escapes then you need to pass ensure_ascii=False to dump()/dumps(), then either encode before saving or use a file-like from codecs.

Ignacio Vazquez-Abrams
that worked out great thx a lot man
sadawd
that did it. I never thought the answer is so simpleTHX!
sadawd