tags:

views:

67

answers:

2

Having a snippet like this:

import yaml
class User(object):
    def __init__(self, name, surname):
       self.name= name
       self.surname= surname

user = User('spam', 'eggs')
serialized_user = yaml.dump(user)
#Network
deserialized_user = yaml.load(serialized_user)
print "name: %s, sname: %s" % (deserialized_user.name, deserialized_user.surname)

Yaml docs says that it is not safe to call yaml.load with any data received from an untrusted source; so, what do i need to modify to my snippet\class to use safe_load method?
Is it possible?

+4  A: 

It appears that safe_load, by definition, does not let you deserialize your own classes. If you want it to be safe, I'd do something like this:

import yaml
class User(object):
    def __init__(self, name, surname):
       self.name= name
       self.surname= surname

    def yaml(self):
       return yaml.dump(self.__dict__)

    @staticmethod
    def load(data):
       values = yaml.safe_load(data)
       return User(values["name"], values["surname"])

user = User('spam', 'eggs')
serialized_user = user.yaml()
print "serialized_user:  %s" % serialized_user.strip()

#Network
deserialized_user = User.load(serialized_user)
print "name: %s, sname: %s" % (deserialized_user.name, deserialized_user.surname)

The advantage here is that you have absolute control over how your class is (de)serialized. That means that you won't get random executable code over the network and run it. The disadvantage is that you have absolute control over how your class is (de)serialized. That means you have to do a lot more work. ;-)

Benson
+1 Crystal clear.I have seen __dict__ also for dumps with JSON.
systempuntoout
Excellent! Basically you're just dumping the member namespace, and the best way to do that is a dict.
Benson
+2  A: 

Another way exists. From the PyYaml docs:

A python object can be marked as safe and thus be recognized by yaml.safe_load. To do this, derive it from yaml.YAMLObject [...] and explicitly set its class property yaml_loader to yaml.SafeLoader.

You also have to set the yaml_tag property to make it work.

YAMLObject does some metaclass magic to make the object loadable. Note that if you do this, the objects will only be loadable by the safe loader, not with regular yaml.load().

Working example:

import yaml

class User(yaml.YAMLObject):
    yaml_loader = yaml.SafeLoader
    yaml_tag = u'!User'

    def __init__(self, name, surname):
       self.name= name
       self.surname= surname

user = User('spam', 'eggs')
serialized_user = yaml.dump(user)

#Network

deserialized_user = yaml.safe_load(serialized_user)
print "name: %s, sname: %s" % (deserialized_user.name, deserialized_user.surname)

The advantage of this one is that it's prety easy to do; the disadvantages are that it only works with safe_load and clutters your class with serialization-related attributes and metaclass.

Petr Viktorin
nice solution.Thanks
systempuntoout