views:

248

answers:

2

I was going to CSV based export/import for large data with app engine. My idea was just simple.

  • First column of CSV would be key of entity.
  • If it's not empty, that row means existing entity and should overwrite old one.
  • Else, that row is new entity and should create new one.

I could export key of entity by adding key property.

class FrontExporter(bulkloader.Exporter):
    def __init__(self):
        bulkloader.Exporter.__init__(self, 'Front', [
        ('__key__', str, None),
        ('name', str, None),
        ])

But when I was trying to upload CSV, it had failed because bulkloader.Loader.generate_key() was just for "key_name" not "key" itself. That means all exported entities in CSV should have unique 'key_name' if I want to modify-and-reupload them.

class FrontLoader(bulkloader.Loader):
    def __init__(self):
        bulkloader.Loader.__init__(self, 'Front', [
        ('_UNUSED', lambda x: None),
        ('name', lambda x: x.decode('utf-8')),
        ])
    def generate_key(self,i,values):
        # first column is key
        keystr = values[0]
        if len(keystr)==0:
            return None
        return keystr

I also tried to load key directly without using generate_key(), but both failed.

class FrontLoader(bulkloader.Loader):
    def __init__(self):
        bulkloader.Loader.__init__(self, 'Front', [
        ('Key', db.Key), # not working. just create new one. 
        ('__key__', db.Key), # same...

So, how can I overwrite existing entity which has no 'key_name'? It would be horrible if I should give unique name to all entities.....


From the first answer, I could handle this problem. :)

def create_entity(self, values, key_name=None, parent=None):
  # if key_name is None:
  #     print 'key_name is None'
  # else:
  #     print 'key_name=<',key_name,'> : length=',len(key_name)
  Validate(values, (list, tuple))
  assert len(values) == len(self._Loader__properties), (
      'Expected %d columns, found %d.' %
      (len(self._Loader__properties), len(values)))

  model_class = GetImplementationClass(self.kind)

  properties = {
      'key_name': key_name,
      'parent': parent,
      }
  for (name, converter), val in zip(self._Loader__properties, values):
    if converter is bool and val.lower() in ('0', 'false', 'no'):
      val = False
    properties[name] = converter(val)

  if key_name is None:
      entity = model_class(**properties)
      #print 'create new one'
  else:
      entity = model_class.get(key_name)
      for key, value in properties.items():
          setattr(entity, key, value)
      #print 'overwrite old one'
  entities = self.handle_entity(entity)

  if entities:
    if not isinstance(entities, (list, tuple)):
      entities = [entities]

    for entity in entities:
      if not isinstance(entity, db.Model):
        raise TypeError('Expected a db.Model, received %s (a %s).' %
                        (entity, entity.__class__))

  return entities

def generate_key(self,i,values):
    # first column is key
    if values[0] is None or values[0] in ('',' ','-','.'):
        return None
    return values[0]
A: 

Your best option is probably to override create_entity. You'll need to copy most of the existing code there, but modify the constructor to supply a key argument instead of a key_name argument.

Nick Johnson
A: 

I get the following error when I try this..

NameError: global name 'Validate' is not defined

Any thoughts? Am I missing a library inclusion?

StephenW
from google.appengine.tools.bulkloader import Validate, GetImplementationClass:)
Ray Yun
Sorry, overwriting existing entity;s property is not working now...
Ray Yun