views:

230

answers:

1

One can deploy several versions of the same application on GAE/J, but how does GAE/J deal with the fact that different versions can use different Datastore (and possibly incompatible) schemes?

Example:

Suppose that on version 1 of my application I have a POJO like (I've left out the several details for sake of simplicity):

public class User {

  private String key;

  private String username;

  private Integer phoneNumber;

}

Now suppose that on version 2 I want to use:

public class User {

  private String key;

  private String username;

  // on this version, replaced 'phoneNumber' by: 
  private String eMail;

}

Now two questions:

  1. If I deploy both versions om GAE/J, what schema will I see in the Datastore?

  2. What about the data itself? If I add a user on version 2, will I see its data on the Datastore of version 1?

+4  A: 

Quoting the docs,

Unlike relational databases, the App Engine datastore does not require that all entities of a given kind have the same properties. The application can specify and enforce its data model using libraries included with the SDK, or its own code.

This has also been referred to as "soft schema" -- the datastore doesn't really do schemas, but you can more or less simulate some soft kind of schema via application level code (your own, or in libraries).

So if you (via the library or in your own code) enforce a constraint that says "this attribute must be present", and a certain entity doesn't actually have that attribute (because it was inserted based on a different "soft schema", for example a different version of the app), then you'll get whatever exception your application-level code or library chooses to use to indicate a violation of this soft constraint, at the point in which the constraint is checked.

If you express no such constraints, then an attribute that's missing will either have a default value supplied by your code or library, or else a "default default", which I believe is usually null in Java or None in Python.

Note that different versions of the app may use different runtimes (some may be Java and others may be Python) and the different runtimes will still use the same datastore, so the Java vs Python distinction is not crucial here.

In your specific example (with no default provided and no assertion made about mandatory presence) I would expect that adding a user from either version will make it visible from the other, with missing attributes seen as null (but there may be constraints that I'm not aware of, in which case an exception should result when a library tries to validate those constraints and sees they're being violated).

In general terms, I would not worry about adding "optional" attributes (ones which may legitimately be missing/null/None, or have an explicit default in those cases, so that entities written by the older version are still correctly readable), but other kinds of changes (making a previously missing or optional attribute be mandatory instead, adding other constraints, etc) may require a form of "database migration" (maybe via the Secure Data Connector) or "application level hacks for legacy compatibility" if a migration is just unfeasible.

Migration may not be feasible, in particular, if you require the ability to rollback to previous app versions, for example (indeed in those cases other operations become problematic, e.g. removing constraints becomes just as problematic as adding them, since the old version might be unable to deal with data entered in the new one which violates constraints that have been removed in the new version).

So it's not necessarily a simple problem in practice, but it still helps to think of it this way: the datastore per se has no schema, only my app and/or the libraries it chooses to use enforce whatever constraints are desired at app level on underlying entities which, per se, each really have an arbitrary set of attributes -- "soft schema", application-level schema, no "actual" schema in the underlying data layer.

Alex Martelli