views:

258

answers:

2

Hey all,

As far as I understand from app engine tutorial, entity groups exist only for the purpose of transactions:

"Only use entity groups when they are needed for transactions" (from the tutorial)

The definition of being in the same entity group is to have the same root.. In that case, what is the use of having more than 1 hierarchy level? That is, why should I use "A -> B -> C" (A is the root, B his son, C his grandson) instead of "A -> B ; A -> C" ? (A, B and C are still in the same entity group since A is their root).

If the only purpose of entity groups in to make transaction possible between entities, why should I use more than 1 hierarchy level (what do I earn from Root -> Grandson linkage)?

A: 

When you store A -> B -> C, A has many Bs, and a B has many Cs. When you store A -> B and A -> C, A has many Bs, and many Cs. In other words, a C doesn't belong to a single B.

Which structure you use really depends on the data you're storing.

When using lots of write accesses, you might have to do unintuitive things to your entitygroups, see Sharding Counters for an example of this:

Sander Rijken
Thanks for the quick response. I understand the idea that C "belongs" to B in the first case but not in the second (like a Tree).But what does "belongs" means in the case of entity groups? What uses do I have from the fact that they are connected to each other? If the whole idea in entity groups is to allow transactions between entities and in both cases I mentioned transaction is allowed with A,B and C, what do I get from the fact that B and C are connected?
Joel
Same thing as James' answer actually :-). Simples example is a collection of files as an archive, kinda like a zip. You could have Archive -> Files -> File Properties. Now each file can have properties (which can be accessed using ancestor). When you have Archive -> Files and Archive -> File Properties, there's no such thing.
Sander Rijken
Hey, thnx for the answer :)However, i still do not fully understand since you can use ReferenceProperty for the purpose. Quote from google tutorial:"Only use entity groups when they are needed for transactions. For other relationships between entities, use ReferenceProperty properties and Key values, which can be used in queries."Thats what bothers me :)
Joel
+2  A: 

When you're doing queries, you can use ancestor() to restrict the query to children of a particular entity - in your example, you could look for only descendants of B, which you couldn't do if they were all at the top level.

There's more on Ancestor Queries in Programming Google App Engine

The Keys and Entity Groups doc also says that:

Entity group relationships tell App Engine to store several entities in the same part of the distributed network ... All entities in a group are stored in the same datastore node

edit: The same document also lists some of the reasons why you don't want your entity groups to grow too large:

The more entity groups your application has—that is, the more root entities there are—the more efficiently the datastore can distribute the entity groups across datastore nodes. Better distribution improves the performance of creating and updating data. Also, multiple users attempting to update entities in the same entity group at the same time will cause some users to retry their transactions, possibly causing some to fail to commit changes. Do not put all of the application's entities under one root.

Any transaction on an entity in a Group will cause any other writes to the same entity group to fail. If you have a large entity group with lots of writes, this causes lots of contention, and your app then has to handle the expected write failures. Avoiding datastore contention goes into more detail on the strategies you can use to minimse the contention.

James Polley
Ok, finally I got it!So basically the sentence from the tutorial"Only use entity groups when they are needed for transactions"is not really correct, because it has other uses...
Joel
That's right, it's a bit misleading. I think the point is to really think before you use entity groups, don't just think it's a nice logical way to group data. There are some non-transactional benefits, such as performance. Brett Slatkin suggested something he called "Relation Index Entities" using entity groups here:http://code.google.com/events/io/2009/sessions/BuildingScalableComplexApps.html
Danny Tuppeny