I need to design a data model for an Amazon S3-like application. Let's simplify the problem into 3 key concepts - users, buckets and objects. There are many ways to design this model - I'll list two.
Three Kinds - User, Bucket and Object. Each Object has a Bucket as its parent. Each Bucket has a User as its parent. User is the root.
Dynamic Kinds - Users are stored in the User kind and buckets are stored in the Bucket kind - same as #1. However objects within a bucket are stored in a dynamic kind named "<BucketID>_Object". There is no parent / child relationship between bucket and object entities anymore. This relationship is established by the name of the object kind.
#1 is of course the more intuitive and traditional model. One can argue that #2 is radical while others may say ridiculous.
Why am I thinking about #2? - In my application, properties defined on objects can vary from bucket to bucket. These properties are specified by the user at bucket creation time. Also, all properties on objects need to be queryable. A dynamic object kind per bucket allows me to support these requirements. Moreover, because my object kind is now a root kind, I no longer need to apply ancestor filters which means I get an index on each object property for free. In Model #1 I am forced to apply ancestor filters which means that I need a custom index for every property I want to query against.
I apologize for the convoluted explanation. I'll try better if it's not clear.
My questions are - is #2 a totally outrageous model? With #2 my kinds can potentially run into the 10s of thousands. Is that ok? I understand there's a limit on the number of custom indexes. But I am not creating custom indexes on my dynamic kinds but only relying on the automatic indexes.
Thanks, Keyur