views:

79

answers:

1

I have a class:

class Transaction(db.Model):
    accountDebit = db.ReferenceProperty(reference_class=Account,
                                       collection_name="kontoDuguje")
    accountCredit = db.ReferenceProperty(reference_class=Account, 
                                       collection_name="kontoPotrazuje")
    amount = db.FloatProperty()
    Tran_date = db.DateProperty()
    comment = db.StringProperty()

here is the method of Account class by which I would like to get all the transactions for the particular account (transactions with accountDebit or accountCredit), but sorted by date:

def GetTransactions(self):
    transactions = []
    transactions_debit = db.GqlQuery('SELECT * FROM Transaction ' + 
                                    'WHERE accountDebit=:1',self)
    transactions_credit = db.GqlQuery('SELECT * FROM Transaction ' + 
                                    'WHERE accountCredit=:1',self)
    for x in transactions_debit:
        x.amount = -x.amount
        transactions.append(x)
    for x in transactions_credit:
        x.amount = x.amount
        transactions.append(x)
   return transactions

The aim is to make union with the sort of this two results, but with limit + offset. Consider the fact that you can not fetch more than 1000 rows in a single query ...

Please help

+2  A: 

You can do an OR (Python laboriously synthesizes it for you at application level), which takes care of the "union with sorty". However, if you need to worry about > 1000 transactions, that won't help (nor will offset and limit: the sum of offset + limit is what's limited to 1000!). You'll need to slice by something (presumably the same field you're sorting on, tran_date I imagine?) with a couple of < conditions there, and that of course can't guarantee you the exact limit and offset you desire, so you'll have to exceed them a bit and slice off the eccess at application level.

Edit: OR is not actually synthesized at application level (IN and != are the two operations that are), so you need to synthesize it yourself (also at application level of course), e.g.:

def GetTransactions(account):
    transactions = list(db.GqlQuery(
        'SELECT * FROM Transaction WHERE '
        'accountDebit = :1 ORDER BY Tran_date', account))
    transactions.extend(db.GqlQuery(
        'SELECT * FROM Transaction WHERE '
        'accountCredit = :1 ORDER BY Tran_date', account))
    transactions.sort(key=operator.attrgetter('Tran_date'))
    return transactions

But the big issues are still those outlined above.

So what are the numbers in play -- typical numbers of transactions for a user (say per week or per day), typical max total for a user, what order of magnitudes are you going to need in your offset and limit, etc, etc? Hard to suggest specific design choices without having any idea of the orders of magnitude of these numbers!-)

Edit: there is no solution that will be optimal, or even reasonable, for ANY order of magnitude of each of these parameters -- how you deal efficiently with many millions of transactions per user per day is just going to be deeply different from how you deal with a few transactions per user per day; I can't even imagine an architecture that would make sense in both cases (I might, perhaps, in a relational context, but not in a non-relational one such as we have here -- e.g., to decently deal with the case of millions of transactions per day, you really want a finer-grained timestamp on a transaction than just recording its date can provide!-).

Alex Martelli
I'm afraid that I haven't seen any example of 'OR which takes care of the "union with sorty"' Can you provide some?
manda
Indeed, in this moment, there are not many transacions. The app is still in the developement phase. However, I would like to make it work ok independently of number of transactions. I got an idea, but yet I haven't come to the realization:To make an extra class which would contain 3 properties: the reference to account, tran_date, and the reference to the transaction. Every time I put the transaction, I should also put 2 entities of this type. This way I would easily pick limit+offset records by account and get these transactions by reference.
manda
@manda, my bad --`IN` and `!=` are the two operation synthesized for you at app level, `OR` is not there (so you need to do it at app level yourself -- editing answer to show how).
Alex Martelli
OK, edited (and also pointed out how "independently of number of transactions" is not really an option;-). Your idea for denormalization does make sense, for example, but it will still break for over 1000 transactions per account per day (which IS part of what "any number" **means**!) -- so if you know it's less than 1000 transactions per day, express that limit clearly (so you or another maintainer know where to look when that breaks, for example); otherwise, use finer-grained timestamps, not just the date (and probably denormalize further recording transaction counts).
Alex Martelli