views:

186

answers:

2

To start, I'm very new to python, let alone Django and Piston.

Note: (I've updated this since the first two suggestions... you can view the old post in txt form here: http://bennyland.com/old-2554127.txt). The update I made was to better understand what was going wrong - and now I at least sort of know what's happening but I have no clue how to fix it.

Anyway, using Django and Piston, I've set up a new BaseHandler class named BaseApiHandler which does most of the work I was doing across all of my handlers. This worked great until I added the ability to limit the filters being applied to my results (for instance 'give me the first result only').

Examples (had to remove ":" because i can't submit more urls): - http//localhost/api/hours_detail/empid/22 gives me all hours_detail rows from employee # 22 - http//localhost/api/hours_detail/empid/22/limit/first gives me the first hours_detail row from employee # 22

What's happening is that when I run /limit/first several times in succession, the first example is then broken, pretending it's a /limit/ url when it isn't.

Right now I'm storing whether or not it's a limit and what the limit is in a new class - prior to this stackoverflow edit, I was just using a list with two entries (limit = [] when initialized, limit = [0,1] when set). Prior to this stackoverflow edit, once you spammed /limit/first, when going to the first example 'limit' would be pre-set to [0,1] and the handler would then limit the query because of this. With the debug data I've added, I can say for certain that the list was pre-set and not getting set during the execution of the code.

I'm adding debug info into my response so I can see what's happening. Right now when you first ask for Example 1's url, you get this CORRECT statusmsg response:

"statusmsg": "2 hours_detail found with query: {'empid':'22','datestamp':'2009-03-02',}",

When you ask for Example 2's url, you get this CORRECT statusmsg response:

"statusmsg": "1 hours_detail found with query: {'empid':'22','datestamp':'2009-03-02','limit','first',with limit[0,1](limit,None... limit set 1 times),}",

However, if you refresh a bunch of times, the limit set value starts increasing (incrementing this value was something a friend of mine suggested to see if this variable was somehow getting kept around)

"statusmsg": "1 hours_detail found with query: {'empid':'22','datestamp':'2009-03-02','limit','first',with limit[0,1](limit,None... limit set 10 times),}",

Once that number goes above '1 times', you can start trying to get Example 1's url. Each time I now refresh example 1, i get odd results. Here are 3 different status messages from different refreshes (Notice that from each one, 'limit':'first' is CORRECTLY missing from the kwarg's debug output while the actual value of islimit is hovering between 8 and 10):

"statusmsg": "1 hours_detail found with query: {'empid':'22','datestamp':'2009-03-02',with limit[0,1](limit,None... limit set 10 times),}",
"statusmsg": "1 hours_detail found with query: {'empid':'22','datestamp':'2009-03-02',with limit[0,1](limit,None... limit set 8 times),}",
"statusmsg": "1 hours_detail found with query: {'empid':'22','datestamp':'2009-03-02',with limit[0,1](limit,None... limit set 9 times),}",

So it would appear that this object is getting cached. Prior to changing 'limit' form a list to a class, it also appeared that the list version of 'limit' was getting cached as after going to Example 2's url, i would sometimes have [0,1] as the limit.

Here are the updated snippets of the code (remember, you can view the first post here: bennyland.com/old-2554127.txt)

URLS.PY - inside 'urlpatterns = patterns('

    #hours_detail/id/{id}/empid/{empid}/projid/{projid}/datestamp/{datestamp}/daterange/{fromdate}to{todate}
    #empid is required
    url(r'^api/hours_detail/(?:' + \
        r'(?:[/]?id/(?P<id>\d+))?' + \
        r'(?:[/]?empid/(?P<empid>\d+))?' + \
        r'(?:[/]?projid/(?P<projid>\d+))?' + \
        r'(?:[/]?datestamp/(?P<datestamp>\d{4,}[-/\.]\d{2,}[-/\.]\d{2,}))?' + \
        r'(?:[/]?daterange/(?P<daterange>(?:\d{4,}[-/\.]\d{2,}[-/\.]\d{2,})(?:to|/-)(?:\d{4,}[-/\.]\d{2,}[-/\.]\d{2,})))?' + \
        r')+' + \
        r'(?:/limit/(?P<limit>(?:first|last)))?' + \
        r'(?:/(?P<exact>exact))?$', hours_detail_resource),

HANDLERS.PY

class ResponseLimit(object):
    def __init__(self):
        self._min = 0
        self._max = 0
        self._islimit = 0

    @property
    def min(self):
        if self.islimit == 0:
            raise LookupError("trying to access min when no limit has been set")
        return self._min

    @property
    def max(self):
        if self.islimit == 0:
            raise LookupError("trying to access max when no limit has been set")
        return self._max

    @property
    def islimit(self):
        return self._islimit

    def setlimit(self, min, max):
        self._min = min
        self._max = max
        # incrementing _islimit instead of using a bool so I can try and see why it's broken
        self._islimit += 1

class BaseApiHandler(BaseHandler):
    limit = ResponseLimit()
    def __init__(self):
        self._post_name = 'base'

    @property
    def post_name(self):
        return self._post_name

    @post_name.setter
    def post_name(self, value):
        self._post_name = value

    def process_kwarg_read(self, key, value, d_post, b_exact):
        """
        this should be overridden in the derived classes to process kwargs
        """
        pass

    # override 'read' so we can better handle our api's searching capabilities
    def read(self, request, *args, **kwargs):
        d_post = {'status':0,'statusmsg':'Nothing Happened'}
        try:
            # setup the named response object
            # select all employees then filter - querysets are lazy in django
            # the actual query is only done once data is needed, so this may
            # seem like some memory hog slow beast, but it's actually not.
            d_post[self.post_name] = self.queryset(request)
            s_query = ''

            b_exact = False
            if 'exact' in kwargs and kwargs['exact'] <> None:
                b_exact = True
                s_query = '\'exact\':True,'

            for key,value in kwargs.iteritems():
                # the regex url possibilities will push None into the kwargs dictionary
                # if not specified, so just continue looping through if that's the case
                if value is None or key == 'exact':
                    continue

                # write to the s_query string so we have a nice error message
                s_query = '%s\'%s\':\'%s\',' % (s_query, key, value)

                # now process this key/value kwarg
                self.process_kwarg_read(key=key, value=value, d_post=d_post, b_exact=b_exact)

            # end of the kwargs for loop
            else:
                if self.limit.islimit > 0:
                    s_query = '%swith limit[%s,%s](limit,%s... limit set %s times),' % (s_query, self.limit.min, self.limit.max, kwargs['limit'],self.limit.islimit)
                    d_post[self.post_name] = d_post[self.post_name][self.limit.min:self.limit.max]
                if d_post[self.post_name].count() == 0:
                    d_post['status'] = 0
                    d_post['statusmsg'] = '%s not found with query: {%s}' % (self.post_name, s_query)
                else:
                    d_post['status'] = 1
                    d_post['statusmsg'] = '%s %s found with query: {%s}' % (d_post[self.post_name].count(), self.post_name, s_query)
        except:
            e = sys.exc_info()[1]
            d_post['status'] = 0
            d_post['statusmsg'] = 'error: %s %s' % (e, traceback.format_exc())
            d_post[self.post_name] = []

        return d_post


class HoursDetailHandler(BaseApiHandler):
    #allowed_methods = ('GET', 'PUT', 'POST', 'DELETE',)
    model = HoursDetail
    exclude = ()

    def __init__(self):
        BaseApiHandler.__init__(self)
        self._post_name = 'hours_detail'

    def process_kwarg_read(self, key, value, d_post, b_exact):
        # each query is handled slightly differently... when keys are added
        # handle them in here.  python doesn't have switch statements, this
        # could theoretically be performed using a dictionary with lambda
        # expressions, however I was affraid it would mess with the way the
        # filters on the queryset work so I went for the less exciting
        # if/elif block instead

        # querying on a specific row
        if key == 'id':
            d_post[self.post_name] = d_post[self.post_name].filter(pk=value)

        # filter based on employee id - this is guaranteed to happen once
        # per query (see read(...))
        elif key == 'empid':
            d_post[self.post_name] = d_post[self.post_name].filter(emp__id__exact=value)

        # look for a specific project by id
        elif key == 'projid':
            d_post[self.post_name] = d_post[self.post_name].filter(proj__id__exact=value)

        elif key == 'datestamp' or key == 'daterange':
            d_from = None
            d_to = None
            # first, regex out the times in the case of range vs stamp
            if key == 'daterange':
                m = re.match('(?P<daterangefrom>\d{4,}[-/\.]\d{2,}[-/\.]\d{2,})(?:to|/-)(?P<daterangeto>\d{4,}[-/\.]\d{2,}[-/\.]\d{2,})', \
                             value)
                d_from = datetime.strptime(m.group('daterangefrom'), '%Y-%m-%d')
                d_to = datetime.strptime(m.group('daterangeto'), '%Y-%m-%d')
            else:
                d_from = datetime.strptime(value, '%Y-%m-%d')
                d_to = datetime.strptime(value, '%Y-%m-%d')

            # now min/max to get midnight on day1 through just before midnight on day2
            # note: this is a hack because as of the writing of this app,
            # __date doesn't yet exist as a queryable field thus any
            # timestamps not at midnight were incorrectly left out
            d_from = datetime.combine(d_from, time.min)
            d_to = datetime.combine(d_to, time.max)

            d_post[self.post_name] = d_post[self.post_name].filter(clock_time__gte=d_from)
            d_post[self.post_name] = d_post[self.post_name].filter(clock_time__lte=d_to)

        elif key == 'limit':
            order_by = 'clock_time'
            if value == 'last':
                order_by = '-clock_time'
            d_post[self.post_name] = d_post[self.post_name].order_by(order_by)
            self.limit.setlimit(0, 1)

        else:
            raise NameError

    def read(self, request, *args, **kwargs):
        # empid is required, so make sure it exists before running BaseApiHandler's read method
        if not('empid' in kwargs and kwargs['empid'] <> None and kwargs['empid'] >= 0):
            return {'status':0,'statusmsg':'empid cannot be empty'}
        else:
            return BaseApiHandler.read(self, request, *args, **kwargs)
A: 

I would say that there is a basic flaw in your code, if has_limit() can return True when limit is a list of length 2, but this line will fail if limit is shorter than 3 elements long:

s_query = '%swith limit[%s,%s](limit,%s > traceback:%s),' % 
          (s_query, self.limit[0], self.limit[1], kwargs['limit'], 
           self.limit[2])

Why are you initializing self.limit to an invalid length list? You could also make this code a little more defensive:

if self.has_limit():
    s_query += 'with limit[%s,%s]' % self.limit[0:1]
    if 'limit' in kwargs and len(self.limit) > 2:
        s_query += '(limit,%s > traceback:%s),' % 
              (kwargs['limit'], self.limit[2])
Paul McGuire
sure, i've already changed this line - but it still doesn't solve the original problem. The code here was just put in there to try and solve why limit is getting set to anything in the first place
Anverc
A: 

I think you may be creating an alias to your internal limit list, via the get_limit property accessor. Try removing (or at least adding a print statement) inside this accessor. If you have code externally that binds a local list to get_limit, then it can update the contents using append, del, or assignment to slices, such as [:]. Or try this:

def get_limit(self): 
    return self._limit[:]

Instead of binding your internal list to an external name, it will make a copy of your internal list.

Paul McGuire
I'll try this tonight. Remember this was broken before I added the getter and setter - I added the getter and setter later to try and debug it more (which I later found out didn't really work well).See the code prior to the "edit: I added more debug info to try and figure out what's happening" line where I added additional information
Anverc
Paul, thanks for your help so far... I've updated my code quite a bit to get better info from it and it would seem like some things aren't getting initialized, or are otherwise getting cached - I updated the entire post if you wouldn't mind sharing your thoughts
Anverc
Assuming it was a cache issue, I've added the following at the start of BaseApiHandler.read: def read(self, request, *args, **kwargs): # initialize stuff: self.limit = ResponseLimit() It now works - what bothers me about this is that I know the caching was happening... so does this mean that if two requests happen at the same time (one with /limit and one without) that the data in one of them could end up being incorrect?
Anverc