I have a catalog of items like so:
class Items(models.Model):
name=models.CharField()
type=models.ForeignKey(ItemType)
quantity=models.IntegerField()
price=models.DecimalField() # Simplified
[... other fields]
and it has some attributes governed by:
class ItemAttributes(models.Model):
name=models.CharField()
type=modles.ForeignKey(ItemType)
class ItemAttributeRecords(models.Model):
attribute=ForeignKey(ItemAttributes)
item=ForeignKey(Items)
value=CharField() #Actual model stores ints/bools etc. simplified for SO
Different types of items have a different set of ItemAttributes eg. a Tshirt will have a size, design and colour but a Towel may only have a size and design.
I'd like to be able to generate a report by specifying specific attributes that I wish to summarize over and/or see details of and generate summary information such as total quantity, average price that would allow me to generate queries at a higher level view.
Let me illustrate with some examples using the clothing illustration above.
On a predefined queryset, I'd like to know what quantity of different coloured items were sold on a per attribute basis, in this case I would specify the order and attribute, [Type, Attr1, Attr2]
and details on Attr3
. I might also choose to ignore another attribute, say Attr4(Long/Short sleeve)
, ie. both long and short seleve types get aggregated. That probably a poor description but the example output is probably a better way to understand.
Lets define that as:
summarize_over= ['Type', 'Attr1', 'Attr2']
detail_on = ['Attr3']
output_values = ['quantity']
ignore = ['Attr4']
Type Attr1(Design) Attr2(Size) Attr3(Colour) Quantity
Tshirt Design A M Blue 10 (3 long/7short sleeve, aggregated)
" " " Brown 20 [....etc....]
" " " Yellow 30
Total 60
Tshirt Design A S Blue 20
" " " Green 20
Total 40
Tshirt Design B M Blue 30
" " " White 30
Total 60
Attributes such as 'name' (and some others not shown) are ignored in this summary.
A dictionary output would be fine eg. {'type': 'TShirt', 'Attr1': 'Design A', 'Attr2': 'M', 'Quantity':60, 'colour': {'blue':10,'brown':20','yellow':30}}
I'm quite new to Python so I'm not really familiar with the idioms, tricks, various methods and such that might make solving this quite easy.
I think this could either be done by 1)raw SQL (but I'm not too familiar with multiple GROUP_BY statements and rolling up the ignored columns), or 2)using the values() method and then looping through them looking for similar attributes. It also occurred to me it might be possible to use sets to solve this but I dont know how to approach it from that angle or if it is in fact better.
So far the pseudocode have come up with so far is along the lines:
for each item in list given by queryset
remove all unwanted attributes and rollup output_values, add to new list
for each attribute in ordered attribute list
group all items in new list that have the same attr into their own dict and add to a list
<repeat nesting the list of dicts>
for each dict in list of dict
flatten each dict and generate output_values
Thoughts? Suggestions? Code snipets? Better approaches?