views:

158

answers:

2

I was wondering if there are any ways to simplify the following piece of Code. As you can see, there are numerous dicts being used as well as condition statements to weed out bad input data. Note that the trip rate values are not all inputed yet, the dicts are just copied and pasted for now

EDIT

In any of the rates, (x,y):z . x and y are correct, the z values are not as they're just copy/pasted

this code works in case you want to copy, paste, and test it

import math


# step 1.4 return trip rates
def trip_rates( population_stratification, analysis_type, low_income, medium_income, high_income ):
  ''' this function returns the proper trip rate tuple to be used based on input 
    data 
    ADPT = Average Daily Person Trips per Household
    pph = person per household
    veh_hh = vehicles per household
    (param_1, param_2): ADPT
  '''
  li = low_income
  mi = medium_income
  hi = high_income
  # table 5 -
  if analysis_type == 1:
    if population_stratification == 1:
      rates = {( li, 1 ):3.6, ( li, 2 ):6.5, ( li, 3 ):9.1, ( li, 4 ):11.5, ( li, 5 ): 13.8,
               ( mi, 1 ):3.9, ( mi, 2 ):7.3, ( mi, 3 ):10.0, ( mi, 4 ):13.1, ( mi, 5 ): 15.9,
               ( hi, 1 ):4.5, ( mi, 2 ):9.2, ( mi, 3 ):12.2, ( mi, 4 ):14.8, ( mi, 5 ): 18.2}
      return rates
    if population_stratification == 2:
      rates = {
               ( li, 1 ):3.1, ( li, 2 ):6.3, ( li, 3 ):9.4, ( li, 4 ):12.5, ( li, 5 ): 14.7,
               ( mi, 1 ):4.8, ( mi, 2 ):7.2, ( mi, 3 ):10.1, ( mi, 4 ):13.3, ( mi, 5 ): 15.5,
               ( hi, 1 ):4.9, ( mi, 2 ):7.7, ( mi, 3 ):12.5, ( mi, 4 ):13.8, ( mi, 5 ): 16.7
              }
      return rates
    if population_stratification == 3: #TODO: input actual rate
      rates = {
               ( li, 1 ):3.6, ( li, 2 ):6.5, ( li, 3 ):9.1, ( li, 4 ):11.5, ( li, 5 ): 13.8,
               ( mi, 1 ):3.9, ( mi, 2 ):7.3, ( mi, 3 ):10.0, ( mi, 4 ):13.1, ( mi, 5 ): 15.9,
               ( hi, 1 ):4.5, ( mi, 2 ):9.2, ( mi, 3 ):12.2, ( mi, 4 ):14.8, ( mi, 5 ): 18.2
              }
      return rates
    if population_stratification == 4: #TODO: input actual rate
      rates = {
               ( li, 1 ):3.1, ( li, 2 ):6.3, ( li, 3 ):9.4, ( li, 4 ):12.5, ( li, 5 ): 14.7,
               ( mi, 1 ):4.8, ( mi, 2 ):7.2, ( mi, 3 ):10.1, ( mi, 4 ):13.3, ( mi, 5 ): 15.5,
               ( hi, 1 ):4.9, ( mi, 2 ):7.7, ( mi, 3 ):12.5, ( mi, 4 ):13.8, ( mi, 5 ): 16.7
              }
      return rates
  #table 6
  elif analysis_type == 2:
    if population_stratification == 1: #TODO: Change rates
      rates = {
               ( 0, 1 ):3.6, ( 0, 2 ):6.5, ( 0, 3 ):9.1, ( 0, 4 ):11.5, ( 0, 5 ): 13.8,
               ( 1, 1 ):3.9, ( 1, 2 ):7.3, ( 1, 3 ):10.0, ( 1, 4 ):13.1, ( 1, 5 ): 15.9,
               ( 2, 1 ):4.5, ( 2, 2 ):9.2, ( 2, 3 ):12.2, ( 2, 4 ):14.8, ( 2, 5 ): 18.2,
               ( 3, 1 ):4.5, ( 3, 2 ):9.2, ( 3, 3 ):12.2, ( 3, 4 ):14.8, ( 3, 5 ): 18.2
              }
      return rates
    if population_stratification == 2: #TODO: Change rates
      rates = {
               ( 0, 1 ):3.6, ( 0, 2 ):6.5, ( 0, 3 ):9.1, ( 0, 4 ):11.5, ( 0, 5 ): 13.8,
               ( 1, 1 ):3.9, ( 1, 2 ):7.3, ( 1, 3 ):10.0, ( 1, 4 ):13.1, ( 1, 5 ): 15.9,
               ( 2, 1 ):4.5, ( 2, 2 ):9.2, ( 2, 3 ):12.2, ( 2, 4 ):14.8, ( 2, 5 ): 18.2,
               ( 3, 1 ):4.5, ( 3, 2 ):9.2, ( 3, 3 ):12.2, ( 3, 4 ):14.8, ( 3, 5 ): 18.2
              }
      return rates
    if population_stratification == 3: #TODO: Change rates
      rates = {
               ( 0, 1 ):3.6, ( 0, 2 ):6.5, ( 0, 3 ):9.1, ( 0, 4 ):11.5, ( 0, 5 ): 13.8,
               ( 1, 1 ):3.9, ( 1, 2 ):7.3, ( 1, 3 ):10.0, ( 1, 4 ):13.1, ( 1, 5 ): 15.9,
               ( 2, 1 ):4.5, ( 2, 2 ):9.2, ( 2, 3 ):12.2, ( 2, 4 ):14.8, ( 2, 5 ): 18.2,
               ( 3, 1 ):4.5, ( 3, 2 ):9.2, ( 3, 3 ):12.2, ( 3, 4 ):14.8, ( 3, 5 ): 18.2
              }
      return rates
    if population_stratification == 4: #TODO: Change rates
      rates = {
               ( 0, 1 ):3.6, ( 0, 2 ):6.5, ( 0, 3 ):9.1, ( 0, 4 ):11.5, ( 0, 5 ): 13.8,
               ( 1, 1 ):3.9, ( 1, 2 ):7.3, ( 1, 3 ):10.0, ( 1, 4 ):13.1, ( 1, 5 ): 15.9,
               ( 2, 1 ):4.5, ( 2, 2 ):9.2, ( 2, 3 ):12.2, ( 2, 4 ):14.8, ( 2, 5 ): 18.2,
               ( 3, 1 ):4.5, ( 3, 2 ):9.2, ( 3, 3 ):12.2, ( 3, 4 ):14.8, ( 3, 5 ): 18.2
              }
      return rates
  # table 7
  elif analysis_type == 3:
    if population_stratification == 1: #TODO: input actual rate
      rates = {
               ( li, 0 ):3.6, ( li, 1 ):6.5, ( li, 2 ):9.1, ( li, 3 ):11.5,
               ( mi, 0 ):3.9, ( mi, 1 ):7.3, ( mi, 2 ):10.0, ( mi, 3 ):13.1,
               ( hi, 0 ):4.5, ( mi, 1 ):9.2, ( mi, 2 ):12.2, ( mi, 3 ):14.8
              }
      return rates
    if population_stratification == 2: #TODO: input actual rate
      rates = {
               ( li, 0 ):3.6, ( li, 1 ):6.5, ( li, 2 ):9.1, ( li, 3 ):11.5,
               ( mi, 0 ):3.9, ( mi, 1 ):7.3, ( mi, 2 ):10.0, ( mi, 3 ):13.1,
               ( hi, 0 ):4.5, ( mi, 1 ):9.2, ( mi, 2 ):12.2, ( mi, 3 ):14.8
              }
      return rates
    if population_stratification == 3: #TODO: input actual rate
      rates = {
               ( li, 0 ):3.6, ( li, 1 ):6.5, ( li, 2 ):9.1, ( li, 3 ):11.5,
               ( mi, 0 ):3.9, ( mi, 1 ):7.3, ( mi, 2 ):10.0, ( mi, 3 ):13.1,
               ( hi, 0 ):4.5, ( mi, 1 ):9.2, ( mi, 2 ):12.2, ( mi, 3 ):14.8
              }
      return rates
    if population_stratification == 4: #TODO: input actual rate
      rates = {
               ( li, 0 ):3.6, ( li, 1 ):6.5, ( li, 2 ):9.1, ( li, 3 ):11.5,
               ( mi, 0 ):3.9, ( mi, 1 ):7.3, ( mi, 2 ):10.0, ( mi, 3 ):13.1,
               ( hi, 0 ):4.5, ( mi, 1 ):9.2, ( mi, 2 ):12.2, ( mi, 3 ):14.8
              }
      return rates

def interpolate( population_stratification, analysis_type, low_income, medium_income, high_income, x, y ):
  #get rates dict
  rates = trip_rates( population_stratification, analysis_type, low_income, medium_income, high_income )


  # dealing with x parameters
  #when using income levels, x_1 and x_2 are li, mi, or hi
  if analysis_type == 1 or analysis_type == 2 or analsis_type == 4:
    if x < high_income and x >= medium_income:
      x_1 = medium_income
      x_2 = high_income
    elif x < medium_income:
      x_1 = low_income
      x_2 = medium_income
    else:
      x_1 = high_income
      x_2 = high_income
  if analysis_type == 3:
    if x >= 3:
      x_1 = 3
      x_2 = 3
    else:
      x_1 = int( math.floor( x ) )
      x_2 = int( math.ceil( x ) )

  # dealing with y parametrs
  #when using persons per household, max number y = 5
  if analysis_type == 1 or analysis_type == 4:
    if y >= 5:
      y_1 = 5
      y_2 = 5
    else:
      y_1 = int( math.floor( y ) )
      y_2 = int( math.ceil( y ) )
  elif analysis_type == 2 or analysis_type == 3:
    if y >= 5:
      y_1 = 5
      y_2 = 5
    else:
      y_1 = int( math.floor( y ) )
      y_2 = int( math.ceil( y ) )

  # denominator
  z = ( ( x_2 - x_1 ) * ( y_2 - y_1 ) )

  result = ( ( ( rates[( x_1, y_1 )] ) * ( ( x_2 - x ) * ( y_2 - y ) ) / ( z ) ) +
             ( ( rates[( x_2, y_1 )] ) * ( ( x - x_1 ) * ( y_2 - y ) ) / ( z ) ) +
             ( ( rates[( x_1, y_2 )] ) * ( ( x_2 - x ) * ( y - y_1 ) ) / ( z ) ) +
             ( ( rates[( x_2, y_2 )] ) * ( ( x - x_1 ) * ( y - y_1 ) ) / ( z ) ) )

  return result

#test
low_income = 20000 #this is calculated using exchange rates
medium_income = 40000 # this is calculated using exchange rates
high_income = 60000 # this is calculated using exchange rates
population_stratification = 1 #inputed by user
analysis_type = 1 #inputed by user
x = 35234.34 #test income
y = 3.5 # test pph

print interpolate( population_stratification, analysis_type, low_income, medium_income, high_income, x, y )
+5  A: 

Well, where to start? Here is just a first observation:

You have a lot of data there, and it seems code and data are mixed into each other.

Data and Code should be separate. Data is an external source, something you modify or read in. You could probably adapt your code to quickly parse Data from a good editable representation to a reprensentation useful for your algorithms. I suspect your code will be shorter, clearer, and less error prone (did you notice all of the 'rates' dictionaries have multiple keys, and you miss a lot of 'hi' keys?).

If you need better abstractions such as matrices and arrays of data, look into numpy


Edit 1

Did you count your number of dimensions? You have a many-dimensional matrix here with X dimensions: analysis_type, population_stratification, income_level, index

If I see right this is a 3x4x3x3 (= 108 entries) "matrix" or "lookup table". If this is the data your model builds on, fine. But can't you put those numbers in a file, or table that you read in? Your code would be next to trivial.


Edit 2

Ok, I'll bite for some minor python style: Testing for values in a Set or a Range.

Instead of:

if analysis_type == 1 or analysis_type == 2 or analsis_type == 4:

you can use

if analysis_type in (1, 2, 4):

or even using readable names as (CUBIC, ..) as suggested.

Instead of:

if x < high_income and x >= medium_income:

you can used chained conditions; Python is one of the few programming languages where conditions chain to make nautral if statements:

if medium_income <= x < high_income:


Edit 3

More imortant than small code figures is of course code design and refactoring. Edit 2 can only give you some polish.

You should learn to loathe duplicate code.

Also, you have quite a lot of branches in one function. That is a good sign you should break it up into multiple functions. It can also reduce duplication. For example, when one variable like analysis_type can totally change what the function does, why have two different behaviors in one function? You shouldn't have the whole program in one function. Perhaps analysis_type == 3 is better expressed in its own function (as an example)?

Do you understand that your fucntion trip_rates basically does an array lookup, where the array lookup is hardcoded as if ..: return .. if : return .., and the array is written out in full in the function? What if trip_rates could be implemented like this? Would it be possible?

data_model = compute_table(low_income, ...)
return data_model[analysis_type][population_stratification]
kaizer.se
i paired indexed to make interpolation easier.I'll fix the 'hi' issue, thanks for pointing it out
dassouki
I'm sure there is much to this code that I don't understand. For example, why you put the income levels in the dictionary keys. But I won't read much more of it, for the code is unreadable. Factor out the data from the code!
kaizer.se
well low/medium/high/income levels are generated based on user input, such as currency, exchange rate, and other factors.
dassouki
concerning the edit, i was thining of putting all of the data in a matrix, but wasn't sure how to interpolate between the values, and how to find the data for proper analysis type and population
dassouki
and thanks for the advice, each rates dict represents a different subtable with different x and y axes and units
dassouki
it would be possible to implement all the changes you have recommended. it might take some time and learning, but thank you for the detailed response
dassouki
+2  A: 

As well with kaizer's suggestion about data and code, here are some simple cleanups:

The code

if y >= 5:
      y_1 = 5
      y_2 = 5
    else:
      y_1 = int( math.floor( y ) )
      y_2 = int( math.ceil( y ) )

can be written as

min(5, int(math.floor(y))

or

int(math.floor(min(5, y))

or even made a function:

def limitedInt(v, maxV):
   return min(5, int(math.floor(y))

Also I would recommend that instead of saying analysis_type == 1 you say something like analysis_type = CUBIC (i.e., an name that describes the analysis type) and set the name to 1. This will not simplify so much as make the code easier to read.

You might find the book Refactoring by Martin Fowler or Refactoring Workbook by William Wake as a way to learn about cleaning up code (the website is also available, but without knowing about "code smells" described in the books, it is not as helpful.

Kathy Van Stone
thank you, i'll look into purchasing the book.
dassouki
and thanks for the snippets
dassouki
The `max` and `min` functions are in fact builtin, so call them `min(..)`
kaizer.se
Fixed (another language crept in there for a minute...)
Kathy Van Stone