views:

127

answers:

4

I have a list of columns that I will be analyzing. Instead of referring back to the specific index such as data[1][2], I'd like to assign a name to a column and then loop through the rows of the columns for different tasks. How do I assign a name to a column and then is this the correct format to refer back to it?

for x in range (len(data)):  
    if [column_name][x] ....
A: 

The snippet in the question looks like Python.

In general, in Python, one can avoid having to index lists, arrays and other containers by a numeric subscript.

The pervasive iterators associated with various built-in and custom classes, allow to resolve the type of question you have with syntax like:

data = [[1,2], [3,4], [5,6], [7,8]]
for x in data[2]:
   print(x)

output is
5
6
mjv
A: 

I believe what you want to do is store a column in a variable and then reference it, instead of always using the subscript. Simple enough:

varName=data[columnName]

For consequent accesses to that column, varName[rowName] should do the trick

Aviral Dasgupta
+2  A: 

The easiest way to use names instead of integers to access your data is to use a dictionary

data = {'pig':[1, 2], 'cow':[3, 4], 'dog':[5,6]}

if data in range(2):
    if data['dog'][1]==4:...

There are other ways as well. For example, you could make a class and override __getitem__; or you could just assign variable names to column numbers in a 2d array, like dog=2, etc; it all depends on exactly what you want to do.

tom10
+3  A: 

There's a bunch of different ways of doing this. If you know the association between names and columns at the time you write your code, the easiest way by far is this:

for row in data:
   (foo, bar, baz, bat) = row

...assuming that you don't need to update row (and data).

If you need to update the row, copying the values to variables won't work. You need to manipulate the items via their indexes. In that case, aviraldg's approach is simplest:

(foo, bar, baz, bat) = range(4)
for row in data:
   row[foo] = row[bar]

If you want to reference the columns by a string that contains their name, you'll need to copy the row to a dictionary whose key is the column name. Then, when you're done manipulating the dictionary, you need to update the original list, or replace it with a new one (which is what this code does):

columns = ("foo", "bar", "baz", "bat")
for i in range(len(data)):
   d = dict(zip(columns, data[i]))
   d["foo"] = d["bar"]
   data[i] = [d[col] for col in columns]
Robert Rossney