For entering and storing the data, I would use a relational database like (sqlite, MySQL or Postgresql). If you do it this way, you can easily write multiple programs which analyze the data in different ways. The sqlite database itself can be accessed from a variety of programming languages, GUI/CLI interfaces. Your data would remain language agnostic (unlike storing numpy arrays).
Python has built-in support for sqlite.
SQL provides a convenient, readable language for slicing and dicing your data (e.g. "What are all the scores for assignment1 from class1? Give a list of the 10 highest scores. Who had those scores? Did class1 have a higher average than class2?) The database tables would accommodate multiple classes, multiple semesters easily.
For entering data, a GUI may be the most convenient. For sqlite there is sqlitebrowser (I don't have a lot of experience here though; there might be even better options.). For MySQL I like phpmyadmin, and for Postgresql, phppgadmin.
Once you have the data entered, you can use a Python module (e.g. sqlite3, MySQLdb, pyscopg2) to access the database, and issue SQL queries. The data can then be fed into a list or numpy array. You can then use numpy to compute statistics.
PS. For small datasets there is really no issue regarding speed or memory footprint. You do not have to store the data in a numpy array just to call
numpy/scipy statistics functions.
You could, for example, draw the data out of the database and into a Python list, and feed the Python list to a numpy function:
sql='SELECT * FROM grades where assignment=%s'
args=['assign1']
data=cursor.fetchall(sql,args)
scores=zip(*data)[0]
ave_score=np.mean(scores)
If grades is a numpy structured array, you'll never be able to access values this way:
grades['123456']['assign 2']
since columns are accessed by name, and rows are accessed by non-negative integers.
I don't think this poses much of an obstacle however. Here's why: Everything you want to do for one student (like find the sum of all assignment points), you'll probably want to do for every student.
So the trick with numpy -- the way to leverage its power -- is to write vectorized equations or use numpy functions that apply to all rows simultaneously, instead of looping over rows individually. Instead of thinking on an individual scale (e.g. individual students, individual assignments), numpy encourages you to think on a grander scale (e.g. all students, all assignments) and to do calculations that apply to all of them at once.
As you've seen with your wrangling with views, you are actually better off not using
a structured array, instead opting for a plain 2-axis numpy array:
Let's imagine the columns represent (2) assignments and the rows represent (4) students.
In [36]: grades=np.random.random((4,2))
In [37]: grades
Out[37]:
array([[ 0.42951657, 0.81696305],
[ 0.2298493 , 0.05389136],
[ 0.12036423, 0.78142328],
[ 0.5029192 , 0.75186565]])
Here are some statistics:
In [38]: sum_of_all_assignments = grades.sum(axis=1)
In [39]: sum_of_all_assignments
Out[39]: array([ 1.24647962, 0.28374066, 0.90178752, 1.25478485])
In [40]: average_of_all_assignments = grades.mean(axis=1)
In [41]: average_of_all_assignments
Out[41]: array([ 0.62323981, 0.14187033, 0.45089376, 0.62739242])
In [42]: average_assignment_score = grades.mean(axis=0)
In [43]: average_assignment_score
Out[43]: array([ 0.32066233, 0.60103583])
Now suppose these are the names of the students:
In [44]: student_names=['harold','harry','herb','humphrey']
To match student names with their average score, you could create the dict
In [45]: dict(zip(student_names,average_of_all_assignments))
Out[45]:
{'harold': 0.62323981076528523,
'harry': 0.14187032892653173,
'herb': 0.45089375919011698,
'humphrey': 0.62739242488169067}
And similarly, for assignments:
In [46]: assignment_names=['assign 1','assign 2']
In [47]: dict(zip(assignment_names,average_assignment_score))
Out[47]: {'assign 1': 0.32066232713749887, 'assign 2': 0.60103583474431344}