Hello,
My scenario is as follows: I have a table of data (handful of fields, less than a hundred rows) that I use extensively in my program. I also need this data to be persistent, so I save it as a CSV and load it on start-up. I choose not to use a database because every option (even SQLite) is an overkill for my humble requirement (also - I would like to be able to edit the values offline in a simple way, and nothing is simpler than notepad).
Assume my data looks as follows (in the file it's comma separated without titles, this is just an illustration):
Row | Name | Year | Priority
------------------------------------
1 | Cat | 1998 | 1
2 | Fish | 1998 | 2
3 | Dog | 1999 | 1
4 | Aardvark | 2000 | 1
5 | Wallaby | 2000 | 1
6 | Zebra | 2001 | 3
Notes:
- Row may be a "real" value written to the file or just an auto-generated value that represents the row number. Either way it exists in memory.
- Names are unique.
Things I do with the data:
- Look-up a row based on either ID (iteration) or name (direct access).
- Display the table in different orders based on multiple field: I need to sort it e.g. by Priority and then Year, or Year and then Priority, etc.
- I need to count instances based on sets of parameters, e.g. how many rows have their year between 1997 and 2002, or how many rows are in 1998 and priority > 2, etc.
I know this "cries" for SQL...
I'm trying to figure out what's the best choice for data structure. Following are several choices I see:
List of row lists:
a = []
a.append( [1, "Cat", 1998, 1] )
a.append( [2, "Fish", 1998, 2] )
a.append( [3, "Dog", 1999, 1] )
...
List of column lists (there will obviously be an API for add_row etc):
a = []
a.append( [1, 2, 3, 4, 5, 6] )
a.append( ["Cat", "Fish", "Dog", "Aardvark", "Wallaby", "Zebra"] )
a.append( [1998, 1998, 1999, 2000, 2000, 2001] )
a.append( [1, 2, 1, 1, 1, 3] )
Dictionary of columns lists (constants can be created to replace the string keys):
a = {}
a['ID'] = [1, 2, 3, 4, 5, 6]
a['Name'] = ["Cat", "Fish", "Dog", "Aardvark", "Wallaby", "Zebra"]
a['Year'] = [1998, 1998, 1999, 2000, 2000, 2001]
a['Priority'] = [1, 2, 1, 1, 1, 3]
Dictionary with keys being tuples of (Row, Field):
Create constants to avoid string searching
NAME=1
YEAR=2
PRIORITY=3
a={}
a[(1, NAME)] = "Cat"
a[(1, YEAR)] = 1998
a[(1, PRIORITY)] = 1
a[(2, NAME)] = "Fish"
a[(2, YEAR)] = 1998
a[(2, PRIORITY)] = 2
...
And I'm sure there are other ways... However each way has disadvantages when it comes to my requirements (complex ordering and counting).
What's the recommended approach?
EDIT:
To clarify, performance is not a major issue for me. Because the table is so small, I believe almost every operation will be in the range of milliseconds, which is not a concern for my application.