ansaurus

Question

Storing simulation results in a persistent manner for Python?

Answer 1

A:

Can't help you on the db stuff, as I'm a total newb and only know enough to query data from sqlite tables...

For persistence, though, could you use the pickle module to store your objects? Check the docs for the exact usage, but I think it's pretty much file(filename, 'wb').write(pickle.pickle(myobject)) to write it and myobject = pickle.unpickle(file(filename, 'rb')) to read.

Then you can read multiple tables/whatever into multiple variables and do whatever comparisons you want.

If you don't need/want to read it back in via Python, you could also just manually format it as tab delimited or something and load that file into the spreadsheet app of your choice (OpenOffice Calc is fantastic).

dash-tom-bang 2010-03-25 01:45:10

That isn't the correct API for pickle. (Also, pickle is very seldom the best serialization technique available for a given use case.)

Mike Graham 2010-03-25 02:36:01

Also, don't use `file` to open files—use `open`, which the documentation will confirm is the right way. (Additionally, always use a context manager, i.e. `with open(filename, mode) as f:`)

Mike Graham 2010-03-25 02:38:17

@Mike Graham: Is pickle worth using for this? How about `deepcopy`?

Az 2010-03-25 03:15:56

@Az, It looks like a database will both persist your data and be a good tool for accessing and manipulating it. I cannot think of a task for which I think pickle is really the right tool. `copy.deepcopy` does not aid in serialization and will not help with this problem at all.

Mike Graham 2010-03-25 03:28:40

@Mike Graham: Much appreciated! At least I know the direction I have to go down now and focus my energies trying to unravel `SQLAlchemy`.

Az 2010-03-25 03:41:40

Answer 2

+3 A:

A two-dimension display of data is a Relational database table with two keys.

In your example, the Student Key and the Session Key.

A "Composite key" is a piece of noise that you can ignore. It isn't helpful and isn't necessary. A composite key does not solve any problems well and create many difficulties. Pretend you never heard it.

What we do is introduce an additional column with a simple "identifier". It's an "autogenerated, surrogate key" for they row. Autogenerated unique keys for each row are a good thing. Composite keys are a bad thing.

S.Lott 2010-03-25 02:44:59

@S.Lott: Gotcha. So the Relational Database would be a good way of doing it? I've been told that if I wanted to check how many times a certain project appears for a certain student, I'd just have to run a 'query'? Any other advice/answers for the questions above? Ta in advance.

Az 2010-03-25 03:18:27

@Az: If you think "query" is a magical buzzword, you have much work to do. Find books. Read them.

S.Lott 2010-03-25 10:04:27

@S.Lott: Is there any particular literature you can recommend for the context of what I'm doing? I'm not looking for anything particularly advanced mainly because I'm on a time leash :S

Az 2010-03-25 20:35:55

@S.Lott: If I use an autogenerated key, I don't actually have to filter using that, right (it will number in the hundreds of thousands)?For example, I may want all the student-project pairs for a single session or all the student-project pairs for a particular student across multiple sessions, etc. This autogen key doesn't really affect it does it?Lastly, the `session_id` is linked with my Monte-Carlo simulation. So for each run across all the students, the `session_id` will remain static and increment for the next run. Does `session_id` have to be a class parameter then?

Az 2010-04-10 22:17:51

@Az: An autogenerated key is useless for filtering, since it's random, unique numbers. (Usually assigned sequentially, but there are no guarantees other than unique.) Session_id, like student_id and every other piece of data except the autogenerated key, simply follows whatever rules you define for the item and the class that maps the item to the database.

S.Lott 2010-04-11 02:55:49

@S.Lott: Right. So far, my student_id and allocated_project_id are mapped from the `Student` class. However, my session_id is to be linked with this Monte-Carlo class I'm creating. Is there a way to have this mapped to the table as well, without making it a part of the Student class?Also, in order to use an autogenerated key, do I just leave an `id` column and let it populate itself?

Az 2010-04-12 16:58:23

@Az: "session_id is to be linked with this Monte-Carlo class" Vague. "Link" means it links two things. Which two? It looks like it's just an attribute of each Project in the original question. In order to use an autogenerated key, you're going to have to break down and actually read the SQLAlchemy documentation; perhaps ask a distinct question.

S.Lott 2010-04-12 18:21:32

@S.Lott: Ah, basically what I meant was: the session_id comes into play for a Monte-Carlo simulation which basically allocates the students, in a randomised order (so everyone has a chance at getting their first choice), for a certain number of sessions (say, 5000.) As such, it would be a parameter for the Monte-Carlo class. Thus, for the first run, it's 1; for the second, 2; so on.Re: autogen keys: Checking up the SQLAlchemy documentation, it seems that I was right in that the 'id' column, if not mapped from a Class, will auto-increment itself as new records are added.

Az 2010-04-12 18:44:52

@Az: "a parameter for the Monte-Carlo class" does not sound like anything that belongs in a database. If it belongs in the database, it's an attribute of one thing or a link between two things.

S.Lott 2010-04-12 19:37:14

@S.Lott: Many apologies, used the wrong terminology. Basically, the session_id should be an attribute of the Monte-Carlo class since it is directly related to it (for *any* M-C simulation). I could make it an attribute of the Student class to make the mapping very simple. However, if someone were to take away the Student class and replace it, unless he/she had a session_id attribute in his/her new class, it wouldn't be part of the equation anymore. This is why I'd like to keep it in the M-C class but I'm not sure how I can map it to the database. I think that's what I've been meaning to say.

Az 2010-04-12 19:51:53

@Az: "if someone were to take away the Student class and replace it," That makes almost no sense at all. If they take away the Student class they totally break the model and create something entirely new. What **are** you talking about?

S.Lott 2010-04-12 23:28:17

@S.Lott: Let's say someone decided to replace the Student class with a similar class called Lecturer, and work out an algorithm that did something related to the Lecturer and Rooms. Also, let's assume that he/she makes sure the object mapping for the Lecturer worked. So, maybe some aspects have changed. There's a lecturer instead of a student, and maybe they're been assigned to Rooms at the beginning of an academic year.

Az 2010-04-13 03:48:50

`Continued` This is a different allocation algorithm, but my point is, the `session_id` should not care about that nor should it care about other columns. This `id` is something that belongs to a general Monte-Carlo simulation whose role is to run multiple, randomised sessions of a particular algorithm (whatever the algorithm is). This is why I don't want to keep the `session_id` as part of the Student or Lecturer classes. It is something that I need as part of a Monte-Carlo object. However that means my table has to be mapped from both the Student and the Monte-Carlo object. Is that possible?

Az 2010-04-13 03:53:34

@Az: "Let's say someone decided to replace the Student class with a similar class called Lecturer"... This is a *different* question. It's only marginally related to the current question, and certainly, not a topic to be covered in the clumsy, ineffective commenting system.

S.Lott 2010-04-13 09:58:54

@S.Lott: Okidokey. I'll post this as a separate question. Ta for the advice!

Az 2010-04-16 05:08:20

Answer 3

+1 A:

The logical structure you have can be considered a Ternary Relationship, where recommended to you table is corresponding to the Attendance relationship object. Therefore ideally you should also create you object model similar to this:

Ternary relationship .

Now, in your case one could argue why do you need more then 1 table, if you have only one field for each of the Entity tables. But I would still model it this way, as this model better represents the real world, and you still need to store somewhere the Project students prefer to work on, which would be another table with many-to-many relationship to Student table.

Working with entities is better and easier for you to understand sqlalchemy; whereas if you just keep one table, how much will you delve into the database really?

About composite keys: S.Lott gave you good reasons to avoid using them, and I fully agree with his take on the topic.

van 2010-03-25 22:05:26

Thanks for the diagram. I've updated the question to show what the `Student` class structure looks like.

Az 2010-03-25 22:44:59

ansaurus

tags:

views:

answers:

Storing simulation results in a persistent manner for Python?

related questions