tags:

views:

112

answers:

4

I'm starting the development of a Python application that provides a GUI for editing various graphical and numerical entities (it's a configuration and management tool for mobile robots, if you must know). One of the questions is how to store the data for a project (meaning, for a robot). The data quantity will be low (about 10MB max) and quite heterogeneous (geometrical data of the robot, maps, missions, platform logs, recorded sensor data, project preferences, ...).

I don't want to develop my own storage layer. The project data should be stored in a single file, and easily accessible from Python. Storing updates should be cheap: I don't want to use an explicit "Save" operation, and changes should be stored as soon as they happen.

A single ZIP file is probably not practical, and would require writing a persistence layer on top to map the application objects to the storage. SQLite is an obvious candidate, possibly with SQLAlchemy as an object-relational layer. ZODB also looks interesting, but I have no experience with it so far.

Any recommendations?

EDIT: When I say an "application", I mean a program to be installed on a user's computer, not a web application.

EDIT: I will be opening data files created by other (not necessarily trusted) people, similar to what I would do with Word or PDF files. This must be a safe operation.

+4  A: 

shelve gives a mapping interface that allows you to store any pickleable type.

Ignacio Vazquez-Abrams
Touché; shelve is probably ideal.
Autopulated
IIUC, shelve uses pickle for serialization. How "stable" is the pickle format across OS / architectures? Will I be able to open a shelf stored on 32-bit Windows with 64-bit Windows? Cross-platform operation (Windows/OS X/Linux) is not required, but would be nice to have.
Remy Blank
If you want to mitigate cross-platform concerns then you could replace the appropriate pickling methods with jsonpickle instead. http://jsonpickle.github.com/
Ignacio Vazquez-Abrams
jsonpickle looks interesting, except for this part: "Loading a JSON string from an untrusted source represents a potential security vulnerability", and I remember reading the same about pickle. I'd prefer not having to worry about opening a project file from a (malicious) customer and have it wipe my hard disk :)
Remy Blank
That's nothing pickle itself doesn't suffer from.
Ignacio Vazquez-Abrams
@Remy Blank: The "security" hand-wringing is only if you're confronted with evil sociopaths who (a) know how to hack the pickle format and (b) feel it necessary to hack the pickle format instead of hacking your Python source. If they want to hack, they have your source. Don't worry needlessly. Use shelve with confidence.
S.Lott
You can look at http://pyyaml.org/wiki/PyYAMLDocumentation YAML
Ib33X
@S.Lott: Being able to hack the code is not the issue here, as I will install my own application instance from a trusted source (myself). The use case I want to avoid is getting a malicious project file from an "evil" customer (created with the application installed on his computer), and upon opening it on my computer, having all my files deleted. No need to hack the source (which they won't have anyway, BTW), if unpickling can trigger e.g. os.unlink(). We have had enough of that with Word files...
Remy Blank
@Remy Blank: Please **update** the question to include this use case. Without the details of this additional use case, no one has any idea what your **real** requirements are.
S.Lott
@S.Lott: I must be old-fashioned, but when I think of an "application", I see a program installed on a user's computer, not a web application. Word, not Google docs. But you are right, this was ambiguous.
Remy Blank
@S.Lott: But I fail to see how this affects the security issue: having a malicious data file trigger an evil payload is unacceptable in both cases, "traditional" application or web application.
Remy Blank
@Remy Blank: In a web application, no one hacks the database. You filter all inputs before putting them in the database. Since no one is allowed to tweak the database file, there's no security issue with a web application. A "desktop" application where the user can only corrupt their own data is none of your concern. The only thing that matters is **you accept data from malicious sociopaths** as part of your business model. Since you have that requirement -- which is very odd in my opinion -- you need to make that absolutely clear.
S.Lott
@S.Lott: So you never open a Microsoft Word or PDF file created by someone else?
Remy Blank
@Remy Blank: "Microsoft Word or PDF file created by someone else"? What? Are you talking about viruses? I use Mac OS X and Linux, so I'm aware that Windows has a problem with viruses, but I don't know precisely what you're talking about.
S.Lott
@S.Lott: Call them viruses if you like (and no, they don't seem to be limited to Windows), but the fact is that opening a Microsoft Word or PDF file (a file that is supposed to contain only data) can trigger a destructive action, and you need "extra protective measures" like macro virus protection to be able to open them safely. This will never happen with e.g. plain text files. I would like my data files to be as safe as plain text files, that is, not have to worry about opening any data file from any source. This is a pretty basic requirement, and pickling doesn't give me that.
Remy Blank
A: 

Pickle it!

Make up your data structures however you like in memory, then just zap them to disk with a pickle when you want to save.

Autopulated
I'd rather not pickle 10MB of data for every single change to the in-memory data.
Remy Blank
See shelve, @Ignacio, above ;) -- though, that said, do you really want to be trying to keep disk data current with in-memory data? Sounds like a job for a proper transactional database... much easier to just write once when your application closes?
Autopulated
@Autopulated: Yes, a transactional database is probably ideal, that's why I started looking at SQLite. Keeping disk data current with in-memory data is a design choice: I don't want the user to have to think about saving. He should be able to close the application at any time (or have it crash, for that matter) without loosing any data and without delay.
Remy Blank
+4  A: 

I would encourage you to look at SQLAlchemy like you discussed in your question. You can map tables in SQLite to Python objects. Using the SQLAlchemy session ( http://www.sqlalchemy.org/docs/session.html#what-does-the-session-do ) you can run queries, add objects to the tables and issue a session.commit command straight away in order to auto save data to SQLite.

New data element:

ed_user = User('ed', 'Ed Jones', 'edspassword') #user is the class you mapped the table to
session.add(ed_user)
session.commit() # basically auto saving here :)

That's what I'd use. I'm using SQLAlchemy for a project now and like what I see. For more see here: http://www.sqlalchemy.org/docs/ormtutorial.html#adding-new-objects

Edward Williams
A: 

Why not giving a try to couchdb ? As your data are heterogenous, it's the best approch. Easy to use and 10MB is nothing to it : http://pypi.python.org/pypi/CouchDB

boblefrag