views:

73

answers:

3

I'm tasked with writing an application that acts as a central storage point for files (usually document formats) as provided by other applications. It also needs to take commands like "file 395 needs a copy in X format", at which point some work is offloaded to a 3rd party application. I'm having trouble coming up with a strategy for this.

I'd like to keep the design as simple as possible, so I'd like to avoid big extra frameworks or techniques like threads for as long as it makes sense.

The clients are expected to be web applications (for example, one is a django application that receives files from our customers; the others are not yet implemented). The platform it will be running on is likely going to be Python on Linux, unless I have a strong argument to use something else.

In the beginning I thought I could fit the information I wanted to communicate in the filenames, and let my application parse the filename to figure out what it needed to do, but this is proving too inflexible with the amount of information I'm realizing I need to make available.

Another idea is to pair FTP with a database used as a communication medium (client uploads a file and updates the database with a command as a row in a table) but I don't like this idea because adding commands (a known change) looks like it will require adding code as well as changing database schemas. It will also muddy up the interface my clients will have to use.

I looked into Pyro to let applications communicate more directly but I don't like the idea of running an extra nameserver for this one purpose. I also don't see a good way to do file transfer within this framework.

What I'm looking for is techniques and/or technologies applicable to my problem. At the simplest level, I need the ability to accept files and messages with them.

A: 

You should be able to implement this in a fairly RESTful way with HTTP PUT and GET operations. This would be very useful for several reasons:

  • Able to link to the storage directly from internal websites
  • Easy testing
  • Lots of libraries available to help you implement this
  • No need to worry about what platform the client application uses

I would suggest implementing it so that getting a file in a specific format is as simple as navigating to:

http://www.myserver.com/filestore/documents/docname&format=xxx

Within the server of course I would use a database of documents, file formats and cached versions of already converted files. I would invoke the third party translators on demand, i.e. when a request comes in for a document in a specific format and the document is not already in the cache.

mikera
this is all fine and good, but honestly its the most obvious part, even trivial compared to the hard part, the hard part is the rules engine that actually does the orchestration of making all those REST calls.
fuzzy lollipop
I looked more into this solution and it seems to make the most sense. With regard to @fuzzy lollipop's suggestion, a rules engine seems a little too beefy for our needs at this time (although I can't conclusively rule it out as a future project).
phasetwenty
You can do both and use a rules engine inside the server to implement your business rules. I probably wouldn't bother with a rules engine though unless your rules a) are complex b) are subject to change and c) require management of long-lived processes.
mikera
the lesson to remember is when you start writing your own code and realize you are writing a rules engine, STOP and use an existing one.
fuzzy lollipop
+3  A: 

What you need to research is a BPEL Rules engine. Here is a list of open source rules engines written in Java. There are alternatives in other languages as well, even Python. This is definitely not something you want to tackle re-inventing yourself. This problem domain gets very complicated very quickly, any "simple" solution will be naive about scalability and performance and will just get thrown out sooner than later.

fuzzy lollipop
+1 for something i've never heard of. This is instantly usable though. Thanks.
chiggsy
most peoples problems are far from unique
fuzzy lollipop
A: 

Your issues seem to cry out for a RESTful web app. As to what framework will let you implement it best, some people do like Django even for that (maybe with django-rest-interface), others prefer lighter-weight frameworks for this purpose -- see some discussion on this SO question. Another possible framework not mentioned there is RIP -- doesn't appear to be currently maintained, unfortunately (indeed, its link to its SVN repository is dangling), but may be worth downloading the sources from pypi, having a look, maybe adapting it.

Alex Martelli