I'm contemplating an application that is, at a glance, a "free form" database. A collection of notes and artifacts. However, at the same time there are some higher level structures within the system.
My 10 second back-of-napkin design entails storing individual "entries" in small files (perhaps XML), organized in directories, and then indexing the entire set using something like Lucene.
The premise behind it is that it will be trivial for folks to "interface" with the system, since they need to merely "put files in to the right places". And since they're simple text files, they can be generated by any program (such as scripting languages), and, if necessary, even a text editor.
The detail is maintaining the index, and any other possible relationships.
In theory, on startup, the program could scan the directory for changed files and update the index. It could even do this in the background. I don't anticipate this being a horribly long process, as I don't anticipate having 1000's of entries. But it could always be an option to have the system scan only when instructed if the size gets too large.
Or I could require that some special file is updated with "new files" or somesuch thing that the system can check on start.
The alternative, is to use some other format rather than individual file. Use a database of some kind, they're a dime a dozen. But by doing that, all of a sudden this data is effectively "opaque" to a casual user. This makes scripting and such potentially more difficult.
Now, I could use something like, say, SQLLite which has broad support, and publish a database schema. Or I was thinking I could create a service layer in the application.
If I write it in Java, I could publish a Java API that a tool could use, but only if it too was written in Java.
Or I could expose the API as, say, lightweight Web Services (POX over HTTP, or REST over HTTP). HTTP support is far and wide today. That would require that the application be running in order to use any utility.
As with everything, it's a balance. I think the File solution is simpler, potentially less efficient, but perhaps limiting in the internal complexity.
The API can be much more powerful, but is harder to use, and certainly not useful for a casual user.
How do you think you might approach this kind of problem?