Basically the situation we have is we need to be able to take data from an arbitrary data source (database, flat file, it could be anything really), morph the data so it maps to the schema we have, and then send us the data so we can store it. The data will be coming from ~200 different physical locations around the country and we'd like it to run on a schedule (once a night or whatever). Also the people at the ~200 locations are not technical, and so we want to make it as easy and hassle-free as possible for them.
Nothing is implemented yet, it is still in the design stage. This is a preliminary design I came up with, and I just wanted SO's opinion on it as well as any problems they can foresee, or any suggestions to do it a better way.
What I came up with was to have a standard standalone application that will accept different plugins for reading data (from a database, flat file, etc). The plugin will read the data and give whatever's there to the main standard application, which will serialize it to XML and send it to us, either via a WSDL or some kind of REST api (haven't decided on this yet, SOAP seems like such a pain in the ass). The app can be scheduled via windows scheduler or run as a cron job, that part is easy enough to do.
This way the user only has to enter the location of the data and possibly how to get to it (username/password, host, etc, whatever configuration is needed). The catch is, we have no idea what their data is going to look like, since there is no standard and every location does it their own way. What I was thinking is, obviously if its a flat file there only way to do it would be to send the whole file. But if it's a database, then if some config file isn't present, it sends us all the metadata (tables, column names, etc), then we can build a config file to tell the application what data it needs to select, e-mail it to the user and tell them to put the config file somewhere.
I was thinking it would be easiest and probably best to do the actual morphing of the data on our side, so if it changes we don't have to send them anything, etc etc.
This will all be done in Java, so if there's already some obscure Apache project to do all this for me, please do let me know.
Also what other storage solutions do you foresee non-technical people using besides your standard SQL database or a flat-file?
Thanks in advance!