Looking for any recommendations for an ETL system for 200+ distributed systems (Windows, AS400, Linux etc).
We collect data each month from all of our customers (regardless of system type), bring it back, process it all together and send the aggregate solutions back to them. I'm tasked with automating this system - any suggestions on how to do this robustly, I really don't want to re-invent the wheel. I don't own any of the systems I'm pulling data from, which has made this task more difficult but can install a client.
I've created a prototype client/server architecture in Java with FTP for transport but it feels brittle to me. I should note that all of the extract/transformation code for the different systems already exists in Java (albeit legacy).
I should mention we pull data once per month currently, but working towards weekly.
Any insight is appreciated.