I'm currently in charge of a process that seems to be very intimate with the database. My program/script/framework's goal is to make uniformity out of disparate data sources. Using a form of dependency injection, my process at a very high level works fine. The implementation of each data source type is hidden from the highest level business abstraction of what's going on. Great. My questions are two.
1) I have a long paragraph (and it's the length that's bothering me) that assembles an SQL statement in Perl-space of how to translate these different data sources into one, homogeneous end format. So the SQL string always depends on the type of data I'm working with. The WHERE clause depends, the FROM clause depends, the INSERT clause depends, it all depends. It's the high level of depending-ness that's confusing me. How do I model this process in an object-oriented way? MagicObject->buildSQL? That's essentially what I have now, but it feels like all of the parts of the code know too much, hence it's length.
2) If I have a function that does something (builds SQL?), do I pass in the business objects whole and then stringify them at the last minute? Or do I stringify them early and only let my function handle what it needs, as opposed to rendering the objects itself?
Edit: While I don't doubt the importance of ORMs, I do not believe we are yet in the ORM space. Imagine baseball data for the American, National, and Fictional leagues were all stored in wildly different formats with varying levels of normalization. It is the job of my process to read these data sources and put them in one unified, normalized pool. I feel the ORM space of acting on these objects happens after my process. I'm a sort of data janitor, if you will. There are essentially no business objects yet to act on because of the lack of a unified pool, which I create.
Edit^2: It's been brought to my attention that maybe I haven't described the problem space in enough detail. Here's an example.
Imagine you had to make a master database of all the criminals in the United States. Your company's service is selling a product which sits atop and provides access to this data in a clean, unified format.
This data is provided publicly by the 50 states, but in wildly different formats. Some are one file of data, not normalized. Other are normalized tables in CSV format. Some are Excel documents. Some are TSVs. Some records are even provided that are not complete without manual intervention (other, manually created data sources).
The purpose of my project is to make a "driver" for each of the 50 states and make sure the end product of the process is a master database of criminals in a perfect, relation model. Everything keyed correctly, the schema in perfect shape, etc.