views:

9

answers:

0

This is more a system architecture preference/recommendation question: (yeah yeah, it looks like an exam question but i'm just aiming to make myself a better developer)

Requirement: You have a raw data source needs to be exported to a proper data store through a batch process (running at regular intervals).

Assume you have a reusable process for picking up a CSV file (or webservice or any persistent temp storage, doesnt matter for this argument) and sending the information to a data store. The operation is small in technical terms but stable. The process is abstracted from its source; it doesnt care where the data comes from.

Your raw data is in a format that needs conversion/massaging before it can go into the CSV file.

1 Now you could compose a process to convert the data and place it into a CSV file, letting the reusable process pick it up.
2 You also could make a specialist process that will perform the conversion and send it directly to the data source.

1 will mean the process is split up into 2 batch jobs; both jobs must now run to complete the entire process, so stability is affected at an architecture level (you now have 2 jobs included in your daily checks).

2 solves stability (only one job for your daily checks), but you're not utilizing a reusable and already-proven process.

Which is the better option? Include the reusable process and have the requirement fulfilled in 2 steps OR have a single process that does everything and improves stability?

How would your answer differ in the following cases:

    If the reusable process is in the form of a service (theoretically handling requests/deposits from multiple processes)
    If the reusable process is standalone; implementing it in the solution would require making another 'instance' of it, forming another instance as part of your daily checks.