I manage a research database with Ruby on Rails. The data that is entered is primarily used by scientists who prefer to have all the relevant information for a study in one single massive table for use in their statistics software of choice. I'm currently presenting it as CSV, as it's very straightforward to do and compatible with the tools people want to use.
I've written many views (the SQL kind, not the Rails HTML/ERB kind) to make the output they expect a reality. Some of these views are quite large and have a fair amount of complexity behind them. I wrote them in SQL because there are many calculations and comparisons that are more easily done with SQL. They're currently loaded into the database straight from a file named views.sql
. To get the requested data, I do a select * from my_view;
.
The views.sql
file is getting quite large. Part of the problem is that we're still figuring out what the data we collect means, so there's a lot of changes being made to the views all the time -- and a ton of them are being created. Many of them need to be repeatable.
I've recently run into issues organizing and testing these views. Rails works great for user interface stuff and business logic, but I'm not aware of much existing structure for handling the reporting we require.
Some options I've thought of:
- Should I move them into the most relevant models somehow? Several of the views interact with each other, which makes this situation more complex than just doing a single
find_by_sql
, so I don't know if they should only be part of the model. - Perhaps they should be treated as a "view" in the MVC sense? (That is, they could be moved into
app/views/
and live alongside the HTML, perhaps as files named something likemy_view.csv.sql
which return CSV.)
How would you deal with a complex reporting problem like this?
UPDATE for Mladen Jablanović
It started by having a couple of views for reporting purposes. My boss(es) decided they wanted more, so I started writing more. Some give couple hundred columns of data, based on the requirements I've been given.
I have a couple thousand lines of views all shoved in a single file now. I don't like that situation, so I want to reorganize/refactor the code. I'd also like an easy way of providing CSVs -- I'm currently running queries and emailing them by hand, which could easily be automated. Finally, I would like to be able to write some tests on the output of the views, since a couple of regressions have already popped up.