Hi, I'm looking at a potential piece of work involving a fairly complex data capture and management site and am wondering how best to approach it. I would be using a LAMP architecture.
The backend will involve:
Mass user creation via user uploaded csv file - I have done this before so am confident about this part.
Once registered, users will submit confidential information on a regular basis. Although the data is confidential, I would simply be planning on storing this in a MySQL database, which would be backup on a regular basis.
There would be different roles involved:
An administrator, who would monitor user activity and create new roles and user accounts where required.
End users - who would simply be supplying information via forms, with the ability to edit their profiles, view previous submissions and other basic stuff like that
Management users - who can run a series of pre-defined reports on the data and display this information within the browser. They may also carry out "free-form queries" on the data as well as extract selected results as spreadsheets. A number of these standard reports would also be made available as web services/feeds. The free-form queries part is
slightly concerning, as I don't know yet what the data actually is, plus there is a strong likelihood that new questions would be dynamically added to the data capture forms over time. The database structure would have to be flexible to accommodate this, plus I would have to provide this capacity for free-form queries, which I've never done before. Can anyone suggest a sensible approach to this?
There would also be some form of versioning, so that if a user updates/revises certain data, the changes would be tracked and the previous record(s) would still be available. I would be planning to incorporate this into the database design, so that instead of updating and overwriting data in the database, a new record would always be created, with the "overwritten" one simply being flagged as archived. This way, I think I would always be able to retrieve the live data row, as well as the archived rows (organised by date). Does that make sense?
Thanks in advance for any pointers, this is a bit more complex than what I've worked on before (mainly standard CMSs) - I think i know how to handle the above, but would be grateful for any advice a more experienced developer could provide.