So I’ve been working on this project at work where I’m coding a php website that interacts with a database I have no control over. The database was “designed” by a co-worker that has been with the company many more years then I have; so in the end decisions are left for them to decide.
When I was first pulled aboard on this project I went to co-worker and explained that the database schema seemed flawed. I explained the importance of normalizing the database to insure data integrity issues, disk space savings, and that it would make the programmer’s (me) job easier. I even gave examples of how insertion, deletion, and update anomalies could occur in the current design. Nevertheless the co-worker explained to me that they did not want to over complicate the project’s database, and that it would not change period.
So now I’m a couple months into the project and I’m pulling my hair out each time I have to join two tables to insert a value in an attribute that has a one to one relation with each other. (So the attribute should have just been an attribute of the main relation.) The database looks horrible, and I’m afraid that years down the road this will come back on me since I programmed the front end that uses the database.
Does anyone have any suggestions as to how to talk a “superior” co-worker into correctly designing a database? Or any suggestions on how to avoid getting patronized years down the road for a design I didn’t have any part of? Should I just refuse to work on projects like this in the future? Leave a comment in my code saying the database wasn’t my doing?
Thanks.
Edit: Additional information in response to comments...
I know that the de-normalization of a database can be useful for speed purposes, so I’m not overlooking this. For those reading who haven’t heard of this tactic I’ll illustrate an example. Often database designers have an address relation that lists a user’s street, city, state and zip code. While everyone knows that a zip code determines the city and state, therefore constituting a table indexing zip codes to city and states. Often database designers will combine the two tables, de-normalizing them with foresight that every query for a user’s address would require a join from the address table to the zip table. This ultimately speeds up the querying process, and is sound reasoning for de-normalization of portions of a database design.
To fill in some details here the database is designed for a Tour Request system, so the data within is related to visitor information, dates, etc. The schema that the current database uses is unpredictable from start to finish. From the simplest inconsistencies in variable naming patterns (example: num_of_visitors, arrivalMethod, etc) to having separate relations defined for a single state one-to-one attribute. Example: statusID represents the status of the tour request, it can only ever have one valid state selected from a group of possible states (Approved, denied, pending, canceled.) For some reason the database has a status table containing: tour_id(Primary key of tour relation), statusID. This allows for multiple states to be defined for each tour request. Which, by design a tour request should only be in one state at any given time. So it’s a flaw in the design not an oversight of mine.