I'm currently developing a PHP application that's using an Access database as a backend. Not by choice you understand... the database is what the client used originally and using it is part of the requirements.
One of the problems with this database is that the column names have the most insane naming convention you could possibly imagine. Uppercase, lowercase, underscores, spaces and the plain insane. For example the column "gender" holds a date. And so does column "User2". There's a lot more but you get the idea.
Faced with this I decided to create an array to map the database columns to PHP variables so we can isolate the code from the madness. However my colleague believes that I'm over-complicating things and we should use the database's column names for the corresponding PHP variables so we don't need to go through the mapping array to find what goes where.
So my question is this... am I doing the right thing or am I complicating things?
views:
367answers:
14Absolutely you are on the right track. If you don't abstract away the madness you will eventually succumb to the madness yourself.
Your colleague has a valid point though, so I suggest you also code an easy way to determine the data to column mapping in PHP.
This isn't about keeping it simple, it's about retrofitting a solid foundation to build upon.
The thing that would worry me is that this kind of random design often hides certain business rules, things like "...if the gender is a date then they must have purchased a widget at some point therefore they can't be allowed to fribbish the lubdub... " - crazy I know but more common than it should be.
Names are exceptionally important. If you want your application to be maintainable, fix them before the code base grows further.
This is a good question as it talks to the heart of coding IMHO.
I would go with you and abstract out the bad names into readable decent names. The result being a little complication for much more logically understandable and readable code.
To play Devil's Advocate, there's something to be said for not having an unnecessary layer of indirection in your short-term memory load for working with the system. Once familiar with the code, you will know what goes in which variable, so the main benefit is to someone new picking up the code from scratch. However, fixing that problem properly would also require fixing up the database schema which would (a) be a significant body of work, and (b) largely make the problem go away.
There is no black-and-white answer to this question, and the lack of an obvious answer to your specific problem suggests that you may want to let sleeping dogs lie.
On the other hand, if a cleanup operation is within the bounds of possibility then you may want to do it on a re-factoring type basis, incrementally fixing up the DB column names as the opportunity arises.
You didn't say you can't rename the columns in Access, so....do that! Another possibility would be to create views for each table, and rename the columns in the view. Then instead of working with table Employees, you work with view vEmployees. If I recall correctly, Access lets you update views as well as select from them. If you are using an ORM with PHP, that may not support updating views however.
I wouldn't say you are complicating things.
Eric Evan's book Domain Driven Design has a lovely term for this: Anti Corruption Layer
Hard coding table names and column names is never a good idea even when the names make sense.
I don't know if using arrays is the best solution though. I'm not really familiar with PHP but I would have gone with something like constant strings to store the table names. In the languages I work in this would lead to more readable code.
You are very unlucky to be stuck with this database but I think on the whole a way of abstracting the field names into something more sensible is smarter.
I would perhaps create a data structure containing the database name, sanitised name, type and a field for the content when you're pulling the data out of the DB. That would give a convenient way of drawing things together so you're not only mapping away the crazy name scheme.
Absolutely you're doing the right thing. In my opinion it's better to implement some sanity there. Going forward, you're logic wouldn't be throw away if they decided to change that database or any of it's column names. If you build your mapping the right way, it should be easy to just plug the new tables/columns right in.
If anything, what you're doing improves the agility of your overall solution.
Of course I would still say KISS applies to the method of your mapping!
Using proper column names in your end of the application is the best you can do. And you should do it unless you want to have to look up "what that field was supposed to be again?" when you have to look at it again after you did something else.
Your colleague's point is not to overcomplicate things. That's valid, too.
So encapsulate access to the fields in a method or methods and have that method do the translation. Using maps this shouldn't be a performance problem.
In fact putting all the mapping to the data source in one object might help you if your customer reconsiders to use a real database. And customers love to change their opinion.
Why not create a datalayer with classes that map on to each table. Then you can define the class methods to access the columns and give the methods whatever names you want. Then the datalayer database access code is the only thing that needs to know about the real column names. I suspect that someone (perhaps several soneones) has already developed a framework to do this. Google "php orm".
You still need to maintain database. One possible approach I can suggest is to map field names in application code as you plan it to do. But then sooner or later you have to start handling this naming madness with field names and fix it. It is not good idea just to screen from a problem and imagine that it is a safe solution and good way to go. It is only temporary workaround. Do not full your self about it.