views:

69

answers:

3

I'm working with a legacy database which due to poor management and design has had a wildgrowth of columns which never have been or are no longer beeing used.

Is it possible to some how query for column usage? As in how often a column is beeing selected (either specifically or with *, or joined on)?

Seems to me like this is something we should be able to somehow retrieve but i have been unable to find anything like this.

Greetings,

F.B. ten Kate

A: 

This article will give you a good idea of how to search all fixed code (prodedures, views, functions and triggers) for the columns that are used. The code in the article searches for a specific table/column combination. You could easily adapt it to run for all columns. For anything dynamically executed, you'd probably have to set up a profiler trace.

Joe Stefanelli
A: 

Even if you could determine whether a column had been used in the past X period of time, would that be good enough? There may be some obscure program out there that populates a column once a week, a month, a year; or once every time they click the mystery button that no one ever clicks, or to log the report that only Fred in accounting ever runs (he quit two years ago), or that gets logged to if that one rare bug happens (during daylight savings time, perhaps?)

My point is, the only way you can truly be certain that a column is absolutely not used by anything is to review everything -- every call, every line of code, every ad hoc Excel data dump, every possible contingency -- everything that references the database . As this may be all but unachievable, try to get a formally defined group of programs and procedures that must be supported, bend over backwards to make sure they are supported, and be prepared to fix things when some overlooked or forgotten piece of functionality turns up.

Philip Kelley
+2  A: 

Unfortunately, this analysis on the DB side isn't really going to be a full answer. I've seen a LOT of instances where application code only needed 3 columns of a 10+ column table, but selected them all anyway.

Your column would still show up on a usage report in any sort of trace or profiling you did, but it still may not ACTUALLY be in use.

You might have to either a) analyze the entire collection of apps that use this website or b) start drafting the a return-on-investment style doc on whether it's worth rebuilding.

Andy_Vulhop
@Andy_Vulhop, +1 Good point. Your best bet, create a copy of the database, install the app(s) on a test machine, and start deleting/renaming columns/tables and seeing what does or does not break... having some automated testing to determine if the app works or doesn't would be ideal.
Chad
Precisely. This would be the perfect time to create a full suite of integration tests, provided they aren't in place, and start testing the system(s) with suspicious columns dropped.
Andy_Vulhop
Yes i am aware of this, problem is that the database really is pretty big (25 gigs on data atm) and i want to start somewhere. I'd like to start by simply getting rid of all the columns that haven't been updated/inserted for 6 months things like that.Ideally i'd rewrite every single thing that uses the database and redesign it as a whole (can you believe there are no FK's..... :'( )Can anyone suggest some good articles about starting up with integration tests?
F.B. ten Kate