views:

61

answers:

2

There was a similar question discussed around collaboration tools but one point wasn't fully agreed upon. As we now have all of these collaboration and documentation tools (WIKIs, sharepoints, blogs, etc) to keep track of project plans, busienss requirements, technical documentation, etc, the question is "should we ever delete this data". As organizations evolve and reorganize and people come and go, a lot of this data is out of date or no longer relavant or correct.

  • One thought is that there maybe useful stuff inside this data so keep it around and preserve the info at that time and it would be good to have historic context.

  • An opposing argument is that this data provides too much noise and can lead to people finding it hard to get the up to date latest data

Thoughts?

A: 

I suppose a big part of the question is "can we afford to never delete?" as in, does the org have the drive space?

Memory is cheap, but drive space allocation can sometimes be conservative, probably to discouraging projects and departments from being sloppy, etc.

I would say that if the space is there, always backup and version, because with Enterprise stuff, having a paper trail and history is more likely to pay off then be a waste of space. For the terabytes of data that will never get seen again, there is a line of code or documentation or an email that will be priceless when it's needed.

Having said that, I also think redundancy should be avoided. If your wiki has seven articles on basically the same thing, that is not the same as a back up, because it means having to update seven places for every change, and this will lead to misinformation that could count against the value of a backup. If someone needs to know how something worked 2 years ago and pull up the article that didn't get updated (or was just wrong), this has made the entire backup system a risk instead of an asset.

Ironically, I do think when fixing redundancy, that the redundancy should be part of the back up. This is where my viewpoints obviously clash, which is why I think its important to a) always try to centralize sources and have things point to them, and b) fix redundancies early. If you can somehow tie them all together so that a search for that needed info will ensure that the seeker will know of the other 6 articles, that would be an ideal patch, so long as it didn't create a crutch.

Long story short, it's better to archive data that never gets used then to wish you hadn't.

Anthony
+3  A: 

We recently dealt with exactly this problem on our internal wiki. It's really important to keep the ratio of signal-noise high, or you will find users will stop using the tool for content, and will find alternative channels. The vast majority of all user searches on an internal knowledge base will be for current information. This strongly suggests that current information should be the easy-to-find default, and out of date content should be dealt with or made less accessible.

For example, in our organisation, there was a widespread perception that 'most' of the information on our intranet was out of date, and therefore could not be relied on. This lead to immense inefficiencies as individuals felt there was no option other than to contact one another directly, call meetings, make personal notes etc., in order to obtain current information. The combined administrative burden on the organisation was huge.

We chose to explicitly deprecate content which was no longer relevant, but had historical value. These pages are prominently marked with a 'deprecated' box at the top of the wiki page, and archived. They are still linked from their logical wiki sections for reference, but are clearly mothballed, and can be easily ignored if not required.

This makes it very clear that the information is not up-to-date. For truly useless old docs (as determined by the orignal author, or the wiki maintainer - me), we delete. But even in these cases, the pages are not truly gone. We use Mediawiki, which preserves the full history of every deleted page. These are still available to administrators, but the benefit of deletion is that they don't appear in searches, and can't be navigated to by ordinary users.

The result for us has been a clear win. We now have an intranet which is genuinely useful to actual users. In the end that's much more important than worrying about endless 'what if this obsolete information is somehow relevant in the future' questions. The vast majority of it will never be required, by anyone, ever.

In short, don't be afraid to rigorously prune old stuff. The signal-to-noise ratio is what really matters.

ire_and_curses