views:

88

answers:

4

I work in an IT department that is divided into two groups. One group develops and manages applications, the other manages the company's infrastructure and servers. One of the problems we face is a break down in communication. I work for the application group and one of the problems I have is not being notified when a server is taken down by infrastructure, or a database is being refreshed.

Does anyone have suggestions on how to improve communications between the two groups or any ideas on how to keep a light-weight log across multiple systems (both linux and windows)? Ideally it would be nice if we could have our boxes just tweet their statuses or something.

Thanks for the help,

Ben

+2  A: 

One thing you could do to communicate server status is to have our Infrastructure group setup a network monitoring system like Nagios. This will give everyone in your application group the ability to get a snapshot view of the status of every server in the system. Having this kind of status is invaluable when you are doing development.

Nagios gives you network monitoring, but also allows you to show scheduled down time for a particular server in the system.

Another thing your group could do to foster communication with the Infrastructure is to have your build system report which servers it is currently using for building and testing your products.

Also, setting up regular meeting between stakeholders of both groups is probably a good idea too. If you all are talking to each other, even for 15 minute a week, you'll probably see incidents like the one you described above go down quite a bit.

Nick Haddad
A: 

I like the Nagios idea as well. If you want to setup something that's more of a communication tool, I would recommend a content management system like Drupal.

We use Drupal internally to communicate between teams. When one team takes a server down, they would add an event into Drupal. The rest of us would either get it as an email, an RSS item or just by refreshing the page.

Chase Seibert
A: 

Implement a change control process where changes are submitted, approved and scheduled for BOTH groups. This lets everyone know what is going on. This process can be as light or heavy-weight as you want.

DJ
We do have change control processes in place, but I think it might get to granular if we have to fill out a document each time a dev box gets backed up or rebooted.On the other hand I suppose you could have something more light-weight?
bong
+1  A: 

I think this is a bigger issue of change control.

You should have hardware and software change control and an approval process.

Ultimately, infrastructure serves you - the purpose for IT infrastructure is to run applications.

In my current large financial data company, servers are not TOUCHED without proper authorization through the client and application groups. It seems like a huge pain, but every single server is there for a reason - to meet a specific business goal and run a specific application. There is simply no excuse for the infrastructure group to be changing things or upsetting servers on their own volition.

Response to critical hardware failure might be an exception.

Needed software and OS updates are handled through scheduled maintenance windows and an approved change process.

Joe Koberg
The infrastructure team does go through change control and the people who 'need' to know, know of the changes. I'm looking for a solution that will enable those who care to know a way to find out.
bong
If your server went down and you cared - you needed to know
Joe Koberg