Monday, September 15, 2008

Buffering...

It's not really clear what I'm doing writing this post when I'm leaving for Ireland in an hour and a half, but I guess I have a buffer to waste...

Anyway, this post is about the single most important part of any IT (any?) project - the BUFFER. A big enough buffer should always be taken for whatever procedure you're planning - we usually take a 30% buffer. 
I'm not talking only about an installation process, but also about things like giving an answer to your manager as to when you'll have a new product checked or a patch installed for the first time on a development environment.  I'd prefer to get a raised eyebrow and maybe surprise my manager later to working under unneccesary pressure and making up execuses. Same goes for actual installation processes, I'd prefer announcing a long downtime and getting complaints from everyone to trying to decide in a rush if I should rollback now or hope I'm lucky enough to shrink an hour's work into fifteen minutes.

Well, that was my little - written in blood - piece of advice before I drown myself in beer, whiskey and god knows what else...

Saturday, September 6, 2008

Ignoring My Issues

A big portion of being a sysadmin is solving problems. One of the first changes I've noticed when I stopped being a sysadmin is the fact that I suddenly have (much) more time on my hands, time previously consumed by solving - usually  minor - issues. But I think a sysadmin's job is also to know when to walk away from solving an issue, let me give you a couple examples:
  • There's a client side issue with the EBS system I used to administrate, I was never able to reproduce it at will or find it's cause, but I do know that reseting the Windows profile for the client solves the issue every time. So that was my policy, I've decided that if the workaround is so simple and finding a real solution is so complicated (believe me, I've tried), I'd rather just ignore the problem.
  • Last week the new EBS sysadmin was trying to solve an issue with one of the EBS related custom applications we have, it has suddenly stopped working without any apparent reason. This application resides on an Oracle Application Server so he was going over the logs trying to find the problem. When I saw it's taking too much time I came over and just restarted the Process Manager and voilla, it works. Talking to him afterwards he said he was aware of this solution but as it was past working hours he thought he'd better explore the issue - well he had a point. But my point is that an issue that happens twice a year or so and that can be simply "killed" isn't worth wasting your time on even if "killing" means affecting some other applications running from the same AS as well.
So how do you know when to ignore a problem? There's no real answer for that but you should probably consider the following aspects: Is there a workaround? How bad is it? What's the frequency? How long to work out a solution?