Monday, July 26, 2010

History

Only last week I've completed an upgrade I've been working on for a long long time - working on a low priority project while leading a team doesn't contribute to agile upgrading. At least I've learned some things since back when.

Interestingly enough the Apache .dll error didn't reproduce on other development environments. Up until the moment of truth - the production upgrade. Obviously I had no recollection of the solution, I did remember running into this error and had it documented in my notes but not the solution - shame on me. Luckily, this time this error was documented in Metalink and the solution was much simpler- there was a missing Windows .dll (msvcr70.dll if I'm not mistaken).
But the more interesting issue is the one that prevented OPMN startall CA from completing...

After opening an SR (and a lot of debugging) I discovered the issue was with "ldapbinding" on the SSL port taking a long long time, in other words some heavy load on this port. I was advised starting the oidldapd process with more dispatchers (I later learned adding more workers worked better for me), that did solve the issue but this is not what baffles me about this case.
The first OID server I upgraded didn't suffer from this issue, I used to attribute it to the fact that it was the only server not being part of an OID cluster - although because of the nature of the upgrade process it always seemed like a lame excuse.
The thing is... the first production server I've upgraded was OK as well. There's now only one thing I can think of common to both servers: they both weren't installed by me. My memory fails me again here, since there might be an additional server I didn't install and did had the problem, but there's a good chance it was installed in a slightly different way, a way I don't think should affect oidldapd's function but still a different way.
The thought that drives me crazy right now is what kind of different installation steps my ancestors could have taken that matter so much, all I did was following their (with time evolved) documentation. Or maybe they did do some secret configuration steps they left no trace of???

1 comment:

Unknown said...

If by using the word "ancestor" you're referring to me, then this makes me feel old :(

I must say that the only thing I did apart from what's written in the documentation I left behind is saying the words "abra cadabra, work for moses and not for his replacement", but I don't think it should have any affect.