Sunday, December 27, 2009

Randomness

Time for a break from professional stuff. The only thing common to this post and my professional posts is my tendency for voodoo crap (although I do consider myself a very reasonable guy).

Lately I've been subject to several events that made me wonder if everything is really a coincidence. What's more interesting is that those different kind of events are not something I've invented and I know many people (supporting comments please) who have experienced similar things as well (and that's not the first time I experience this kind of events as well):

1. Useful Fact - Everyone has some relatively known facts that he's missed during his life, me included. What is strange is that many times you learn one of those facts and then it suddenly pops up everywhere. My example from this month: a lecturer was giving a probability example and mentioned the - unknown to me before - fact that pingvins do not live near the northern pole, a day later this fact was mentioned in a movie.

2. Lucky Instinct - A few weeks ago I was checking out Google Dashboard and saw that my last chat conversation has been with someone I knew I didn't talk in a while. I started constructing my next blog post about how Google showed me totally wrong data, but decided to check my "All Mail" anyway. To my surprise I did have a conversation with this person. Apparently she sent me a message just as I closed GMail and unlike in other times when I miss a chat, GMail didn't put this conversation in my Inbox so I had no real way of knowing this message was sent(that's actually a GMail bug as well, but not as bad as showing totally wrong data). This person wished me a happy holiday so this coincidence saved me from being rude. This whole story is a bit like when you forget something minor at home, come back and then recall you actually forgot something much more important and start wondering what would've happened if you didn't recall about the minor thing. But here you can give credit to subconscious, unlike in my example where I just happen to check the Dashboard on the same day. On the other hand, I might have missed a lot of other chats.

3. Déjà vu month - During life you meet and interact with many people and with most of them you later lose contact, you might occasionally meet one of them but that's it. So it strikes me very unlikely that during the last month I've met\reconnected with three different people (with whom I had significant interaction before) from my past. The average amount of time I didn't have any contact with those people is 4 years! And no, those people don't have anything at all in common and don't know one another. I actually had similar month a couple of years ago (interestingly - and unlikely - I think it was the same time of the year).

Do you believe in voodoo now?

Saturday, December 12, 2009

Business Model

Every company has a business model. It can focus on selling a product, selling accessories for the product, selling support, etc.

One type of model is the "consultation" model, it's used when the product the company sells is sold for near to nothing, but the company still has to make money somehow, so the company presents you with experts that will help you through every phase of the implementation.
Now there are two options - the consultant gets paid either on a per project basis or on a per hour basis. The per project option is OK, but it's highly unprofitable for the company - it's never a good idea to commit yourself to an amount of work you don't know how to translate into time(==money).

The per hour option is problematic from the client's side. A "good" consultant from the company's perspective is one that can do the same work for the longest amount of time - to me it seems to contradict professionalism, but that's not my point here - that kind of consultant has no motivation to do his work to the point and as effective as possible or even answer questions on the phone - every question can be seen as something that should be thoroughly discussed on the client's site (and expense).
Now, with support things are easier - if a client feels dealing with a support issue is taking too long he concludes there's a problem with the company, and that's obviously not so good for the company. But consulting taking a long time can easily be painted like a good and thorough work on behalf of the consultant, and since the client often really doesn't have a good idea about the product he can be easily fooled.

Another pitfall you should beware of when taking on a software project.

Friday, October 30, 2009

High-End Software

Like I mentioned a few months ago, I'm currently working on an Enterprise Search project. Since then I discovered a few other annoying things about FAST ESP and other high-end search solutions.

I've tested two main types of search solutions:
1. Entry level solutions - by companies that have another core business except search.
2. High-end solutions - by companies whose core business is search.
Obviously, there a lot of products somewhere in between, but that's always true. Actually, FAST has been lately acquired by Microsoft, but for the purpose of this post it is still under the second category.

Being a newcomer to this field of Enterprise Search I was quite innocent and thought that entry level solutions will be simple, basic and easy to use - no disappointment here, but I also expected high-end solutions to be a complete search suite that can do great and cool stuff.
Well, products I've examined can do great and cool stuff, I'll even exaggerate and say that it feels like everything is possible if you know your way around the product, but in no way these products I've examined are a suite. It almost seems (it's actually sounds like a swell business model) that the products are intentionally designed to make you grate your teeth at every step so they can provide their business partners with work for their consultants. It seems like they try to make the product as naked as possible, leaving only the basics of indexing efficiently, providing customizing tools and a few other abilities. As always, a whining is not complete without a few examples. I'll rely mostly on FAST ESP examples, but not only, I just know it the best:
1. I would expect a high-end solution to include some security enforcement. Say I want to index a file system(obviously there are other examples as well) content source, to the entry level products it's obvious that I want to index the ACLs as well, not so much to the high-end stuff. High-end software will require installing an additional module that I'll have to carefully configure. And that leads me straight to the next point...
2. To configure the security model for FAST ESP to enforce file system security I have to follow the documentation which, put in one simple word, sucks.
Fact: surprisingly the FAST guys realized I might want to index Windows based file systems.
What I'd expect to see in the documentation: a simple to-do list containing every step I should do in order to configure the whole thing.
What I actually got: every piece of the puzzle is in a different part of the documentation so before each of the following revelations I had to wonder why nothing works and why nothing is written (or at least not where I expect to find it):
  • When the module is initially installed no indexed items are searchable, there are a few ways to work around it and they're written at the end of the documentation.
  • There's no built in authentication module (a pity) but at least there are a few ways to work around it as well.
  • To index securely you have to modify the processing pipeline (or the whatever the term the product uses), next point here I come.
3. The products I've examined have far too much of encoding related (Hebrew and computers should not co-exist) issues, some of them I've been able to overcome but still, for now I can't index securely paths with Hebrew in FAST ESP.

I have lots of other examples and I'll probably have a lot more as I go along, but I think I made my point. Totally high-end...

Wednesday, October 21, 2009

Development

I'm not a developer but as the team leader of the sysadmins (which is often referred to as the "technological team") in my department I am interested in the platforms and methods that are used for development, after all the bottom line is that I'm responsible for the stability of the systems and a system is more stable when the correct practices are used.

Let's take for example our Oracle ERP system (which I used to administer so I'm more familiar with it than with other systems). Today we have web applications based on different technologies:
1. PL/SQL cartridge based applications - well, the programmers do know PL/SQL so it's kinda OK, but no new applications are written using this technology and in the next EBS version (R12) it's no longer supported.
2. Java based applications - no one really knows them and they're supposed to be re-written.
3. ApEx based applications - small modifications are still made to them but the knowledge level is pretty low.
4. .NET based applications - that's the technology that all the developers are familiar with and with which the latest applications were developed.
We have all those techniques since each of them fitted a different need at a different time.

One technology we don't currently use is the Oracle Applications Framework which is a pretty strong technology but...
We currently have about 3.5 EBS developers none of which is really specialized in Java (and the Framework is Java based) and the turnover rate in the developers' team is pretty high.
These facts don't really get along well with the multiple development techniques we maintain.

So now it seems like the time to make a strategic decision as to how the EBS development will look in the next few years, I can think of two main approaches:
1. We choose one single technology that we'll use to write all the solutions, obviously it won't meet perfectly all the needs but the programmers will be specialized in this technology and training will be relatively simple.
2. We choose two technologies that will answer two major types of requirements - for example general user applications vs. specialized user applications. The programmers will be less specialized but we'll be able to provide better solutions.

Anything more than that will have too much training and maintenance overhead...

Friday, July 17, 2009

Disappointment

Sometimes you think you've just seen something amazing: on news, a discount or some great technology, since this blog has a certain orientation, I'll focus on the later.

This week I was planning my oncoming trip to Slovenia, and me and the Doc got to the point of looking at transportation, specifically trains.
Google led us to this site. If you enter the site you'll notice it adapts to your localization settings - nice! We've started typing "Venice" in the destination and not only we had auto completion we saw support for different writing options for "Venice" - really nice comparing to the dozens of other transportation sites we've seen (at some point we thought it could read Hebrew but that's really an exaggerated expectation). And now to the most amazing feature (tam tam tam...) the prices appear in local (ILS) coin! After being amazed for a couple of seconds (yes I know, it's not really that a sophisticated technology, but as mentioned above we did see other transportation sites), we've decided to check what the British version looks like.
Apparently the two sites give different prices for the exact same trip. Yeah, that's right, three adults traveling from Ljubljana to Venice buying tickets in Israel will pay 453 ILS and the same three adults will pay only 42 pounds traveling from Great Britain. European discount? Don't think so, since Australians pay less. I've tried long and hard to think what could be the code behind those ridiculous results, but I still have no idea, suggestions?

So, going back to the first paragraph the conclusion is that miracles do not exist, sorry kids.

Saturday, June 20, 2009

Making Me Sweat

For quite some time now, I've been working on an Enterprise Search project for my organization. Last week I've installed a FAST ESP system as part of a POC, the system uses many third party software components and since each has it's own license they can't be provided as part of the ESP software and have to be downloaded as part of the installation process.

But, here's the catch, I don't have any Internet access on my network. The installation guide provides a solution for this kind of scenario - install the ESP on a computer with Internet access, after the components are downloaded you can abort the installation and transfer the components any way you want to your LAN.
This is a very strange solution, why run an unnecessary installation? What if I have limited privileges on my Internet computer?
After all, there's a much simpler solution - why not provide me with the links to the components? After all, what I did was to harvest the links from the manifest.xml the installation uses to download the components...
Why make me sweat?

Monday, June 8, 2009

Storming In Or Taking Baby Steps

When you have a problem (specifically, technology related) there are two main methods for dealing with it:
1. Storming In - throwing everything you have at the issue, e.g.: trying every performance related command you know, changing hardware, etc.
2. Taking Baby Steps - slowly and thoroughly analyzing the issue, e.g.: reproducing the issue on a development environment, checking all changes that have been made, etc.

Reality, like always, is some shade of gray so what you do in practice is usually some combination of the above two methods.
Many times it's not clear what approach to use. Storming in will sometimes solve the current crisis but might handle only the symptoms leaving you unprepared for the next time the problem manifests itself (and this time the instant solution might not work). Generally, it's always better to understand the causes for a problem and to investigate them, but sometimes issues won't reproduce on development environments and the causes are just too voodooish and some issues are just simpler to ignore. And after all, time IS a valuable resource.

Thursday, May 28, 2009

The Stage I Fear The Most

Since I became a team leader I've been working more and more with Oracle products other than EBS, mainly iAS and OID. I've noticed that the installation part of iAS-like installations never fails, the stage I've really learned to hate is the Configuration Assistants - each one that completes gives me a sigh of relief, but too many times I'm cursing instead. 

For instance, lately I've been trying to upgrade my 10.2.0.2 OID instance to the latest 10.1.4.3 version, upgrade path looks like this: 10.2.0.2->10.1.4.0.1->upgrade MR->10.1.4.3.
10.1.4.0.1 - check. upgrade MR - check. 10.1.4.3 - uh oh...
Even before installation begins, Oracle tries to outsmart me - documentation says to shut down OPMN, but when I do that I get an error during the initial steps that says it can't determine running processes(like daaaa!). OK, so I start OPMN, but then I get a "dude, you're running services that bother me" message, "OK take it easy, I'm shutting it down" is my reply. Up until now I have the upper hand and the installation itself runs smoothly, as always.
But then comes the one before last CA - the DCM CA that fails because of a problem with an Apache dll. So after trying to solve it for quite some time (replaced dlls, configuration files, etc.) without any positive results I opened an SR, the action plan , is kinda funny - install 10.1.2.3 on top of 10.1.4.0.1, then try again, makes sense don't you think? Well, it does (a bit), my previous attempt did include a 10.1.2.3 upgrade before 10.1.4.01, some maybe I should've thought about it myself.
So I did just that, but then the SSO CA took ages (more like two hours, but you get the idea) to fail, some mixture of opmn stopall/startall got it through but now the OPMN startall CA fails so I didn't even got to the problematic DCM CA yet, I guess I have some frustrating days ahead...

With the CAs I always have the urge to just skip 'em all and run 'em later, on the other hand if things are wrong with the so basic OPMN startall maybe I should handle things now. I'll probably have more updates regarding this issue in following posts.

Saturday, May 2, 2009

Linguistics Tip

OK, so we all know that when facing language issues, playing with NLS_LANG often does the trick, but a while ago we had an issue with a web app (running on iAS) that would show normally Hebrew characters for strings taken from configuration files but would transform any Hebrew data from the database into question marks. The interesting thing was that the same application running from a workstation worked perfectly. 
We were starting to get desperate, as more and more people looked at it and didn't manage to find a solution. But when I showed it to the DBA in my team he solved it in five minutes, luckily, he had a similar issue the week before.

Here's the trick, sometimes you have to modify the language settings for the server itself. Actually it's not enough to change the language settings, you have to check this annoying check box as well:


We've obviously played with the Regional Settings before just never even looked at this tab and hence didn't see any changes.


Friday, April 24, 2009

Linguistics

My working assumption always was that about 33% of my duty as an administrator is language related, it might sound like a hell of an exaggeration but experience shows I'm not that far off target. You can always count on Hebrew to supply some issue with printing, badly encoded databases or question marks instead of readable text. The problem with linguistics' issues is that they tend to be hard to solve, you would expect most of these problems to have a common root cause but the fact is that a new solution is required each time, you do collect useful tips as time goes by but there's always some new issue waiting around the corner.
Hack, if I think about what I did for the last month it seems like most of the time spent was somehow connected to the fact that English is not the only existing language, I even think I have material for a couple of posts with useful tips and a lot of desperation.

Usually when I encounter these issues I blame Hebrew for being so odd - wrong writing direction and weird characters don't make my job easier, but English seems to be pretty fucked up as well. Sometimes I wish I could find the guy who invented case (as in lower case and upper case), it really makes no sense, OK so we'd write "english" instead of "English", who cares!

About a year ago I've mentioned researching EUS, well, we managed to implement (the rather cool) design we wanted but not without trouble. 
As part of the design we wanted to run command line esm commands to assign global roles to enterprise roles, the problem was that whenever we added a global role with the command line tool the whole list of global roles would just disappear from the GUI. How come you ask, and case sensitivity I answer! The issue was that the record inserted into the OID had case issues. It took me a few months to stop waiting for Oracle's solution and a few days to decompile the code and find the problem, it took Oracle another couple of months to supply a working patch.
And all of the this because of that stupid case invention.

Sunday, April 12, 2009

Security Issues

A while ago I was phoned by the Apps DBA when I was away from the office and he told me the production is experiencing an issue that includes concurrents failing with an error message regarding access problems to .tmp files. It's not the first time I encounter this kind of problems and it's needles to say that it was never the stated issue with file permissions or disk space. So I came to the office to help the DBA to solve this issue (nothing trivial worked).
Solving the problem was odd, I had a good guess regarding the solution based on the previous cases, but it's really strange it did work since if what I did fixed a real problem I have no idea what triggered it and how the issue didn't manifest itself before, since nothing I know of changed (definitely not the problematic database parameter). Concurrents just started crashing with no apparent reason.

But that's not the main issue here (unless you're stuck with it and then you better know the solution or else you'll have some "fun" trying to solve the issue).
The really alarming thing about this case is that when concurrents crashed some of them had the apps password  (in cleartext) in their log files - the same log files every user can see for the concurrents they submit. 
Really, I couldn't believe my eyes, it seemed like the format usually present in those damn .tmp files, but I really don't understand what kind of a leak can cause the apps password to be dumped into the log file, to tell the truth it almost looks like an intended (buffer overflow?) attack.
The implications are very unnerving, what if we didn't happen to check a problematic log (not all of them had the issue) or what if we had just overlooked it? The meaning would be that everyone would be able to see the password until those logs were purged and we wouldn't have a clue. How can we be sure it won't happen again and who guarantees we will notice it the next time?

Friday, February 13, 2009

Thinking With A Pen

Those of you who read previous posts of mine know that I love the connection between the human brain and computer science. One of the topics I've learned during my Neural Networks course in my first degree was Associative Memory, and indeed I'm often amazed at how memories and thinking in general are association driven.

I may use all kind of technologies to manage my data but I still always keep a primitive notebook+pen set on my desk. You can say that there are things that are just more easy to do with a notebook, and that's true, but in addition, this notebook helps me to think. I sometimes only need to grab my notebook and pen without any scribbling and I already get into "thinking mode", it's like the pretence of writing signals to my brain that it's time to work. 
When I think about it, I actually can't perform heavy thinking without a pen in my hand. I remember last year when I had an Advanced Algorithms course, the homework weren't so hard so most of the questions I solved with a pen and a blank paper in my hand, whenever I tried to give up the pen it just felt awkward.
Another example is that I can't remember some fact I always try to think of a situation in which I used this fact and it often helps (in other cases I just can't remember no matter what, but that's another issue).

Friday, January 2, 2009

Trust No One

One of the TV shows I like the most is House. As many people point out, each episode is pretty much the same and yet it's one of very few shows that improve as seasons go on. Anyways, one motif that appears in almost every episode is that people shouldn't be trusted under any circumstances, every patient House treats has some terrible secret lurking in the dark.

Users are sometimes like patients, they don't necessarily have a dark secret (although they can occasionally forget to tell you an important detail) but they definitely cannot be trusted. That's why the very first step of handling a problem is watching it reproduce with your own eyes, there are a couple of reasons I can think of for doing so: first there's the Sysadmin Effect - never underestimate it. And then there's the fact that the reported problem is often not the issue at all (or not at all an issue). Examples I've heard of and experienced myself range from "The system is down" when working on a computer disconnected from electricity to calling hysterically about "a critical application is down" when the application is neither important, nor down. Obviously there are more examples in the mid-range but that's just a blog post not a thick book.
I have a special place in my heart for the guys that have some good advice, like the totally non-technical guy that advised me to add some memory to my servers to improve performance or the one that had an idea about how to fix the application (the OID) when the issue was with his OS.

So remember, always begin with the first step.