Saturday, May 3, 2008

Upgrading Part II: Bad Job

As with any upgrade/major installation I've ever made (at least as far as I can remember), after upgrading the database to 10g the application have demonstrated some interesting errors. One of the problems is related to concurrents and the executables they spawn crashing in mid-air - promise to tell about this in more detail when I myself have any idea.
At some point I suspected that the problem was with custom code running cmd scripts. Since some of those scripts are called from within a PL/SQL code using a Java stored procedure I thought that maybe I should try and use a "less custom" way to do that.

Luckily (or unluckily), I just read the previous week about a new feature in 10g - the dbms_scheduler package that's supposed to replace dbms_job and to be much more powerful, for instance it enables you to run cmd scripts. So I thought I'll try this out since it sounds exactly like the built-in method I was looking for.
Well, I have only one thing I can say about this: it's better to leave a feature out of the release than to keep it in when it sucks, totally.

Really, my keyboard is still soaked with sweat from my efforts to run a single script that echoes some text.
I've already written about this but I guess the message didn't get through, if I'm supposed to start the Oracle Scheduler service to run jobs then that's exactly what I expect to be written in the error message I get when not doing so - certainly not a "file not found error". I also expect this to be written in any (well, at least some) documentation describing the new feature, that's not the kind of things one supposed to dig up only in forums. By the way, some hilarious problem with similar symptoms I read about while searching for a solution to my own issue - I've actually tried this out since at first I thought this was the problem I was experiencing - it appears that for some users (maybe in earlier 10g versions) just supplying a cmd script didn't work, they had to run cmd.exe with parameters. Can't even begin to understand how you manage to create this bug and release it.
Well, after completing the POC (if at a the cost of health) I got to the real thing running a script with parameters. In some document I saw something like the following example:

dbms_scheduler.create_job(job_name => 'JOB', job_type => 'EXECUTABLE',
job_action => 'script.cmd', number_of_arguments => n);
dbms_scheduler.set_job_argument_value(job_name => 'JOB',
argument_position => 1, argument_value => '...');
...
dbms_scheduler.set_job_argument_value(job_name => 'JOB',
argument_position => n, argument_value => '...');
dbms_scheduler.enable (name => 'JOB');
dbms_scheduler.run_job(job_name => 'JOB');

Well, maybe it's just my system that is a freak but apparently the enable procedure erases the job. Exactly, I create a job, I run dbms_scheduler.enable and no job in the table - no error message, no nothing, I might have run drop_job instead. Well, apparently I don't need that line anyway. After some more struggle with cryptic error messages I got the package to do what I wanted it to do, but that's really not good enough.
I can't even start to imagine what kind of efforts are needed to bootstrap all the advanced scheduling features - windows, chains etc..

I'm willing to bet money on the fact that most developers would have given up much earlier than I did saying this stuff just doesn't work. You can develop useful and cool new features all you like but if you can't cut/paste an example and see it just working no one would use it, if all users get when something is wrong are misleading error messages they'll just get frustrated (and you can see I am one such user) .
I first read about this feature in a "10g Top 20 New Features" document, and indeed it sounded great but if it's impossible to use it, it's not really a new feature at all.
I'm sometimes not sure if I should account all those funny errors I deal with to the fact that my system is on MS Windows - a less common platform, maybe I should. But that's not a good enough reason, if I'm in possession of a disk labeled 10.2.0.3 for Windows I expect it to work, I don't really mind having it released half a year later than the corresponding Linux version, I just want it to function.

As for me, I know about this feature, I can even make it work, but there's no chance I'll suggest it as a solution to any need unless as a last resort. Too bad.

No comments: