Saturday, May 31, 2008

Three concepts which are shaking my foundations

1. HP's announcement of having made a memristor. I trained as an electrical engineer. The memristor is the 4th passive component (the others being the resistor, the capacitor and the inductor). One quarter of the fundamentals of electrical engineering was announced last week! A memristor can function as a memory cell but it's passive (i.e. no power to maintain state, no refresh cycles etc.). And to make matters worse the theory to explain how HP's memristor works requires a complete change in how electricity is understood (flux not voltage). It's been compared to Newton's discovery of F=ma.

2. Louis Savain's positing of a Universal Behaviour Machine rather than a Universal Computing Machine being the basis of computing theory. Parallelism is inherent in the UBM and serial processing has to be added as a constraint. Whereas for Turing's UCM, serialisation is inherent and parallelism is the constraint. Savain posits that we've got the whole theory of computing wrong. Thirty years as a programmer and now he tells me I don't know nuthin'. To make matters worse my training as an electrical engineer forces me to agree with him.

3. John Macmurray's positing that human beings are persons, not machines or animals but we don't have the philosophical apparatus to understand what a person is yet. Thinking is a by-product of what persons do. Blows the theoretical foundation out of most attempts at AI. A "thinking machine" is by definition an oxymoron.

This is God's/Allah's/Flying Spaghetti Monster's revenge for my complaint last week that there is nothing exciting happening in IT these days.

Tuesday, May 27, 2008

Making a boring job interesting

I found I needed to (re-)read a few online articles about making a job interesting because I was losing heart. (That's 'l-o-s-i-n-g' not 'l-o-o-s-i-n-g', God I hate it when people 'loose' something (as in set it free) when they mean 'lose' (as in misplace).)

Found a bit of inspiration in the bits that mention re-defining my job to include stuff I like doing. So I thought I'd better catalogue what I like doing. I like learning new things. I like learning things that other people are learning so I can talk about what I've learned, in other words, feel part of a community of learning. I like building new things. Usually that means new software artefacts, at least this is so with my skills (but my electrical engineering background means I still love the smell of solder resin and the ping of cut wires and the sheer delight of seeing the first LED light up). I also love teaching new stuff to people. I love preparing course materials (but hate PowerPoint).

Last night as I was about to leave I found a terabyte disk on the server full of VMWare images. Ooh! now that is interesting!

Yesterday I spoke to one of the developers about an error in one of the automatic builds. It was caused by a file having a different name to the one it had in the ClearCase Software Object declaration. That's a fatal flaw as far as I'm concerned. CC is out no matter what! I wouldn't touch it with a barge pole. A VCS is supposed to help not hinder software devel. And how did the developer's branch get into the main tree if it had errors in the compile? "Oh, we do that all the time. It takes too long to run a build in a branch so we just force the changed file into the main trunk and let the build engineer deal with it." (aka me!). Like bloody hell is that going to continue! Not sure if this contributed to the departure of the previous incumbent but it certainly will contribute to my departure if I can't get that changed in the next five months.

So here's the slowly forming picture of the future build system:
  • Set up virtualised copies of all the OS's we support. (That's probably done but I'm not sure if it's used yet.) Also need to set up some way of controlling the virtual server programmatically. (Need to learn about this! Yay!).
  • Setup a sandbox VM for myself. Probably RHEL (one of the supported OSs).
  • Install on my VM Apache2+mod_fcgid+Catalyst to allow web management of build system.
  • Speed-up display of build logs (gzip, caching etc.)
  • Install Bazaar(Bzr) or Mercurial(Hg) on VM and put in one of the product trunks
  • Install app to keep Bzr/Hg repository in step with Perforce repository (keep the mgmt happy)
  • Install SCons and see if it can handle a product build.
  • Prepare propaganda campaign: presentations, discussion papers, training courses. (But only if can get actual working example. Spin won't work here. Too many cynical developers :-)

Question: why is one product build taking 5 hours every day? It doesn't change that much. Why aren't previously compiled parts being used? Why does a full compile have to happen? (My guess is the VCS and build tool(s) haven't been reliable enough to risk using already compiled components. Another fatal flaw in CC (and P4?)?)


The good news about attempting this project is that if I fail I can walk in six months' time; no need to wear the current shit for any longer than necessary. The IT job market is so good at the moment I have to consider opportunity cost. I know I can earn 50% more at at least one job I was offered. I have a duty to my shareholders to maximise my company's profits :-). So if I can't get any committment from the mgmt here to my project there'll be no hard feelings on my part when I leave.

Sunday, May 25, 2008

The hard part of a new job

One of the hardest parts about starting a new job is when the previous incumbent has left with no handover. In this case my predecessor had spent 8 years in the role and built layers upon layers of "obvious" decisions which might or might not be still valid. How do I find out? I'm tired of pestering people who have spent decades in their job with "trivial" questions. Two weeks now of pestering questions and I don't feel any closer to understanding why they are doing things in such a complicated manner. The tangled interconnections are almost completely opaque.

For instance, the main system is written in Perl on a Unix box. But it fires off processes onto a variety of machines running various OS's and the critical one runs on a Win box. OK, simple enough. Just fire up Perl in a CMD window and step it through. OK, simple enough. First section opens up a Lotus database. Say what?!! Is Lotus still alive? Database is actually a text file but it's maintained via a Lotus Notes application running on IE6. Say what?!! It's only a friggin' text file. Why is Lotus Notes involved? OK, simple enough. Let's log in to the Lotus app. "You don't have permission!" I'm the owner of this system but I don't have permission to access a critical section of the app because Lotus Notes' permissions don't synch with LDAP.

So now I'm in the middle of my usual crisis of confidence. Do I really want to take on eight years or more of crap and try to clean out the Augean stables? Or do I blow the whole lot away and start again? And how do I convince the rest of this sorry crew that the new way is better than what they've been using for close to a decade?

And the really hard bit is that my immediate supervisor doesn't seem to care what happens but he won't give me root permission on the main box so I can't install any sandboxes to try things out. The other problem is that this organisation sells proprietary products and seems very reluctant to use Free/Open source stuff. Case in point: they can't afford to buy everyone an Adobe Acrobat licence so I can't install PDFMaker to allow me to output PDFs from my Word docs. But instead of installing a freebie (like PDFCreator) I've been told I have to take my Word document to one of the designers who has a licence and he/she will output the PDF for me. Madness! And it seems typical of the attitude here: we can't afford a proprietary licence so we have to do it manually because we don't want to use free tools.

Well too bad. I installed PDFCreator, installed Sumatra, removed Adobe Reader and now I can create and view PDFs quickly and easily without all the Adobe Reader's constant prompts asking me to update it.

The next problem is 2-fold. How to get rid of Lotus Notes dependency? My immediate feeling is: "Install Catalyst and simply replace the web app". The problem is my employers sell a proprietary web app (which took me over a week to config) which could be squeezed into handling this task. Do I want to spend a couple of weeks beating it into shape or should I just quietly install the Cat app and not tell anyone? They've been quite happy to use Lotus Notes so it's not as though I'd be throwing out one of their products.

The 2nd part of the problem is that the need for a Lotus Notes database is purely to handle their way of tracking ClearCase tasks. And that is because of CC's peculiar way of handling branches. (Well that's not strictly true, it's more because of the way they have chosen to use CC). So if we are moving to Perforce why should I bother with a task tracking system?

All this previous raving assumes I am willing to continue with this job. I'm asking myself whether I want to take on the weeks and months of effort needed to complete this conversion. I'm not sure.

I find myself thinking of that book: "The Three Signs of a Miserable Job" (i.e. anonymity, unmeasurability, irrelevence) and this job is very close to three strikes and I'm out. "Anonymous" because my boss seems to be unwilling to volunteer any help or assistance. I don't know if he resents my taking this job (he apparently remains a good friend of the previous incumbent) and my paranoia makes me think he wants me to fail. I don't have anyone to talk with here in my little alcove. "Unmeasurability" because I have no way of assessing what "good work" is and I certainly haven't been given the tools nor the authority to do what I think necessary. And "irrelevence" because the reality is the developers could quite easily handle their build tasks themselves, they resent having this conversion being forced on them by management. And it also looks like the build system is way too complicated with so many obscurities I get the impression the developers have given up trying to understand. I know I would have given up if I were in their position.

So I will let my doubts bubble away in the background for a bit longer. I go through this stage at every new job and it will pass if I stay. It took me over a month at the previous job so I'm not giving myself enough time.

Thursday, May 15, 2008

Converting choral music scores to MP3s

I sing in a choir. Apparently music and programming are closely related (can't find link) and nearly everywhere I've worked it didn't take long till I found other programmers who sang or played or wrote or painted or whatever. For many (like me) it's an absorbing hobby. For others programming has been the day job and their "real" life was the music etc.

I've come to classical choral music very late in life. I grew up with the popular music of the time, got into blues and jazz via guitar playing in my 20s and slowly my musical interests have expanded. But when I turned 50 I decided I wanted to learn to sing properly (tired of the sore throats) and my singing teacher suggested I join a choir to give me some focus. I'm kicking myself now for not having learned to sight read music and not persisting with my occasional attempts to learn keyboard. I insisted my children learned to read music as part of their readin', writin' and 'rithmetic but didn't apply the same logic to myself. D'oh!

So how do I learn to sing my part (tenor) in the choral pieces we perform? Software of course! I reasoned that there must by now be reasonable music OCRing software, Googled a bit and found only two products that continue to be supported. The various other attempts over the years seem to have died. Choral music tends to be "dead white man's music" and thankfully most of it is out of copyright. There is the wonderful Choral Public Domain Library which can often supply an electronic version of a work, often in MIDI format (my preference), sometimes in Sibelius or Finale format (which is useful with Finale's free player), but often only a PDF version is available.

Of course CPDL is only the tip of a vast repository of choral music sequestered in various libraries and archives around the world, probably never to be unearthed and performed. One of our choir members visited the Vatican and was allowed to hand transcribe a simple choral work he found in an archive there. Our choir was the first to perform the work outside of the Vatican. Another piece we perform annually was written 900 years ago. This is simply "mind-blowing baby" as Austin would say.

Back to music OCRing. The two products that seemed worth a look are PhotoScore and SharpEye. PhotoScore is the more heavily promoted and costs more but a Lite version is bundled with Sibelius so the cost is often hidden. The only problem is that when I tried PhotoScore a couple of years ago it was awful. I spent hours having to manually correct all the bits it failed to comprehend. In desperation I went looking again and found the wonderful SharpEye. It simply works. It's accurate and complete and quite inexpensive.

The main problem for me was that I am a Mac zealot and SharpEye only runs on Windoze. Initially I had to run it on Virtual PC and it was painfully slow but then Parallels was released and I ordered a MacBook the next day. SharpEye under Parallels is easy to use. So the procedure is to either download a PDF of the score or scan it in. Then I let SharpEye OCR the score. Usually there are some minor note value errors. (But it has even read hand-written scores and was reasonably accurate.) Compared to PhotoScore, this was heaven.

The final irony was to read a recent press release announcing that PhotoScore now uses the SharpEye engine. But the demo version I tried didn't seem any better than the older versions.

SharpEye exports in MusicXML format and both Finale and Sibelius can import MusicXML. MusicXML is actually the only 'full-content' format I can use to transfer files from SharpEye to Sibelius. The other common format, MIDI, doesn't handle lyrics.

So I import the MusicXML file into Sibelius and clean up the score a bit more. I simply play it and usually I can hear the errors (sharp/flats missing, sometimes (rarely) a note at the wrong pitch, more often timing errors (crotchets/quarter notes for minims/half notes or vice-versa). I switch from Sibelius' default instruments for voices (very soppy oohs and aah) to a nice piano and use a macro to set all the dynamics marks (mf, p, pp etc.) to 'f'. I want to hear the notes and ppp is usually too soft.

Then I export the file as a MIDI and import the MIDI into GarageBand. Now there ought to be scripts that would create four MP3 files for me, each one with a different part being played loudly and the other three parts being played quietly in the background. I tried a couple of Automator scripts but they hung. I tried manually creating an Automator script myself but it also hung at a crucial point so it makes me think the interface to GB might be faulty. Finally I FTP the MP3s to the website ready for choir members to listen and learn.

Now this whole tedious, time-consuming process ought to be largely automated. Obviously manual correction needs to be done as some point but I keep wondering what a more automated process would look like. I wonder if such a Score-to-MP3 system exists (more Googling required I think). Maybe I if learn a bit more about MusicXML I could use a Perl script to extract the four parts and push them through some sort of MP3 creator. It's a thought.

Wednesday, May 14, 2008

New job: surveying the land

So the job...

I don't know if it's just my way of wasting time because I don't want to be disappointed when I get to the nub of a task but I tend to look around a lot when I undertake a new task. Maybe it comes from using Perl so long. I usually Google for relevant topics. A couple of people mentioned the Unix/GNU make utility and I wanted to look at build/release systems to see what open-source and proprietary offerings are around. Found a good review on Freshmeat which led me to ClearCase and Perforce for proprietary offerings and cons and Scons for FOSS offerings. (Yes I know they address different problem domains, this is just my meanderings around the issue. I'm also aware of Subversion, Git, SVK, CVS, RCS and SCCS. And I'm aware of Ant and Maven. All of them address various aspects of the issue. Not all are relevant any more.)

Company uses ClearCase and Perforce for version control, wants to standardise on Perforce and wants legacy code moved from CC to P4. Current in-house build system uses make and Ant depending on code base. (And use of make would have to be mainly historical. Ant seems so much better, at least in my use of it.)

cons and Scons as replacements for make deserve a closer look. cons seems to have run out of interested developers. Written in Perl, parent of Scons, languishing from lack of publicity and support on an obscure Indian manufacturer's site, occasional pleas from users asking for such-and-such feature or bugfix but no one seems willing to put up hand to help. (Is this a job for EC?)

Scons on the other hand, while based on cons, is written in Python, thus gaining some flavour-of-the-month interest but more importantly, is visible to Google (and has thus qualified for Summer of Code support). Also seems to have an active and interested community behind it. And a web page designer! (Pleasant, clean, modern home page. Makes it look alive cf cons.)

Don't ever tell anyone :-), but I don't always obey my boss when told to develop something. I listen carefully to what he/she asks for and then I attempt to do two things. One is I try to work out what they really want/need. Not just in the political sense. I've worked with sociopathic bosses who think they can control me by not telling all the details of the requirements or who's paying for the job or who gets rewarded if it succeeds. One guy told me I had six weeks to complete a task but I accidentally (really!) read an email to his boss where he mentioned the 12 weeks he had allocated to the task. Turns out he was paid a bonus for every week under 12 he could extract. Bastard.

Politics aside, sometimes they can only think in terms of existing systems so the request will be 'one of them but this bit changed'. I have to look at the wider picture. 'One of them' might be so old and poorly maintained and so slow and resource hungry that a second one might destroy the machine or require a new one (not budgetted for of course).

There's also the much harder task: how do I stay interested? I can think of better things to do than spend 8 hours a day doing boring, repetitive, code monkey stuff. I like singing. I like playing guitar. I like talking to my wife and kids. I like walking the dog. I don't like boring shitwork. I'm also aware that there are a few million University graduates scattered around various "developing nations" perfectly capable of churning out a couple hundred lines of Java crap for less than half of what I get. So my code has to be compact but maintainable, precise but not obscure, correct but not overkill. And I have to stay interested. The only way I know how to do this is to find a way to turn a boring, churn-the-handle task into something exciting.

One way to do this is to use a new technology. This is how I discovered JQuery. A webpage needed some JavaScript but my first attempt clashed with the existing JS. And of course it didn't work correctly in IE6. Googled and found JQuery. Oh frabjous day! Suddenly JavaScript became interesting. So much power in such a compact notation. Sounds like Perl! And it automatically handled the PITA differences in IE6.

Succinct code

So the job...

Existing build system written in very clean and clear Perl. One of the advantages of Perl is that code is usually all the doco one needs if it's written by a reasonably competent Perl programmer (i.e. not a C or Java monkey in Perl clothing). Interestingly, a reasonably competent Perl programmer usually also adds useful/relevant comments. Java programmers OTOH love to put in comments like "the following adds 1 and 1 to make 2", followed by '$result = 1 + 1;'. Life is too short for this type of idiocy. I recently looked at a 600-line Perl script which did less than ten lines of xsh code. I also remember a few years ago a Java columnist demonstrating that now that Java had a regular expression library he could write his particular script in only 100 lines! The equivalent in Perl is a one-liner. 100 lines of code rots the brain if there's only one relevant line. Which is the relevant one? How much energy do I have to expend to process the other 99 lines to work out they are not the nub of the code? Of course, Java and many other languages aren't as succinct so one's thoughts are never as clear or focussed, they are always diffused over the other 99 lines.

New job

This blog is probably going to be more of a Twitter-like stream of consciousness rather than a considered, pondered assessment and evaluation of my situation (i.e. it's like every other blog).

Started new job as Build/release engineer for commercial organisation. (Spelled with an 's' immediately tells you which country. Job title immediately tells you the city and the organisation.) I'm a Perl programmer. Been one since 1990.

I spent 15 years prior to 1990 writing C code and then this new 'dynamic scripting' language Perl came along and said 'let me take care of the garbage (i.e. memory management) and you take care of the interesting part (i.e. writing code)'. Suddenly 90% of my coding time was freed. Oh, and Perl also let me forget about the intricacies of awk, sed, grep, join, cut, bash/csh/tcsh/ksh/sh (sure I occasionally use them but never for anything complicated). Suddenly one ring ruled them all.

Over the years since then I've looked long and hard at a lot of languages including VB, Java, C++, Smalltalk, Eiffel, Python, Ruby, Erlang, Haskell (the list goes on and on). All of them have attractive aspects. I remember reading once that each language is designed to solve a particular problem. (Maybe in that sense all programming languages are Domain-specific languages.) Fortran (Formula Translator) and COBOL (COmmon Business-Oriented Language) were early examples of this. Perl was invented supposedly because Larry Wall was too lazy to learn how to use all the Unix utilities and too impatient to wait for them even when he could get them to link together to process his report data.

Steve Yegge's talk on Dynamic Languages seems to confirm my own experience. It's just too hard to switch languages these days. In my case the Perl CPAN has over 10,000 downloadable plugin modules. It takes a lot of time to sift through 10,000 modules to find the ones which I am comfortable with. I can learn the syntax of a language in a couple of days but how long does it take to understand the subtleties of the Schwartzian Transform and why one would use it? Each language has such 'deep learning' aspects and they take a long time to grok.

I omitted JavaScript from my list above because I am finding I get more and more enjoyment from writing JS each day. JQuery has got to be something akin to manna from heaven.

Yegge's comments about Java compilers and language support tools effectively running the program and looking inside to see what it actually does rather than trust to the static declarations gives me great hope that a Perl programmer somewhere (won't be me: see blog title) will finally bite the bullet and run with the fact that only Perl can truly understand a Perl script and start writing IDEs and refactoring tools which are actually useful.