Posts Tagged ‘programming’

Promoting Code from R&D to Engineering: The 3 Decimal Place Barrier

August 23rd, 2011 No comments

One of the watershed events that marks a successful R&D project is the handover to engineering.

This step generally involves a fair amount of integration effort, and occasionally requires that the code be rewritten to employ databases, web services or conform to data contracts with other modules. Sometimes, this also requires that the logic be rewritten in a different programming languages, perhaps a translation from Python to C# or from Perl to Java.

I’ve noticed in most such projects, even after the most faithful reimplementation of the code, the output only matches to three decimal places. Beyond that, the results invariably vary (pun intended).

For most decision support application (which dominate the artificial intelligence field), this is close enough for all intents and purposes. However, for applications from the field of finance, certain life-sciences and space technologies, this would be considered a minor crisis!

Technical Debt — Defined

February 7th, 2011 No comments

Designing software that is meant to be used requires that you put the user experience front-and-center when coming up with the design.

However, when  you have functionality that you need to add to your system you have two ways to do it,

  1. Quick and messy – you are sure that it will make further changes harder in the future. This involves actions like hardcoding parameters or bringing in libraries that you don’t completely understand.
  2. The other results in a cleaner design, but will take longer to put in place.

Ward Cunningham coined a wonderful metaphor (Technical Debt) to help us think about this problem.

In this metaphor, doing things the quick and dirty way sets us up with a technical debt, which is similar to a financial debt. Like a financial debt, the technical debt incurs interest payments, which come in the form of the extra effort that we have to do in future development because of the quick and dirty design choice. We can choose to continue paying the interest, or we can pay down the principal by refactoring the quick and dirty design into the better design. Although it costs to pay down the principal, we gain by reduced interest payments in the future.

The metaphor also explains why it may be sensible to do the quick and dirty approach. Just as a business incurs some debt to take advantage of a market opportunity developers may incur technical debt to hit an important deadline. The all too common problem is that development organizations let their debt get out of control and spend most of their future development effort paying crippling interest payments.

I’ve made some choices during my career where hard deadlines, or the limited maintenance nature of the project meant that the effort for very clean code and architecture was not justified. However, one practice that I would advocate is to keep a copy of Bugzilla around where you can log all the ‘todos’ required to refactor, clean-up and enhance robustness in your project.

When you have debt, you have to keep track of it so that you can pay it off. Any other alternative is reckless and irresponsible. The metaphor hold equally well in the domain of software engineering as it does in the field of personal (or corporate) finance.

Programmer Pitfalls I

December 29th, 2009 No comments

A quick list of the classic errors I’ve seen programmers make (and which I’ve made as well at some point in time!).

Reinvent the wheel
Most programming assignments are not there to provide you with a sandbox where you can play. While you’re putting together your super-optimized implementation of the linked-list or quick-sort algorithm, someone is paying for your time and waiting on you to reap the benefits of their investment. If you must insist on reinventing the wheel, consider an academic career, do it in your spare time or take part in the greatest adventure ever, become part of a new startup!

Reuse blindly
You cannot assume that an API available in the wild will solve your problem completely. Unless it is a mature component for a well understood problem (e.g. the Apache Web Server), you’ll probably exert a lot of effort to shape it to fit your need. Even if you’ve managed to do what you set out to, you’ll have a lot of functionality that is in your code which shouldn’t really be there. This excess functionality will not only slow down your system, it will make it difficult to transition other developers in (without a lot of hand-holding on your part) and can represent a serious security risk (in some instances). I should know, I just finished a project where we started by trying to leverage the Apache Nutch web crawler, and in the end, we made a decision to roll our own simpler crawler rather than modify core components of Nutch to server our needs. Prior to that, I was writing a scraper, and our first attempt was to piggy-back on a headless (i.e. without GUI) instance of the Mozilla XUL code-base. Even though it was theoretically possible to scrape through AJAX driven menus with that approach, in the end, we decided to employ a simpler regular expression driven logic via HTML.

Too much alphabet soup? The lesson here is simple; there is nothing wrong with trying to reuse, but develop a system to quickly map out the capabilities of the system, compare them against your requirements and have an idea of how much effort a simpler ‘re-implement the wheel’ strategy would involve.

If this flies in the face of the first rule-of-thumb, consider that you should not reinvent the wheel for mature, well understood component. However, when you have a core innovative piece in your system, and there is nothing that matches it immediately out there, you then have license to build a working component that does exactly what is necessary (and no more).

Estimate based on a deadline or a desired date
This is a classic recipe for death-march project. Your estimates should always be based on a completely list of steps required to get from the beginning of the project to the end of it, aggregated from the cost (in time and money) associated with each sub-task. It is basically dishonest to promise to hit deadlines that you do not have a plan for.

I recently lost a project due to this policy of mine. The clients had asked me for some documentation, and instead of agreeing blindly, I decided to map out the full effort required. The final project become a multi-month activity, and required a budget of 60K+. When the client saw the bill, they balked.

I am actually quite happy that we did not go forward with this contract, as the alternatives were not were appealing. Consider them:

Scenario A If I had billed by the hour (which the clients were willing to accept BTW), the clients would have received their project at the end of month X, but would have probably been very unhappy (as it would have cost them more than they expected).

Scenario B If I had given a fixed quote, I’d have been working for many weeks beyond what was reasonably expected for this project.

This way, the contract was not finalized, but neither I, nor the client, was unhappy :-)

Some people would argue that it is rational to have billed by the hour in this case (as in scenario A). That may be the case in a Machiavellian sort of ends-justify-the-means type of world where profits are supreme. After all, you’re shifting all the risk to the clients!

However, I am not a faceless multinational, and am not beholden to callous stockholders to keep my job in my firm. I have to live with myself, and to be happy with the person in the mirror, I’d rather not benefit at someone else’s expense.