Archive

Archive for October, 2009

Donations for Senegal Food Aid

October 30th, 2009 1 comment

I know a few friends at CTIF, an Ottawa based non-profit, who travel every year to Senegal, in Africa, with the goal of providing food aid.

When they first told me about this charitable work, I was a bit nonplussed. After all, I’ve never heard much about Senegal (except that it is in Africa). There has not been any tsunami, war, massive outbreak of disease or anything like that on BBC or the other news channels that I follow. I figured that it was time to do some research.

According to the BBC, Senegal is one of the few African countries without any significant conflict, and with a democracy that has been working quite well. So I asked myself, why are my friends going to Senegal this year to give out food aid? The answers were really remarkable.

The economic stats painted a dire picture; the work that my friends are doing is very much the need of the day. 54% of the population of Senegal lives below the poverty line. The UN estimates the poverty threshold at 1.08 dollars per day (http://en.wikipedia.org/wiki/Poverty_threshold). Thus, most of Senegal lives on less than a dollar a day.

So the next time you go with your mates to Tim Hortons, Starbucks or grab a quick pizza, spare a thought for the people who really need this money. That $15 – $20 you just spent will pay for a family of 4 for 5 days.

I have two lessons that I’ve learned from this. Firstly, we should be grateful for what we have been given. Secondly, there is an opportunity to support the good work being done by CTIF. Any money you provide will be 100% used for the intended purpose. There is absolutely zero overhead. All travel and other expenses are already covered by the individuals going to Senegal this year. Any money you or I will provide will go directly to feed those who are in need.

My friends were quite emotional when they talked about their experience. Last year, the queue for the food aid was two kilometers long.

Some people waited two days in line to get the meat being given away.

Two days is a lot of time. However, let’s put this into perspective. This was probably the only meat that this person would have tasted in the entire year. At $1 per day, you can barely afford the staple food items, much less ‘luxury’ items like meat.

If you want to help support this effort, please write a cheque to CTIF c/o Northern Lights Educational Services at 26 Thorncliff Place, Ottawa, ONT K2H 6L2. Please specify if you need a receipt for tax purposes.

I know that some of you may be planning to make donations over the coming Eid festival. I’ve calculated that the rate for a share acceptable for an Eid donation is CAD 125. Seven such shares can be pooled together to purchase a bull for the food aid (please specify the name of the donating person(s) as well).

However, any help that you can provide will be most welcome. If you cannot send 125, at least send the cost of a pizza that you can skip for lunch today.

If writing a cheque is too much work, drop me a quick email or give me a shout over the phone (613.263.8009) and I’ll be happy to pick up the donation from you to pass on to CTIF.

Tags:

Nervous!

October 23rd, 2009 4 comments

OK, we’re almost here. After 5 months of efforts we are about to go live. The first 3 months were layered on top of a VERY demanding 60 hour work-week consultancy, and the last 2 months of which were a flat out, fully-focused, I-am-going-to-burn-out-or-get-this-done-even-if-it-kills-me rollercoaster ride.

Almost there. Almost.

The code works in testing. Everything is perfect so far.

The crawlers perform as per spec, the custom heuristics we’ve created to analyse blogs tests out fine (they give VERY sane results), the machine learning components give us over 96% accuracy. We can tell a ton about bloggers just by what (and how) they write, and the structure of their blogs (which are automagically reverse-engineered.. eat your heart out Harry Potter).

The entire flow is rock-solid, and I’m grateful that I chose the more robust option of Java EE to express the logic in, rather than ‘quicker’ language like Perl, PHP or Ruby.

There are some ‘routine’ elements (i.e. subscription, registration etc) to take care of. Nothing associated with any risk.. stuff that we’ve done dozens of time before. It’s all about polish right now.

There may be some important pieces missing, or more likely in infancy (some amazing heuristics we can put in, but that’s for Q3 now), but then, this is a prototype. For those who would appreciate an analogy, it’s like building the airplane while riding in it. You have a high probability of experiencing crashes (which thankfully are not fatal in this scenario! .. yet).

I am feeling really giddy as I take the system through the last steps. Got to make sure that everything works.. have to keep an eye on security, batten down the hatches, set up the corporate presence (yes, emails, accounting systems etc), set the analytics to capture every nuance of user-interaction data on the site.

Perhaps once I get the site up, open it up to the world and unleash the dogs (sorry, Shakespeare’s King Lear always gets me), I’ll feel better. Meanwhile, things to do… things to do !!!

I’m nervous! Have to look at patents and funding grants now.

We’re almost past the prototype stage (stage 2)! Now to delight the customers and scale up (Got an eye on you Amazon EC2… spare a smile for me, and a server or a few!).

Wish me luck guys!

Parameter Passing in Java

October 14th, 2009 3 comments

The way that java actually handles the passing of parameters in methods is something that a lot of people are confused by. Since I’ve already been down this route, I thought that I’d lay some markers to help others who could benefit from this knowledge.

Java has much in common with C++, but the lack of pointers can complicate things. You don’t actually lose any of the power of pointers though. This is because all parameters are passed by reference (remember prepending the & sign in C++?)

Note: This post is a follow-on from the previous one that discussed the proper way to remove elements from Java Collections when a live iterator is present.

Consider the code snippet given below:

import java.util.*;

public static void main(String argv[]) {
Hashtable ht = new Hashtable();
ht.put(1, “Curly”);
ht.put(2, “Larry”);
first(ht);
second(ht);
}

public static void first(Hashtable ht_inside) {
ht_inside.put(3, “Moe”);
}

public static void second(Hashtable ht_inside) {
ht_inside = new Hashtable();
ht_inside.put(3, “Leo”);
}

Before first() is called, ht contains ‘Curly’ and ‘Larry’.

After first() is called, ht contains ‘Curly’, ‘Larry’ and ‘Moe’. Note that this change has not only taken place in the local ht in the scope of first(), but also in the ht in the scope of main().

when second() is called, only the ht in second is modified (to contain ‘Leo’). This is because this local ht has been newly created and initiated, and is in a different memory spot as compared to the ht in first() which was passed by reference.

Thus local changes made to ht in second() do not effect the ht in main(), while local changes made to ht in first()always change the ht in main() as it was the same ht that was passed by reference.

Tags:

java.util.ConcurrentModificationException

October 14th, 2009 2 comments

java.util.ConcurrentModificationException… it always worries me when I come across an error that I do not understand, and which is completely unexpected.

You’d expect a concurrent error exception to only occur when you are using threading, or have multiple applications accessing the same resource(s). However, this was not the case this morning.

I had a simple, single-threaded logic, running through a list of URLs and removing some instances based on regex pattens. Standard, plain-vanilla Java operations.

The error that occurred took place at this line of code:

String nextCheckSum = i_safeDivs.next().toString();

i_safeDivs is an iterator over the a java HashMap (SafeDivs) data structure. This one was a PITA to figure out, as the error was occurring during a call to the ‘next()’ operator, while it was actually due an element being deleted a few lines down !

Apparently, this exception can occur anytime when you are working with maps and other Collection utilities in Java:

The API documentation of ConcurrentModificationException says:

“This exception may be thrown by methods that have detected concurrent modification of an object when such modification is not permissible. For example, it is not generally permissible for one thread to modify a Collection while another thread is iterating over it…”

The problem (and solution) has very little to do with Concurrent Access. Actually, the issue is that when you remove elements from a Map (or other Collection DS) on which you have a live Iterator, you need to remove it via the Iteration’s iterator.remove() method rather than by directly accessing the HashMaps hashMap.remove(key) method.

If you have an Iterator open, when you remove the key from KeySet the iterator logic breaks, throwing the ConcurrentModificationException. To avoid this use the same Iterator (keySetItr) for both for iteration and to remove the keys.

Tags:

My Ideal Employee: On Initiative

October 12th, 2009 6 comments

Over the years, I’ve had the pleasure to work with many individuals. Most have been above average, a few made me cringe (waiting for the next crisis), and a select few have never failed to delight me. Today, I thought a bit about what separates the stars from the dogs (using the BCG framework). There are a couple of traits that I value over others. The one that I will discuss in this post is initiative.

The question ‘what should I do next’ really irritates me. It reminds me of 5 years olds whining ‘I’m bored!’.

Micro-managing is something that I like about as much as going to the dentist (which is NOT one of my favorite past-times).

I much rather prefer delegating responsibility and authority, If you want to impress me, keep a vigilant eye on the priorities of the organization, and set several priority activities or areas. Occasionally meet me for coffee and discuss how things are going, and make sure that we both have the same idea about the context, the goals and the constraints. Come up with your own plan and execute it, making use of your own initiative, chasing up the required stakeholders and using the God-given faculties, abilities and resources at your disposal.

Generally, the type of projects that I work on require a high-tolerance for failure, lots of adaptability, the ability to budget time, efforts and identify milestones, and truck-loads of persistence. In industrial research and technology transfer, you are working on strategic projects, but you will most likely fail many times as hypotheses do not pan out, and you’ll need a plan B, and then a plan C (and so on). You’ve got to be a one (wo)man army, able to do the impossible. We expect miracles daily in industrial research.

You may need to get to your objectives by very unusual routes. It is absolutely critical that you need to know how to roll with the punches, set up a sustainable system that allows you to keep improving the ‘solution’ for a (most likely) not completely understood problem. The projects you are working on will probably require you to become a world-authority in a very specialized area of knowledge which hardly anyone else know about.

Nothing beats initiative, confidence, persistence and adaptability in allowing you to succeed at tasks that 90% of the other practitioners will fail at

If you cannot think, and cannot exercise initiative, you’re in the wrong corner of the organization.

Weaving Intangible Software Value

October 8th, 2009 4 comments

I am quite pleased with myself this morning. Starting at 9:15 am and working through 11:29 am, I wrote around 200 lines of code. You may scoff at this, and say that you can write better, more elegant, and superior code to me, and you’d probably be right.

However, the code that I wrote does something special. It leverages data repositories available of names (of people) and through some elegant magic of statistics and frequencies, can tell you with very high precision whether an encountered first name is that of a male, or a female, or unknown (which I’ll put in a queue for further analysis).

Laugh you may. Ha ha.
I’ll laugh along with you.

After all, why would someone want the computer to tell them whether a name is mostly associated with males or females? Shahzad, come on man.. don’t waste your time on stuff like this!

However, think for a second…. and wipe that grin off your face my friend. This (gender identification via names) is the type of capability that makes business people very happy. Associating demographics with individuals can boost the value of targeted advertising by 100 times! Carry out a quick survey of CPM costs online, and you’ll notice a huge difference between the cost of adverts on those sites that can qualify the users and viewers via demographic information, and those that do not. It’s the difference between 7 cents per thousand impressions and 14 dollars per thousand impressions (which is what my business partner’s very astute observation and equally diligent survey has demonstrated to me).

.. and why not? If I was paying someone to show an advertisement, I’d consider the money much more well spent if I knew the profile of the individuals that the advertisements were shown to. Even better, if you could only show adverts to those people who meet a pre-selected profile, you’d be my bestest (sic) advertisement-based content-serving site ever!

Pardon my (hopefully infectious) good humour. I’m going to grab a nice cup of tea to celebrate!

Computational Linguistics in the Real World

October 6th, 2009 2 comments

I was recently invited to present my thoughts on how students from the University Of Ottawa Linguistics Department could benefit career-wise from the knowledge of computational linguistics. I gave an overview of computational approaches to linguistics, listed a number of areas where viable products can be created, and described three technologies that have already been successfully commercialized, or are ready for it.

Computational Linguistics in the Real World SlidePack [PDF format download]

I also provided an overview of machine translation and speech recognition/transcription as well, even though these are not reflected in the slides.

After the presentation, I asked a very interesting question (among others) that I’d like to elucidate on a bit. The question was regarding a statement I made that statistical methods in computational linguistics have become very popular, and I was asked what does this imply for the symbolic logic (rational methods).

There had always been a bit of back and forth between the empiricists and the rationalists in computational linguistics. However, with the cheap computational resources and abundance of data available nowadays, it makes a lot of sense to run some [empirical] exploratory data analysis experiments, and carry out some collocation/correlation analysis before getting really deep into the problem. This way, you can get some results within two week, rather than finding out the viability of your hypothesis after six months of intense [rationale-based] study and experimentation. If the initial results are promising, and the research problem is worthy of further study, only then should you commit yourself to studying this particular issue deeper.