Archive for the ‘search technologies’ Category

Historian is on Hiatus

June 12th, 2009 No comments

I want to thank all my friends who took the time out to test Firefox Historian, and give me their feedback. In particular, a big thank you to Treena, Kenton, Glenn and Waleed. As everyone is probably aware by now, I’ve started a new project, and it is not Historian; I’m parking this for now. The client side code is downloadable from the original site,; rename the XPI code to .zip to access it.

If you want access to the server side code, please email me. If there is enough interest, I may even open-source it.

My email? It’s shahzad AT-THE-SPECIAL-PLACE the [There that should defeat the spam-slingers]

Firefox Historian Released

May 12th, 2009 1 comment

I’ve been working on a personal search engine project in my spare time for the past few weeks. This is a Firefox add-on which records your browsing history, and builds a full-text search index.

Basically, the value proposition of this site is that you’ll never need to try and remember individual sites or try and rediscover them through ‘public’ search engines. I only index publicly accessible sites that you may visit, and thus your email etc will never be indexed (as the server does not have your login information, or an implementation of the login protocol).

As far as search engines go, this is quite spartan for now. However, if there is enough interest, I’ll be happy to add enhancements and make the search better and more intuitive. The index updates every five minutes for now. If you need faster indexing, drop me a line and we can work out way to make this possible for you.

For now, you can get the Firefox add-on from here

Feedback is most welcome, but please be gentle and expect bugs, this is an early beta!


Faceted Search — The Superior Search Method

March 3rd, 2009 No comments

One of the most intuitive methods available for searcher is faceted search. This builds on the strength of both direct search, and browsing.

The two paradigms that search professional are most familiar with are:

  • Navigational search (browsing) uses a topic hierarchy that allows users to iteratively narrow down the scope of their quest by digging into the hierarchy. This hierachy is usually predetermined, and may be either hand-created or automatically generated. Good examples can be seen in the classic Yahoo! Directory, and DMOZ. This is useful for those with some search proficiency, as they can examine the set of topics available at the current node in the hierarchy to improve the description of their ‘information need’.
  • Direct search allows users to simply write their queries as set of keywords in a text (search-)box. Most people are much more familiar with this interface, as it has no learning curve. However, there is no attempt made to give the user a sense for what is available in the content. The responsibility lies with the searcher to examine the returned set of document, and improve the search by choosing relevant keywords. Most search engines (including the one on this site) employ this metaphor.

There is however a third metaphor, which combines the best feature of both paradigms that have been discussed, Faceted Search.

Faceted search essentially starts out as a direct search, however, as soon as the results have been returned, the user is also presented with a set of filters that they can use to ‘dig’ down through the results. Three excellent examples are available at Ebay, Yahoo Mail‘s search feature and Autocatch.

Ebay allows you to carry out a search and narrow it down by geographic location, price, classification of the product and other ‘facets’. They have done a commendable job of making it possible for people to navigate through a constantly changing, and diverse set of auction items. Faceted search is an important aspect of their strategy to provide easy-to-use product access for their customers.

Ebay Faceted Classification Example

Autocatch has also provided a number of different ways to navigate through the large repository of cars that are available for sale. After the initial search, the user can narrow down based on location, make, model, year, price and whether the seller is a private party or a dealer.

Autocatch Faceted Classification Example

Note that in both examples, the faceted search is based on the actual content, and the ‘natural’ steps that the user would take when they are fulfilling their information access task. It is essential that the faceted search strategy is aligned with the content and the user’s needs, otherwise it will drive off your user’s and not have the intended benefit.


Seminar on Open Source Search Tools

February 25th, 2009 No comments

I was the speaker at the Ottawa LAMP meetup for February. I must confess that I really enjoyed giving this presentation as it touch on three of my passions, (1) open source tools, (2) search theory and technology and (3) using technology to help people and organizations work more effectively and efficiently.

The video of my presentation, titled ‘Scalable techniques for large-text repositories: Google in a box with LAMP tools’ is available at

Special thanks to Andrew Ross for recording such a high quality video.