Wednesday, April 9, 2008

Gartner Emerging Trends - The Future of Search in the Post-Google Era

Today was my first day at the Gartner Emerging Trends conference. The first session of the day was titled "The Future of Search in the Post-Google Era." Overall, an informative presentation. A couple of themes coming from the talk were that IT shops should recognize that there may very well be a need to use multiple search engines within the organization. Often, picking a single search engine to fit all your needs may increase time to market for new applications requiring search capabilities. Additionally, a single search engine may force you into some corners and require compromises in the resulting applications. Instead, my spending some more money initially on a couple of search engines to complement each other, you'll reap the benefits down the road, and ultimately provide better search results to the users.

Another interesting metaphor presented was that of relating the searching a search engine does with an ant, rather than a spider. A spider (in the context of web search) determines which information is relevant to a query term by "crawling" over the content on the web. The ant metaphor, related to search, is more focused on how ants (the insects) communicate with each other by leaving trails of information related to the particular path. For instance, if an ant finds a trail to a food source, it will leave chemical traces along the path for subsequent ants to find, and know that the path leads to a food source. Cutting edge search engines are doing similar things related to the data that they find. For instance, if a user searches for "firm" and ultimately clicks a link related to a mattress sale (ya know, a firm mattress), the search engine will capture this and remember for future searches, that the user may be more interested in mattresses, than attorney firms.

Lucene, an open source search tool, was briefly recognized as a useful tool, but it was recommended to adopt with caution. The basis for this recommendation was primarily due to the fact that it's nothing more than a library of APIs to help index content. Usually, application teams have to do a lot of work around it to get to the desired endpoint.

Another random metric that was new to me, on average, search engines only receive 1.7 words in the query. Incredible, right?