Tuesday, September 16, 2008

We've Moved!

Please redirect your readers to http://thenoisychannel.com! The RSS feed is available at http://thenoisychannel.com/?feed=rss2.

See you all there...

Migrating Tonight!

At long last, this blog will migrate over to a hosted WordPress platform at http://thenoisychannel.com/. Thanks to Andy Milk (and to Endeca for lending me his services) and especially to Noisy Channel regular David Fauth for making this promised migration a reality!

As of midnight EST, please visit the new site. My goal is to redirect all incoming Blogger traffic to the new hosted site. This will be the last post here at Blogger.

p.s. Please note that I'll be manually migrating any content (posts and comments) from the past 5 days, i.e., since I performed an import on September 12th. My apologies if anything is lost in translation.

Quick Bites: Search Evaluation at Google

Original post is here; Jeff's commentary is here. Not surprisingly, my reaction is that Google should consider a richer notion of "results" than an ordering of matching pages, perhaps a faceted approach that reflects the "several dimensions to 'good' results."

Quick Bites: Is Wikipedia Production Slowing Down?

Thanks to Sérgio for tweeting this post by Peter Pirolli at PARC: Is Wikipedia Production Slowing Down?

Here's the picture showing the reduction of growth in the number of Wikipedia editors over time:



Interesting material and commentary at Augmented Social Cognition and Peter Pirolli's blog. Are people are running out of things to write about?

Monday, September 15, 2008

Information Accountability

The recent United Airlines stock fiasco triggered an expected wave of finger pointing. For those who didn't follow the event, here is the executive summary:

    In the wee hours of Sunday, September 7th, The South Florida Sun-Sentinel (a subsidiary of the Tribune Company) included a link to an article entitled "UAL Files for Bankruptcy." The link was legit, but the linked article didn't carry its publication date in 2002. Then Google's news bot picked up the article and automatically assigned it a current date. Furthermore, Google sent the link to anyone with an alert set up for news about United. Then, on Monday, September 8th, someone at Income Security Advisors saw the article in the results for a Google News search and sent it out on Bloomberg. The results are in the picture below, courtesy of Bloomberg by way of the New York Times.



    For anyone who wants all of the gory details, Google's version of the story is here; the Tribune Company's version is here.

I've spent the past week wondering about this event from an information access perspective. And then today I saw two interesting articles:
  • The first was a piece in BBC News about a speech by Sir Tim Berners-Lee expressing concern that the internet needs a way to help people separate rumor from real science. His examples included the fears about the Large Hadron Collider at CERN creating a black hole that would swallow up the earth (which isn't quite the premise of Dan Brown's Angels and Demons), and rumors that a vaccine given to children in Britain was harmful.

  • The second was a column in the New York Times about the dynamics of the US presidential campaign, where Adam Nagourney notes that "senior campaign aides say they are no longer sure what works, as they stumble through what has become a daily campaign fog, struggling to figure out what voters are paying attention to and, not incidentally, what they are even believing."
I see a common thread here is that I'd like to call "information accountability." I don't mean this term in the sense of a recent CACM article about information privacy and sensitivity, but rather in a sense of information provenance and responsibility.

Whether we're worrying about Google bombing, Google bowling, or what Gartner analyst Whit Andrews calls "denial-of-insight" attacks, our concern is that information often arrives with implicit authority. Despite the aphorism telling us "don't believe everything you read," most of us select news and information sources with some hope that they will be authoritative. Whether the motto is "all the news that's fit to print" or "don't be evil", our choice of what we believe to be information sources is a necessary heuristic to avoid subjecting everything we read to endless skeptical inquiry.

But sometimes the most reputable news sources get it wrong. Or perhaps "wrong" is the wrong word. When newspapers reported that the FBI was treating Richard Jewell as a "person of interest" in the Centennial Olympic Park bombing (cf. "Olympic Park Bomber" Eric Robert Rudolph), they weren't lying, but rather were communicating information from what they believed to be a reliable source. And, in turn the FBI may have been correctly doing its job, given the information they had. But there's no question that Jewell suffered tremendously from his "trial by media" before his name was ultimately cleared.

It's tempting to react to these information breakdowns with finger-pointing, to figure out who is accountable and, in as litigious a society as the United States, bring on the lawyers. Moreover, there clearly are cases where willful misinformation constitutes criminal defamation or fraud. But I think we need to be careful, especially in a world where information flows in a highly connected--and not necessary acyclic--social graph. Anyone who has played the children's game of telephone knows that small communication errors can blow up rapidly, and that it's difficult to partition blame fairly.

The simplest answer is that we are accountable for how we consume information: caveat lector. But this model seems overly simplistic, since our daily lives hinge our ability to consume information without such a skeptical eye that we can accept nothing at face value. Besides, shouldn't we hold information providers responsible for living up the reputations they cultivate and promote?

There are no easy answers here. But the bad news is that we cannot ignore the questions of information accountability. If terms like "social media" and "web 2.0" mean anything, they surely tell us that the game of telephone will only grow in the number of participants and in the complexity of the communication chains. As a society, we will have to learn to live with and mitigate the fallout.

Sunday, September 14, 2008

Is Blog Search Different?

Alerted by Jeff and Iadh, I recently read What Should Blog Search Look Like?, a position paper by Marti Hearst, Matt Hurst, and Sue Dumais. For those readers unfamiliar with this triumvirate, I suggest you take some time to read their work, as they are heavyweights in some of the areas most often covered by this blog.

The position paper suggests focusing on 3 three kinds of search tasks:
  1. Find out what are people thinking or feeling about X over time.
  2. Find good blogs/authors to read.
  3. Find useful information that was published in blogs sometime in the past.
The authors generally recommend the use of faceted navigation interfaces--something I'd hope would be uncontroversial by now for search in general.

But I'm more struck by their criticism that existing blog search engines fail to leverage the special properties of blog data, and that their discussion, based on work by Mishne and de Rijke, that blog search queries differ substantially from web search queries. I don't doubt the data they've collected, but I'm curious if their results account for the rapid proliferation and mainstreaming of blogs. The lines between blogs, news articles, and informational web pages seem increasingly blurred.

So I'd like to turn the question around: what should blog search look like that is not applicable to search in general?

Saturday, September 13, 2008

Progress on the Migration

Please check out http://thenoisychannel.com/ to see the future of The Noisy Channel in progress. I'm using WordPress hosted on GoDaddy and did the minimum work to port all posts and comments (not including this one).

Here is the my current list of tasks that I'd like to get done before we move.
  • Design! I'm currently using the default WordPress theme, which is pretty lame. I'm inclined to use a clean but stylish two-column theme that is widget-friendly. Maybe Cutline. In any case, I'd like the new site to be a tad less spartan before we move into it.

  • Internal Links. My habit of linking back to previous posts now means I have to map those links to the new posts. I suspect I'll do it manually, since I don't see an easy way to automate it.

  • Redirects. Unfortunately I don't think I can actually get Blogger to redirect traffic automatically. So my plan is to post signage throughout this blog making it clear that the blog has moved.
I'd love help, particularly in the form of advice on the design side. And I'll happily give administration access to anyone who has the cycles to help implement any of these or other ideas. Please let me know by posting here or by emailing me: dtunkelang@{endeca,gmail}.com.

Friday, September 12, 2008

Quick Bites: Probably Irrelevant. (Not!)

Thanks to Jeff Dalton for spreading the word about a new information retrieval blog: Probably Irrelevant. It's a group blog, currently listing Fernando Diaz and Jon Elsas as contributors. Given the authors and the blog name's anagram of "Re-plan IR revolt, baby!", I expect great things!

Wednesday, September 10, 2008

Fun with Twitter

I recently joined Twitter and asked the twitterverse for opinions about DreamHost vs. GoDaddy as a platform to host this blog on WordPress. I was shocked when I noticed today that I'd gotten this response from the President / COO of GoDaddy (or perhaps a sales rep posing as such).

Seems like a lot of work for customer acquisition!

Quick Bites: Email becomes a Dangerous Distraction

Just read this article citing a number of studies to the effect that email is a major productivity drain. Nothing surprising to me--a lot of us have learned the hard way that the only way to be productive is to not check email constantly.

But I am curious if anyone has made progress on tools that alert you to emails that do call for immediate attention. I'm personally a fan of attention bonds approaches, but I imagine that the machine learning folks have at least thought about this as a sort of inverse spam filtering problem.

Tuesday, September 9, 2008

Quick Bites: The Clickwheel Must Die

As someone who's long felt that the iPod's clickwheel violates Fitts's law, I was delighted to read this Gizmodo article asserting that the iPod's clickwheel must die. My choice quote:
Quite simply, the clickwheel hasn't scaled to handle the long, modern day menus in powerful iPods.
Fortunately Apple recognized its mistake on this one and fixed the problem in its touch interface. Though, to be clear, the problem was not inherent in the choice of a wheel interface, but rather in the requirement to make gratuitously precise selections.

Now I'm waiting to see someone fix the tiny minimize/maximize/close buttons in the upper right corner on Windows, which I suspect have become the textbook example of violating Fitts's law.

Monday, September 8, 2008

Incentives for Active Users

Some of the most successful web sites today are social networks, such as Facebook and LinkedIn. These are not only popular web sites; they are also remarkably effective people search tools. For example, I can use LinkedIn to find the 163 people in my network who mention "information retrieval" in their profiles and live within 50 miles of my ZIP code (I can't promise you'll see the same results!).

A couple of observations about social networking sites (I'll focus on LinkedIn) are in order.

First, this functionality is a very big deal, and it's something Google, Yahoo, and Microsoft have not managed to provide, even though their own technology is largely built on a social network--citation ranking.

Second, the "secret sauce" for sites like LinkedIn is hardly their technology (a search engine built on Lucene and a good implementation of breadth-first search), but rather the way they have incented users to be active participants, in everything from virally marketing the site to their peers to inputting high-quality semi-structured profiles that make the site useful. In other words, active users ensure both the quantity and quality of information on the site.

Many people have noted the network effect that drove the run-away success of Microsoft Office and eBay. But I think that social networking sites are taking this idea further, because users not only flock to the crowds, but become personally invested not only in the success of the site generally, but especially in the quality and accuracy of their personal information.

Enterprises need to learn from these consumer-oriented success stories. Some have already. For example, a couple of years ago, IBM established a Professional Marketplace, powered by Endeca, to maintain a skills and availability inventory of IBM employees. This effort was a run-away success, saving IBM $500M in its first year. But there's more: IBM employees have reacted to the success of the system by being more active in maintaining their own profiles. I spent the day with folks at the ACM, and their seeing great uptake in their author profile pages.

I've argued before that there's no free lunch when it comes to enterprise search and information access. The good news, however, is that, if you create the right incentives, you can get other folks to happily pay for lunch.

Quick Bites: Taxonomy Directed Folksonomies

Props to Gwen Harris at Taxonomy Watch for posting a paper by Sarah Hayman and Nick Lothian on Taxonomy Directed Folksonomies.

The paper asks whether folksonomies and formal taxonomy can be used together and answers in the affirmative. The work is in the spirit of some of our recent work at Endeca to bootstrap from vocabularies (though not necessarily controlled vocabularies) to address the inconsistency and sparsity of tagging in folksonomies.

I'm personally excited to see the walls coming down between the two approaches, which many people seem to think of as mutually exclusive approaches to the tagging problem.

Sunday, September 7, 2008

Quick Bites: Is Search Really 90% Solved?

Props to Michael Arrington for calling out this snippet in an interview with Marissa Mayer, Google Vice President of Search Product and User Experience on the occasion of Google's 10th birthday:
Search is an unsolved problem. We have a good 90 to 95% of the solution, but there is a lot to go in the remaining 10%.
I agree with Michael that search isn't even close to being solved yet. I've criticized the way many web search start-ups--and even the giants Yahoo and Microsoft--are going about trying to dethrone Google through incremental improvements or technologies that don't address any need that Google does not already adequately (if not optimally) address. But there is no lack of open problems in search for those ambitious enough to tackle them.

Quick Bites: Applying Turing's Ideas to Search

A colleague of mine at Endeca recently pointed me to a post by John Ferrara at Boxes and Arrows entitled Applying Turing's Ideas to Search.

One of the points he makes echoes the "computers aren't mind readers" theme I've been hammering at for a while:
If the user has not phrased her search clearly enough for another person to understand what she’s trying to find, then it’s not reasonable to expect that a comparatively "dumb" machine could do better. In a Turing test, the response to a question incomprehensible even to humans would prove nothing, because it wouldn’t provide any distinction between person and machine.
While I'm not convinced that search engine designers should be aspiring to pass the Turing test, I agree wholeheartedly with the vision John puts forward:
It describes an ideal form of human-computer interaction in which people express their information needs in their own words, and the system understands and responds to their requests as another human being would. During my usability test, it became clear that this was the very standard to which my test participants held search engines.
It's not about the search engine convincing the user that another human being is producing the answers, but rather engaging users in a conversation that helps them articulate and elaborate their information needs. Or, as we like to call it around here, HCIR.

Saturday, September 6, 2008

Migrating Soon

Just another reminder that I expect to migrate this blog to a hosted WordPress platform in the next days. If you have opinions about hosting platforms, please let me know by commenting here. Right now, I'm debating between DreamHost and GoDaddy, but I'm very open to suggestions.

I will do everything in my power to minimize disruption--not sure how easy Blogger will make it to redirect users to the new site. I'll probably post here for a while after to the move to try to direct traffic.

I do expect the new site to be under a domain name I've already reserved: http://thenoisychannel.com. It currently forwards to Blogger.

Back from the Endeca Government Summit

I spent Thursday at the Endeca Government Summit, where I had the privilege to chat face-to-face with some Noisy Channel readers. Mostly, I was there to learn more about the sorts of information seeking problems people are facing in the public sector in general, and in the intelligence agencies in particular.

While I can't go into much detail, the key concern was exploration of information availability. This problem is the antithesis of known-item search: rather than you are trying to retrieve information you know exist (and which you know how to specify), you are trying to determine if there is information available that would help you with a particular task.

Despite being lost in a sea of TLAs, I came away with a deepened appreciation of both the problems the intelligence agencies are trying to address and the relevance of exploratory search approaches to those problems.

Thursday, September 4, 2008

Query Elaboration as a Dialogue

I ended my post on transparency in information retrieval with a teaser: if users aren't great at composing queries for set retrieval, which I argue is more transparent than ranked retrieval, then how will we ever deliver an information retrieval system that offers both usefulness and transparency?

The answer is that the system needs to help the user elaborate the query. Specifically, the process of composing a query should be a dialogue between the user and the system that allows the user to progressively articulate and explore an information need.

Those of you who have been reading this blog for a while or who are familiar with what I do at Endeca shouldn't be surprised to see dialogue as the punch line. But I want to emphasize that the dialogue I'm describing isn't just a back-and-forth between the user and the system. After all, there are query suggestion mechanisms that operate in the context of ranked retrieval algorithms--algorithms which do not offer the user transparency. While such mechanisms sometimes work, they risk doing more harm than good. Any interactive approach requires the user to do more work; if this added work does not result in added effectiveness, users will be frustrated.

That is why the dialogue has to be based on a transparent retrieval model--one where the system responds to queries in a way that is intuitive to users. Then, as users navigate in query space, transparency ensures that they can make informed choices about query refinement and thus make progress. I'm partial to set retrieval models, though I'm open to probabilistic ones. 

But of course we've just shifted the problem. How do we decide what query refinements to offer to a user in order to support this progressive refinement process? Stay tuned...

Tuesday, September 2, 2008

Migrating to WordPress

Just a quick note to let folks know that I'll be migrating to WordPress in the next days. I'll make every effort to have to move be seamless. I have secured the domain name http://thenoisychannel.com, which currently forwards Blogger, but will shift to wherever the blog is hosted. I apologize in advance for any disruption.

Quick Bites: Google Chrome

For those of you who thought that no major technology news would come out during the Labor Day weekend, check out the prematurely released comic book hailing Google Chrome, Google's long rumored entry into browser wars. By the time you are reading this, the (Windows only) beta may even be available for download. The official Google announcement is here.

If the software lives up to the promise of the comic book, then Google may have a real shot of taking market share from IE and Firefox. More significantly, if they can supplant the operating system with the browser, then they'll have a much more credible opportunity to take on desktop software with their web-based applications.

Interestingly, even though all of the search blogs are reporting about Chrome, I haven't seen any analysis on what this might mean for web search.

Monday, September 1, 2008

Quick Bites: E-Discovery and Transparency

One change I'm thinking of making to this blog is to introduce "quick bites" as a way of mentioning interesting sites or articles I've come across without going into deep analysis. Here's a first one to give you a flavor of the concept. Let me know what you think.

I just read an article on how courts will tolerate search inaccuracies in e-Discovery by way of Curt Monash. It reminded me of our recent discussion of transparency in information retrieval. I agree that "explanations of [search] algorithms are of questionable value" for convincing a court of the relevance and accuracy of the results. But that's because those algorithms aren't sufficiently intuitive for those explanations to be meaningful except in a theoretical sense to an information retreival researcher.

I realize that user-entered Boolean queries (the traditional approach to e-Discovery) aren't effective because users aren't great at composing queries for set retrieval. But that's why machines need to help users with query elaboration--a topic for an upcoming post.

POLL: Blogging Platform

I've gotten a fair amount of feedback suggesting that I switch blogging platforms. Since I'd plan to make such changes infrequently, I'd like to get input from readers before doing so, especially since migration may have hiccups.

I've just posted a poll on the home page to ask if folks here have a preference as to which blogging platform I use. Please vote this week, and feel free to post comments here.

Friday, August 29, 2008

Improving The Noisy Channel: A Call for Ideas

Over the past five months, this blog has grown from a suggestion Jeff Dalton put in my ear to a community to which I'm proud to belong.

Some milestones:
  • Over 70 posts to date.
  • 94 subscribers, as reported by Google Reader.
  • 100 unique visitors on.a typical day.
To be honest, I thought I'd struggle to keep up with posting weekly, and that I'd need to convince my mom to read this blog so that I wouldn't be speaking to an empty room. The results so far have wildly exceeded the expectations I came in with.

But now that I've seen the potential of this blog, I'd like to "take it to the next level," as the MBA types say.

My goals:
  • Increase the readership. My motive isn't (only) to inflate my own ego. I've seen that this blog succeeds most when it stimulates conversation, and a conversation needs participants.

  • Increase participation. Given the quantity and quality of comments on recent posts, it's clear that readers here contribute the most valuable content. I'd like to step that up a notch by having readers guest-blog and perhaps going as far as to turning The Noisy Channel into a group blog about information seeking that transcends my personal take on the subject. I've very open to suggestions here.

  • Add some style. Various folks have offered suggestions for improving the blog, such as changing platforms to WordPress, modifying the layout to better use screen real estate, adding more images, etc. I'm the first to admit that I am not a designer, and I'd really appreciate ideas from you all on how to make this site more attractive and usable.
In short, I'm asking you to help me help you make The Noisy Channel a better and noisier place. Please post your comments here or email me if you'd prefer to make suggestions privately.

Wednesday, August 27, 2008

Transparency in Information Retrieval

It's been hard to find time to write another post while keeping up with the comment stream on my previous post about set retrieval! I'm very happy to see this level of interest, and I hope to continue catalyzing such discussions.

Today, I'd like to discuss transparency in the context of information retrieval. Transparency is an increasingly popular term these days in the context of search--perhaps not surprising, since users are finally starting to question the idea of search as a black box.

The idea of transparency is simple: users should know why a search engine returns a particular response to their query. Note the emphasis on "why" rather than "how". Most users don't care what algorithms a search engine uses to compute a response. What they do care about is how the engine ultimately "understood" their query--in other words, what question the engine thinks it's answering.

Some of you might find this description too anthropomorphic. But a recent study reported that most users expect search engines to read their minds--never mind that the general case goes beyond AI-complete (should we create a new class of ESP-complete problems)? But what frustrates users most is when a search engine not only fails to read their minds, but gives no indication of where the communication broke down, let alone how to fix it. In short, a failure to provide transparency.

What does this have to do with set retrieval vs. ranked retrieval? Plenty!

Set retrieval predates the Internet by a few decades, and was the first approach used to implement search engines. These search engines allowed users to enter queries by stringing together search terms with Boolean operators (AND, OR, etc.). Today, Boolean retrieval seem arcane, and most people see set retrieval as suitable for querying databases, rather than for querying search engines.

The biggest problem with set retrieval is that users find it extremely difficult to compose effective Boolean queries. Nonetheless, there is no question that set retrieval offers transparency: what you ask is what you get. And, if you prefer a particular sort order for your results, you can specify it.

In contrast, ranked retrieval makes it much easier for users to compose queries: users simply enter a few top-of-mind keywords. And for many use cases (in particular, known-item search) , a state-of-the-art implementation of ranked retrieval yields results that are good enough.

But ranked retrieval approaches generally shed transparency. At best, they employ standard information retrieval models that, although published in all of their gory detail, are opaque to their users--who are unlikely to be SIGIR regulars. At worst, they employ secret, proprietary models, either to protect their competitive differentiation or to thwart spammers.

Either way, the only clues that most ranked retrieval engines provide to users are text snippets from the returned documents. Those snippets may validate the relevance of the results that are shown, but the user does not learn what distinguishes the top-ranked results from other documents that contain some or all of the query terms.

If the user is satisfied with one of the top results, then transparency is unlikely to even come up. Even if the selected result isn't optimal, users may do well to satisfice. But when the search engine fails to read the user's mind, transparency offer the best hope of recovery.

But, as I mentioned earlier, users aren't great at composing queries for set retrieval, which was how ranked retrieval became so popular in the first place despite its lack of transparency. How do we resolve this dilemma?

To be continued...

Sunday, August 24, 2008

Set Retrieval vs. Ranked Retrieval

After last week's post about a racially targeted web search engine, you'd think I'd avoid controversy for a while. To the contrary, I now feel bold enough like to bring up what I have found to be my most controversial position within the information retrieval community: my preference for set retrieval over ranked retrieval.

This will be the first of several posts along this theme, so I'll start by introducing the terms.
  • In a ranked retrieval approach, the system responds to a search query by ranking all documents in the corpus based on its estimate of their relevance to the query.

  • In a set retrieval approach, the system partitions the corpus into two subsets of documents: those it considers relevant to the search query, and those it does not.
An information retrieval system can combine set retrieval and ranked retrieval by first determining a set of matching documents and then ranking the matching documents. Most industrial search engines, such as Google, take this approach, at least in principle. But, because the set of matching documents is typically much larger than the set of documents displayed to a user, these approaches are, in practice, ranked retrieval.

What is set retrieval in practice? In my view, a set retrieval approach satisfies two expectations:
  • The number of documents reported to match my search should be meaningful--or at least should be a meaningful estimate. More generally, any summary information reported about this set should be useful.

  • Displaying a random subset of the set of matching documents to the user should be a plausible behavior, even if it is not as good as displaying the top-ranked matches. In other words, relevance ranking should help distinguish more relevant results from less relevant results, rather than distinguishing relevant results from irrelevant results.
Despite its popularity, the ranked retrieval model suffers because it does not provide a clear split between relevant and irrelevant documents. This weakness makes it impossible to obtain even basic analysis of the query results, such as the number of relevant documents, let alone a more complicated one, such as the result quality. In contrast, a set retrieval model partitions the corpus into two subsets of documents: those that are considered relevant, and those that are not. A set retrieval model does not rank the retrieved documents; instead, it establishes a clear split between documents that are in and out of the retrieved set. As a result, set retrieval models enable rich analysis of query results, which can then be applied to improve user experience.

Saturday, August 23, 2008

Back from the Cone of Silence

Regular readers may have noticed the lack of posts this week. My apologies to anyone who was waiting by the RSS feed. Yesterday was the submission deadline for HCIR '08, which means that today is a new day! So please stay tuned for your regularly scheduled programming.

Saturday, August 16, 2008

Thinking Outside the Black Box

I was reading Techmeme today, and I noticed an LA Times article about RushmoreDrive, described on its About Us page as "a first-of-its-kind search engine for the Black community." My first reaction, blogged by others already, was that this idea was dumb and racist. In fact, it took some work to find positive commentary about RushmoreDrive.

But I've learned from the way the blogosphere handled the Cuil launch not to trust anyone who evaluates a search engine without having tried it, myself included. My wife and I have been the only white people at Amy Ruth's and the service was as gracious as the chicken and waffles were delicious; I decided I'd try my luck on a search engine not targeted at my racial profile.

The search quality is solid, comparable to that of Google, Yahoo, and Microsoft. In fact, the site looks a lot like a re-skinning (no pun intended) of Ask.com, a corporate sibling of IAC-owned RushmoreDrive. Like Ask.com, RushmoreDrive emphasizes search refinement through narrowing and broadening refinements.

What I find ironic is that the whole controversy about racial bias in relevance ranking reveals the much bigger problem--that relevance ranking should not be a black box (ok, maybe this time I'll take responsibility for the pun). I've been beating this drum at The Noisy Channel ever since I criticized Amit Singhal for Google's lack of transparency. I think that sites like RushmoreDrive are inevitable if search engines refuse to cede more control of search results to users.

I don't know how much information race provides as prior to influence statistical ranking approaches, but I'm skeptical that the effects are useful or even noticeable beyond a few well-chosen examples. I'm more inclined to see RushmoreDrive as a marketing ploy by the folks at IAC--and perhaps a successful one. I doubt that Google is running scared, but I think this should be a wake-up call to folks who are convinced that personalized relevance ranking is the end goal of user experience for search engines.

Friday, August 15, 2008

New Information Retrieval Book Available Online

Props to Jeff Dalton for alerting me about the new book on information retrieval by Christopher Manning, Prabhakar Raghavan, and Hinrich Schütze. You can buy a hard copy, but you can also access it online for free at the book website.

Wednesday, August 13, 2008

David Huynh's Freebase Parallax

One of the perks of working in HCIR is that you get to meet some of the coolest people in academic and industrial research. I met David Huynh a few years ago, while he was a graduate student at MIT, working in the Haystack group and on the Simile project. You've probably seen some of his work: his Timeline project has been deployed all over the web.

Despite efforts by me and other to persuade David to stay in the Northeast, he went out west a few months ago to join Metaweb, a company with ambitions "to build a better infrastructure for the Web." While I (and others) am not persuaded by Freebase, Metaweb's "open database of the world’s information," I am happy to see that David is still doing great work.

I encourage you to check out David's latest project: Freebase Parallax. In it, he does something I've never seen outside Endeca (excepting David's earlier work on a Nested Faceted Browser) he allows you to navigate using the facets of multiple entity types, joining between sets of entities through their relationships. At Endeca, we call this "record relationship navigation"--we presented it at HCIR '07, showing an how it can enable social navigation.

David includes a video where he eloquently demonstrates how Parallax works, and the interface is quite compelling. I'm not sure how well it scales with large data sets, but David's focus has been on interfaces rather than systems. My biggest complaint--which isn't David's fault--is that the Freebase content is a bit sparse. But his interface strikes me as a great fit for exploratory search.

Conversation with Seth Grimes

I had an great conversation with Intelligent Enterprise columnist Seth Grimes today. Apparently there's an upside to writing critical commentary on Google's aspirations in the enterprise!

One of the challenges in talking about enterprise search is that no one seems to agree on what it is. Indeed, as I've been discussing with Ryan Shaw , I use the term broadly to describe information access scenarios distinct from web search where an organization has some ownership or control of the content (in contrast to the somewhat adversarial relationship that web search companies have with the content they index). But I realize that many folks define enterprise search more narrowly to be a search box hooked up to the intranet.

Perhaps a better way to think about enterprise search is as a problem rather than solution. Many people expect a search box because they're familiar with searching the web using Google. I don't blame anyone for expecting that the same interface will work for enterprise information collections. Unfortunately, wishful thinking and clever advertising notwithstanding, it doesn't.

I've blogged about this subject from several different perspectives over the past weeks, so I'll refer recent readers to earlier posts on the subject rather than bore the regulars.

But I did want to mention a comment Seth made that I found particularly insightful. He defined enterprise search even more broadly than I do, suggesting that it encompassed any information seeking performed in the pursuit of enterprise-centric needs. In that context, he does see Google as the leader in enterprise search--not because of their enterprise offerings, but rather because of the web search they offer for free.

I'm not sure how I feel about his definition, but I think he raises a point that enterprise vendors often neglect. No matter how much information an enterprise controls, there will always be valuable information outside the enterprise. I find today's APIs to that information woefully inadequate; for example, I can't even choose a sort order through any of the web search APIs. But I am optimistic that those APIs will evolve, and that we will see "federated" information seeking that goes beyond merging ranked lists from different sources.

Indeed, I look forward to the day that web search providers take a cue from the enterprise and drop the focus on black box relevance ranking in favor of an approach that offers users control and interaction.

Monday, August 11, 2008

Position papers for NSF IS3 Workshop

I just wanted to let folks know that the position papers for the NSF Information Seeking Support Systems Workshop are now available at this link.

Here is a listing to whet your curiosity:
  • Supporting Interaction and Familiarity
    James Allan, University of Massachusetts Amherst, USA

  • From Web Search to Exploratory Search: Can we get there from here?
    Peter Anick, Yahoo! Inc., USA

  • Complex and Exploratory Web Search (with Daniel Russell)
    Anne Aula, Google, USA

  • Really Supporting Information Seeking: A Position Paper
    Nicholas J. Belkin, Rutgers University, USA

  • Transparent and User-Controllable Personalization For Information Exploration
    Peter Brusilovsky, University of Pittsburgh, USA

  • Faceted Exploratory Search Using the Relation Browser
    Robert Capra, UNC, USA

  • Towards a Model of Understanding Social Search
    Ed Chi, Palo Alto Research Center, USA

  • Building Blocks For Rapid Development of Information Seeking Support Systems
    Gary Geisler, University of Texas at Austin, USA

  • Collaborative Information Seeking in Electronic Environments
    Gene Golovchinsky, FX Palo Alto Laboratory, USA

  • NeoNote: User Centered Design Suggestions for a Global Shared Scholarly Annotation System
    Brad Hemminger, UNC, USA

  • Speaking the Same Language About Exploratory Information Seeking
    Bill Kules, The Catholic University of America, USA

  • Musings on Information Seeking Support Systems
    Michael Levi, U.S. Bureau of Labor Statistics, USA

  • Social Bookmarking and Information Seeking
    David Millen, IBM Research, USA

  • Making Sense of Search Result Pages
    Jan Pedersen, Yahoo, USA

  • A Multilevel Science of Social Information Foraging and Sensemaking
    Peter Pirolli, XEROX PARC USA

  • Characterizing, Supporting and Evaluating Exploratory Search
    Edie Rasmussen, University of British Columbia, Canada

  • The Information-Seeking Funnel
    Daniel Rose, A9.com, USA

  • Complex and Exploratory Web Search (with Anne Aula)
    Daniel Russell, Google, USA

  • Research Agenda: Visual Overviews for Exploratory Search
    Ben Shneiderman, University of Maryland, USA

  • Five Challenges for Research to Support IS3
    Elaine Toms, Dalhousie University, Canada

  • Resolving the Battle Royale between Information Retrieval and Information Science
    Daniel Tunkelang, Endeca, USA

Sunday, August 10, 2008

Why Enterprise Search Will Never Be Google-y

As I prepared to end my trilogy of Google-themed posts, I ran into two recently published items. They provide an excellent context for what I intended to talk about: the challenges and opportunities of enterprise search.

The first is Google's announcement of an upgrade to their search appliance that allows one box to index 10 million documents and offers improved search quality and personalization.

The second is an article by Chris Sherman in the Enterprise Search Sourcebook 2008 entitled Why Enterprise Search Will Never Be Google-y.

First, the Google announcement. These are certainly improvements for the GSA, and Google does seem to be aiming to compete with the Big Three: Autonomy, Endeca, FAST (now a subsidiary of Microsoft). But these improvements should be seen in the context of state of the art. In particular, Google's scalability claims, while impressive, still fall short of the market leaders in enterprise search. Moreover, the bottleneck in enterprise search hasn't been the scale of document indexing, but rather the effectiveness with which people can access and interact with the indexed content. Interestingly, Google's strongest selling point for the GSA, their claim it works "out of the box", is also its biggest weakness: even with the new set of features, the GSA does not offer the flexibility or rich functionality that enterprises have come to expect.

Second, the Chris Sherman piece. Here is an excerpt:
Enterprise search and web search are fundamentally different animals, and I'd argue that enterprise search won't--and shouldn't--be Google-y any time soon....Like web search, Google's enterprise search is easy to use--if you're willing to go along with how Google's algorithms view and present your business information....Ironically, enterprises, with all of their highly structures and carefully organized silos of information, require a very different and paradoxically more complex approach.
I highly recommend you read the whole article (it's only 2 pages), not only because it informative and well written, but also because the author isn't working for one of the Big Three.

The upshot? There is no question that Google is raising the bar for simple search in the enterprise. I wouldn't recommend that anyone try to compete with the GSA on its turf.

But information needs in the enterprise go far beyond known-item search, What enterprises want when they ask for "enterprise search" is not just a search box, but an interactive tool that helps them (or their customers) work through the process of articulating and fulfilling their information needs, for tasks as diverse as customer segmentation, knowledge management, and e-discovery.

If you're interested in search and want to be on the cutting edge of innovation, I suggest you think about the enterprise.

Thursday, August 7, 2008

Where Google Isn't Good Enough

My last post, Is Google Good Enough?, challenged would-be Google killers to identify and address clear consumer needs for which Google isn't good enough as a solution. I like helping my readers, so here are some ideas.
  • Shopping. Google Product Search (fka Froogle) is not one of Google's crown jewels. At best, it works well when you know the exact name of the product you are looking for. But it pales in contrast to any modern ecommerce site, such as Amazon or Home Depot. What makes a shopping site successful? Put simply, it helps users find what they want, even when they didn't know exactly what they wanted when they started.

  • Finding a job. Google has not thrown its hat into the ring of job search, and even the page they offer for finding jobs at Google could use some improvement. The two biggest job sites, Monster and Careerbuilder, succeed in terms of the number of jobs posted, but aren't exactly optimized for user experience. Dice does better, but only for technology jobs. Interestingly, the best job finding site may be LinkedIn--not because of their search implementation (which is adequate but not innovative), but because of their success in getting millions of professionals to provide high-quality data.

  • Finding employees. Again, LinkedIn has probably come closest to providing a good employee finding site. The large job sites (all of which I've used at some point) not only fail to support exploratory search, but also suffer from a skew towards ineligible candidates and a nuisance of recruiters posing as job seekers. Here again, Google has not tried to compete.

  • Planning a trip. Sure, you can use Expedia, Travelocity, or Kayak to find a flight, hotel, and car rental. But there's a lot of room for improvement when it comes to planning a trip, whether for business or pleasure. The existing tools do a poor job of putting together a coordinated itinerary (e.g., meals, activities), and also don't integrate with relevant information sources, such as local directories and reviews. This is another area where Google has not tried to play.
Note two general themes here. The first is thinking beyond the mechanics of search and focusing on the ability to meet user needs at the task level. The second is the need for exploratory search. These only scratch the surface of opportunities in consumer-facing "search" applications. The opportunities within the enterprise are even greater, but I'll save that for my next post.

Tuesday, August 5, 2008

Is Google Good Enough?

As Chief Scientist of Endeca, I spend a lot of my time explaining to people why they should not be satisfied with an information seekin interface that only offers them keyword search as an input mechanism and a ranked list of results as output. I tell them about query clarification dialogs, faceted navigation, and set analysis. More broadly, I evangelize exploratory search and human computer information retrieval as critical to addressing the inherent weakness of conventional ranked retrieval. If you haven't heard me expound on the subject, feel free to check out this slide show on Is Search Broken?.

But today I wanted to put my ideology aside and ask the the simple question: Is Google good enough? Here is a good faith attempt to make the case for the status quo. I'll focus on web search, since, as I've discussed before on this blog, enterprise search is different.

1) Google does well enough on result quality, enough of the time.

While Google doesn't publish statistics about user satisfaction, it's commonplace that Google usually succeeds in returning results that users find relevant. Granted, so do all of the major search engines: you can compare Google and Yahoo graphically at this site. But the question is not whether other search engines are also good enough--or even whether they are better. The point is that Google is good enough.

2) Google doesn't support exploratory search. But it often leads you to a tool that does.

The classic instance of this synergy is when Google leads you to a Wikipedia entry. For example, I look up Daniel Kahneman on Google. The top results is his Wikipedia entry. From there, I can traverse links to learn about his research areas, his colleagues, etc.

3) Google is a benign monopoly that mitigates choice overload.

Many people, myself includes, have concerns about Google's increasing role in mediating our access to information. But it's hard to ignore the upside of a single portal that gives you access to everything in one place: web pages, blogs, maps, email, etc, And it's all "free"--at least in so far as ad-supported services can be said to be free.

In summary, Google sets the bar pretty high. There are places where Google performs poorly (e.g., shopping) or doesn't even try to compete (e.g., travel). But when I see the series of companies lining up to challenge Google, I have to wonder how many of them have identified and addressed clear consumer needs for which Google isn't good enough as a solution. Given Google's near-monopoly in web search, parity or even incremental advantage isn't enough.

Monday, July 28, 2008

Not as Cuil as I Expected

Today's big tech news is the launch of Cuil, the latest challenger to Google's hegemony in Web search. Given the impressive team of Xooglers that put it together, I had high expectations for the launch.

My overall reaction: not bad, but not good enough to take seriously as a challenge to Google. They may be "The World's Biggest Search Engine" based on the number of pages indexed, but they return zero results for a number of queries where Google does just fine, including noisy channel blog (compare to Google). But I'm not taking it personally--after all, their own site doesn't show up when you search for their name (again, compare to Google). As for their interface features (column display, explore by category, query suggestions), they're fine, but neither the concepts nor the quality of their implementation strike me as revolutionary.

Perhaps I'm expecting too much on day 1. But they're not just trying to beat Gigablast; they're trying to beat Google, and they surely expected to get lots of critical attention the moment they launched. Regardless of the improvements they've made in indexing, they clearly need to do more work on their crawler. It's hard to judge the quality of results when it's clear that at least some of the problem is that the most relevant documents simply aren't in their index. I'm also surprised to not see Wikipedia documents showing up much for my searches--particularly for searches when I'm quite sure the most relevant document is in Wikipedia. Again, it's hard to tell if this is an indexing or results quality issue.

I wish them luck--I speak for many in my desire to see Google face worthy competition in web search.

Sunday, July 27, 2008

Catching up on SIGIR '08

Now that SIGIR '08 is over, I hope to see more folks blogging about it. I'm jealous of everyone who had the opportunity to attend, not only because of the culinary delights of Singapore, but because the program seems to reflect an increasing interest of the academic community in real-world IR problems.

Some notes from looking over the proceedings:
  • Of the 27 paper sessions, 2 include the word "user" in their titles, 2 include the word "social", 2 focus on Query Analysis & Models, and 1 is about exploratory search. Compared to the last few SIGIR conferences, this is a significant increase in focus on users and interaction.

  • A paper on whether test collections predict users' effectiveness offers an admirable defense of the Cranfield paradigm, much along the lines I've been advocating.

  • A nice paper from Microsoft Research looks at the problem of whether to personalize results for a query, recognizing that not all queries benefit from personalization. This approach may well be able to reap the benefits of personaliztion while avoiding much of its harm.

  • Two papers on tag prediction: Real-time Automatic Tag Recommendation (ACM Digital Library subscription required) and Social Tag Prediction. Semi-automated tagging tools are one of the best ways to leverage the best of both human and machine capabilities.
And I haven't even gotten to the posters! I'm sad to see that they dropped the industry day, but perhaps they'll bring it back next year in Boston.

Wednesday, July 23, 2008

Knol: Google takes on Wikipedia

Just a few days ago, I was commenting on a New York Times article about Wikipedia's new approval system that the biggest problem with Wikipedia is anonymous authorship. By synchronous coincidence, Google unveiled Knol today, which is something of a cross between Wikipedia and Squidoo. It's most salient feature is that each entry will have a clearly identified author. They even allow authors to verify their identities using credit cards or phone directories.

It's a nice idea, since anonymous authorship is a a major factor in the adversarial nature of information retrieval on the web. Not only does the accountability of authorship inhibit vandalism and edit wars, but it also allows readers to decide for themselves whom to trust--at least to the extent that readers are able and willing to obtain reliable information about the authors. Without question, they are addressing Wikipedia's biggest weakness.

But it's too little, too late. Wikipedia is already there. And, despite complaints about its inaccuracy and bias, Wikipedia is a fantastic, highly utilized resource. The only way I see for Knol to supplant Wikipedia in reasonable time frame is through a massive cut-and-paste to make up for the huge difference in content.

Interestingly, Wikipedia does not seem to place any onerous restrictions on verbatim copying. However, unless a single author is 100% responsible for authoring a Wikipedia entry, it isn't clear that anyone can simply copy the entry into Knol.

I know that it's dangerous to bet against Google. But I'm really skeptical about this latest effort. It's a pity, because I think their emphasis is the right one. But for once I wish they'd been a bit more humble and accepted that they aren't going to build a better Wikipedia from scratch.

Saturday, July 19, 2008

Predictably Irrational

As regular readers have surely noticed by now, I've been on a bit of a behavioral psychology kick lately. Some of this reflects long-standing personal interest and my latest reading. But I also feel increasingly concerned that researchers in information seeking--especially those working on tools--have neglected the impact of cognitive bias.

For those who are unfamiliar with last few decades of research in this field, I highly recommend a recent lecture by behavioral economist Dan Ariely on predictable irrationality. Not only is he a very informative and entertaining speaker, but he chooses very concrete and credible examples, starting with his contemplating how we experience pain based on his own experience of suffering
third-degree burns over 70 percent of his body. I promise you, the lecture is an hour well spent, and the time will fly by.

A running theme of through this and my other posts on cognitive bias is that the way a information is presented to us has dramatic effects on how we interpret that information.

This is great news for anyone who wants to manipulate people. In fact, I once asked Dan about the relative importance of people's inherent preferences vs. those induced by presentation on retail web sites, and he all but dismissed the former (i.e., you can sell ice cubes to Eskimos, if you can manipulate their cognitive biases appropriately). But it's sobering news for those of us who want to empower user to evaluate information objectively to support decision making.

Friday, July 18, 2008

Call to Action - A Follow-Up

The call to action I sent out a couple of weeks ago has generated healthy interest.

One of the several people who responded is the CTO of one of Endeca's competitors, whom I laud for understanding that the need to better articulate and communicate the technology of information access transcends competition among vendors. While we have differences on how to achieve this goal, I at least see hope from his responsiveness.

The rest were analysts representing some of the leading firms in the space. They not only expressed interest, but also contributed their own ideas on how to make this effort successful. Indeed, I met with two analysts this week to discuss next steps.

Here is where I see this going.

In order for any efforts to communicate the technology of information access to be effective, the forum has to establish credibility as a vendor-neutral and analyst-neutral forum. Ideally, that means having at least two major vendors and two major analysts on board. What we want to avoid is having only one major vendor or analyst, since that will create a reasonable perception of bias.

I'd also like to involve academics in information retrieval and library and information science. As one of the analysts suggested, we could reach out to the leading iSchools, who have expressed an open interest in engaging the broader community.

What I'd like to see come together is a forum, probably a one-day workshop, that brings together credible representatives from the vendor, analyst, and academic communities. With a critical mass of participants and enough diversity to assuage concerns of bias, we can start making good on this call to action.

Tuesday, July 15, 2008

Beyond a Reasonable Doubt

In Psychology of Intelligence Analysis, Richards Heuer advocates that we quantify expressions of uncertainty: "To avoid ambiguity, insert an odds ratio or probability range in parentheses after expressions of uncertainty in key judgments."

His suggestion reminds me of my pet peeve about the unquantified notion of reasonable doubt in the American justice system. I've always wanted (but never had the opportunity) to ask a judge what probability of innocence constitutes a reasonable doubt.

Unfortunately, as Heuer himself notes elsewhere in his book, we human beings are really bad at estimating probabilities. I suspect (with a confidence of 90 to 95%) that quantifying our uncertainties as probability ranges will only suggest a false sense of precision.

So, what can we do to better communicate uncertainty? Here are a couple of thoughts:
  • We can calibrate estimates based on past performance. It's unclear what will happen if people realize that their estimates are being translated, but, at worst, it feels like good fodder for research in judgment and decision making.

  • We can ask people to express relative probability judgments. While these are also susceptible to bias, at least they don't demand as much precision. And we can always vary the framing of questions to try to factor out the cognitive biases they induce.
Also, we talk about uncertainty, it is important that we distinguish between aleatory and epistemic uncertainty.

When I flip a coin, I am certain it has a 50% chance of landing heads, because I know the probability distribution of the event space. This is aleatory uncertainty, and forms the basis of probability and statistics.

But when I reason about less contrived uncertain events, such as estimating the likelihood that my bank will collapse this year, the challenge is my ignorance of the probability distribution. This is epistemic uncertainty, and it's a lot messier.

If you'd like to learn more about aleatory and existential uncertainty, I recommend Nicholas Nassim Taleb's Fooled by Randomness (which is a better read than his better-known Black Swan).

In summary, we have to accept the bad news that the real world is messy. As a mathematician and computer scientist, I've learned to pursue theoretical rigor as an ideal. Like me, you may find it very disconcerting to not be able to treat all real-world uncertainty in terms of probability spaces. Tell it to the judge!

Sunday, July 13, 2008

Small is Beautiful

Today's New York Times has an article by John Markoff called On a Small Screen, Just the Salient Stuff. It argues that the design constraints of the iPhone (and of mobile devices in general) lead to an improved user experience, since site designers do a better job of focusing on the information that users will find relevant.

Of course, on a blog entitled The Noisy Channel, I can't help praising approaches that strive to improve the signal-to-noise ratio in information seeking applications. And I'm glad to see them quoting Ben Shneiderman, a colleague of mine at the University of Maryland who has spent much of his career focusing on HCIR issues.

Still, I think they could have taken the idea much further. Their discussion of more efficient or ergonomic use of real estate boils down to stripping extraneous content (a good idea, but hardly novel), and making sites vertically oriented (i.e., no horizontal scrolling). They don't consider the question of what information is best to present in the limited space--which, in my mind, is the most important question to consider as we optimize interaction. Indeed, many of the questions raised by small screens also apply to other interfaces, such as voice.

Perhaps I am asking too much to expect them to call out the extreme inefficiency of ranked lists, compared to summarization-oriented approaches. Certainly the mobile space opens great opportunities for someone to get this right on the web.

Friday, July 11, 2008

Psychology of Intelligence Analysis

In the course of working with some of Endeca's more interesting clients, I started reading up on how the intelligence agencies address the challenges of making decisions, especially in the face of incomplete and contradictory evidence. I ran into a book called Psychology of Intelligence Analysis by former CIA analyst Richards Heuer. The entire book is available online, or you can hunt down a hard copy of the out-of-print book from your favorite used book seller.

Given the mixed record of the intelligence agencies over the past few decades, you might be wondering if the CIA is the best source for learning how to analyze intelligence. But this book is a gem. Even if the agencies don't always practice what they preach (and the book makes a good case as to why), the book is an excellent tour through the literature on judgment and decision making.

If you're already familiar with work by Herb Simon, Danny Kahneman, and Amos Tversky, then a lot of the ground he covers will be familiar--especially the third of the book that enumerates cognitive biases. I'm a big fan of the judgment and decision making literature myself. But I still found some great nuggets, particularly Chapter 8 on Analysis of Competing Hypotheses. Unlike most of the literature that focuses exclusively on demonstrating our systematic departures from rationality, Heuer hopes offer at least some constructive advice.

As someone who builds tools to help people make decisions using information that not only may be incomplete and contradictory, but also challenging to find in the first place, I'm very sensitive to how people's cognitive biases affect their ability to use these tools effectively. One of the HCIR '07 presentations by Jolie Martin and Michael Norton (who have worked with Max Bazerman) showed how the manner in which information was partitioned on retail web sites drove decisions, i.e., re-organizing the same information affected consumer's decision process.

It may be tempting for us on the software side to wash our hands of our users' cognitive biases. But such an approach would be short-sighted. As Heuer shows in his well-researched book, people not only have cognitive biases, but are unable to counter those biases simply by being made aware of them. Hence, if software tools are to help people make effective decisions, it is the job of us tool builders to build with those biases in mind, and to support processes like Analysis of Competing Hypotheses that try to compensate for human bias.

Thursday, July 10, 2008

Nice Selection of Machine Learning Papers

John Langford just posted a list of seven ICML '08 papers that he found interesting. I appreciate his taste in papers, and I particularly liked a paper on Learning Diverse Rankings with Multi-Armed Bandits that addresses learning a diverse ranking of documents based on users' clicking behavior. If you liked the Less is More work that Harr Chen and David Karger presented at SIGIR '06, then I recommend you check this one out.

Tuesday, July 8, 2008

Librarian 2.0

Many of the words that mark milestones in the history of technology, such as calculator and word processor, originally corresponded to people. Calculating had at least two lives as a technology breakthrough--first as a process, and then as a automatic means for executing that process. Thanks to inventions like calculators and computers, human beings have moved up the value chain to become scientists and engineers who take low-level details for granted.

Similarly, the advances in information science and retrieval have dramatically changed the role of a reference librarian.

Hopefully some of you old enough to remember card catalogs, They were certainly functional if you knew the exact title or author you were looking for, assuming the title wasn't too generic or author too prolific. Where card catalogs fell short was in supporting exploratory search. In many cases, your best bet was to quite literally explore the stacks and hope that locality within the Dewey Decimal system sufficed for to support your information seeking needs. Alternatively, you could follow citation paths--the dead-tree precursor of surfing a hypertext collection.

For exploratory tasks, library patrons would turn to reference librarians, who would clarify the patrons' needs through a process called the reference interview. According to Wikipedia:
A reference interview is composed of two segments:

1. An initial segment in which the librarian encourages the user to fully discuss the request.
2. A final segment in which the librarian asks questions to relate the request to the materials available in the library

A reference interview is structured (ideally) according to the following series of steps. First the library user states a question or describes a problem. The librarian then clarifies the user's information need, sometimes leading him or her back from a request for a specific resource (which may not be the best one for the problem at hand) to the actual information need as it manifests in the library user's life. Following that, the librarian suggests information resources that address the user's information need, explaining the nature and scope of information they contain and soliciting feedback. The reference interview closes when the librarian has provided the appropriate information or a referral to an outside resource where it can be found, and the user confirms that he or she has received the information needed.
Fast forward to the present day. Thanks to modern search engines, title and author search are no longer tedious processes. Moreover, search engines are somewhat forgiving of users, offering spelling correction and inexact query matching. Libraries are still catching up with advances in technology, but the evolution is clearly under way.

However, search engines have not obviated the need for a reference interview. Excepting the simple cases of known item search, the typical information seeker needs help translating an information need into one or more search queries. And that information need may change as the seeker learns from the process.

But it should come as no surprise that information seeking support systems need to be more than search engines. The ideal information seeking support system emulates a reference librarian, stepping users through a structured process of clarification. Indeed, this is exactly what my colleagues and I at Endeca are trying to do in our work with libraries and more broadly in pursuing a vision of human computer information retrieval.

What then becomes of librarians? Much as calculators and computers did not obviate the need for mathematicians, I don't see technology obviating the need for information scientists. Library schools have already evolved into information schools, and I have no doubt that their graduates will help establish the next generation of information seeking technology that makes today's search engines seem as quaint as card catalogs.

Sunday, July 6, 2008

Resolving the Battle Royale between Information Retrieval and Information Science

The following is the position paper I submitted to the NSF Information Seeking Support Systems Workshop last month. The workshop report is still being assembled, but I wanted to share my own contribution to the discussion, since it is particularly appropriate to the themes of The Noisy Channel.


Resolving the Battle Royale between Information Retrieval and Information Science


Daniel Tunkelang

Endeca

ABSTRACT

We propose an approach to help resolve the “battle royale” between the information retrieval and information science communities. The information retrieval side favors the Cranfield paradigm of batch evaluation, criticized by the information science side for its neglect of the user. The information science side favors user studies, criticized by the information retrieval side for their scale and repeatability challenges. Our approach aims to satisfy the primary concerns of both sides.

Categories and Subject Descriptors

H.1.2 [Human Factors]: Human information processing.

H.3.3 [Information Systems]: Information Search and Retrieval - Information Filtering, Retrieval Models

H.5.2 [Information Systems]: Information Interfaces and Presentation - User Interfaces

General Terms

Design, Experimentation, Human Factors

Keywords

Information science, information retrieval, information seeking, evaluation, user studies

1. INTRODUCTION

Over the past few decades, a growing community of researchers has called for the information retrieval community to think outside the Cranfield box. Perhaps the most vocal advocate is Nick Belkin, whose "grand challenges" in his keynote at the 2008 European Conference on Information Retrieval [1] all pertained to the interactive nature of information seeking he claims the Cranfield approach neglects. Belkin cited similar calls to action going back as far as Karen Spärck Jones, in her 1988 acceptance speech for the Gerald Salton award [2], and again from Tefko Saracevic, when he received the same award in 1997 [3]. More recently, we have the Information Seeking and Retrieval research program proposed by Peter Ingwersen and Kalervo Järvelin in The Turn, published in 2005 [4].

2. IMPASSE BETWEEN IR AND IS

Given the advocacy of Belkin and others, why hasn't there been more progress? As Ellen Voorhees noted in defense of Cranfield at the 2006 Workshop on Adaptive Information Retrieval, "changing the abstraction slightly to include just a bit more characterization of the user will result in a dramatic loss of power or increase in cost of retrieval experiments" [5]. Despite user studies that have sought to challenge the Cranfield emphasis on batch information retrieval measures like mean average precision—such as those of Andrew Turpin and Bill Hersh [6]—the information retrieval community, on the whole, remains unconvinced by these experiments because they are smaller in scale and less repeatable than the TREC evaluations.

As Tefko Saracevic has said, there is a "battle royale" between the information retrieval community, which favors the Cranfield paradigm of batch evaluation despite its neglect of the user, and the information science community, which favors user studies despite their scale and repeatability challenges [7]. How do we move forward?

3. PRIMARY CONCERNS OF IR AND IS

Both sides have compelling arguments. If an evaluation procedure is not repeatable and cost-effective, it has little practical value. Nonetheless, it is essential that an evaluation procedure measure the interactive nature of information seeking.

If we are to find common ground to resolve this dispute, we need to satisfy the primary concerns of both sides:

· Real information seeking tasks are interstice, so the results of the evaluation procedure must be meaningful in an interactive context.

· The evaluation procedure must be repeatable and cost-effective.

In order to move beyond the battle royale and resolve the impasse between the IR and IS communities, we need to address both of these concerns.

4. PROPOSED APPROACH


A key point of contention in the battle royale is whether we should evaluate systems by studying individual users or measuring system performance against test collections.

The short answer is that we need to do both. In order to ground the results of evaluation in realistic contexts, we need to conduct user studies that relate proposed measures to success in interactive information seeking tasks. Otherwise, we optimize under the artificial constraint that a task involves only a single user query.

Such an approach presumes that we have a characterization of information seeking tasks. This characterization is an open problem that is beyond the scope of this position paper but has been addressed by other information seeking researchers, including Ingwersen and Järvelin [4]. We presume access to a set of tasks that, if not exhaustive, at least applies to a valuable subset of real information seeking problems.

Consider, as a concrete example, the task of a researcher who, given a comprehensive digital library of technical publications, wants to determine with confidence whether his or her idea is novel. In other words, the researcher want to either discover prior art that anticipates the idea, or to state with confidence that there is no such art. Patent inventors and lawyers performing e-discovery perform analogous tasks. We can measure task performance objectively as a combination of accuracy and efficiency, and we can also consider subject measures like user confidence and satisfaction. Let us assume that we are able to quantify a task success measure that incorporates these factors.

Given this task and success measure, we would like to know how well an information retrieval system supports the user performing it. As the information scientists correctly argue, user studies are indispensable. But, as we employ user studies to determine which systems are most helpful to users, we need to go a step further and correlate user success to one or more system measures. We can then evaluate these system measures in a repeatable, cost-effective process that does not require user involvement.

For example, let us hypothesize that mean average precision (MAP) on a given TREC collection is such a measure. We hypothesize that users pursuing the prior art search task are more successful using a system with higher MAP than those using a system with lower MAP. In order to test this hypothesis, we can present users with a family of systems that, insofar as possible, vary only in MAP, and see how well user success correlates to the system’s MAP. If the correlation is strong, then we validate the utility of MAP as a system measure and invest in evaluating systems using MAP against the specified collection in order to predict their utility for the prior art task.

The principle here is a general one, and can even be used not only to compare different algorithms, but also to evaluate more sophisticated interfaces, such as document clustering [8] or faceted search [9]. The only requirement is that we hypothesize and validate system measures that correlate to user success.

5. WEAKNESSES OF APPROACH

Our proposed approach has two major weaknesses.

The first weakness is that, in a realistic interactive information retrieval context, distinct queries are not independent. Rather, a typical user executes a sequence of queries in pursuit of an information need, each query informed by the results of the previous ones.

In a batch test, we must decide the query sequence in advance, and cannot model how the user’s queries depend on system response. Hence, we are limited to computing measures that can be evaluated for each query independently. Nonetheless, we can choose measures which correlate to effectiveness in realistic settings. Hopefully these measures are still meaningful, even when we remove the test queries from their realistic context.

The second challenge is that we do not envision a way to compare different interfaces in a batch setting. It seems that testing the relative merits of different interfaces requires real—or at least simulated—users.

If, however, we hold the interface constant, then we can define performance measures that apply to those interfaces. For example, we can develop standardized versions of well-studied interfaces, such as faceted search and clustering. We can then compare the performance of different systems that use these interfaces, e.g., different clustering algorithms.

6. AN ALTERNATIVE APPROACH

An alternative way to tackle the evaluation problem leverages the “human computation” approach championed by Luis Von Ahn [10]. This approach uses “games with a purpose” to motivate people to perform information-related tasks, such as image tagging and optical character recognition (OCR).

A particularly interesting "game" in our present context is Phetch, in which in which one or more "Seekers" compete to find an image based on a text description provided by a "Describer" [11]. The Describer’s goal is to help the Seekers succeed, while the Seekers compete with one another to find the target image within a fixed time limit, using search engine that has indexed the images based on tagging results from the ESP Game. In order to discourage a shotgun approach, the game penalizes Seekers for wrong guesses.

This game goes quite far in capturing the essence of interactive information retrieval. If we put aside the competition among the Seekers, then we see that an individual Seeker, aided by the human Describer and the algorithmic--but human indexed--search engine--is pursuing an information retrieval task. Moreover, the Seeker is incented to be both effective and efficient.

How can we leverage this framework for information retrieval evaluation? Even though the game envisions both Describers and Seekers to be human beings, there is no reason we cannot allow computers to play too--in either or both roles. Granted, the game, as currently designed, focuses on image retrieval without giving the human players direct access to the image tags, but we could imagine a framework that is more amenable to machine participation, e.g., providing a machine player with a set of tags derived from those in the index when that player is presented with an image. Alternatively, there may be a domain more suited than image retrieval to incorporating computer players.

The main appeal of the game framework is that it allows all participants to be judged based on an objective criterion that reflects the effectiveness and efficiency of the interactive information retrieval process. A good Describer should, on average, outscore a bad Describer over the long term; likewise, a good Seeker should outscore a bad one. We can even vary the search engine available to Seekers, in order to compare competing search engine algorithms or interfaces.

7. CONCLUSION

Our goal is ambitious: we aspire towards an evaluation framework that satisfies information scientists as relevant to real-world information seeking, but nonetheless offers the practicality of the Cranfield paradigm that dominates information retrieval. The near absence of collaboration between the information science and information retrieval communities has been a greatly missed opportunity not only for both researcher communities but also for the rest of the world who could benefit from practical advances in our understanding of information seeking. We hope that the approach we propose takes at least a small step towards resolving this battle royale.

8. REFERENCES

[1] Belkin, N. J., 2008. Some(What) Grand Challenges for Information Retrieval. ACM SIGIR Forum 42, 1 (June 2008), 47-54.

[2] Spärck Jones, K. 1988. A look back and a look forward. In: SIGIR ’88. In Proceedings of the 11th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval, 13-29.

[3] Saracevic, T. 1997. Users lost: reflections of the past, future and limits of information science. ACM SIGIR Forum 31, 2 (July 1997), 16-27.

[4] Ingwersen, P. and Järvelin, K. 2005. The turn. Integration of information seeking and retrieval in context. Springer.

[5] Voorhees, E. 2006. Building Test Collections for Adaptive Information Retrieval: What to Abstract for What cost? In First International Workshop on Adaptive Information Retrieval (AIR).

[6] Turpin, A. and Scholer, F. 2006. User performance versus precision measures for simple search tasks. In Proceedings
of the 29th Annual international ACM SIGIR Conference on Research and Development in information Retrieval
, 11-18.

[7] Saracevic, T. (2007). Relevance: A review of the literature and a framework for thinking on the notion in information science. Part II: nature and manifestations of relevance. Journal of the American Society for Information Science and Technology 58(3), 1915-1933.

[8] Cutting, D., Karger, D., Pedersen, J., and Tukey, J. 1992. Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections. In Proceedings of the 15th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval, 318-329.

[9] Workshop on Faceted Search. 2006. In Proceedings of the 29th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval.

[10] Von Ahn, L. 2006. Games with a Purpose. IEEE Computer 39, 6 (June 2006), 92-94.

[11] Von Ahn, L., Ginosar, S., Kedia, M., Liu, R., and Blum, M. 2006. Improving accessibility of the web with a computer game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 79-82.


Tuesday, September 16, 2008

We've Moved!

Please redirect your readers to http://thenoisychannel.com! The RSS feed is available at http://thenoisychannel.com/?feed=rss2.

See you all there...

Migrating Tonight!

At long last, this blog will migrate over to a hosted WordPress platform at http://thenoisychannel.com/. Thanks to Andy Milk (and to Endeca for lending me his services) and especially to Noisy Channel regular David Fauth for making this promised migration a reality!

As of midnight EST, please visit the new site. My goal is to redirect all incoming Blogger traffic to the new hosted site. This will be the last post here at Blogger.

p.s. Please note that I'll be manually migrating any content (posts and comments) from the past 5 days, i.e., since I performed an import on September 12th. My apologies if anything is lost in translation.

Quick Bites: Search Evaluation at Google

Original post is here; Jeff's commentary is here. Not surprisingly, my reaction is that Google should consider a richer notion of "results" than an ordering of matching pages, perhaps a faceted approach that reflects the "several dimensions to 'good' results."

Quick Bites: Is Wikipedia Production Slowing Down?

Thanks to Sérgio for tweeting this post by Peter Pirolli at PARC: Is Wikipedia Production Slowing Down?

Here's the picture showing the reduction of growth in the number of Wikipedia editors over time:



Interesting material and commentary at Augmented Social Cognition and Peter Pirolli's blog. Are people are running out of things to write about?

Monday, September 15, 2008

Information Accountability

The recent United Airlines stock fiasco triggered an expected wave of finger pointing. For those who didn't follow the event, here is the executive summary:

    In the wee hours of Sunday, September 7th, The South Florida Sun-Sentinel (a subsidiary of the Tribune Company) included a link to an article entitled "UAL Files for Bankruptcy." The link was legit, but the linked article didn't carry its publication date in 2002. Then Google's news bot picked up the article and automatically assigned it a current date. Furthermore, Google sent the link to anyone with an alert set up for news about United. Then, on Monday, September 8th, someone at Income Security Advisors saw the article in the results for a Google News search and sent it out on Bloomberg. The results are in the picture below, courtesy of Bloomberg by way of the New York Times.



    For anyone who wants all of the gory details, Google's version of the story is here; the Tribune Company's version is here.

I've spent the past week wondering about this event from an information access perspective. And then today I saw two interesting articles:
  • The first was a piece in BBC News about a speech by Sir Tim Berners-Lee expressing concern that the internet needs a way to help people separate rumor from real science. His examples included the fears about the Large Hadron Collider at CERN creating a black hole that would swallow up the earth (which isn't quite the premise of Dan Brown's Angels and Demons), and rumors that a vaccine given to children in Britain was harmful.

  • The second was a column in the New York Times about the dynamics of the US presidential campaign, where Adam Nagourney notes that "senior campaign aides say they are no longer sure what works, as they stumble through what has become a daily campaign fog, struggling to figure out what voters are paying attention to and, not incidentally, what they are even believing."
I see a common thread here is that I'd like to call "information accountability." I don't mean this term in the sense of a recent CACM article about information privacy and sensitivity, but rather in a sense of information provenance and responsibility.

Whether we're worrying about Google bombing, Google bowling, or what Gartner analyst Whit Andrews calls "denial-of-insight" attacks, our concern is that information often arrives with implicit authority. Despite the aphorism telling us "don't believe everything you read," most of us select news and information sources with some hope that they will be authoritative. Whether the motto is "all the news that's fit to print" or "don't be evil", our choice of what we believe to be information sources is a necessary heuristic to avoid subjecting everything we read to endless skeptical inquiry.

But sometimes the most reputable news sources get it wrong. Or perhaps "wrong" is the wrong word. When newspapers reported that the FBI was treating Richard Jewell as a "person of interest" in the Centennial Olympic Park bombing (cf. "Olympic Park Bomber" Eric Robert Rudolph), they weren't lying, but rather were communicating information from what they believed to be a reliable source. And, in turn the FBI may have been correctly doing its job, given the information they had. But there's no question that Jewell suffered tremendously from his "trial by media" before his name was ultimately cleared.

It's tempting to react to these information breakdowns with finger-pointing, to figure out who is accountable and, in as litigious a society as the United States, bring on the lawyers. Moreover, there clearly are cases where willful misinformation constitutes criminal defamation or fraud. But I think we need to be careful, especially in a world where information flows in a highly connected--and not necessary acyclic--social graph. Anyone who has played the children's game of telephone knows that small communication errors can blow up rapidly, and that it's difficult to partition blame fairly.

The simplest answer is that we are accountable for how we consume information: caveat lector. But this model seems overly simplistic, since our daily lives hinge our ability to consume information without such a skeptical eye that we can accept nothing at face value. Besides, shouldn't we hold information providers responsible for living up the reputations they cultivate and promote?

There are no easy answers here. But the bad news is that we cannot ignore the questions of information accountability. If terms like "social media" and "web 2.0" mean anything, they surely tell us that the game of telephone will only grow in the number of participants and in the complexity of the communication chains. As a society, we will have to learn to live with and mitigate the fallout.

Sunday, September 14, 2008

Is Blog Search Different?

Alerted by Jeff and Iadh, I recently read What Should Blog Search Look Like?, a position paper by Marti Hearst, Matt Hurst, and Sue Dumais. For those readers unfamiliar with this triumvirate, I suggest you take some time to read their work, as they are heavyweights in some of the areas most often covered by this blog.

The position paper suggests focusing on 3 three kinds of search tasks:
  1. Find out what are people thinking or feeling about X over time.
  2. Find good blogs/authors to read.
  3. Find useful information that was published in blogs sometime in the past.
The authors generally recommend the use of faceted navigation interfaces--something I'd hope would be uncontroversial by now for search in general.

But I'm more struck by their criticism that existing blog search engines fail to leverage the special properties of blog data, and that their discussion, based on work by Mishne and de Rijke, that blog search queries differ substantially from web search queries. I don't doubt the data they've collected, but I'm curious if their results account for the rapid proliferation and mainstreaming of blogs. The lines between blogs, news articles, and informational web pages seem increasingly blurred.

So I'd like to turn the question around: what should blog search look like that is not applicable to search in general?

Saturday, September 13, 2008

Progress on the Migration

Please check out http://thenoisychannel.com/ to see the future of The Noisy Channel in progress. I'm using WordPress hosted on GoDaddy and did the minimum work to port all posts and comments (not including this one).

Here is the my current list of tasks that I'd like to get done before we move.
  • Design! I'm currently using the default WordPress theme, which is pretty lame. I'm inclined to use a clean but stylish two-column theme that is widget-friendly. Maybe Cutline. In any case, I'd like the new site to be a tad less spartan before we move into it.

  • Internal Links. My habit of linking back to previous posts now means I have to map those links to the new posts. I suspect I'll do it manually, since I don't see an easy way to automate it.

  • Redirects. Unfortunately I don't think I can actually get Blogger to redirect traffic automatically. So my plan is to post signage throughout this blog making it clear that the blog has moved.
I'd love help, particularly in the form of advice on the design side. And I'll happily give administration access to anyone who has the cycles to help implement any of these or other ideas. Please let me know by posting here or by emailing me: dtunkelang@{endeca,gmail}.com.

Friday, September 12, 2008

Quick Bites: Probably Irrelevant. (Not!)

Thanks to Jeff Dalton for spreading the word about a new information retrieval blog: Probably Irrelevant. It's a group blog, currently listing Fernando Diaz and Jon Elsas as contributors. Given the authors and the blog name's anagram of "Re-plan IR revolt, baby!", I expect great things!

Wednesday, September 10, 2008

Fun with Twitter

I recently joined Twitter and asked the twitterverse for opinions about DreamHost vs. GoDaddy as a platform to host this blog on WordPress. I was shocked when I noticed today that I'd gotten this response from the President / COO of GoDaddy (or perhaps a sales rep posing as such).

Seems like a lot of work for customer acquisition!

Quick Bites: Email becomes a Dangerous Distraction

Just read this article citing a number of studies to the effect that email is a major productivity drain. Nothing surprising to me--a lot of us have learned the hard way that the only way to be productive is to not check email constantly.

But I am curious if anyone has made progress on tools that alert you to emails that do call for immediate attention. I'm personally a fan of attention bonds approaches, but I imagine that the machine learning folks have at least thought about this as a sort of inverse spam filtering problem.

Tuesday, September 9, 2008

Quick Bites: The Clickwheel Must Die

As someone who's long felt that the iPod's clickwheel violates Fitts's law, I was delighted to read this Gizmodo article asserting that the iPod's clickwheel must die. My choice quote:
Quite simply, the clickwheel hasn't scaled to handle the long, modern day menus in powerful iPods.
Fortunately Apple recognized its mistake on this one and fixed the problem in its touch interface. Though, to be clear, the problem was not inherent in the choice of a wheel interface, but rather in the requirement to make gratuitously precise selections.

Now I'm waiting to see someone fix the tiny minimize/maximize/close buttons in the upper right corner on Windows, which I suspect have become the textbook example of violating Fitts's law.

Monday, September 8, 2008

Incentives for Active Users

Some of the most successful web sites today are social networks, such as Facebook and LinkedIn. These are not only popular web sites; they are also remarkably effective people search tools. For example, I can use LinkedIn to find the 163 people in my network who mention "information retrieval" in their profiles and live within 50 miles of my ZIP code (I can't promise you'll see the same results!).

A couple of observations about social networking sites (I'll focus on LinkedIn) are in order.

First, this functionality is a very big deal, and it's something Google, Yahoo, and Microsoft have not managed to provide, even though their own technology is largely built on a social network--citation ranking.

Second, the "secret sauce" for sites like LinkedIn is hardly their technology (a search engine built on Lucene and a good implementation of breadth-first search), but rather the way they have incented users to be active participants, in everything from virally marketing the site to their peers to inputting high-quality semi-structured profiles that make the site useful. In other words, active users ensure both the quantity and quality of information on the site.

Many people have noted the network effect that drove the run-away success of Microsoft Office and eBay. But I think that social networking sites are taking this idea further, because users not only flock to the crowds, but become personally invested not only in the success of the site generally, but especially in the quality and accuracy of their personal information.

Enterprises need to learn from these consumer-oriented success stories. Some have already. For example, a couple of years ago, IBM established a Professional Marketplace, powered by Endeca, to maintain a skills and availability inventory of IBM employees. This effort was a run-away success, saving IBM $500M in its first year. But there's more: IBM employees have reacted to the success of the system by being more active in maintaining their own profiles. I spent the day with folks at the ACM, and their seeing great uptake in their author profile pages.

I've argued before that there's no free lunch when it comes to enterprise search and information access. The good news, however, is that, if you create the right incentives, you can get other folks to happily pay for lunch.

Quick Bites: Taxonomy Directed Folksonomies

Props to Gwen Harris at Taxonomy Watch for posting a paper by Sarah Hayman and Nick Lothian on Taxonomy Directed Folksonomies.

The paper asks whether folksonomies and formal taxonomy can be used together and answers in the affirmative. The work is in the spirit of some of our recent work at Endeca to bootstrap from vocabularies (though not necessarily controlled vocabularies) to address the inconsistency and sparsity of tagging in folksonomies.

I'm personally excited to see the walls coming down between the two approaches, which many people seem to think of as mutually exclusive approaches to the tagging problem.

Sunday, September 7, 2008

Quick Bites: Is Search Really 90% Solved?

Props to Michael Arrington for calling out this snippet in an interview with Marissa Mayer, Google Vice President of Search Product and User Experience on the occasion of Google's 10th birthday:
Search is an unsolved problem. We have a good 90 to 95% of the solution, but there is a lot to go in the remaining 10%.
I agree with Michael that search isn't even close to being solved yet. I've criticized the way many web search start-ups--and even the giants Yahoo and Microsoft--are going about trying to dethrone Google through incremental improvements or technologies that don't address any need that Google does not already adequately (if not optimally) address. But there is no lack of open problems in search for those ambitious enough to tackle them.

Quick Bites: Applying Turing's Ideas to Search

A colleague of mine at Endeca recently pointed me to a post by John Ferrara at Boxes and Arrows entitled Applying Turing's Ideas to Search.

One of the points he makes echoes the "computers aren't mind readers" theme I've been hammering at for a while:
If the user has not phrased her search clearly enough for another person to understand what she’s trying to find, then it’s not reasonable to expect that a comparatively "dumb" machine could do better. In a Turing test, the response to a question incomprehensible even to humans would prove nothing, because it wouldn’t provide any distinction between person and machine.
While I'm not convinced that search engine designers should be aspiring to pass the Turing test, I agree wholeheartedly with the vision John puts forward:
It describes an ideal form of human-computer interaction in which people express their information needs in their own words, and the system understands and responds to their requests as another human being would. During my usability test, it became clear that this was the very standard to which my test participants held search engines.
It's not about the search engine convincing the user that another human being is producing the answers, but rather engaging users in a conversation that helps them articulate and elaborate their information needs. Or, as we like to call it around here, HCIR.

Saturday, September 6, 2008

Migrating Soon

Just another reminder that I expect to migrate this blog to a hosted WordPress platform in the next days. If you have opinions about hosting platforms, please let me know by commenting here. Right now, I'm debating between DreamHost and GoDaddy, but I'm very open to suggestions.

I will do everything in my power to minimize disruption--not sure how easy Blogger will make it to redirect users to the new site. I'll probably post here for a while after to the move to try to direct traffic.

I do expect the new site to be under a domain name I've already reserved: http://thenoisychannel.com. It currently forwards to Blogger.

Back from the Endeca Government Summit

I spent Thursday at the Endeca Government Summit, where I had the privilege to chat face-to-face with some Noisy Channel readers. Mostly, I was there to learn more about the sorts of information seeking problems people are facing in the public sector in general, and in the intelligence agencies in particular.

While I can't go into much detail, the key concern was exploration of information availability. This problem is the antithesis of known-item search: rather than you are trying to retrieve information you know exist (and which you know how to specify), you are trying to determine if there is information available that would help you with a particular task.

Despite being lost in a sea of TLAs, I came away with a deepened appreciation of both the problems the intelligence agencies are trying to address and the relevance of exploratory search approaches to those problems.

Thursday, September 4, 2008

Query Elaboration as a Dialogue

I ended my post on transparency in information retrieval with a teaser: if users aren't great at composing queries for set retrieval, which I argue is more transparent than ranked retrieval, then how will we ever deliver an information retrieval system that offers both usefulness and transparency?

The answer is that the system needs to help the user elaborate the query. Specifically, the process of composing a query should be a dialogue between the user and the system that allows the user to progressively articulate and explore an information need.

Those of you who have been reading this blog for a while or who are familiar with what I do at Endeca shouldn't be surprised to see dialogue as the punch line. But I want to emphasize that the dialogue I'm describing isn't just a back-and-forth between the user and the system. After all, there are query suggestion mechanisms that operate in the context of ranked retrieval algorithms--algorithms which do not offer the user transparency. While such mechanisms sometimes work, they risk doing more harm than good. Any interactive approach requires the user to do more work; if this added work does not result in added effectiveness, users will be frustrated.

That is why the dialogue has to be based on a transparent retrieval model--one where the system responds to queries in a way that is intuitive to users. Then, as users navigate in query space, transparency ensures that they can make informed choices about query refinement and thus make progress. I'm partial to set retrieval models, though I'm open to probabilistic ones. 

But of course we've just shifted the problem. How do we decide what query refinements to offer to a user in order to support this progressive refinement process? Stay tuned...

Tuesday, September 2, 2008

Migrating to WordPress

Just a quick note to let folks know that I'll be migrating to WordPress in the next days. I'll make every effort to have to move be seamless. I have secured the domain name http://thenoisychannel.com, which currently forwards Blogger, but will shift to wherever the blog is hosted. I apologize in advance for any disruption.

Quick Bites: Google Chrome

For those of you who thought that no major technology news would come out during the Labor Day weekend, check out the prematurely released comic book hailing Google Chrome, Google's long rumored entry into browser wars. By the time you are reading this, the (Windows only) beta may even be available for download. The official Google announcement is here.

If the software lives up to the promise of the comic book, then Google may have a real shot of taking market share from IE and Firefox. More significantly, if they can supplant the operating system with the browser, then they'll have a much more credible opportunity to take on desktop software with their web-based applications.

Interestingly, even though all of the search blogs are reporting about Chrome, I haven't seen any analysis on what this might mean for web search.

Monday, September 1, 2008

Quick Bites: E-Discovery and Transparency

One change I'm thinking of making to this blog is to introduce "quick bites" as a way of mentioning interesting sites or articles I've come across without going into deep analysis. Here's a first one to give you a flavor of the concept. Let me know what you think.

I just read an article on how courts will tolerate search inaccuracies in e-Discovery by way of Curt Monash. It reminded me of our recent discussion of transparency in information retrieval. I agree that "explanations of [search] algorithms are of questionable value" for convincing a court of the relevance and accuracy of the results. But that's because those algorithms aren't sufficiently intuitive for those explanations to be meaningful except in a theoretical sense to an information retreival researcher.

I realize that user-entered Boolean queries (the traditional approach to e-Discovery) aren't effective because users aren't great at composing queries for set retrieval. But that's why machines need to help users with query elaboration--a topic for an upcoming post.

POLL: Blogging Platform

I've gotten a fair amount of feedback suggesting that I switch blogging platforms. Since I'd plan to make such changes infrequently, I'd like to get input from readers before doing so, especially since migration may have hiccups.

I've just posted a poll on the home page to ask if folks here have a preference as to which blogging platform I use. Please vote this week, and feel free to post comments here.

Friday, August 29, 2008

Improving The Noisy Channel: A Call for Ideas

Over the past five months, this blog has grown from a suggestion Jeff Dalton put in my ear to a community to which I'm proud to belong.

Some milestones:
  • Over 70 posts to date.
  • 94 subscribers, as reported by Google Reader.
  • 100 unique visitors on.a typical day.
To be honest, I thought I'd struggle to keep up with posting weekly, and that I'd need to convince my mom to read this blog so that I wouldn't be speaking to an empty room. The results so far have wildly exceeded the expectations I came in with.

But now that I've seen the potential of this blog, I'd like to "take it to the next level," as the MBA types say.

My goals:
  • Increase the readership. My motive isn't (only) to inflate my own ego. I've seen that this blog succeeds most when it stimulates conversation, and a conversation needs participants.

  • Increase participation. Given the quantity and quality of comments on recent posts, it's clear that readers here contribute the most valuable content. I'd like to step that up a notch by having readers guest-blog and perhaps going as far as to turning The Noisy Channel into a group blog about information seeking that transcends my personal take on the subject. I've very open to suggestions here.

  • Add some style. Various folks have offered suggestions for improving the blog, such as changing platforms to WordPress, modifying the layout to better use screen real estate, adding more images, etc. I'm the first to admit that I am not a designer, and I'd really appreciate ideas from you all on how to make this site more attractive and usable.
In short, I'm asking you to help me help you make The Noisy Channel a better and noisier place. Please post your comments here or email me if you'd prefer to make suggestions privately.

Wednesday, August 27, 2008

Transparency in Information Retrieval

It's been hard to find time to write another post while keeping up with the comment stream on my previous post about set retrieval! I'm very happy to see this level of interest, and I hope to continue catalyzing such discussions.

Today, I'd like to discuss transparency in the context of information retrieval. Transparency is an increasingly popular term these days in the context of search--perhaps not surprising, since users are finally starting to question the idea of search as a black box.

The idea of transparency is simple: users should know why a search engine returns a particular response to their query. Note the emphasis on "why" rather than "how". Most users don't care what algorithms a search engine uses to compute a response. What they do care about is how the engine ultimately "understood" their query--in other words, what question the engine thinks it's answering.

Some of you might find this description too anthropomorphic. But a recent study reported that most users expect search engines to read their minds--never mind that the general case goes beyond AI-complete (should we create a new class of ESP-complete problems)? But what frustrates users most is when a search engine not only fails to read their minds, but gives no indication of where the communication broke down, let alone how to fix it. In short, a failure to provide transparency.

What does this have to do with set retrieval vs. ranked retrieval? Plenty!

Set retrieval predates the Internet by a few decades, and was the first approach used to implement search engines. These search engines allowed users to enter queries by stringing together search terms with Boolean operators (AND, OR, etc.). Today, Boolean retrieval seem arcane, and most people see set retrieval as suitable for querying databases, rather than for querying search engines.

The biggest problem with set retrieval is that users find it extremely difficult to compose effective Boolean queries. Nonetheless, there is no question that set retrieval offers transparency: what you ask is what you get. And, if you prefer a particular sort order for your results, you can specify it.

In contrast, ranked retrieval makes it much easier for users to compose queries: users simply enter a few top-of-mind keywords. And for many use cases (in particular, known-item search) , a state-of-the-art implementation of ranked retrieval yields results that are good enough.

But ranked retrieval approaches generally shed transparency. At best, they employ standard information retrieval models that, although published in all of their gory detail, are opaque to their users--who are unlikely to be SIGIR regulars. At worst, they employ secret, proprietary models, either to protect their competitive differentiation or to thwart spammers.

Either way, the only clues that most ranked retrieval engines provide to users are text snippets from the returned documents. Those snippets may validate the relevance of the results that are shown, but the user does not learn what distinguishes the top-ranked results from other documents that contain some or all of the query terms.

If the user is satisfied with one of the top results, then transparency is unlikely to even come up. Even if the selected result isn't optimal, users may do well to satisfice. But when the search engine fails to read the user's mind, transparency offer the best hope of recovery.

But, as I mentioned earlier, users aren't great at composing queries for set retrieval, which was how ranked retrieval became so popular in the first place despite its lack of transparency. How do we resolve this dilemma?

To be continued...

Sunday, August 24, 2008

Set Retrieval vs. Ranked Retrieval

After last week's post about a racially targeted web search engine, you'd think I'd avoid controversy for a while. To the contrary, I now feel bold enough like to bring up what I have found to be my most controversial position within the information retrieval community: my preference for set retrieval over ranked retrieval.

This will be the first of several posts along this theme, so I'll start by introducing the terms.
  • In a ranked retrieval approach, the system responds to a search query by ranking all documents in the corpus based on its estimate of their relevance to the query.

  • In a set retrieval approach, the system partitions the corpus into two subsets of documents: those it considers relevant to the search query, and those it does not.
An information retrieval system can combine set retrieval and ranked retrieval by first determining a set of matching documents and then ranking the matching documents. Most industrial search engines, such as Google, take this approach, at least in principle. But, because the set of matching documents is typically much larger than the set of documents displayed to a user, these approaches are, in practice, ranked retrieval.

What is set retrieval in practice? In my view, a set retrieval approach satisfies two expectations:
  • The number of documents reported to match my search should be meaningful--or at least should be a meaningful estimate. More generally, any summary information reported about this set should be useful.

  • Displaying a random subset of the set of matching documents to the user should be a plausible behavior, even if it is not as good as displaying the top-ranked matches. In other words, relevance ranking should help distinguish more relevant results from less relevant results, rather than distinguishing relevant results from irrelevant results.
Despite its popularity, the ranked retrieval model suffers because it does not provide a clear split between relevant and irrelevant documents. This weakness makes it impossible to obtain even basic analysis of the query results, such as the number of relevant documents, let alone a more complicated one, such as the result quality. In contrast, a set retrieval model partitions the corpus into two subsets of documents: those that are considered relevant, and those that are not. A set retrieval model does not rank the retrieved documents; instead, it establishes a clear split between documents that are in and out of the retrieved set. As a result, set retrieval models enable rich analysis of query results, which can then be applied to improve user experience.

Saturday, August 23, 2008

Back from the Cone of Silence

Regular readers may have noticed the lack of posts this week. My apologies to anyone who was waiting by the RSS feed. Yesterday was the submission deadline for HCIR '08, which means that today is a new day! So please stay tuned for your regularly scheduled programming.

Saturday, August 16, 2008

Thinking Outside the Black Box

I was reading Techmeme today, and I noticed an LA Times article about RushmoreDrive, described on its About Us page as "a first-of-its-kind search engine for the Black community." My first reaction, blogged by others already, was that this idea was dumb and racist. In fact, it took some work to find positive commentary about RushmoreDrive.

But I've learned from the way the blogosphere handled the Cuil launch not to trust anyone who evaluates a search engine without having tried it, myself included. My wife and I have been the only white people at Amy Ruth's and the service was as gracious as the chicken and waffles were delicious; I decided I'd try my luck on a search engine not targeted at my racial profile.

The search quality is solid, comparable to that of Google, Yahoo, and Microsoft. In fact, the site looks a lot like a re-skinning (no pun intended) of Ask.com, a corporate sibling of IAC-owned RushmoreDrive. Like Ask.com, RushmoreDrive emphasizes search refinement through narrowing and broadening refinements.

What I find ironic is that the whole controversy about racial bias in relevance ranking reveals the much bigger problem--that relevance ranking should not be a black box (ok, maybe this time I'll take responsibility for the pun). I've been beating this drum at The Noisy Channel ever since I criticized Amit Singhal for Google's lack of transparency. I think that sites like RushmoreDrive are inevitable if search engines refuse to cede more control of search results to users.

I don't know how much information race provides as prior to influence statistical ranking approaches, but I'm skeptical that the effects are useful or even noticeable beyond a few well-chosen examples. I'm more inclined to see RushmoreDrive as a marketing ploy by the folks at IAC--and perhaps a successful one. I doubt that Google is running scared, but I think this should be a wake-up call to folks who are convinced that personalized relevance ranking is the end goal of user experience for search engines.

Friday, August 15, 2008

New Information Retrieval Book Available Online

Props to Jeff Dalton for alerting me about the new book on information retrieval by Christopher Manning, Prabhakar Raghavan, and Hinrich Schütze. You can buy a hard copy, but you can also access it online for free at the book website.

Wednesday, August 13, 2008

David Huynh's Freebase Parallax

One of the perks of working in HCIR is that you get to meet some of the coolest people in academic and industrial research. I met David Huynh a few years ago, while he was a graduate student at MIT, working in the Haystack group and on the Simile project. You've probably seen some of his work: his Timeline project has been deployed all over the web.

Despite efforts by me and other to persuade David to stay in the Northeast, he went out west a few months ago to join Metaweb, a company with ambitions "to build a better infrastructure for the Web." While I (and others) am not persuaded by Freebase, Metaweb's "open database of the world’s information," I am happy to see that David is still doing great work.

I encourage you to check out David's latest project: Freebase Parallax. In it, he does something I've never seen outside Endeca (excepting David's earlier work on a Nested Faceted Browser) he allows you to navigate using the facets of multiple entity types, joining between sets of entities through their relationships. At Endeca, we call this "record relationship navigation"--we presented it at HCIR '07, showing an how it can enable social navigation.

David includes a video where he eloquently demonstrates how Parallax works, and the interface is quite compelling. I'm not sure how well it scales with large data sets, but David's focus has been on interfaces rather than systems. My biggest complaint--which isn't David's fault--is that the Freebase content is a bit sparse. But his interface strikes me as a great fit for exploratory search.

Conversation with Seth Grimes

I had an great conversation with Intelligent Enterprise columnist Seth Grimes today. Apparently there's an upside to writing critical commentary on Google's aspirations in the enterprise!

One of the challenges in talking about enterprise search is that no one seems to agree on what it is. Indeed, as I've been discussing with Ryan Shaw , I use the term broadly to describe information access scenarios distinct from web search where an organization has some ownership or control of the content (in contrast to the somewhat adversarial relationship that web search companies have with the content they index). But I realize that many folks define enterprise search more narrowly to be a search box hooked up to the intranet.

Perhaps a better way to think about enterprise search is as a problem rather than solution. Many people expect a search box because they're familiar with searching the web using Google. I don't blame anyone for expecting that the same interface will work for enterprise information collections. Unfortunately, wishful thinking and clever advertising notwithstanding, it doesn't.

I've blogged about this subject from several different perspectives over the past weeks, so I'll refer recent readers to earlier posts on the subject rather than bore the regulars.

But I did want to mention a comment Seth made that I found particularly insightful. He defined enterprise search even more broadly than I do, suggesting that it encompassed any information seeking performed in the pursuit of enterprise-centric needs. In that context, he does see Google as the leader in enterprise search--not because of their enterprise offerings, but rather because of the web search they offer for free.

I'm not sure how I feel about his definition, but I think he raises a point that enterprise vendors often neglect. No matter how much information an enterprise controls, there will always be valuable information outside the enterprise. I find today's APIs to that information woefully inadequate; for example, I can't even choose a sort order through any of the web search APIs. But I am optimistic that those APIs will evolve, and that we will see "federated" information seeking that goes beyond merging ranked lists from different sources.

Indeed, I look forward to the day that web search providers take a cue from the enterprise and drop the focus on black box relevance ranking in favor of an approach that offers users control and interaction.

Monday, August 11, 2008

Position papers for NSF IS3 Workshop

I just wanted to let folks know that the position papers for the NSF Information Seeking Support Systems Workshop are now available at this link.

Here is a listing to whet your curiosity:
  • Supporting Interaction and Familiarity
    James Allan, University of Massachusetts Amherst, USA

  • From Web Search to Exploratory Search: Can we get there from here?
    Peter Anick, Yahoo! Inc., USA

  • Complex and Exploratory Web Search (with Daniel Russell)
    Anne Aula, Google, USA

  • Really Supporting Information Seeking: A Position Paper
    Nicholas J. Belkin, Rutgers University, USA

  • Transparent and User-Controllable Personalization For Information Exploration
    Peter Brusilovsky, University of Pittsburgh, USA

  • Faceted Exploratory Search Using the Relation Browser
    Robert Capra, UNC, USA

  • Towards a Model of Understanding Social Search
    Ed Chi, Palo Alto Research Center, USA

  • Building Blocks For Rapid Development of Information Seeking Support Systems
    Gary Geisler, University of Texas at Austin, USA

  • Collaborative Information Seeking in Electronic Environments
    Gene Golovchinsky, FX Palo Alto Laboratory, USA

  • NeoNote: User Centered Design Suggestions for a Global Shared Scholarly Annotation System
    Brad Hemminger, UNC, USA

  • Speaking the Same Language About Exploratory Information Seeking
    Bill Kules, The Catholic University of America, USA

  • Musings on Information Seeking Support Systems
    Michael Levi, U.S. Bureau of Labor Statistics, USA

  • Social Bookmarking and Information Seeking
    David Millen, IBM Research, USA

  • Making Sense of Search Result Pages
    Jan Pedersen, Yahoo, USA

  • A Multilevel Science of Social Information Foraging and Sensemaking
    Peter Pirolli, XEROX PARC USA

  • Characterizing, Supporting and Evaluating Exploratory Search
    Edie Rasmussen, University of British Columbia, Canada

  • The Information-Seeking Funnel
    Daniel Rose, A9.com, USA

  • Complex and Exploratory Web Search (with Anne Aula)
    Daniel Russell, Google, USA

  • Research Agenda: Visual Overviews for Exploratory Search
    Ben Shneiderman, University of Maryland, USA

  • Five Challenges for Research to Support IS3
    Elaine Toms, Dalhousie University, Canada

  • Resolving the Battle Royale between Information Retrieval and Information Science
    Daniel Tunkelang, Endeca, USA

Sunday, August 10, 2008

Why Enterprise Search Will Never Be Google-y

As I prepared to end my trilogy of Google-themed posts, I ran into two recently published items. They provide an excellent context for what I intended to talk about: the challenges and opportunities of enterprise search.

The first is Google's announcement of an upgrade to their search appliance that allows one box to index 10 million documents and offers improved search quality and personalization.

The second is an article by Chris Sherman in the Enterprise Search Sourcebook 2008 entitled Why Enterprise Search Will Never Be Google-y.

First, the Google announcement. These are certainly improvements for the GSA, and Google does seem to be aiming to compete with the Big Three: Autonomy, Endeca, FAST (now a subsidiary of Microsoft). But these improvements should be seen in the context of state of the art. In particular, Google's scalability claims, while impressive, still fall short of the market leaders in enterprise search. Moreover, the bottleneck in enterprise search hasn't been the scale of document indexing, but rather the effectiveness with which people can access and interact with the indexed content. Interestingly, Google's strongest selling point for the GSA, their claim it works "out of the box", is also its biggest weakness: even with the new set of features, the GSA does not offer the flexibility or rich functionality that enterprises have come to expect.

Second, the Chris Sherman piece. Here is an excerpt:
Enterprise search and web search are fundamentally different animals, and I'd argue that enterprise search won't--and shouldn't--be Google-y any time soon....Like web search, Google's enterprise search is easy to use--if you're willing to go along with how Google's algorithms view and present your business information....Ironically, enterprises, with all of their highly structures and carefully organized silos of information, require a very different and paradoxically more complex approach.
I highly recommend you read the whole article (it's only 2 pages), not only because it informative and well written, but also because the author isn't working for one of the Big Three.

The upshot? There is no question that Google is raising the bar for simple search in the enterprise. I wouldn't recommend that anyone try to compete with the GSA on its turf.

But information needs in the enterprise go far beyond known-item search, What enterprises want when they ask for "enterprise search" is not just a search box, but an interactive tool that helps them (or their customers) work through the process of articulating and fulfilling their information needs, for tasks as diverse as customer segmentation, knowledge management, and e-discovery.

If you're interested in search and want to be on the cutting edge of innovation, I suggest you think about the enterprise.

Thursday, August 7, 2008

Where Google Isn't Good Enough

My last post, Is Google Good Enough?, challenged would-be Google killers to identify and address clear consumer needs for which Google isn't good enough as a solution. I like helping my readers, so here are some ideas.
  • Shopping. Google Product Search (fka Froogle) is not one of Google's crown jewels. At best, it works well when you know the exact name of the product you are looking for. But it pales in contrast to any modern ecommerce site, such as Amazon or Home Depot. What makes a shopping site successful? Put simply, it helps users find what they want, even when they didn't know exactly what they wanted when they started.

  • Finding a job. Google has not thrown its hat into the ring of job search, and even the page they offer for finding jobs at Google could use some improvement. The two biggest job sites, Monster and Careerbuilder, succeed in terms of the number of jobs posted, but aren't exactly optimized for user experience. Dice does better, but only for technology jobs. Interestingly, the best job finding site may be LinkedIn--not because of their search implementation (which is adequate but not innovative), but because of their success in getting millions of professionals to provide high-quality data.

  • Finding employees. Again, LinkedIn has probably come closest to providing a good employee finding site. The large job sites (all of which I've used at some point) not only fail to support exploratory search, but also suffer from a skew towards ineligible candidates and a nuisance of recruiters posing as job seekers. Here again, Google has not tried to compete.

  • Planning a trip. Sure, you can use Expedia, Travelocity, or Kayak to find a flight, hotel, and car rental. But there's a lot of room for improvement when it comes to planning a trip, whether for business or pleasure. The existing tools do a poor job of putting together a coordinated itinerary (e.g., meals, activities), and also don't integrate with relevant information sources, such as local directories and reviews. This is another area where Google has not tried to play.
Note two general themes here. The first is thinking beyond the mechanics of search and focusing on the ability to meet user needs at the task level. The second is the need for exploratory search. These only scratch the surface of opportunities in consumer-facing "search" applications. The opportunities within the enterprise are even greater, but I'll save that for my next post.

Tuesday, August 5, 2008

Is Google Good Enough?

As Chief Scientist of Endeca, I spend a lot of my time explaining to people why they should not be satisfied with an information seekin interface that only offers them keyword search as an input mechanism and a ranked list of results as output. I tell them about query clarification dialogs, faceted navigation, and set analysis. More broadly, I evangelize exploratory search and human computer information retrieval as critical to addressing the inherent weakness of conventional ranked retrieval. If you haven't heard me expound on the subject, feel free to check out this slide show on Is Search Broken?.

But today I wanted to put my ideology aside and ask the the simple question: Is Google good enough? Here is a good faith attempt to make the case for the status quo. I'll focus on web search, since, as I've discussed before on this blog, enterprise search is different.

1) Google does well enough on result quality, enough of the time.

While Google doesn't publish statistics about user satisfaction, it's commonplace that Google usually succeeds in returning results that users find relevant. Granted, so do all of the major search engines: you can compare Google and Yahoo graphically at this site. But the question is not whether other search engines are also good enough--or even whether they are better. The point is that Google is good enough.

2) Google doesn't support exploratory search. But it often leads you to a tool that does.

The classic instance of this synergy is when Google leads you to a Wikipedia entry. For example, I look up Daniel Kahneman on Google. The top results is his Wikipedia entry. From there, I can traverse links to learn about his research areas, his colleagues, etc.

3) Google is a benign monopoly that mitigates choice overload.

Many people, myself includes, have concerns about Google's increasing role in mediating our access to information. But it's hard to ignore the upside of a single portal that gives you access to everything in one place: web pages, blogs, maps, email, etc, And it's all "free"--at least in so far as ad-supported services can be said to be free.

In summary, Google sets the bar pretty high. There are places where Google performs poorly (e.g., shopping) or doesn't even try to compete (e.g., travel). But when I see the series of companies lining up to challenge Google, I have to wonder how many of them have identified and addressed clear consumer needs for which Google isn't good enough as a solution. Given Google's near-monopoly in web search, parity or even incremental advantage isn't enough.

Monday, July 28, 2008

Not as Cuil as I Expected

Today's big tech news is the launch of Cuil, the latest challenger to Google's hegemony in Web search. Given the impressive team of Xooglers that put it together, I had high expectations for the launch.

My overall reaction: not bad, but not good enough to take seriously as a challenge to Google. They may be "The World's Biggest Search Engine" based on the number of pages indexed, but they return zero results for a number of queries where Google does just fine, including noisy channel blog (compare to Google). But I'm not taking it personally--after all, their own site doesn't show up when you search for their name (again, compare to Google). As for their interface features (column display, explore by category, query suggestions), they're fine, but neither the concepts nor the quality of their implementation strike me as revolutionary.

Perhaps I'm expecting too much on day 1. But they're not just trying to beat Gigablast; they're trying to beat Google, and they surely expected to get lots of critical attention the moment they launched. Regardless of the improvements they've made in indexing, they clearly need to do more work on their crawler. It's hard to judge the quality of results when it's clear that at least some of the problem is that the most relevant documents simply aren't in their index. I'm also surprised to not see Wikipedia documents showing up much for my searches--particularly for searches when I'm quite sure the most relevant document is in Wikipedia. Again, it's hard to tell if this is an indexing or results quality issue.

I wish them luck--I speak for many in my desire to see Google face worthy competition in web search.

Sunday, July 27, 2008

Catching up on SIGIR '08

Now that SIGIR '08 is over, I hope to see more folks blogging about it. I'm jealous of everyone who had the opportunity to attend, not only because of the culinary delights of Singapore, but because the program seems to reflect an increasing interest of the academic community in real-world IR problems.

Some notes from looking over the proceedings:
  • Of the 27 paper sessions, 2 include the word "user" in their titles, 2 include the word "social", 2 focus on Query Analysis & Models, and 1 is about exploratory search. Compared to the last few SIGIR conferences, this is a significant increase in focus on users and interaction.

  • A paper on whether test collections predict users' effectiveness offers an admirable defense of the Cranfield paradigm, much along the lines I've been advocating.

  • A nice paper from Microsoft Research looks at the problem of whether to personalize results for a query, recognizing that not all queries benefit from personalization. This approach may well be able to reap the benefits of personaliztion while avoiding much of its harm.

  • Two papers on tag prediction: Real-time Automatic Tag Recommendation (ACM Digital Library subscription required) and Social Tag Prediction. Semi-automated tagging tools are one of the best ways to leverage the best of both human and machine capabilities.
And I haven't even gotten to the posters! I'm sad to see that they dropped the industry day, but perhaps they'll bring it back next year in Boston.

Wednesday, July 23, 2008

Knol: Google takes on Wikipedia

Just a few days ago, I was commenting on a New York Times article about Wikipedia's new approval system that the biggest problem with Wikipedia is anonymous authorship. By synchronous coincidence, Google unveiled Knol today, which is something of a cross between Wikipedia and Squidoo. It's most salient feature is that each entry will have a clearly identified author. They even allow authors to verify their identities using credit cards or phone directories.

It's a nice idea, since anonymous authorship is a a major factor in the adversarial nature of information retrieval on the web. Not only does the accountability of authorship inhibit vandalism and edit wars, but it also allows readers to decide for themselves whom to trust--at least to the extent that readers are able and willing to obtain reliable information about the authors. Without question, they are addressing Wikipedia's biggest weakness.

But it's too little, too late. Wikipedia is already there. And, despite complaints about its inaccuracy and bias, Wikipedia is a fantastic, highly utilized resource. The only way I see for Knol to supplant Wikipedia in reasonable time frame is through a massive cut-and-paste to make up for the huge difference in content.

Interestingly, Wikipedia does not seem to place any onerous restrictions on verbatim copying. However, unless a single author is 100% responsible for authoring a Wikipedia entry, it isn't clear that anyone can simply copy the entry into Knol.

I know that it's dangerous to bet against Google. But I'm really skeptical about this latest effort. It's a pity, because I think their emphasis is the right one. But for once I wish they'd been a bit more humble and accepted that they aren't going to build a better Wikipedia from scratch.

Saturday, July 19, 2008

Predictably Irrational

As regular readers have surely noticed by now, I've been on a bit of a behavioral psychology kick lately. Some of this reflects long-standing personal interest and my latest reading. But I also feel increasingly concerned that researchers in information seeking--especially those working on tools--have neglected the impact of cognitive bias.

For those who are unfamiliar with last few decades of research in this field, I highly recommend a recent lecture by behavioral economist Dan Ariely on predictable irrationality. Not only is he a very informative and entertaining speaker, but he chooses very concrete and credible examples, starting with his contemplating how we experience pain based on his own experience of suffering
third-degree burns over 70 percent of his body. I promise you, the lecture is an hour well spent, and the time will fly by.

A running theme of through this and my other posts on cognitive bias is that the way a information is presented to us has dramatic effects on how we interpret that information.

This is great news for anyone who wants to manipulate people. In fact, I once asked Dan about the relative importance of people's inherent preferences vs. those induced by presentation on retail web sites, and he all but dismissed the former (i.e., you can sell ice cubes to Eskimos, if you can manipulate their cognitive biases appropriately). But it's sobering news for those of us who want to empower user to evaluate information objectively to support decision making.

Friday, July 18, 2008

Call to Action - A Follow-Up

The call to action I sent out a couple of weeks ago has generated healthy interest.

One of the several people who responded is the CTO of one of Endeca's competitors, whom I laud for understanding that the need to better articulate and communicate the technology of information access transcends competition among vendors. While we have differences on how to achieve this goal, I at least see hope from his responsiveness.

The rest were analysts representing some of the leading firms in the space. They not only expressed interest, but also contributed their own ideas on how to make this effort successful. Indeed, I met with two analysts this week to discuss next steps.

Here is where I see this going.

In order for any efforts to communicate the technology of information access to be effective, the forum has to establish credibility as a vendor-neutral and analyst-neutral forum. Ideally, that means having at least two major vendors and two major analysts on board. What we want to avoid is having only one major vendor or analyst, since that will create a reasonable perception of bias.

I'd also like to involve academics in information retrieval and library and information science. As one of the analysts suggested, we could reach out to the leading iSchools, who have expressed an open interest in engaging the broader community.

What I'd like to see come together is a forum, probably a one-day workshop, that brings together credible representatives from the vendor, analyst, and academic communities. With a critical mass of participants and enough diversity to assuage concerns of bias, we can start making good on this call to action.

Tuesday, July 15, 2008

Beyond a Reasonable Doubt

In Psychology of Intelligence Analysis, Richards Heuer advocates that we quantify expressions of uncertainty: "To avoid ambiguity, insert an odds ratio or probability range in parentheses after expressions of uncertainty in key judgments."

His suggestion reminds me of my pet peeve about the unquantified notion of reasonable doubt in the American justice system. I've always wanted (but never had the opportunity) to ask a judge what probability of innocence constitutes a reasonable doubt.

Unfortunately, as Heuer himself notes elsewhere in his book, we human beings are really bad at estimating probabilities. I suspect (with a confidence of 90 to 95%) that quantifying our uncertainties as probability ranges will only suggest a false sense of precision.

So, what can we do to better communicate uncertainty? Here are a couple of thoughts:
  • We can calibrate estimates based on past performance. It's unclear what will happen if people realize that their estimates are being translated, but, at worst, it feels like good fodder for research in judgment and decision making.

  • We can ask people to express relative probability judgments. While these are also susceptible to bias, at least they don't demand as much precision. And we can always vary the framing of questions to try to factor out the cognitive biases they induce.
Also, we talk about uncertainty, it is important that we distinguish between aleatory and epistemic uncertainty.

When I flip a coin, I am certain it has a 50% chance of landing heads, because I know the probability distribution of the event space. This is aleatory uncertainty, and forms the basis of probability and statistics.

But when I reason about less contrived uncertain events, such as estimating the likelihood that my bank will collapse this year, the challenge is my ignorance of the probability distribution. This is epistemic uncertainty, and it's a lot messier.

If you'd like to learn more about aleatory and existential uncertainty, I recommend Nicholas Nassim Taleb's Fooled by Randomness (which is a better read than his better-known Black Swan).

In summary, we have to accept the bad news that the real world is messy. As a mathematician and computer scientist, I've learned to pursue theoretical rigor as an ideal. Like me, you may find it very disconcerting to not be able to treat all real-world uncertainty in terms of probability spaces. Tell it to the judge!

Sunday, July 13, 2008

Small is Beautiful

Today's New York Times has an article by John Markoff called On a Small Screen, Just the Salient Stuff. It argues that the design constraints of the iPhone (and of mobile devices in general) lead to an improved user experience, since site designers do a better job of focusing on the information that users will find relevant.

Of course, on a blog entitled The Noisy Channel, I can't help praising approaches that strive to improve the signal-to-noise ratio in information seeking applications. And I'm glad to see them quoting Ben Shneiderman, a colleague of mine at the University of Maryland who has spent much of his career focusing on HCIR issues.

Still, I think they could have taken the idea much further. Their discussion of more efficient or ergonomic use of real estate boils down to stripping extraneous content (a good idea, but hardly novel), and making sites vertically oriented (i.e., no horizontal scrolling). They don't consider the question of what information is best to present in the limited space--which, in my mind, is the most important question to consider as we optimize interaction. Indeed, many of the questions raised by small screens also apply to other interfaces, such as voice.

Perhaps I am asking too much to expect them to call out the extreme inefficiency of ranked lists, compared to summarization-oriented approaches. Certainly the mobile space opens great opportunities for someone to get this right on the web.

Friday, July 11, 2008

Psychology of Intelligence Analysis

In the course of working with some of Endeca's more interesting clients, I started reading up on how the intelligence agencies address the challenges of making decisions, especially in the face of incomplete and contradictory evidence. I ran into a book called Psychology of Intelligence Analysis by former CIA analyst Richards Heuer. The entire book is available online, or you can hunt down a hard copy of the out-of-print book from your favorite used book seller.

Given the mixed record of the intelligence agencies over the past few decades, you might be wondering if the CIA is the best source for learning how to analyze intelligence. But this book is a gem. Even if the agencies don't always practice what they preach (and the book makes a good case as to why), the book is an excellent tour through the literature on judgment and decision making.

If you're already familiar with work by Herb Simon, Danny Kahneman, and Amos Tversky, then a lot of the ground he covers will be familiar--especially the third of the book that enumerates cognitive biases. I'm a big fan of the judgment and decision making literature myself. But I still found some great nuggets, particularly Chapter 8 on Analysis of Competing Hypotheses. Unlike most of the literature that focuses exclusively on demonstrating our systematic departures from rationality, Heuer hopes offer at least some constructive advice.

As someone who builds tools to help people make decisions using information that not only may be incomplete and contradictory, but also challenging to find in the first place, I'm very sensitive to how people's cognitive biases affect their ability to use these tools effectively. One of the HCIR '07 presentations by Jolie Martin and Michael Norton (who have worked with Max Bazerman) showed how the manner in which information was partitioned on retail web sites drove decisions, i.e., re-organizing the same information affected consumer's decision process.

It may be tempting for us on the software side to wash our hands of our users' cognitive biases. But such an approach would be short-sighted. As Heuer shows in his well-researched book, people not only have cognitive biases, but are unable to counter those biases simply by being made aware of them. Hence, if software tools are to help people make effective decisions, it is the job of us tool builders to build with those biases in mind, and to support processes like Analysis of Competing Hypotheses that try to compensate for human bias.

Thursday, July 10, 2008

Nice Selection of Machine Learning Papers

John Langford just posted a list of seven ICML '08 papers that he found interesting. I appreciate his taste in papers, and I particularly liked a paper on Learning Diverse Rankings with Multi-Armed Bandits that addresses learning a diverse ranking of documents based on users' clicking behavior. If you liked the Less is More work that Harr Chen and David Karger presented at SIGIR '06, then I recommend you check this one out.

Tuesday, July 8, 2008

Librarian 2.0

Many of the words that mark milestones in the history of technology, such as calculator and word processor, originally corresponded to people. Calculating had at least two lives as a technology breakthrough--first as a process, and then as a automatic means for executing that process. Thanks to inventions like calculators and computers, human beings have moved up the value chain to become scientists and engineers who take low-level details for granted.

Similarly, the advances in information science and retrieval have dramatically changed the role of a reference librarian.

Hopefully some of you old enough to remember card catalogs, They were certainly functional if you knew the exact title or author you were looking for, assuming the title wasn't too generic or author too prolific. Where card catalogs fell short was in supporting exploratory search. In many cases, your best bet was to quite literally explore the stacks and hope that locality within the Dewey Decimal system sufficed for to support your information seeking needs. Alternatively, you could follow citation paths--the dead-tree precursor of surfing a hypertext collection.

For exploratory tasks, library patrons would turn to reference librarians, who would clarify the patrons' needs through a process called the reference interview. According to Wikipedia:
A reference interview is composed of two segments:

1. An initial segment in which the librarian encourages the user to fully discuss the request.
2. A final segment in which the librarian asks questions to relate the request to the materials available in the library

A reference interview is structured (ideally) according to the following series of steps. First the library user states a question or describes a problem. The librarian then clarifies the user's information need, sometimes leading him or her back from a request for a specific resource (which may not be the best one for the problem at hand) to the actual information need as it manifests in the library user's life. Following that, the librarian suggests information resources that address the user's information need, explaining the nature and scope of information they contain and soliciting feedback. The reference interview closes when the librarian has provided the appropriate information or a referral to an outside resource where it can be found, and the user confirms that he or she has received the information needed.
Fast forward to the present day. Thanks to modern search engines, title and author search are no longer tedious processes. Moreover, search engines are somewhat forgiving of users, offering spelling correction and inexact query matching. Libraries are still catching up with advances in technology, but the evolution is clearly under way.

However, search engines have not obviated the need for a reference interview. Excepting the simple cases of known item search, the typical information seeker needs help translating an information need into one or more search queries. And that information need may change as the seeker learns from the process.

But it should come as no surprise that information seeking support systems need to be more than search engines. The ideal information seeking support system emulates a reference librarian, stepping users through a structured process of clarification. Indeed, this is exactly what my colleagues and I at Endeca are trying to do in our work with libraries and more broadly in pursuing a vision of human computer information retrieval.

What then becomes of librarians? Much as calculators and computers did not obviate the need for mathematicians, I don't see technology obviating the need for information scientists. Library schools have already evolved into information schools, and I have no doubt that their graduates will help establish the next generation of information seeking technology that makes today's search engines seem as quaint as card catalogs.

Sunday, July 6, 2008

Resolving the Battle Royale between Information Retrieval and Information Science

The following is the position paper I submitted to the NSF Information Seeking Support Systems Workshop last month. The workshop report is still being assembled, but I wanted to share my own contribution to the discussion, since it is particularly appropriate to the themes of The Noisy Channel.


Resolving the Battle Royale between Information Retrieval and Information Science


Daniel Tunkelang

Endeca

ABSTRACT

We propose an approach to help resolve the “battle royale” between the information retrieval and information science communities. The information retrieval side favors the Cranfield paradigm of batch evaluation, criticized by the information science side for its neglect of the user. The information science side favors user studies, criticized by the information retrieval side for their scale and repeatability challenges. Our approach aims to satisfy the primary concerns of both sides.

Categories and Subject Descriptors

H.1.2 [Human Factors]: Human information processing.

H.3.3 [Information Systems]: Information Search and Retrieval - Information Filtering, Retrieval Models

H.5.2 [Information Systems]: Information Interfaces and Presentation - User Interfaces

General Terms

Design, Experimentation, Human Factors

Keywords

Information science, information retrieval, information seeking, evaluation, user studies

1. INTRODUCTION

Over the past few decades, a growing community of researchers has called for the information retrieval community to think outside the Cranfield box. Perhaps the most vocal advocate is Nick Belkin, whose "grand challenges" in his keynote at the 2008 European Conference on Information Retrieval [1] all pertained to the interactive nature of information seeking he claims the Cranfield approach neglects. Belkin cited similar calls to action going back as far as Karen Spärck Jones, in her 1988 acceptance speech for the Gerald Salton award [2], and again from Tefko Saracevic, when he received the same award in 1997 [3]. More recently, we have the Information Seeking and Retrieval research program proposed by Peter Ingwersen and Kalervo Järvelin in The Turn, published in 2005 [4].

2. IMPASSE BETWEEN IR AND IS

Given the advocacy of Belkin and others, why hasn't there been more progress? As Ellen Voorhees noted in defense of Cranfield at the 2006 Workshop on Adaptive Information Retrieval, "changing the abstraction slightly to include just a bit more characterization of the user will result in a dramatic loss of power or increase in cost of retrieval experiments" [5]. Despite user studies that have sought to challenge the Cranfield emphasis on batch information retrieval measures like mean average precision—such as those of Andrew Turpin and Bill Hersh [6]—the information retrieval community, on the whole, remains unconvinced by these experiments because they are smaller in scale and less repeatable than the TREC evaluations.

As Tefko Saracevic has said, there is a "battle royale" between the information retrieval community, which favors the Cranfield paradigm of batch evaluation despite its neglect of the user, and the information science community, which favors user studies despite their scale and repeatability challenges [7]. How do we move forward?

3. PRIMARY CONCERNS OF IR AND IS

Both sides have compelling arguments. If an evaluation procedure is not repeatable and cost-effective, it has little practical value. Nonetheless, it is essential that an evaluation procedure measure the interactive nature of information seeking.

If we are to find common ground to resolve this dispute, we need to satisfy the primary concerns of both sides:

· Real information seeking tasks are interstice, so the results of the evaluation procedure must be meaningful in an interactive context.

· The evaluation procedure must be repeatable and cost-effective.

In order to move beyond the battle royale and resolve the impasse between the IR and IS communities, we need to address both of these concerns.

4. PROPOSED APPROACH


A key point of contention in the battle royale is whether we should evaluate systems by studying individual users or measuring system performance against test collections.

The short answer is that we need to do both. In order to ground the results of evaluation in realistic contexts, we need to conduct user studies that relate proposed measures to success in interactive information seeking tasks. Otherwise, we optimize under the artificial constraint that a task involves only a single user query.

Such an approach presumes that we have a characterization of information seeking tasks. This characterization is an open problem that is beyond the scope of this position paper but has been addressed by other information seeking researchers, including Ingwersen and Järvelin [4]. We presume access to a set of tasks that, if not exhaustive, at least applies to a valuable subset of real information seeking problems.

Consider, as a concrete example, the task of a researcher who, given a comprehensive digital library of technical publications, wants to determine with confidence whether his or her idea is novel. In other words, the researcher want to either discover prior art that anticipates the idea, or to state with confidence that there is no such art. Patent inventors and lawyers performing e-discovery perform analogous tasks. We can measure task performance objectively as a combination of accuracy and efficiency, and we can also consider subject measures like user confidence and satisfaction. Let us assume that we are able to quantify a task success measure that incorporates these factors.

Given this task and success measure, we would like to know how well an information retrieval system supports the user performing it. As the information scientists correctly argue, user studies are indispensable. But, as we employ user studies to determine which systems are most helpful to users, we need to go a step further and correlate user success to one or more system measures. We can then evaluate these system measures in a repeatable, cost-effective process that does not require user involvement.

For example, let us hypothesize that mean average precision (MAP) on a given TREC collection is such a measure. We hypothesize that users pursuing the prior art search task are more successful using a system with higher MAP than those using a system with lower MAP. In order to test this hypothesis, we can present users with a family of systems that, insofar as possible, vary only in MAP, and see how well user success correlates to the system’s MAP. If the correlation is strong, then we validate the utility of MAP as a system measure and invest in evaluating systems using MAP against the specified collection in order to predict their utility for the prior art task.

The principle here is a general one, and can even be used not only to compare different algorithms, but also to evaluate more sophisticated interfaces, such as document clustering [8] or faceted search [9]. The only requirement is that we hypothesize and validate system measures that correlate to user success.

5. WEAKNESSES OF APPROACH

Our proposed approach has two major weaknesses.

The first weakness is that, in a realistic interactive information retrieval context, distinct queries are not independent. Rather, a typical user executes a sequence of queries in pursuit of an information need, each query informed by the results of the previous ones.

In a batch test, we must decide the query sequence in advance, and cannot model how the user’s queries depend on system response. Hence, we are limited to computing measures that can be evaluated for each query independently. Nonetheless, we can choose measures which correlate to effectiveness in realistic settings. Hopefully these measures are still meaningful, even when we remove the test queries from their realistic context.

The second challenge is that we do not envision a way to compare different interfaces in a batch setting. It seems that testing the relative merits of different interfaces requires real—or at least simulated—users.

If, however, we hold the interface constant, then we can define performance measures that apply to those interfaces. For example, we can develop standardized versions of well-studied interfaces, such as faceted search and clustering. We can then compare the performance of different systems that use these interfaces, e.g., different clustering algorithms.

6. AN ALTERNATIVE APPROACH

An alternative way to tackle the evaluation problem leverages the “human computation” approach championed by Luis Von Ahn [10]. This approach uses “games with a purpose” to motivate people to perform information-related tasks, such as image tagging and optical character recognition (OCR).

A particularly interesting "game" in our present context is Phetch, in which in which one or more "Seekers" compete to find an image based on a text description provided by a "Describer" [11]. The Describer’s goal is to help the Seekers succeed, while the Seekers compete with one another to find the target image within a fixed time limit, using search engine that has indexed the images based on tagging results from the ESP Game. In order to discourage a shotgun approach, the game penalizes Seekers for wrong guesses.

This game goes quite far in capturing the essence of interactive information retrieval. If we put aside the competition among the Seekers, then we see that an individual Seeker, aided by the human Describer and the algorithmic--but human indexed--search engine--is pursuing an information retrieval task. Moreover, the Seeker is incented to be both effective and efficient.

How can we leverage this framework for information retrieval evaluation? Even though the game envisions both Describers and Seekers to be human beings, there is no reason we cannot allow computers to play too--in either or both roles. Granted, the game, as currently designed, focuses on image retrieval without giving the human players direct access to the image tags, but we could imagine a framework that is more amenable to machine participation, e.g., providing a machine player with a set of tags derived from those in the index when that player is presented with an image. Alternatively, there may be a domain more suited than image retrieval to incorporating computer players.

The main appeal of the game framework is that it allows all participants to be judged based on an objective criterion that reflects the effectiveness and efficiency of the interactive information retrieval process. A good Describer should, on average, outscore a bad Describer over the long term; likewise, a good Seeker should outscore a bad one. We can even vary the search engine available to Seekers, in order to compare competing search engine algorithms or interfaces.

7. CONCLUSION

Our goal is ambitious: we aspire towards an evaluation framework that satisfies information scientists as relevant to real-world information seeking, but nonetheless offers the practicality of the Cranfield paradigm that dominates information retrieval. The near absence of collaboration between the information science and information retrieval communities has been a greatly missed opportunity not only for both researcher communities but also for the rest of the world who could benefit from practical advances in our understanding of information seeking. We hope that the approach we propose takes at least a small step towards resolving this battle royale.

8. REFERENCES

[1] Belkin, N. J., 2008. Some(What) Grand Challenges for Information Retrieval. ACM SIGIR Forum 42, 1 (June 2008), 47-54.

[2] Spärck Jones, K. 1988. A look back and a look forward. In: SIGIR ’88. In Proceedings of the 11th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval, 13-29.

[3] Saracevic, T. 1997. Users lost: reflections of the past, future and limits of information science. ACM SIGIR Forum 31, 2 (July 1997), 16-27.

[4] Ingwersen, P. and Järvelin, K. 2005. The turn. Integration of information seeking and retrieval in context. Springer.

[5] Voorhees, E. 2006. Building Test Collections for Adaptive Information Retrieval: What to Abstract for What cost? In First International Workshop on Adaptive Information Retrieval (AIR).

[6] Turpin, A. and Scholer, F. 2006. User performance versus precision measures for simple search tasks. In Proceedings
of the 29th Annual international ACM SIGIR Conference on Research and Development in information Retrieval
, 11-18.

[7] Saracevic, T. (2007). Relevance: A review of the literature and a framework for thinking on the notion in information science. Part II: nature and manifestations of relevance. Journal of the American Society for Information Science and Technology 58(3), 1915-1933.

[8] Cutting, D., Karger, D., Pedersen, J., and Tukey, J. 1992. Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections. In Proceedings of the 15th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval, 318-329.

[9] Workshop on Faceted Search. 2006. In Proceedings of the 29th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval.

[10] Von Ahn, L. 2006. Games with a Purpose. IEEE Computer 39, 6 (June 2006), 92-94.

[11] Von Ahn, L., Ginosar, S., Kedia, M., Liu, R., and Blum, M. 2006. Improving accessibility of the web with a computer game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 79-82.


Tuesday, September 16, 2008

We've Moved!

Please redirect your readers to http://thenoisychannel.com! The RSS feed is available at http://thenoisychannel.com/?feed=rss2.

See you all there...

Migrating Tonight!

At long last, this blog will migrate over to a hosted WordPress platform at http://thenoisychannel.com/. Thanks to Andy Milk (and to Endeca for lending me his services) and especially to Noisy Channel regular David Fauth for making this promised migration a reality!

As of midnight EST, please visit the new site. My goal is to redirect all incoming Blogger traffic to the new hosted site. This will be the last post here at Blogger.

p.s. Please note that I'll be manually migrating any content (posts and comments) from the past 5 days, i.e., since I performed an import on September 12th. My apologies if anything is lost in translation.

Quick Bites: Search Evaluation at Google

Original post is here; Jeff's commentary is here. Not surprisingly, my reaction is that Google should consider a richer notion of "results" than an ordering of matching pages, perhaps a faceted approach that reflects the "several dimensions to 'good' results."

Quick Bites: Is Wikipedia Production Slowing Down?

Thanks to Sérgio for tweeting this post by Peter Pirolli at PARC: Is Wikipedia Production Slowing Down?

Here's the picture showing the reduction of growth in the number of Wikipedia editors over time:



Interesting material and commentary at Augmented Social Cognition and Peter Pirolli's blog. Are people are running out of things to write about?

Monday, September 15, 2008

Information Accountability

The recent United Airlines stock fiasco triggered an expected wave of finger pointing. For those who didn't follow the event, here is the executive summary:

    In the wee hours of Sunday, September 7th, The South Florida Sun-Sentinel (a subsidiary of the Tribune Company) included a link to an article entitled "UAL Files for Bankruptcy." The link was legit, but the linked article didn't carry its publication date in 2002. Then Google's news bot picked up the article and automatically assigned it a current date. Furthermore, Google sent the link to anyone with an alert set up for news about United. Then, on Monday, September 8th, someone at Income Security Advisors saw the article in the results for a Google News search and sent it out on Bloomberg. The results are in the picture below, courtesy of Bloomberg by way of the New York Times.



    For anyone who wants all of the gory details, Google's version of the story is here; the Tribune Company's version is here.

I've spent the past week wondering about this event from an information access perspective. And then today I saw two interesting articles:
  • The first was a piece in BBC News about a speech by Sir Tim Berners-Lee expressing concern that the internet needs a way to help people separate rumor from real science. His examples included the fears about the Large Hadron Collider at CERN creating a black hole that would swallow up the earth (which isn't quite the premise of Dan Brown's Angels and Demons), and rumors that a vaccine given to children in Britain was harmful.

  • The second was a column in the New York Times about the dynamics of the US presidential campaign, where Adam Nagourney notes that "senior campaign aides say they are no longer sure what works, as they stumble through what has become a daily campaign fog, struggling to figure out what voters are paying attention to and, not incidentally, what they are even believing."
I see a common thread here is that I'd like to call "information accountability." I don't mean this term in the sense of a recent CACM article about information privacy and sensitivity, but rather in a sense of information provenance and responsibility.

Whether we're worrying about Google bombing, Google bowling, or what Gartner analyst Whit Andrews calls "denial-of-insight" attacks, our concern is that information often arrives with implicit authority. Despite the aphorism telling us "don't believe everything you read," most of us select news and information sources with some hope that they will be authoritative. Whether the motto is "all the news that's fit to print" or "don't be evil", our choice of what we believe to be information sources is a necessary heuristic to avoid subjecting everything we read to endless skeptical inquiry.

But sometimes the most reputable news sources get it wrong. Or perhaps "wrong" is the wrong word. When newspapers reported that the FBI was treating Richard Jewell as a "person of interest" in the Centennial Olympic Park bombing (cf. "Olympic Park Bomber" Eric Robert Rudolph), they weren't lying, but rather were communicating information from what they believed to be a reliable source. And, in turn the FBI may have been correctly doing its job, given the information they had. But there's no question that Jewell suffered tremendously from his "trial by media" before his name was ultimately cleared.

It's tempting to react to these information breakdowns with finger-pointing, to figure out who is accountable and, in as litigious a society as the United States, bring on the lawyers. Moreover, there clearly are cases where willful misinformation constitutes criminal defamation or fraud. But I think we need to be careful, especially in a world where information flows in a highly connected--and not necessary acyclic--social graph. Anyone who has played the children's game of telephone knows that small communication errors can blow up rapidly, and that it's difficult to partition blame fairly.

The simplest answer is that we are accountable for how we consume information: caveat lector. But this model seems overly simplistic, since our daily lives hinge our ability to consume information without such a skeptical eye that we can accept nothing at face value. Besides, shouldn't we hold information providers responsible for living up the reputations they cultivate and promote?

There are no easy answers here. But the bad news is that we cannot ignore the questions of information accountability. If terms like "social media" and "web 2.0" mean anything, they surely tell us that the game of telephone will only grow in the number of participants and in the complexity of the communication chains. As a society, we will have to learn to live with and mitigate the fallout.

Sunday, September 14, 2008

Is Blog Search Different?

Alerted by Jeff and Iadh, I recently read What Should Blog Search Look Like?, a position paper by Marti Hearst, Matt Hurst, and Sue Dumais. For those readers unfamiliar with this triumvirate, I suggest you take some time to read their work, as they are heavyweights in some of the areas most often covered by this blog.

The position paper suggests focusing on 3 three kinds of search tasks:
  1. Find out what are people thinking or feeling about X over time.
  2. Find good blogs/authors to read.
  3. Find useful information that was published in blogs sometime in the past.
The authors generally recommend the use of faceted navigation interfaces--something I'd hope would be uncontroversial by now for search in general.

But I'm more struck by their criticism that existing blog search engines fail to leverage the special properties of blog data, and that their discussion, based on work by Mishne and de Rijke, that blog search queries differ substantially from web search queries. I don't doubt the data they've collected, but I'm curious if their results account for the rapid proliferation and mainstreaming of blogs. The lines between blogs, news articles, and informational web pages seem increasingly blurred.

So I'd like to turn the question around: what should blog search look like that is not applicable to search in general?

Saturday, September 13, 2008

Progress on the Migration

Please check out http://thenoisychannel.com/ to see the future of The Noisy Channel in progress. I'm using WordPress hosted on GoDaddy and did the minimum work to port all posts and comments (not including this one).

Here is the my current list of tasks that I'd like to get done before we move.
  • Design! I'm currently using the default WordPress theme, which is pretty lame. I'm inclined to use a clean but stylish two-column theme that is widget-friendly. Maybe Cutline. In any case, I'd like the new site to be a tad less spartan before we move into it.

  • Internal Links. My habit of linking back to previous posts now means I have to map those links to the new posts. I suspect I'll do it manually, since I don't see an easy way to automate it.

  • Redirects. Unfortunately I don't think I can actually get Blogger to redirect traffic automatically. So my plan is to post signage throughout this blog making it clear that the blog has moved.
I'd love help, particularly in the form of advice on the design side. And I'll happily give administration access to anyone who has the cycles to help implement any of these or other ideas. Please let me know by posting here or by emailing me: dtunkelang@{endeca,gmail}.com.

Friday, September 12, 2008

Quick Bites: Probably Irrelevant. (Not!)

Thanks to Jeff Dalton for spreading the word about a new information retrieval blog: Probably Irrelevant. It's a group blog, currently listing Fernando Diaz and Jon Elsas as contributors. Given the authors and the blog name's anagram of "Re-plan IR revolt, baby!", I expect great things!

Wednesday, September 10, 2008

Fun with Twitter

I recently joined Twitter and asked the twitterverse for opinions about DreamHost vs. GoDaddy as a platform to host this blog on WordPress. I was shocked when I noticed today that I'd gotten this response from the President / COO of GoDaddy (or perhaps a sales rep posing as such).

Seems like a lot of work for customer acquisition!

Quick Bites: Email becomes a Dangerous Distraction

Just read this article citing a number of studies to the effect that email is a major productivity drain. Nothing surprising to me--a lot of us have learned the hard way that the only way to be productive is to not check email constantly.

But I am curious if anyone has made progress on tools that alert you to emails that do call for immediate attention. I'm personally a fan of attention bonds approaches, but I imagine that the machine learning folks have at least thought about this as a sort of inverse spam filtering problem.

Tuesday, September 9, 2008

Quick Bites: The Clickwheel Must Die

As someone who's long felt that the iPod's clickwheel violates Fitts's law, I was delighted to read this Gizmodo article asserting that the iPod's clickwheel must die. My choice quote:
Quite simply, the clickwheel hasn't scaled to handle the long, modern day menus in powerful iPods.
Fortunately Apple recognized its mistake on this one and fixed the problem in its touch interface. Though, to be clear, the problem was not inherent in the choice of a wheel interface, but rather in the requirement to make gratuitously precise selections.

Now I'm waiting to see someone fix the tiny minimize/maximize/close buttons in the upper right corner on Windows, which I suspect have become the textbook example of violating Fitts's law.

Monday, September 8, 2008

Incentives for Active Users

Some of the most successful web sites today are social networks, such as Facebook and LinkedIn. These are not only popular web sites; they are also remarkably effective people search tools. For example, I can use LinkedIn to find the 163 people in my network who mention "information retrieval" in their profiles and live within 50 miles of my ZIP code (I can't promise you'll see the same results!).

A couple of observations about social networking sites (I'll focus on LinkedIn) are in order.

First, this functionality is a very big deal, and it's something Google, Yahoo, and Microsoft have not managed to provide, even though their own technology is largely built on a social network--citation ranking.

Second, the "secret sauce" for sites like LinkedIn is hardly their technology (a search engine built on Lucene and a good implementation of breadth-first search), but rather the way they have incented users to be active participants, in everything from virally marketing the site to their peers to inputting high-quality semi-structured profiles that make the site useful. In other words, active users ensure both the quantity and quality of information on the site.

Many people have noted the network effect that drove the run-away success of Microsoft Office and eBay. But I think that social networking sites are taking this idea further, because users not only flock to the crowds, but become personally invested not only in the success of the site generally, but especially in the quality and accuracy of their personal information.

Enterprises need to learn from these consumer-oriented success stories. Some have already. For example, a couple of years ago, IBM established a Professional Marketplace, powered by Endeca, to maintain a skills and availability inventory of IBM employees. This effort was a run-away success, saving IBM $500M in its first year. But there's more: IBM employees have reacted to the success of the system by being more active in maintaining their own profiles. I spent the day with folks at the ACM, and their seeing great uptake in their author profile pages.

I've argued before that there's no free lunch when it comes to enterprise search and information access. The good news, however, is that, if you create the right incentives, you can get other folks to happily pay for lunch.

Quick Bites: Taxonomy Directed Folksonomies

Props to Gwen Harris at Taxonomy Watch for posting a paper by Sarah Hayman and Nick Lothian on Taxonomy Directed Folksonomies.

The paper asks whether folksonomies and formal taxonomy can be used together and answers in the affirmative. The work is in the spirit of some of our recent work at Endeca to bootstrap from vocabularies (though not necessarily controlled vocabularies) to address the inconsistency and sparsity of tagging in folksonomies.

I'm personally excited to see the walls coming down between the two approaches, which many people seem to think of as mutually exclusive approaches to the tagging problem.

Sunday, September 7, 2008

Quick Bites: Is Search Really 90% Solved?

Props to Michael Arrington for calling out this snippet in an interview with Marissa Mayer, Google Vice President of Search Product and User Experience on the occasion of Google's 10th birthday:
Search is an unsolved problem. We have a good 90 to 95% of the solution, but there is a lot to go in the remaining 10%.
I agree with Michael that search isn't even close to being solved yet. I've criticized the way many web search start-ups--and even the giants Yahoo and Microsoft--are going about trying to dethrone Google through incremental improvements or technologies that don't address any need that Google does not already adequately (if not optimally) address. But there is no lack of open problems in search for those ambitious enough to tackle them.

Quick Bites: Applying Turing's Ideas to Search

A colleague of mine at Endeca recently pointed me to a post by John Ferrara at Boxes and Arrows entitled Applying Turing's Ideas to Search.

One of the points he makes echoes the "computers aren't mind readers" theme I've been hammering at for a while:
If the user has not phrased her search clearly enough for another person to understand what she’s trying to find, then it’s not reasonable to expect that a comparatively "dumb" machine could do better. In a Turing test, the response to a question incomprehensible even to humans would prove nothing, because it wouldn’t provide any distinction between person and machine.
While I'm not convinced that search engine designers should be aspiring to pass the Turing test, I agree wholeheartedly with the vision John puts forward:
It describes an ideal form of human-computer interaction in which people express their information needs in their own words, and the system understands and responds to their requests as another human being would. During my usability test, it became clear that this was the very standard to which my test participants held search engines.
It's not about the search engine convincing the user that another human being is producing the answers, but rather engaging users in a conversation that helps them articulate and elaborate their information needs. Or, as we like to call it around here, HCIR.

Saturday, September 6, 2008

Migrating Soon

Just another reminder that I expect to migrate this blog to a hosted WordPress platform in the next days. If you have opinions about hosting platforms, please let me know by commenting here. Right now, I'm debating between DreamHost and GoDaddy, but I'm very open to suggestions.

I will do everything in my power to minimize disruption--not sure how easy Blogger will make it to redirect users to the new site. I'll probably post here for a while after to the move to try to direct traffic.

I do expect the new site to be under a domain name I've already reserved: http://thenoisychannel.com. It currently forwards to Blogger.

Back from the Endeca Government Summit

I spent Thursday at the Endeca Government Summit, where I had the privilege to chat face-to-face with some Noisy Channel readers. Mostly, I was there to learn more about the sorts of information seeking problems people are facing in the public sector in general, and in the intelligence agencies in particular.

While I can't go into much detail, the key concern was exploration of information availability. This problem is the antithesis of known-item search: rather than you are trying to retrieve information you know exist (and which you know how to specify), you are trying to determine if there is information available that would help you with a particular task.

Despite being lost in a sea of TLAs, I came away with a deepened appreciation of both the problems the intelligence agencies are trying to address and the relevance of exploratory search approaches to those problems.

Thursday, September 4, 2008

Query Elaboration as a Dialogue

I ended my post on transparency in information retrieval with a teaser: if users aren't great at composing queries for set retrieval, which I argue is more transparent than ranked retrieval, then how will we ever deliver an information retrieval system that offers both usefulness and transparency?

The answer is that the system needs to help the user elaborate the query. Specifically, the process of composing a query should be a dialogue between the user and the system that allows the user to progressively articulate and explore an information need.

Those of you who have been reading this blog for a while or who are familiar with what I do at Endeca shouldn't be surprised to see dialogue as the punch line. But I want to emphasize that the dialogue I'm describing isn't just a back-and-forth between the user and the system. After all, there are query suggestion mechanisms that operate in the context of ranked retrieval algorithms--algorithms which do not offer the user transparency. While such mechanisms sometimes work, they risk doing more harm than good. Any interactive approach requires the user to do more work; if this added work does not result in added effectiveness, users will be frustrated.

That is why the dialogue has to be based on a transparent retrieval model--one where the system responds to queries in a way that is intuitive to users. Then, as users navigate in query space, transparency ensures that they can make informed choices about query refinement and thus make progress. I'm partial to set retrieval models, though I'm open to probabilistic ones. 

But of course we've just shifted the problem. How do we decide what query refinements to offer to a user in order to support this progressive refinement process? Stay tuned...

Tuesday, September 2, 2008

Migrating to WordPress

Just a quick note to let folks know that I'll be migrating to WordPress in the next days. I'll make every effort to have to move be seamless. I have secured the domain name http://thenoisychannel.com, which currently forwards Blogger, but will shift to wherever the blog is hosted. I apologize in advance for any disruption.

Quick Bites: Google Chrome

For those of you who thought that no major technology news would come out during the Labor Day weekend, check out the prematurely released comic book hailing Google Chrome, Google's long rumored entry into browser wars. By the time you are reading this, the (Windows only) beta may even be available for download. The official Google announcement is here.

If the software lives up to the promise of the comic book, then Google may have a real shot of taking market share from IE and Firefox. More significantly, if they can supplant the operating system with the browser, then they'll have a much more credible opportunity to take on desktop software with their web-based applications.

Interestingly, even though all of the search blogs are reporting about Chrome, I haven't seen any analysis on what this might mean for web search.

Monday, September 1, 2008

Quick Bites: E-Discovery and Transparency

One change I'm thinking of making to this blog is to introduce "quick bites" as a way of mentioning interesting sites or articles I've come across without going into deep analysis. Here's a first one to give you a flavor of the concept. Let me know what you think.

I just read an article on how courts will tolerate search inaccuracies in e-Discovery by way of Curt Monash. It reminded me of our recent discussion of transparency in information retrieval. I agree that "explanations of [search] algorithms are of questionable value" for convincing a court of the relevance and accuracy of the results. But that's because those algorithms aren't sufficiently intuitive for those explanations to be meaningful except in a theoretical sense to an information retreival researcher.

I realize that user-entered Boolean queries (the traditional approach to e-Discovery) aren't effective because users aren't great at composing queries for set retrieval. But that's why machines need to help users with query elaboration--a topic for an upcoming post.

POLL: Blogging Platform

I've gotten a fair amount of feedback suggesting that I switch blogging platforms. Since I'd plan to make such changes infrequently, I'd like to get input from readers before doing so, especially since migration may have hiccups.

I've just posted a poll on the home page to ask if folks here have a preference as to which blogging platform I use. Please vote this week, and feel free to post comments here.

Friday, August 29, 2008

Improving The Noisy Channel: A Call for Ideas

Over the past five months, this blog has grown from a suggestion Jeff Dalton put in my ear to a community to which I'm proud to belong.

Some milestones:
  • Over 70 posts to date.
  • 94 subscribers, as reported by Google Reader.
  • 100 unique visitors on.a typical day.
To be honest, I thought I'd struggle to keep up with posting weekly, and that I'd need to convince my mom to read this blog so that I wouldn't be speaking to an empty room. The results so far have wildly exceeded the expectations I came in with.

But now that I've seen the potential of this blog, I'd like to "take it to the next level," as the MBA types say.

My goals:
  • Increase the readership. My motive isn't (only) to inflate my own ego. I've seen that this blog succeeds most when it stimulates conversation, and a conversation needs participants.

  • Increase participation. Given the quantity and quality of comments on recent posts, it's clear that readers here contribute the most valuable content. I'd like to step that up a notch by having readers guest-blog and perhaps going as far as to turning The Noisy Channel into a group blog about information seeking that transcends my personal take on the subject. I've very open to suggestions here.

  • Add some style. Various folks have offered suggestions for improving the blog, such as changing platforms to WordPress, modifying the layout to better use screen real estate, adding more images, etc. I'm the first to admit that I am not a designer, and I'd really appreciate ideas from you all on how to make this site more attractive and usable.
In short, I'm asking you to help me help you make The Noisy Channel a better and noisier place. Please post your comments here or email me if you'd prefer to make suggestions privately.

Wednesday, August 27, 2008

Transparency in Information Retrieval

It's been hard to find time to write another post while keeping up with the comment stream on my previous post about set retrieval! I'm very happy to see this level of interest, and I hope to continue catalyzing such discussions.

Today, I'd like to discuss transparency in the context of information retrieval. Transparency is an increasingly popular term these days in the context of search--perhaps not surprising, since users are finally starting to question the idea of search as a black box.

The idea of transparency is simple: users should know why a search engine returns a particular response to their query. Note the emphasis on "why" rather than "how". Most users don't care what algorithms a search engine uses to compute a response. What they do care about is how the engine ultimately "understood" their query--in other words, what question the engine thinks it's answering.

Some of you might find this description too anthropomorphic. But a recent study reported that most users expect search engines to read their minds--never mind that the general case goes beyond AI-complete (should we create a new class of ESP-complete problems)? But what frustrates users most is when a search engine not only fails to read their minds, but gives no indication of where the communication broke down, let alone how to fix it. In short, a failure to provide transparency.

What does this have to do with set retrieval vs. ranked retrieval? Plenty!

Set retrieval predates the Internet by a few decades, and was the first approach used to implement search engines. These search engines allowed users to enter queries by stringing together search terms with Boolean operators (AND, OR, etc.). Today, Boolean retrieval seem arcane, and most people see set retrieval as suitable for querying databases, rather than for querying search engines.

The biggest problem with set retrieval is that users find it extremely difficult to compose effective Boolean queries. Nonetheless, there is no question that set retrieval offers transparency: what you ask is what you get. And, if you prefer a particular sort order for your results, you can specify it.

In contrast, ranked retrieval makes it much easier for users to compose queries: users simply enter a few top-of-mind keywords. And for many use cases (in particular, known-item search) , a state-of-the-art implementation of ranked retrieval yields results that are good enough.

But ranked retrieval approaches generally shed transparency. At best, they employ standard information retrieval models that, although published in all of their gory detail, are opaque to their users--who are unlikely to be SIGIR regulars. At worst, they employ secret, proprietary models, either to protect their competitive differentiation or to thwart spammers.

Either way, the only clues that most ranked retrieval engines provide to users are text snippets from the returned documents. Those snippets may validate the relevance of the results that are shown, but the user does not learn what distinguishes the top-ranked results from other documents that contain some or all of the query terms.

If the user is satisfied with one of the top results, then transparency is unlikely to even come up. Even if the selected result isn't optimal, users may do well to satisfice. But when the search engine fails to read the user's mind, transparency offer the best hope of recovery.

But, as I mentioned earlier, users aren't great at composing queries for set retrieval, which was how ranked retrieval became so popular in the first place despite its lack of transparency. How do we resolve this dilemma?

To be continued...

Sunday, August 24, 2008

Set Retrieval vs. Ranked Retrieval

After last week's post about a racially targeted web search engine, you'd think I'd avoid controversy for a while. To the contrary, I now feel bold enough like to bring up what I have found to be my most controversial position within the information retrieval community: my preference for set retrieval over ranked retrieval.

This will be the first of several posts along this theme, so I'll start by introducing the terms.
  • In a ranked retrieval approach, the system responds to a search query by ranking all documents in the corpus based on its estimate of their relevance to the query.

  • In a set retrieval approach, the system partitions the corpus into two subsets of documents: those it considers relevant to the search query, and those it does not.
An information retrieval system can combine set retrieval and ranked retrieval by first determining a set of matching documents and then ranking the matching documents. Most industrial search engines, such as Google, take this approach, at least in principle. But, because the set of matching documents is typically much larger than the set of documents displayed to a user, these approaches are, in practice, ranked retrieval.

What is set retrieval in practice? In my view, a set retrieval approach satisfies two expectations:
  • The number of documents reported to match my search should be meaningful--or at least should be a meaningful estimate. More generally, any summary information reported about this set should be useful.

  • Displaying a random subset of the set of matching documents to the user should be a plausible behavior, even if it is not as good as displaying the top-ranked matches. In other words, relevance ranking should help distinguish more relevant results from less relevant results, rather than distinguishing relevant results from irrelevant results.
Despite its popularity, the ranked retrieval model suffers because it does not provide a clear split between relevant and irrelevant documents. This weakness makes it impossible to obtain even basic analysis of the query results, such as the number of relevant documents, let alone a more complicated one, such as the result quality. In contrast, a set retrieval model partitions the corpus into two subsets of documents: those that are considered relevant, and those that are not. A set retrieval model does not rank the retrieved documents; instead, it establishes a clear split between documents that are in and out of the retrieved set. As a result, set retrieval models enable rich analysis of query results, which can then be applied to improve user experience.

Saturday, August 23, 2008

Back from the Cone of Silence

Regular readers may have noticed the lack of posts this week. My apologies to anyone who was waiting by the RSS feed. Yesterday was the submission deadline for HCIR '08, which means that today is a new day! So please stay tuned for your regularly scheduled programming.

Saturday, August 16, 2008

Thinking Outside the Black Box

I was reading Techmeme today, and I noticed an LA Times article about RushmoreDrive, described on its About Us page as "a first-of-its-kind search engine for the Black community." My first reaction, blogged by others already, was that this idea was dumb and racist. In fact, it took some work to find positive commentary about RushmoreDrive.

But I've learned from the way the blogosphere handled the Cuil launch not to trust anyone who evaluates a search engine without having tried it, myself included. My wife and I have been the only white people at Amy Ruth's and the service was as gracious as the chicken and waffles were delicious; I decided I'd try my luck on a search engine not targeted at my racial profile.

The search quality is solid, comparable to that of Google, Yahoo, and Microsoft. In fact, the site looks a lot like a re-skinning (no pun intended) of Ask.com, a corporate sibling of IAC-owned RushmoreDrive. Like Ask.com, RushmoreDrive emphasizes search refinement through narrowing and broadening refinements.

What I find ironic is that the whole controversy about racial bias in relevance ranking reveals the much bigger problem--that relevance ranking should not be a black box (ok, maybe this time I'll take responsibility for the pun). I've been beating this drum at The Noisy Channel ever since I criticized Amit Singhal for Google's lack of transparency. I think that sites like RushmoreDrive are inevitable if search engines refuse to cede more control of search results to users.

I don't know how much information race provides as prior to influence statistical ranking approaches, but I'm skeptical that the effects are useful or even noticeable beyond a few well-chosen examples. I'm more inclined to see RushmoreDrive as a marketing ploy by the folks at IAC--and perhaps a successful one. I doubt that Google is running scared, but I think this should be a wake-up call to folks who are convinced that personalized relevance ranking is the end goal of user experience for search engines.

Friday, August 15, 2008

New Information Retrieval Book Available Online

Props to Jeff Dalton for alerting me about the new book on information retrieval by Christopher Manning, Prabhakar Raghavan, and Hinrich Schütze. You can buy a hard copy, but you can also access it online for free at the book website.

Wednesday, August 13, 2008

David Huynh's Freebase Parallax

One of the perks of working in HCIR is that you get to meet some of the coolest people in academic and industrial research. I met David Huynh a few years ago, while he was a graduate student at MIT, working in the Haystack group and on the Simile project. You've probably seen some of his work: his Timeline project has been deployed all over the web.

Despite efforts by me and other to persuade David to stay in the Northeast, he went out west a few months ago to join Metaweb, a company with ambitions "to build a better infrastructure for the Web." While I (and others) am not persuaded by Freebase, Metaweb's "open database of the world’s information," I am happy to see that David is still doing great work.

I encourage you to check out David's latest project: Freebase Parallax. In it, he does something I've never seen outside Endeca (excepting David's earlier work on a Nested Faceted Browser) he allows you to navigate using the facets of multiple entity types, joining between sets of entities through their relationships. At Endeca, we call this "record relationship navigation"--we presented it at HCIR '07, showing an how it can enable social navigation.

David includes a video where he eloquently demonstrates how Parallax works, and the interface is quite compelling. I'm not sure how well it scales with large data sets, but David's focus has been on interfaces rather than systems. My biggest complaint--which isn't David's fault--is that the Freebase content is a bit sparse. But his interface strikes me as a great fit for exploratory search.

Conversation with Seth Grimes

I had an great conversation with Intelligent Enterprise columnist Seth Grimes today. Apparently there's an upside to writing critical commentary on Google's aspirations in the enterprise!

One of the challenges in talking about enterprise search is that no one seems to agree on what it is. Indeed, as I've been discussing with Ryan Shaw , I use the term broadly to describe information access scenarios distinct from web search where an organization has some ownership or control of the content (in contrast to the somewhat adversarial relationship that web search companies have with the content they index). But I realize that many folks define enterprise search more narrowly to be a search box hooked up to the intranet.

Perhaps a better way to think about enterprise search is as a problem rather than solution. Many people expect a search box because they're familiar with searching the web using Google. I don't blame anyone for expecting that the same interface will work for enterprise information collections. Unfortunately, wishful thinking and clever advertising notwithstanding, it doesn't.

I've blogged about this subject from several different perspectives over the past weeks, so I'll refer recent readers to earlier posts on the subject rather than bore the regulars.

But I did want to mention a comment Seth made that I found particularly insightful. He defined enterprise search even more broadly than I do, suggesting that it encompassed any information seeking performed in the pursuit of enterprise-centric needs. In that context, he does see Google as the leader in enterprise search--not because of their enterprise offerings, but rather because of the web search they offer for free.

I'm not sure how I feel about his definition, but I think he raises a point that enterprise vendors often neglect. No matter how much information an enterprise controls, there will always be valuable information outside the enterprise. I find today's APIs to that information woefully inadequate; for example, I can't even choose a sort order through any of the web search APIs. But I am optimistic that those APIs will evolve, and that we will see "federated" information seeking that goes beyond merging ranked lists from different sources.

Indeed, I look forward to the day that web search providers take a cue from the enterprise and drop the focus on black box relevance ranking in favor of an approach that offers users control and interaction.

Monday, August 11, 2008

Position papers for NSF IS3 Workshop

I just wanted to let folks know that the position papers for the NSF Information Seeking Support Systems Workshop are now available at this link.

Here is a listing to whet your curiosity:
  • Supporting Interaction and Familiarity
    James Allan, University of Massachusetts Amherst, USA

  • From Web Search to Exploratory Search: Can we get there from here?
    Peter Anick, Yahoo! Inc., USA

  • Complex and Exploratory Web Search (with Daniel Russell)
    Anne Aula, Google, USA

  • Really Supporting Information Seeking: A Position Paper
    Nicholas J. Belkin, Rutgers University, USA

  • Transparent and User-Controllable Personalization For Information Exploration
    Peter Brusilovsky, University of Pittsburgh, USA

  • Faceted Exploratory Search Using the Relation Browser
    Robert Capra, UNC, USA

  • Towards a Model of Understanding Social Search
    Ed Chi, Palo Alto Research Center, USA

  • Building Blocks For Rapid Development of Information Seeking Support Systems
    Gary Geisler, University of Texas at Austin, USA

  • Collaborative Information Seeking in Electronic Environments
    Gene Golovchinsky, FX Palo Alto Laboratory, USA

  • NeoNote: User Centered Design Suggestions for a Global Shared Scholarly Annotation System
    Brad Hemminger, UNC, USA

  • Speaking the Same Language About Exploratory Information Seeking
    Bill Kules, The Catholic University of America, USA

  • Musings on Information Seeking Support Systems
    Michael Levi, U.S. Bureau of Labor Statistics, USA

  • Social Bookmarking and Information Seeking
    David Millen, IBM Research, USA

  • Making Sense of Search Result Pages
    Jan Pedersen, Yahoo, USA

  • A Multilevel Science of Social Information Foraging and Sensemaking
    Peter Pirolli, XEROX PARC USA

  • Characterizing, Supporting and Evaluating Exploratory Search
    Edie Rasmussen, University of British Columbia, Canada

  • The Information-Seeking Funnel
    Daniel Rose, A9.com, USA

  • Complex and Exploratory Web Search (with Anne Aula)
    Daniel Russell, Google, USA

  • Research Agenda: Visual Overviews for Exploratory Search
    Ben Shneiderman, University of Maryland, USA

  • Five Challenges for Research to Support IS3
    Elaine Toms, Dalhousie University, Canada

  • Resolving the Battle Royale between Information Retrieval and Information Science
    Daniel Tunkelang, Endeca, USA

Sunday, August 10, 2008

Why Enterprise Search Will Never Be Google-y

As I prepared to end my trilogy of Google-themed posts, I ran into two recently published items. They provide an excellent context for what I intended to talk about: the challenges and opportunities of enterprise search.

The first is Google's announcement of an upgrade to their search appliance that allows one box to index 10 million documents and offers improved search quality and personalization.

The second is an article by Chris Sherman in the Enterprise Search Sourcebook 2008 entitled Why Enterprise Search Will Never Be Google-y.

First, the Google announcement. These are certainly improvements for the GSA, and Google does seem to be aiming to compete with the Big Three: Autonomy, Endeca, FAST (now a subsidiary of Microsoft). But these improvements should be seen in the context of state of the art. In particular, Google's scalability claims, while impressive, still fall short of the market leaders in enterprise search. Moreover, the bottleneck in enterprise search hasn't been the scale of document indexing, but rather the effectiveness with which people can access and interact with the indexed content. Interestingly, Google's strongest selling point for the GSA, their claim it works "out of the box", is also its biggest weakness: even with the new set of features, the GSA does not offer the flexibility or rich functionality that enterprises have come to expect.

Second, the Chris Sherman piece. Here is an excerpt:
Enterprise search and web search are fundamentally different animals, and I'd argue that enterprise search won't--and shouldn't--be Google-y any time soon....Like web search, Google's enterprise search is easy to use--if you're willing to go along with how Google's algorithms view and present your business information....Ironically, enterprises, with all of their highly structures and carefully organized silos of information, require a very different and paradoxically more complex approach.
I highly recommend you read the whole article (it's only 2 pages), not only because it informative and well written, but also because the author isn't working for one of the Big Three.

The upshot? There is no question that Google is raising the bar for simple search in the enterprise. I wouldn't recommend that anyone try to compete with the GSA on its turf.

But information needs in the enterprise go far beyond known-item search, What enterprises want when they ask for "enterprise search" is not just a search box, but an interactive tool that helps them (or their customers) work through the process of articulating and fulfilling their information needs, for tasks as diverse as customer segmentation, knowledge management, and e-discovery.

If you're interested in search and want to be on the cutting edge of innovation, I suggest you think about the enterprise.

Thursday, August 7, 2008

Where Google Isn't Good Enough

My last post, Is Google Good Enough?, challenged would-be Google killers to identify and address clear consumer needs for which Google isn't good enough as a solution. I like helping my readers, so here are some ideas.
  • Shopping. Google Product Search (fka Froogle) is not one of Google's crown jewels. At best, it works well when you know the exact name of the product you are looking for. But it pales in contrast to any modern ecommerce site, such as Amazon or Home Depot. What makes a shopping site successful? Put simply, it helps users find what they want, even when they didn't know exactly what they wanted when they started.

  • Finding a job. Google has not thrown its hat into the ring of job search, and even the page they offer for finding jobs at Google could use some improvement. The two biggest job sites, Monster and Careerbuilder, succeed in terms of the number of jobs posted, but aren't exactly optimized for user experience. Dice does better, but only for technology jobs. Interestingly, the best job finding site may be LinkedIn--not because of their search implementation (which is adequate but not innovative), but because of their success in getting millions of professionals to provide high-quality data.

  • Finding employees. Again, LinkedIn has probably come closest to providing a good employee finding site. The large job sites (all of which I've used at some point) not only fail to support exploratory search, but also suffer from a skew towards ineligible candidates and a nuisance of recruiters posing as job seekers. Here again, Google has not tried to compete.

  • Planning a trip. Sure, you can use Expedia, Travelocity, or Kayak to find a flight, hotel, and car rental. But there's a lot of room for improvement when it comes to planning a trip, whether for business or pleasure. The existing tools do a poor job of putting together a coordinated itinerary (e.g., meals, activities), and also don't integrate with relevant information sources, such as local directories and reviews. This is another area where Google has not tried to play.
Note two general themes here. The first is thinking beyond the mechanics of search and focusing on the ability to meet user needs at the task level. The second is the need for exploratory search. These only scratch the surface of opportunities in consumer-facing "search" applications. The opportunities within the enterprise are even greater, but I'll save that for my next post.

Tuesday, August 5, 2008

Is Google Good Enough?

As Chief Scientist of Endeca, I spend a lot of my time explaining to people why they should not be satisfied with an information seekin interface that only offers them keyword search as an input mechanism and a ranked list of results as output. I tell them about query clarification dialogs, faceted navigation, and set analysis. More broadly, I evangelize exploratory search and human computer information retrieval as critical to addressing the inherent weakness of conventional ranked retrieval. If you haven't heard me expound on the subject, feel free to check out this slide show on Is Search Broken?.

But today I wanted to put my ideology aside and ask the the simple question: Is Google good enough? Here is a good faith attempt to make the case for the status quo. I'll focus on web search, since, as I've discussed before on this blog, enterprise search is different.

1) Google does well enough on result quality, enough of the time.

While Google doesn't publish statistics about user satisfaction, it's commonplace that Google usually succeeds in returning results that users find relevant. Granted, so do all of the major search engines: you can compare Google and Yahoo graphically at this site. But the question is not whether other search engines are also good enough--or even whether they are better. The point is that Google is good enough.

2) Google doesn't support exploratory search. But it often leads you to a tool that does.

The classic instance of this synergy is when Google leads you to a Wikipedia entry. For example, I look up Daniel Kahneman on Google. The top results is his Wikipedia entry. From there, I can traverse links to learn about his research areas, his colleagues, etc.

3) Google is a benign monopoly that mitigates choice overload.

Many people, myself includes, have concerns about Google's increasing role in mediating our access to information. But it's hard to ignore the upside of a single portal that gives you access to everything in one place: web pages, blogs, maps, email, etc, And it's all "free"--at least in so far as ad-supported services can be said to be free.

In summary, Google sets the bar pretty high. There are places where Google performs poorly (e.g., shopping) or doesn't even try to compete (e.g., travel). But when I see the series of companies lining up to challenge Google, I have to wonder how many of them have identified and addressed clear consumer needs for which Google isn't good enough as a solution. Given Google's near-monopoly in web search, parity or even incremental advantage isn't enough.

Monday, July 28, 2008

Not as Cuil as I Expected

Today's big tech news is the launch of Cuil, the latest challenger to Google's hegemony in Web search. Given the impressive team of Xooglers that put it together, I had high expectations for the launch.

My overall reaction: not bad, but not good enough to take seriously as a challenge to Google. They may be "The World's Biggest Search Engine" based on the number of pages indexed, but they return zero results for a number of queries where Google does just fine, including noisy channel blog (compare to Google). But I'm not taking it personally--after all, their own site doesn't show up when you search for their name (again, compare to Google). As for their interface features (column display, explore by category, query suggestions), they're fine, but neither the concepts nor the quality of their implementation strike me as revolutionary.

Perhaps I'm expecting too much on day 1. But they're not just trying to beat Gigablast; they're trying to beat Google, and they surely expected to get lots of critical attention the moment they launched. Regardless of the improvements they've made in indexing, they clearly need to do more work on their crawler. It's hard to judge the quality of results when it's clear that at least some of the problem is that the most relevant documents simply aren't in their index. I'm also surprised to not see Wikipedia documents showing up much for my searches--particularly for searches when I'm quite sure the most relevant document is in Wikipedia. Again, it's hard to tell if this is an indexing or results quality issue.

I wish them luck--I speak for many in my desire to see Google face worthy competition in web search.

Sunday, July 27, 2008

Catching up on SIGIR '08

Now that SIGIR '08 is over, I hope to see more folks blogging about it. I'm jealous of everyone who had the opportunity to attend, not only because of the culinary delights of Singapore, but because the program seems to reflect an increasing interest of the academic community in real-world IR problems.

Some notes from looking over the proceedings:
  • Of the 27 paper sessions, 2 include the word "user" in their titles, 2 include the word "social", 2 focus on Query Analysis & Models, and 1 is about exploratory search. Compared to the last few SIGIR conferences, this is a significant increase in focus on users and interaction.

  • A paper on whether test collections predict users' effectiveness offers an admirable defense of the Cranfield paradigm, much along the lines I've been advocating.

  • A nice paper from Microsoft Research looks at the problem of whether to personalize results for a query, recognizing that not all queries benefit from personalization. This approach may well be able to reap the benefits of personaliztion while avoiding much of its harm.

  • Two papers on tag prediction: Real-time Automatic Tag Recommendation (ACM Digital Library subscription required) and Social Tag Prediction. Semi-automated tagging tools are one of the best ways to leverage the best of both human and machine capabilities.
And I haven't even gotten to the posters! I'm sad to see that they dropped the industry day, but perhaps they'll bring it back next year in Boston.

Wednesday, July 23, 2008

Knol: Google takes on Wikipedia

Just a few days ago, I was commenting on a New York Times article about Wikipedia's new approval system that the biggest problem with Wikipedia is anonymous authorship. By synchronous coincidence, Google unveiled Knol today, which is something of a cross between Wikipedia and Squidoo. It's most salient feature is that each entry will have a clearly identified author. They even allow authors to verify their identities using credit cards or phone directories.

It's a nice idea, since anonymous authorship is a a major factor in the adversarial nature of information retrieval on the web. Not only does the accountability of authorship inhibit vandalism and edit wars, but it also allows readers to decide for themselves whom to trust--at least to the extent that readers are able and willing to obtain reliable information about the authors. Without question, they are addressing Wikipedia's biggest weakness.

But it's too little, too late. Wikipedia is already there. And, despite complaints about its inaccuracy and bias, Wikipedia is a fantastic, highly utilized resource. The only way I see for Knol to supplant Wikipedia in reasonable time frame is through a massive cut-and-paste to make up for the huge difference in content.

Interestingly, Wikipedia does not seem to place any onerous restrictions on verbatim copying. However, unless a single author is 100% responsible for authoring a Wikipedia entry, it isn't clear that anyone can simply copy the entry into Knol.

I know that it's dangerous to bet against Google. But I'm really skeptical about this latest effort. It's a pity, because I think their emphasis is the right one. But for once I wish they'd been a bit more humble and accepted that they aren't going to build a better Wikipedia from scratch.

Saturday, July 19, 2008

Predictably Irrational

As regular readers have surely noticed by now, I've been on a bit of a behavioral psychology kick lately. Some of this reflects long-standing personal interest and my latest reading. But I also feel increasingly concerned that researchers in information seeking--especially those working on tools--have neglected the impact of cognitive bias.

For those who are unfamiliar with last few decades of research in this field, I highly recommend a recent lecture by behavioral economist Dan Ariely on predictable irrationality. Not only is he a very informative and entertaining speaker, but he chooses very concrete and credible examples, starting with his contemplating how we experience pain based on his own experience of suffering
third-degree burns over 70 percent of his body. I promise you, the lecture is an hour well spent, and the time will fly by.

A running theme of through this and my other posts on cognitive bias is that the way a information is presented to us has dramatic effects on how we interpret that information.

This is great news for anyone who wants to manipulate people. In fact, I once asked Dan about the relative importance of people's inherent preferences vs. those induced by presentation on retail web sites, and he all but dismissed the former (i.e., you can sell ice cubes to Eskimos, if you can manipulate their cognitive biases appropriately). But it's sobering news for those of us who want to empower user to evaluate information objectively to support decision making.

Friday, July 18, 2008

Call to Action - A Follow-Up

The call to action I sent out a couple of weeks ago has generated healthy interest.

One of the several people who responded is the CTO of one of Endeca's competitors, whom I laud for understanding that the need to better articulate and communicate the technology of information access transcends competition among vendors. While we have differences on how to achieve this goal, I at least see hope from his responsiveness.

The rest were analysts representing some of the leading firms in the space. They not only expressed interest, but also contributed their own ideas on how to make this effort successful. Indeed, I met with two analysts this week to discuss next steps.

Here is where I see this going.

In order for any efforts to communicate the technology of information access to be effective, the forum has to establish credibility as a vendor-neutral and analyst-neutral forum. Ideally, that means having at least two major vendors and two major analysts on board. What we want to avoid is having only one major vendor or analyst, since that will create a reasonable perception of bias.

I'd also like to involve academics in information retrieval and library and information science. As one of the analysts suggested, we could reach out to the leading iSchools, who have expressed an open interest in engaging the broader community.

What I'd like to see come together is a forum, probably a one-day workshop, that brings together credible representatives from the vendor, analyst, and academic communities. With a critical mass of participants and enough diversity to assuage concerns of bias, we can start making good on this call to action.

Tuesday, July 15, 2008

Beyond a Reasonable Doubt

In Psychology of Intelligence Analysis, Richards Heuer advocates that we quantify expressions of uncertainty: "To avoid ambiguity, insert an odds ratio or probability range in parentheses after expressions of uncertainty in key judgments."

His suggestion reminds me of my pet peeve about the unquantified notion of reasonable doubt in the American justice system. I've always wanted (but never had the opportunity) to ask a judge what probability of innocence constitutes a reasonable doubt.

Unfortunately, as Heuer himself notes elsewhere in his book, we human beings are really bad at estimating probabilities. I suspect (with a confidence of 90 to 95%) that quantifying our uncertainties as probability ranges will only suggest a false sense of precision.

So, what can we do to better communicate uncertainty? Here are a couple of thoughts:
  • We can calibrate estimates based on past performance. It's unclear what will happen if people realize that their estimates are being translated, but, at worst, it feels like good fodder for research in judgment and decision making.

  • We can ask people to express relative probability judgments. While these are also susceptible to bias, at least they don't demand as much precision. And we can always vary the framing of questions to try to factor out the cognitive biases they induce.
Also, we talk about uncertainty, it is important that we distinguish between aleatory and epistemic uncertainty.

When I flip a coin, I am certain it has a 50% chance of landing heads, because I know the probability distribution of the event space. This is aleatory uncertainty, and forms the basis of probability and statistics.

But when I reason about less contrived uncertain events, such as estimating the likelihood that my bank will collapse this year, the challenge is my ignorance of the probability distribution. This is epistemic uncertainty, and it's a lot messier.

If you'd like to learn more about aleatory and existential uncertainty, I recommend Nicholas Nassim Taleb's Fooled by Randomness (which is a better read than his better-known Black Swan).

In summary, we have to accept the bad news that the real world is messy. As a mathematician and computer scientist, I've learned to pursue theoretical rigor as an ideal. Like me, you may find it very disconcerting to not be able to treat all real-world uncertainty in terms of probability spaces. Tell it to the judge!

Sunday, July 13, 2008

Small is Beautiful

Today's New York Times has an article by John Markoff called On a Small Screen, Just the Salient Stuff. It argues that the design constraints of the iPhone (and of mobile devices in general) lead to an improved user experience, since site designers do a better job of focusing on the information that users will find relevant.

Of course, on a blog entitled The Noisy Channel, I can't help praising approaches that strive to improve the signal-to-noise ratio in information seeking applications. And I'm glad to see them quoting Ben Shneiderman, a colleague of mine at the University of Maryland who has spent much of his career focusing on HCIR issues.

Still, I think they could have taken the idea much further. Their discussion of more efficient or ergonomic use of real estate boils down to stripping extraneous content (a good idea, but hardly novel), and making sites vertically oriented (i.e., no horizontal scrolling). They don't consider the question of what information is best to present in the limited space--which, in my mind, is the most important question to consider as we optimize interaction. Indeed, many of the questions raised by small screens also apply to other interfaces, such as voice.

Perhaps I am asking too much to expect them to call out the extreme inefficiency of ranked lists, compared to summarization-oriented approaches. Certainly the mobile space opens great opportunities for someone to get this right on the web.

Friday, July 11, 2008

Psychology of Intelligence Analysis

In the course of working with some of Endeca's more interesting clients, I started reading up on how the intelligence agencies address the challenges of making decisions, especially in the face of incomplete and contradictory evidence. I ran into a book called Psychology of Intelligence Analysis by former CIA analyst Richards Heuer. The entire book is available online, or you can hunt down a hard copy of the out-of-print book from your favorite used book seller.

Given the mixed record of the intelligence agencies over the past few decades, you might be wondering if the CIA is the best source for learning how to analyze intelligence. But this book is a gem. Even if the agencies don't always practice what they preach (and the book makes a good case as to why), the book is an excellent tour through the literature on judgment and decision making.

If you're already familiar with work by Herb Simon, Danny Kahneman, and Amos Tversky, then a lot of the ground he covers will be familiar--especially the third of the book that enumerates cognitive biases. I'm a big fan of the judgment and decision making literature myself. But I still found some great nuggets, particularly Chapter 8 on Analysis of Competing Hypotheses. Unlike most of the literature that focuses exclusively on demonstrating our systematic departures from rationality, Heuer hopes offer at least some constructive advice.

As someone who builds tools to help people make decisions using information that not only may be incomplete and contradictory, but also challenging to find in the first place, I'm very sensitive to how people's cognitive biases affect their ability to use these tools effectively. One of the HCIR '07 presentations by Jolie Martin and Michael Norton (who have worked with Max Bazerman) showed how the manner in which information was partitioned on retail web sites drove decisions, i.e., re-organizing the same information affected consumer's decision process.

It may be tempting for us on the software side to wash our hands of our users' cognitive biases. But such an approach would be short-sighted. As Heuer shows in his well-researched book, people not only have cognitive biases, but are unable to counter those biases simply by being made aware of them. Hence, if software tools are to help people make effective decisions, it is the job of us tool builders to build with those biases in mind, and to support processes like Analysis of Competing Hypotheses that try to compensate for human bias.

Thursday, July 10, 2008

Nice Selection of Machine Learning Papers

John Langford just posted a list of seven ICML '08 papers that he found interesting. I appreciate his taste in papers, and I particularly liked a paper on Learning Diverse Rankings with Multi-Armed Bandits that addresses learning a diverse ranking of documents based on users' clicking behavior. If you liked the Less is More work that Harr Chen and David Karger presented at SIGIR '06, then I recommend you check this one out.

Tuesday, July 8, 2008

Librarian 2.0

Many of the words that mark milestones in the history of technology, such as calculator and word processor, originally corresponded to people. Calculating had at least two lives as a technology breakthrough--first as a process, and then as a automatic means for executing that process. Thanks to inventions like calculators and computers, human beings have moved up the value chain to become scientists and engineers who take low-level details for granted.

Similarly, the advances in information science and retrieval have dramatically changed the role of a reference librarian.

Hopefully some of you old enough to remember card catalogs, They were certainly functional if you knew the exact title or author you were looking for, assuming the title wasn't too generic or author too prolific. Where card catalogs fell short was in supporting exploratory search. In many cases, your best bet was to quite literally explore the stacks and hope that locality within the Dewey Decimal system sufficed for to support your information seeking needs. Alternatively, you could follow citation paths--the dead-tree precursor of surfing a hypertext collection.

For exploratory tasks, library patrons would turn to reference librarians, who would clarify the patrons' needs through a process called the reference interview. According to Wikipedia:
A reference interview is composed of two segments:

1. An initial segment in which the librarian encourages the user to fully discuss the request.
2. A final segment in which the librarian asks questions to relate the request to the materials available in the library

A reference interview is structured (ideally) according to the following series of steps. First the library user states a question or describes a problem. The librarian then clarifies the user's information need, sometimes leading him or her back from a request for a specific resource (which may not be the best one for the problem at hand) to the actual information need as it manifests in the library user's life. Following that, the librarian suggests information resources that address the user's information need, explaining the nature and scope of information they contain and soliciting feedback. The reference interview closes when the librarian has provided the appropriate information or a referral to an outside resource where it can be found, and the user confirms that he or she has received the information needed.
Fast forward to the present day. Thanks to modern search engines, title and author search are no longer tedious processes. Moreover, search engines are somewhat forgiving of users, offering spelling correction and inexact query matching. Libraries are still catching up with advances in technology, but the evolution is clearly under way.

However, search engines have not obviated the need for a reference interview. Excepting the simple cases of known item search, the typical information seeker needs help translating an information need into one or more search queries. And that information need may change as the seeker learns from the process.

But it should come as no surprise that information seeking support systems need to be more than search engines. The ideal information seeking support system emulates a reference librarian, stepping users through a structured process of clarification. Indeed, this is exactly what my colleagues and I at Endeca are trying to do in our work with libraries and more broadly in pursuing a vision of human computer information retrieval.

What then becomes of librarians? Much as calculators and computers did not obviate the need for mathematicians, I don't see technology obviating the need for information scientists. Library schools have already evolved into information schools, and I have no doubt that their graduates will help establish the next generation of information seeking technology that makes today's search engines seem as quaint as card catalogs.

Sunday, July 6, 2008

Resolving the Battle Royale between Information Retrieval and Information Science

The following is the position paper I submitted to the NSF Information Seeking Support Systems Workshop last month. The workshop report is still being assembled, but I wanted to share my own contribution to the discussion, since it is particularly appropriate to the themes of The Noisy Channel.


Resolving the Battle Royale between Information Retrieval and Information Science


Daniel Tunkelang

Endeca

ABSTRACT

We propose an approach to help resolve the “battle royale” between the information retrieval and information science communities. The information retrieval side favors the Cranfield paradigm of batch evaluation, criticized by the information science side for its neglect of the user. The information science side favors user studies, criticized by the information retrieval side for their scale and repeatability challenges. Our approach aims to satisfy the primary concerns of both sides.

Categories and Subject Descriptors

H.1.2 [Human Factors]: Human information processing.

H.3.3 [Information Systems]: Information Search and Retrieval - Information Filtering, Retrieval Models

H.5.2 [Information Systems]: Information Interfaces and Presentation - User Interfaces

General Terms

Design, Experimentation, Human Factors

Keywords

Information science, information retrieval, information seeking, evaluation, user studies

1. INTRODUCTION

Over the past few decades, a growing community of researchers has called for the information retrieval community to think outside the Cranfield box. Perhaps the most vocal advocate is Nick Belkin, whose "grand challenges" in his keynote at the 2008 European Conference on Information Retrieval [1] all pertained to the interactive nature of information seeking he claims the Cranfield approach neglects. Belkin cited similar calls to action going back as far as Karen Spärck Jones, in her 1988 acceptance speech for the Gerald Salton award [2], and again from Tefko Saracevic, when he received the same award in 1997 [3]. More recently, we have the Information Seeking and Retrieval research program proposed by Peter Ingwersen and Kalervo Järvelin in The Turn, published in 2005 [4].

2. IMPASSE BETWEEN IR AND IS

Given the advocacy of Belkin and others, why hasn't there been more progress? As Ellen Voorhees noted in defense of Cranfield at the 2006 Workshop on Adaptive Information Retrieval, "changing the abstraction slightly to include just a bit more characterization of the user will result in a dramatic loss of power or increase in cost of retrieval experiments" [5]. Despite user studies that have sought to challenge the Cranfield emphasis on batch information retrieval measures like mean average precision—such as those of Andrew Turpin and Bill Hersh [6]—the information retrieval community, on the whole, remains unconvinced by these experiments because they are smaller in scale and less repeatable than the TREC evaluations.

As Tefko Saracevic has said, there is a "battle royale" between the information retrieval community, which favors the Cranfield paradigm of batch evaluation despite its neglect of the user, and the information science community, which favors user studies despite their scale and repeatability challenges [7]. How do we move forward?

3. PRIMARY CONCERNS OF IR AND IS

Both sides have compelling arguments. If an evaluation procedure is not repeatable and cost-effective, it has little practical value. Nonetheless, it is essential that an evaluation procedure measure the interactive nature of information seeking.

If we are to find common ground to resolve this dispute, we need to satisfy the primary concerns of both sides:

· Real information seeking tasks are interstice, so the results of the evaluation procedure must be meaningful in an interactive context.

· The evaluation procedure must be repeatable and cost-effective.

In order to move beyond the battle royale and resolve the impasse between the IR and IS communities, we need to address both of these concerns.

4. PROPOSED APPROACH


A key point of contention in the battle royale is whether we should evaluate systems by studying individual users or measuring system performance against test collections.

The short answer is that we need to do both. In order to ground the results of evaluation in realistic contexts, we need to conduct user studies that relate proposed measures to success in interactive information seeking tasks. Otherwise, we optimize under the artificial constraint that a task involves only a single user query.

Such an approach presumes that we have a characterization of information seeking tasks. This characterization is an open problem that is beyond the scope of this position paper but has been addressed by other information seeking researchers, including Ingwersen and Järvelin [4]. We presume access to a set of tasks that, if not exhaustive, at least applies to a valuable subset of real information seeking problems.

Consider, as a concrete example, the task of a researcher who, given a comprehensive digital library of technical publications, wants to determine with confidence whether his or her idea is novel. In other words, the researcher want to either discover prior art that anticipates the idea, or to state with confidence that there is no such art. Patent inventors and lawyers performing e-discovery perform analogous tasks. We can measure task performance objectively as a combination of accuracy and efficiency, and we can also consider subject measures like user confidence and satisfaction. Let us assume that we are able to quantify a task success measure that incorporates these factors.

Given this task and success measure, we would like to know how well an information retrieval system supports the user performing it. As the information scientists correctly argue, user studies are indispensable. But, as we employ user studies to determine which systems are most helpful to users, we need to go a step further and correlate user success to one or more system measures. We can then evaluate these system measures in a repeatable, cost-effective process that does not require user involvement.

For example, let us hypothesize that mean average precision (MAP) on a given TREC collection is such a measure. We hypothesize that users pursuing the prior art search task are more successful using a system with higher MAP than those using a system with lower MAP. In order to test this hypothesis, we can present users with a family of systems that, insofar as possible, vary only in MAP, and see how well user success correlates to the system’s MAP. If the correlation is strong, then we validate the utility of MAP as a system measure and invest in evaluating systems using MAP against the specified collection in order to predict their utility for the prior art task.

The principle here is a general one, and can even be used not only to compare different algorithms, but also to evaluate more sophisticated interfaces, such as document clustering [8] or faceted search [9]. The only requirement is that we hypothesize and validate system measures that correlate to user success.

5. WEAKNESSES OF APPROACH

Our proposed approach has two major weaknesses.

The first weakness is that, in a realistic interactive information retrieval context, distinct queries are not independent. Rather, a typical user executes a sequence of queries in pursuit of an information need, each query informed by the results of the previous ones.

In a batch test, we must decide the query sequence in advance, and cannot model how the user’s queries depend on system response. Hence, we are limited to computing measures that can be evaluated for each query independently. Nonetheless, we can choose measures which correlate to effectiveness in realistic settings. Hopefully these measures are still meaningful, even when we remove the test queries from their realistic context.

The second challenge is that we do not envision a way to compare different interfaces in a batch setting. It seems that testing the relative merits of different interfaces requires real—or at least simulated—users.

If, however, we hold the interface constant, then we can define performance measures that apply to those interfaces. For example, we can develop standardized versions of well-studied interfaces, such as faceted search and clustering. We can then compare the performance of different systems that use these interfaces, e.g., different clustering algorithms.

6. AN ALTERNATIVE APPROACH

An alternative way to tackle the evaluation problem leverages the “human computation” approach championed by Luis Von Ahn [10]. This approach uses “games with a purpose” to motivate people to perform information-related tasks, such as image tagging and optical character recognition (OCR).

A particularly interesting "game" in our present context is Phetch, in which in which one or more "Seekers" compete to find an image based on a text description provided by a "Describer" [11]. The Describer’s goal is to help the Seekers succeed, while the Seekers compete with one another to find the target image within a fixed time limit, using search engine that has indexed the images based on tagging results from the ESP Game. In order to discourage a shotgun approach, the game penalizes Seekers for wrong guesses.

This game goes quite far in capturing the essence of interactive information retrieval. If we put aside the competition among the Seekers, then we see that an individual Seeker, aided by the human Describer and the algorithmic--but human indexed--search engine--is pursuing an information retrieval task. Moreover, the Seeker is incented to be both effective and efficient.

How can we leverage this framework for information retrieval evaluation? Even though the game envisions both Describers and Seekers to be human beings, there is no reason we cannot allow computers to play too--in either or both roles. Granted, the game, as currently designed, focuses on image retrieval without giving the human players direct access to the image tags, but we could imagine a framework that is more amenable to machine participation, e.g., providing a machine player with a set of tags derived from those in the index when that player is presented with an image. Alternatively, there may be a domain more suited than image retrieval to incorporating computer players.

The main appeal of the game framework is that it allows all participants to be judged based on an objective criterion that reflects the effectiveness and efficiency of the interactive information retrieval process. A good Describer should, on average, outscore a bad Describer over the long term; likewise, a good Seeker should outscore a bad one. We can even vary the search engine available to Seekers, in order to compare competing search engine algorithms or interfaces.

7. CONCLUSION

Our goal is ambitious: we aspire towards an evaluation framework that satisfies information scientists as relevant to real-world information seeking, but nonetheless offers the practicality of the Cranfield paradigm that dominates information retrieval. The near absence of collaboration between the information science and information retrieval communities has been a greatly missed opportunity not only for both researcher communities but also for the rest of the world who could benefit from practical advances in our understanding of information seeking. We hope that the approach we propose takes at least a small step towards resolving this battle royale.

8. REFERENCES

[1] Belkin, N. J., 2008. Some(What) Grand Challenges for Information Retrieval. ACM SIGIR Forum 42, 1 (June 2008), 47-54.

[2] Spärck Jones, K. 1988. A look back and a look forward. In: SIGIR ’88. In Proceedings of the 11th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval, 13-29.

[3] Saracevic, T. 1997. Users lost: reflections of the past, future and limits of information science. ACM SIGIR Forum 31, 2 (July 1997), 16-27.

[4] Ingwersen, P. and Järvelin, K. 2005. The turn. Integration of information seeking and retrieval in context. Springer.

[5] Voorhees, E. 2006. Building Test Collections for Adaptive Information Retrieval: What to Abstract for What cost? In First International Workshop on Adaptive Information Retrieval (AIR).

[6] Turpin, A. and Scholer, F. 2006. User performance versus precision measures for simple search tasks. In Proceedings
of the 29th Annual international ACM SIGIR Conference on Research and Development in information Retrieval
, 11-18.

[7] Saracevic, T. (2007). Relevance: A review of the literature and a framework for thinking on the notion in information science. Part II: nature and manifestations of relevance. Journal of the American Society for Information Science and Technology 58(3), 1915-1933.

[8] Cutting, D., Karger, D., Pedersen, J., and Tukey, J. 1992. Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections. In Proceedings of the 15th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval, 318-329.

[9] Workshop on Faceted Search. 2006. In Proceedings of the 29th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval.

[10] Von Ahn, L. 2006. Games with a Purpose. IEEE Computer 39, 6 (June 2006), 92-94.

[11] Von Ahn, L., Ginosar, S., Kedia, M., Liu, R., and Blum, M. 2006. Improving accessibility of the web with a computer game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 79-82.