Friday, May 30, 2008

Is Search Broken?

Last night, I had the privilege of speaking to fellow CMU School of Computer Science alumni at Fidelity's Center for Advanced Technology in Boston. Dean Randy Bryant, Associate Director of Corporate Relations Dan Jenkins, and Director of Alumni Relations Tina Carr, organized the event, and they encouraged me to pick a provocative subject.

Thus encouraged, I decided to ask the question: Is Search Broken?

Slides are here as a PowerPoint show for anyone interested, or use the embedded SlideShare show below.



Wednesday, May 28, 2008

Another HCIR Game

I just received an announcement from the SIG-IRList about the flickling challenge, a "game" designed around known-item image retrieval from Flickr. The user is given an image (not annotated) and the goal is to find the image again from Flickr using the system.

I'm not sure how well it will catch on with casual gamers--but that is hardly its primary motivation. Rather, the challenge was designed to help provide a foundation for evaluating interactive information retrieval--in a cross-language setting, no less. Details available at the iCLEF 2008 site or in this paper.

I'm thrilled to see efforts like these emerging to evaluate interactive retrieval--indeed, this feels like a solitaire version of Phetch.

Tuesday, May 27, 2008

The Magic Shelf

I generally shy away from pimping Endeca's customers here at The Noisy Channel, but occasionally I have to make an exception. As some of you may remember, Borders made a deal several years ago to have Amazon operate their web site. Last year, they decided to reclaim their site. And today they are live, powered by Endeca! For more details, visit http://blog.endeca.com.

Now back to our commercial-free programming...

Monday, May 26, 2008

Your Input is Relevant!

The following is a public service announcement.

As some of you may know, I am the primary author of the Human Computer Information Retrieval entry on Wikipedia. I created this entry last November, shortly after the HCIR '07 workshop. One of the ideas we've tossed around for HCIR '08 is to collaboratively edit the page. But why wait? With apologies to Isaac Asimov, I/you/we are Wikipedia, so let's improve the entry now!

And, while you've got Wikipedia on the brain, please take a look at the Relevance (Information Retrieval) entry. After an unsuccessful attempt to have this entry folded into the main Information Retrieval entry, I've tried to rewrite it to conform to what I perceive as Wikipedia's standards of quality and non-partisanship. While I tried my best, I'm sure there's still room for improving it, and I suspect that some of you reading this are among the best qualified folks to do so!

As Lawrence Lessig says, it's a read-write society. So readers, please help out a bit with the writing.

Saturday, May 24, 2008

Games With an HCIR Purpose?

A couple of weeks ago, my colleague Luis Von Ahn at CMU launched Games With a Purpose,

Here is a brief explanation from the site:

When you play a game at Gwap, you aren't just having fun. You're helping the world become a better place. By playing our games, you're training computers to solve problems for humans all over the world.

Von Ahn has made a career (and earned a MacArthur Fellowship) from his work on such games, most notably the ESP Game and reCAPTCHA. His games emphasize tagging tasks that are difficult for machines but easy for human beings, such as labeling images with high-level descriptors.

I've been interested in Von Ahn's work for several years, and most particularly in a game called Phetch, a game which never quite made it out of beta but strikes me as one of the most ambitious examples of "human computation". Here is a description from the Phetch site:

Quick! Find an image of Michael Jackson wearing a sailor hat.
Phetch is like a treasure hunt -- you must find or help find an image from the Web.

One of the players is the Describer and the others are Seekers. Only the Describer can see the hidden image, and has to help the Seekers find it by giving them descriptions.

If the image is found, the Describer wins 200 points. The first to find it wins 100 points and becomes the new Describer.

A few important details that this description leaves out:

  • The Seeker (but not the Describer) has access to search engine that has indexed the images based on results from the ESP Game.
  • A Seeker loses points (I can't recall how many) for wrong guesses.
  • The game has a time limit (hence the "Quick!").

Now, let's unpack the game description and analyze it in terms of the Human-Computer Information Retrieval (HCIR) paradigm. First, let us simplify the game, so that there is only one Seeker. In that case, we have a cooperative information retrieval game, where the Describer is trying to describe a target document (specifically, an image) as informatively as possible, while the Seeker is trying to execute clever algorithms in his or her wetware to retrieve it. If we think in terms of a traditional information retrieval setup, that makes the Describer the user and the Seeker the information retrieval system. Sort of.

A full analysis of this game is beyond the scope of a single blog post, but let's look at the game from the Seeker's perspective, holding our assumption that there is only one Seeker, and adding the additional assumption that the Describer's input is static and supplied before the Seeker starts trying to find the image.

Assuming these simplifications, here is how a Seeker plays Phetch:

  • Read the description provided by the Describer and uses it to compose a search.
  • Scan the results sequentially, interrupting either to make a guess or to reformulate the search.

The key observation is that Phetch is about interactive information retrieval. A good Seeker recognizes when it is better to try reformulating the search than to keep scanning.

Returning to our theme of evaluation, we can envision modifying Phetch to create a system for evaluating interactive information retrieval. In fact, I persuaded my colleague Shiry Ginosar, who worked with Von Ahn on Phetch and is now a software engineer at Endeca, to elaborate such an approach at HCIR '07. There are a lot of details to work out, but I find this vision very compelling and perhaps a route to addressing Nick Belkin's grand challenge.

Thursday, May 22, 2008

Back from Orlando

I'm back from Endeca Discover '08: two and a half days of presentations, superheroic attractions, and, in the best tradition of The Noisy Channel, karaoke. A bunch of us tried our best to blog the presentations at http://blog.endeca.com/.

All in all, a fun exhausting time, but it's good to be back home. So, for those who have noticed the lack of posts in your RSS feeds, I promise I'll start making it up to you in the next few days.

Friday, May 16, 2008

Attending Endeca Discover '08

I'll be attending Endeca Discover '08, Endeca's annual user conference, from Sunday, May 18th to Wednesday, May 21st, so you might see a bit of a lull in my verbiage here while I live blog at http://blog.endeca.com and hang out in sunny Orlando with Endeca customers and partners.

If you're attending Discover, please give me a shout and come to my sessions:
Otherwise, I'll do my best to sneak in a post or comment, and I'll be back in full force later next week.

A Utilitarian View of IR Evaluation

In many information retrieval papers that propose new techniques, the authors validate those techniques by demonstrating improved mean average precision over a standard test collection. The value of such results--at least to a practitioner--hinges on whether mean average precision correlates to utility for users. Not only do user studies place this correlation in doubt, but I have yet to see an empirical argument defending the utility of average precision as an evaluation measure. Please send me any references if you are aware of them!

Of course, user studies are fraught with complications, the most practical one being their expense. I'm not suggesting that we need to replace Cranfield studies with user studies wholesale. Rather, I see the purpose of user studies as establishing the utility of measures that can then be evaluated by Cranfield studies. As with any other science, we need to work with simplified, abstract models to achieve progress, but we also need to ground those models by validating them in the real world.

For example, consider the scenario where a collection contains no documents that match a user's need. In this case, it is ideal for the user to reach this conclusion as accurately, quickly, and confidently as possible. Holding the interface constant, are there evaluation measures that correlate to how well users perform on these three criteria? Alternatively, can we demonstrate that some interfaces lead to better user performance than others? If so, can we establish measures suitable for those interfaces?

The "no documents" case is just one of many real-world scenarios, and I don't mean to suggest we should study it at the expense of all others. That said, I think it's a particularly valuable scenario that, as far as I can tell, has been neglected by the information retreival community. I use it to drive home the argument that practical use cases should drive our process of defining evaluation measures.

Tuesday, May 13, 2008

Thinking about IR Evaluation

I just read the recent Information Processing & Management special issue on Evaluation of Interactive Information Retrieval Systems. The articles were a worthwhile read, and yet they weren't exactly what I was looking for. Let me explain.

In fact, let's start by going back to Cranfield. The Cranfield paradigm offers us a quantitative, repeatable means to evaluate information retrieval systems. Its proponents make a strong case that it is effective and cost-effective. Its critics object that it measures the wrong thing because it neglects the user.

But let's look a bit harder at the proponents' case. The primary measure in use today is average precision--indeed, most authors of SIGIR papers validate their proposed approaches by demonstrating increased mean average precision (MAP) over a standard test collection of queries. The dominance of average precision as a measure is no accident: it has been shown to be the best single predictor of the precision-recall graph.

So why are folks like me complaining? There are the various user studies asserting that MAP does not predict user performance on search tasks. Those have me at hello, but the studies are controversial in the information retrieval community, and in any case not constructive.

Instead, consider a paper by Harr Chen and David Karger (both at MIT) entitled "Less is more." Here is a snippet from the abstract:
Traditionally, information retrieval systems aim to maximize the number of relevant documents returned to a user within some window of the top. For that goal, the probability ranking principle, which ranks documents in decreasing order of probability of relevance, is provably optimal. However, there are many scenarios in which that ranking does not optimize for the user's information need.
Let me rephrase that: the precision-recall graph, which indicates how well a ranked retrieval algorithms does at ranking relevant documents ahead of irrelevant ones, does not necessarily characterize how well a system meets a user's information need.

One of Chen and Karger's examples is the case where the user is only interested in retrieving one relevant document. In this case, a system does well to return a diverse set of results that hedges against different possible query interpretations or query processing strategies. The authors also discuss more general scenarios, along with heuristics to address them.

But the main contribution of this paper, at least in my eyes, is a philosophical one. The authors consider the diversity of user needs and offer quantitative, repeatable way to evaluate information retrieval systems with respect to different needs. Granted, they do not even consider the challenge of evaluating interactive information retrieval. But they do set a good example.

Stay tuned for more musings on this theme...

Monday, May 12, 2008

A Lofty Goal

The blogosphere is all atwitter with Powerset's public launch last night. Over at Techcrunch, Michael Arrington refers to their approach as a lofty goal.

But I'd like us to dream bigger. In the science fiction stories that inspired me to study computer and information science, the human-computer interface is not just natural language input. It's dialogue. The authors do not treat machine understanding of unambiguous requests as a wonder, but instead take it for granted as an artifact of technical progress. Indeed, the human-computer interface only becomes relevant to the plot when communication breaks down (aka "that does not compute").

Ever since I hacked a BASIC version of ELIZA on a Commodore 64, I've felt the visceral appeal of natural language input as an interface. Conversely, the progress of speech synthesis attests to our desire to humanize the machine's output. It is as if we want to reduce the Turing Test to a look-and-feel.

But the essence of dialogue lies beneath the surface. The conversations we have with machines are driven by our information needs, and should be optimized to that end. Even we human drop natural language among ourselves when circumstances call for more efficient communication. Consider an example as mundane as Starbucks baristas eliciting and delegating a latte order.

In short, let's remember that we want to talk with our computers, not just at them. Today's natural language input may be a step towards that end, or it may be just a detour.

Sunday, May 11, 2008

Powerset: Public Launch Later Today

As a member of the Powerset private beta, I just received this announcement:

Greetings Powerlabbers,

Later today, Powerset is going to launch the first publicly available version of our product. Since you've been active in the Powerlabs community, we wanted to give you a special heads-up to look for our release. Your suggestions, help, feedback, bug reports, and conversation have helped us immensely in creating an innovative and useful product. We hope that you'll continue to be active in Powerlabs and make more great suggestions.

More information will be posted on Powerset's blog later today, so keep your eye out for updates. Also, consider following us on Twitter or becoming a fan of Powerset on Facebook.

If you have a blog, we'd especially appreciate it if you'd write a blog post about your experience with this first Powerset product. Since you've been on the journey with us, your insight will be helpful in showing other people all of the amazing features in this release.

Again, we want to extend special thanks to you for sticking with us. We hope you feel almost as invested in this release as we are.

Thanks!

The Powerset Team


As loyal readers know, I've posted my impressions in the past. Now that the beta will be publicly available, I'm curious to hear impressions from you all.

Saturday, May 10, 2008

Special Issues of Information Processing & Management

My colleague Max Wilson at the University of Southampton recently called my attention to a pair of special issues of Information Processing & Management. The first is on Evaluation of Interactive Information Retrieval Systems; the second is on Evaluating Exploratory Search Systems. Both are available online at ScienceDirect. The interactive IR papers can be downloaded for free; the exploratory search papers are available for purchase to folks who don't have access through their institutions.

I'm behind on my reading, but the titles look promising. Stay tuned!

Friday, May 9, 2008

A Harmonic Convergence

This week, Forrester released a report entitled "Search + BI = Unified Information Access". The authors assert the convergence of search and business intelligence, a case that Forrester has been developing for quite some time.

The executive summary:
Search and business intelligence (BI) really are two sides of the same coin. Enterprise search enables people to access unstructured content like documents, blog and wiki entries, and emails stored in repositories across their organizations. BI surfaces structured data in reports and dashboards. As both technologies mature, the boundary between them is beginning to blur. Search platforms are beginning to perform BI functions like data visualization and reporting, and BI vendors have begun to incorporate simple to use search experiences into their products. Information and knowledge management professionals should take advantage of this convergence, which will have the same effect from both sides: to give businesspeople better context and information for the decisions they make every day.
It's hard to find any fault here. In fact, the convergence of search and BI is a corollary to the fact that people (yes, businesspeople are people too) use these systems, and that the same people have no desire to distinguish between "structured" and "unstructured" content as they pursue their information needs.

That said, I do have some quibbles with how the authors expect the convergence to play out. The authors make two assertions that I have a hard time accepting at face value:
    • People will be able to execute data queries via a search box using natural language.
    Sure, but will they want to? Natural language is fraught with communication challenges, and I'm no more persuaded by natural language queries for BI than I am by natural language queries for search.
    • Visual data representations will increase understanding of linkages among concepts.
    We've all heard the cliché that a picture is worth a thousand words. I know this better than most, as I earned my PhD by producing visual representations of networks. But I worry that people overestimate the value of these visualizations. Data visualization is simply a way to represent data analytics. I see more value in making analytics interactive (e.g., supporting and guiding incremental refinement) than in emphasizing visual representations.

    But I quibble. I strongly agree with most of their points, including:
    • BI interfaces will encourage discovery of additional data dimensions.
    • BI and search tools will provide proactive suggestions.
    • BI and search will continue to borrow techniques from each other.
    And it doesn't hurt that the authors express a very favorable view of Endeca. I can only hope they won't change their minds after reading this post!

    Thursday, May 8, 2008

    This Conversation is Public

    An interesting implication of blogging and other social media is that conversations once conducted privately have become public. The most common examples are conversations that take place through the comment areas for posts, rather than through private email.

    My initial reaction to this phenomenon was to bemoan the loss of boundaries. But, in keeping with my recent musings about privacy, I increasingly see the virtues of public conversations. After all, a synonym for privacy, albeit with a somewhat different connotation, is secrecy. Near-antonyms include transparency and openness.

    I can't promise to always serve personally as an open, transparent information access provider. But I'll do so where possible. Here at The Noisy Channel, the conversation is public.

    Wednesday, May 7, 2008

    Business, Technology, and Information

    I was fortunate to attend the Tri-State CIO Forum these last couple of days, and I thought I'd change the pace a bit by posting some reflections about it.

    In his keynote speech last night, George Colony, Chairman and CEO of Forrester Research, called on the business community to drop the name "information technology" (IT) in favor of "business technology" (BT). His reasoning, in a nutshell, was that such nomenclature would reflect the centrality of technology's role for businesses.

    Following similar reasoning but reaching a different conclusion, Julia King, an Executive Editor for Computerworld and one of of today's speakers, noted that IT titles are being "techno-scrubbed", and that there is a shift from managing technology to managing information.

    While I can't get excited about a naming debate, I do feel there's an important point overlooked in this discussion. Even though we've achieved consensus on the importance of technology, we need a sharper focus on information. It is a cliché that we live in an information age, but expertise about information is scarce. Information scientists struggle to influence technology development, and information theory is mostly confined to areas like cryptography and compression.

    We have no lack of information technology. Search engines, databases, and applications built on top of them are ubiquitous. But we still just learning how to work with information.

    Monday, May 5, 2008

    Saracevic on Relevance and Interaction

    There is no Nobel Prize in computer science, despite computer science having done more than any other discipline in the past fifty years to change the world. Instead, there is the Turing Award, which serves as a Nobel Prize of computing.

    But the Turing Award has never been given to anyone in information retrieval. Instead, there is the Gerald Salton Award, which serves as a Turing Award of information retrieval. Its recipients represent an A-list of information retrieval researchers.

    Last week, I had the opportunity to talk with Salton Award recipient Tefko Saracevic. If you are not familiar with Saracevic, I suggest you take an hour to watch his 2007 lecture on "Relevance in information science".

    I won't try to capture an hour of conversation in a blog post, but here are a few highlights:
    • We learn from philosophers, particularly Alfred Schütz, that we cannot reduce relevance to a single concept, but rather have to consider a system of interdependent relevancies, such as topical relevance, interpretational relevance, and motivational relevance.

    • When we talk about relevance measures, such as precision and recall, we evaluate results from the perspective of a user. But information retrieval approaches necessarily take a systems perspective, making assumptions about what people will want and encoding those assumptions in models and algorithms.

    • A major challenge in the information retrieval is that users--particularly web search users--often formulate queries that are ineffective, particularly because they are too short. Studies have shown that reference interviews can lead to improved retrieval effectiveness (typically through longer, more informative queries). He said that automated systems could help too, but he wasn't aware of any that had achieved traction.

    • A variety of factors affect interactive information retrieval, including task context, intent, expertise. Moreover, people react to certain relevance clues more than others, and more within some populations than others.
    As I expected, I walked away with more questions than answers. But I did walk away reassured that my colleagues and I at Endeca , along with others in the HCIR community, are attacking the right problem: helping users formulate better queries.

    I'd like to close with an anecdote that Saracevic recounts in his 2007 lecture. Bruce Croft had just delivered an information retrieval talk, and Nick Belkin raised the objection that users need to be incorporated into the study. Croft's conversation-ending response: "Tell us what to do, and we will do it."

    We're halfway there. We've built interactive information retrieval systems, and we see from deployment after deployment that they work. Not that there isn't plenty of room for improvement, but the unmet challenge, as Ellen Voorhees makes clear, is evaluation. We need to address Nick Belkin's grand challenge and establish a paradigm suitable for evaluation of interactive IR systems.

    Friday, May 2, 2008

    Guided Summarization

    I'm still waiting for the ECIR organizers to post the slides from the Industry Day. I particularly liked Nick Craswell's presentation on A Brief Tour of "Query Space". Until his slides are up, I recommend this SIGIR '07 paper to give you an idea of his approach.

    Slides are here as a PowerPoint show for anyone interested, or use the embedded SlideShare show below.

    List of Findability Solutions

    Dan Keldsen has posted a list of findability-related solutions at BizTechTalk. The 80 or so solutions that he lists are certainly an attempt to err on the side of recall, by including search, taxonomies, interfaces, and visualization as aspects of findability. Definitely a useful resource for anyone interested in enterprise information access.

    Thursday, May 1, 2008

    Privacy through Difficulty

    I had lunch today with Harr Chen, a graduate student at MIT, and we were talking about the consequences of information efficiency for privacy.

    A nice example is the company pages on LinkedIn. No company, to my knowledge, publishes statistics on:
    • the schools their employees attended.
    • the companies where their employees previously worked.
    • the companies where their ex-employees work next.
    If a company maintains these statistics, it surely considers them to be sensitive and confidential. Nonetheless, by aggregating information from member profiles, LinkedIn computes best guesses at these statistics and makes them public.

    Arguably, information like this was never truly private, but was simply so difficult to aggregate that nobody bothered. As Harr aptly put it, they practiced "privacy through difficulty"--a privacy analog to security through obscurity.

    Some people are terrified by the increasing efficiency of the information market and look for legal remedies as a last ditch attempt to protect their privacy. I am inclined towards the other extreme (see my previous post on privacy and information theory): let's assume that information flow is efficient and confront the consequences honestly. Then we can have an informed conversation about information privacy.

    Friday, May 30, 2008

    Is Search Broken?

    Last night, I had the privilege of speaking to fellow CMU School of Computer Science alumni at Fidelity's Center for Advanced Technology in Boston. Dean Randy Bryant, Associate Director of Corporate Relations Dan Jenkins, and Director of Alumni Relations Tina Carr, organized the event, and they encouraged me to pick a provocative subject.

    Thus encouraged, I decided to ask the question: Is Search Broken?

    Slides are here as a PowerPoint show for anyone interested, or use the embedded SlideShare show below.



    Wednesday, May 28, 2008

    Another HCIR Game

    I just received an announcement from the SIG-IRList about the flickling challenge, a "game" designed around known-item image retrieval from Flickr. The user is given an image (not annotated) and the goal is to find the image again from Flickr using the system.

    I'm not sure how well it will catch on with casual gamers--but that is hardly its primary motivation. Rather, the challenge was designed to help provide a foundation for evaluating interactive information retrieval--in a cross-language setting, no less. Details available at the iCLEF 2008 site or in this paper.

    I'm thrilled to see efforts like these emerging to evaluate interactive retrieval--indeed, this feels like a solitaire version of Phetch.

    Tuesday, May 27, 2008

    The Magic Shelf

    I generally shy away from pimping Endeca's customers here at The Noisy Channel, but occasionally I have to make an exception. As some of you may remember, Borders made a deal several years ago to have Amazon operate their web site. Last year, they decided to reclaim their site. And today they are live, powered by Endeca! For more details, visit http://blog.endeca.com.

    Now back to our commercial-free programming...

    Monday, May 26, 2008

    Your Input is Relevant!

    The following is a public service announcement.

    As some of you may know, I am the primary author of the Human Computer Information Retrieval entry on Wikipedia. I created this entry last November, shortly after the HCIR '07 workshop. One of the ideas we've tossed around for HCIR '08 is to collaboratively edit the page. But why wait? With apologies to Isaac Asimov, I/you/we are Wikipedia, so let's improve the entry now!

    And, while you've got Wikipedia on the brain, please take a look at the Relevance (Information Retrieval) entry. After an unsuccessful attempt to have this entry folded into the main Information Retrieval entry, I've tried to rewrite it to conform to what I perceive as Wikipedia's standards of quality and non-partisanship. While I tried my best, I'm sure there's still room for improving it, and I suspect that some of you reading this are among the best qualified folks to do so!

    As Lawrence Lessig says, it's a read-write society. So readers, please help out a bit with the writing.

    Saturday, May 24, 2008

    Games With an HCIR Purpose?

    A couple of weeks ago, my colleague Luis Von Ahn at CMU launched Games With a Purpose,

    Here is a brief explanation from the site:

    When you play a game at Gwap, you aren't just having fun. You're helping the world become a better place. By playing our games, you're training computers to solve problems for humans all over the world.

    Von Ahn has made a career (and earned a MacArthur Fellowship) from his work on such games, most notably the ESP Game and reCAPTCHA. His games emphasize tagging tasks that are difficult for machines but easy for human beings, such as labeling images with high-level descriptors.

    I've been interested in Von Ahn's work for several years, and most particularly in a game called Phetch, a game which never quite made it out of beta but strikes me as one of the most ambitious examples of "human computation". Here is a description from the Phetch site:

    Quick! Find an image of Michael Jackson wearing a sailor hat.
    Phetch is like a treasure hunt -- you must find or help find an image from the Web.

    One of the players is the Describer and the others are Seekers. Only the Describer can see the hidden image, and has to help the Seekers find it by giving them descriptions.

    If the image is found, the Describer wins 200 points. The first to find it wins 100 points and becomes the new Describer.

    A few important details that this description leaves out:

    • The Seeker (but not the Describer) has access to search engine that has indexed the images based on results from the ESP Game.
    • A Seeker loses points (I can't recall how many) for wrong guesses.
    • The game has a time limit (hence the "Quick!").

    Now, let's unpack the game description and analyze it in terms of the Human-Computer Information Retrieval (HCIR) paradigm. First, let us simplify the game, so that there is only one Seeker. In that case, we have a cooperative information retrieval game, where the Describer is trying to describe a target document (specifically, an image) as informatively as possible, while the Seeker is trying to execute clever algorithms in his or her wetware to retrieve it. If we think in terms of a traditional information retrieval setup, that makes the Describer the user and the Seeker the information retrieval system. Sort of.

    A full analysis of this game is beyond the scope of a single blog post, but let's look at the game from the Seeker's perspective, holding our assumption that there is only one Seeker, and adding the additional assumption that the Describer's input is static and supplied before the Seeker starts trying to find the image.

    Assuming these simplifications, here is how a Seeker plays Phetch:

    • Read the description provided by the Describer and uses it to compose a search.
    • Scan the results sequentially, interrupting either to make a guess or to reformulate the search.

    The key observation is that Phetch is about interactive information retrieval. A good Seeker recognizes when it is better to try reformulating the search than to keep scanning.

    Returning to our theme of evaluation, we can envision modifying Phetch to create a system for evaluating interactive information retrieval. In fact, I persuaded my colleague Shiry Ginosar, who worked with Von Ahn on Phetch and is now a software engineer at Endeca, to elaborate such an approach at HCIR '07. There are a lot of details to work out, but I find this vision very compelling and perhaps a route to addressing Nick Belkin's grand challenge.

    Thursday, May 22, 2008

    Back from Orlando

    I'm back from Endeca Discover '08: two and a half days of presentations, superheroic attractions, and, in the best tradition of The Noisy Channel, karaoke. A bunch of us tried our best to blog the presentations at http://blog.endeca.com/.

    All in all, a fun exhausting time, but it's good to be back home. So, for those who have noticed the lack of posts in your RSS feeds, I promise I'll start making it up to you in the next few days.

    Friday, May 16, 2008

    Attending Endeca Discover '08

    I'll be attending Endeca Discover '08, Endeca's annual user conference, from Sunday, May 18th to Wednesday, May 21st, so you might see a bit of a lull in my verbiage here while I live blog at http://blog.endeca.com and hang out in sunny Orlando with Endeca customers and partners.

    If you're attending Discover, please give me a shout and come to my sessions:
    Otherwise, I'll do my best to sneak in a post or comment, and I'll be back in full force later next week.

    A Utilitarian View of IR Evaluation

    In many information retrieval papers that propose new techniques, the authors validate those techniques by demonstrating improved mean average precision over a standard test collection. The value of such results--at least to a practitioner--hinges on whether mean average precision correlates to utility for users. Not only do user studies place this correlation in doubt, but I have yet to see an empirical argument defending the utility of average precision as an evaluation measure. Please send me any references if you are aware of them!

    Of course, user studies are fraught with complications, the most practical one being their expense. I'm not suggesting that we need to replace Cranfield studies with user studies wholesale. Rather, I see the purpose of user studies as establishing the utility of measures that can then be evaluated by Cranfield studies. As with any other science, we need to work with simplified, abstract models to achieve progress, but we also need to ground those models by validating them in the real world.

    For example, consider the scenario where a collection contains no documents that match a user's need. In this case, it is ideal for the user to reach this conclusion as accurately, quickly, and confidently as possible. Holding the interface constant, are there evaluation measures that correlate to how well users perform on these three criteria? Alternatively, can we demonstrate that some interfaces lead to better user performance than others? If so, can we establish measures suitable for those interfaces?

    The "no documents" case is just one of many real-world scenarios, and I don't mean to suggest we should study it at the expense of all others. That said, I think it's a particularly valuable scenario that, as far as I can tell, has been neglected by the information retreival community. I use it to drive home the argument that practical use cases should drive our process of defining evaluation measures.

    Tuesday, May 13, 2008

    Thinking about IR Evaluation

    I just read the recent Information Processing & Management special issue on Evaluation of Interactive Information Retrieval Systems. The articles were a worthwhile read, and yet they weren't exactly what I was looking for. Let me explain.

    In fact, let's start by going back to Cranfield. The Cranfield paradigm offers us a quantitative, repeatable means to evaluate information retrieval systems. Its proponents make a strong case that it is effective and cost-effective. Its critics object that it measures the wrong thing because it neglects the user.

    But let's look a bit harder at the proponents' case. The primary measure in use today is average precision--indeed, most authors of SIGIR papers validate their proposed approaches by demonstrating increased mean average precision (MAP) over a standard test collection of queries. The dominance of average precision as a measure is no accident: it has been shown to be the best single predictor of the precision-recall graph.

    So why are folks like me complaining? There are the various user studies asserting that MAP does not predict user performance on search tasks. Those have me at hello, but the studies are controversial in the information retrieval community, and in any case not constructive.

    Instead, consider a paper by Harr Chen and David Karger (both at MIT) entitled "Less is more." Here is a snippet from the abstract:
    Traditionally, information retrieval systems aim to maximize the number of relevant documents returned to a user within some window of the top. For that goal, the probability ranking principle, which ranks documents in decreasing order of probability of relevance, is provably optimal. However, there are many scenarios in which that ranking does not optimize for the user's information need.
    Let me rephrase that: the precision-recall graph, which indicates how well a ranked retrieval algorithms does at ranking relevant documents ahead of irrelevant ones, does not necessarily characterize how well a system meets a user's information need.

    One of Chen and Karger's examples is the case where the user is only interested in retrieving one relevant document. In this case, a system does well to return a diverse set of results that hedges against different possible query interpretations or query processing strategies. The authors also discuss more general scenarios, along with heuristics to address them.

    But the main contribution of this paper, at least in my eyes, is a philosophical one. The authors consider the diversity of user needs and offer quantitative, repeatable way to evaluate information retrieval systems with respect to different needs. Granted, they do not even consider the challenge of evaluating interactive information retrieval. But they do set a good example.

    Stay tuned for more musings on this theme...

    Monday, May 12, 2008

    A Lofty Goal

    The blogosphere is all atwitter with Powerset's public launch last night. Over at Techcrunch, Michael Arrington refers to their approach as a lofty goal.

    But I'd like us to dream bigger. In the science fiction stories that inspired me to study computer and information science, the human-computer interface is not just natural language input. It's dialogue. The authors do not treat machine understanding of unambiguous requests as a wonder, but instead take it for granted as an artifact of technical progress. Indeed, the human-computer interface only becomes relevant to the plot when communication breaks down (aka "that does not compute").

    Ever since I hacked a BASIC version of ELIZA on a Commodore 64, I've felt the visceral appeal of natural language input as an interface. Conversely, the progress of speech synthesis attests to our desire to humanize the machine's output. It is as if we want to reduce the Turing Test to a look-and-feel.

    But the essence of dialogue lies beneath the surface. The conversations we have with machines are driven by our information needs, and should be optimized to that end. Even we human drop natural language among ourselves when circumstances call for more efficient communication. Consider an example as mundane as Starbucks baristas eliciting and delegating a latte order.

    In short, let's remember that we want to talk with our computers, not just at them. Today's natural language input may be a step towards that end, or it may be just a detour.

    Sunday, May 11, 2008

    Powerset: Public Launch Later Today

    As a member of the Powerset private beta, I just received this announcement:

    Greetings Powerlabbers,

    Later today, Powerset is going to launch the first publicly available version of our product. Since you've been active in the Powerlabs community, we wanted to give you a special heads-up to look for our release. Your suggestions, help, feedback, bug reports, and conversation have helped us immensely in creating an innovative and useful product. We hope that you'll continue to be active in Powerlabs and make more great suggestions.

    More information will be posted on Powerset's blog later today, so keep your eye out for updates. Also, consider following us on Twitter or becoming a fan of Powerset on Facebook.

    If you have a blog, we'd especially appreciate it if you'd write a blog post about your experience with this first Powerset product. Since you've been on the journey with us, your insight will be helpful in showing other people all of the amazing features in this release.

    Again, we want to extend special thanks to you for sticking with us. We hope you feel almost as invested in this release as we are.

    Thanks!

    The Powerset Team


    As loyal readers know, I've posted my impressions in the past. Now that the beta will be publicly available, I'm curious to hear impressions from you all.

    Saturday, May 10, 2008

    Special Issues of Information Processing & Management

    My colleague Max Wilson at the University of Southampton recently called my attention to a pair of special issues of Information Processing & Management. The first is on Evaluation of Interactive Information Retrieval Systems; the second is on Evaluating Exploratory Search Systems. Both are available online at ScienceDirect. The interactive IR papers can be downloaded for free; the exploratory search papers are available for purchase to folks who don't have access through their institutions.

    I'm behind on my reading, but the titles look promising. Stay tuned!

    Friday, May 9, 2008

    A Harmonic Convergence

    This week, Forrester released a report entitled "Search + BI = Unified Information Access". The authors assert the convergence of search and business intelligence, a case that Forrester has been developing for quite some time.

    The executive summary:
    Search and business intelligence (BI) really are two sides of the same coin. Enterprise search enables people to access unstructured content like documents, blog and wiki entries, and emails stored in repositories across their organizations. BI surfaces structured data in reports and dashboards. As both technologies mature, the boundary between them is beginning to blur. Search platforms are beginning to perform BI functions like data visualization and reporting, and BI vendors have begun to incorporate simple to use search experiences into their products. Information and knowledge management professionals should take advantage of this convergence, which will have the same effect from both sides: to give businesspeople better context and information for the decisions they make every day.
    It's hard to find any fault here. In fact, the convergence of search and BI is a corollary to the fact that people (yes, businesspeople are people too) use these systems, and that the same people have no desire to distinguish between "structured" and "unstructured" content as they pursue their information needs.

    That said, I do have some quibbles with how the authors expect the convergence to play out. The authors make two assertions that I have a hard time accepting at face value:
      • People will be able to execute data queries via a search box using natural language.
      Sure, but will they want to? Natural language is fraught with communication challenges, and I'm no more persuaded by natural language queries for BI than I am by natural language queries for search.
      • Visual data representations will increase understanding of linkages among concepts.
      We've all heard the cliché that a picture is worth a thousand words. I know this better than most, as I earned my PhD by producing visual representations of networks. But I worry that people overestimate the value of these visualizations. Data visualization is simply a way to represent data analytics. I see more value in making analytics interactive (e.g., supporting and guiding incremental refinement) than in emphasizing visual representations.

      But I quibble. I strongly agree with most of their points, including:
      • BI interfaces will encourage discovery of additional data dimensions.
      • BI and search tools will provide proactive suggestions.
      • BI and search will continue to borrow techniques from each other.
      And it doesn't hurt that the authors express a very favorable view of Endeca. I can only hope they won't change their minds after reading this post!

      Thursday, May 8, 2008

      This Conversation is Public

      An interesting implication of blogging and other social media is that conversations once conducted privately have become public. The most common examples are conversations that take place through the comment areas for posts, rather than through private email.

      My initial reaction to this phenomenon was to bemoan the loss of boundaries. But, in keeping with my recent musings about privacy, I increasingly see the virtues of public conversations. After all, a synonym for privacy, albeit with a somewhat different connotation, is secrecy. Near-antonyms include transparency and openness.

      I can't promise to always serve personally as an open, transparent information access provider. But I'll do so where possible. Here at The Noisy Channel, the conversation is public.

      Wednesday, May 7, 2008

      Business, Technology, and Information

      I was fortunate to attend the Tri-State CIO Forum these last couple of days, and I thought I'd change the pace a bit by posting some reflections about it.

      In his keynote speech last night, George Colony, Chairman and CEO of Forrester Research, called on the business community to drop the name "information technology" (IT) in favor of "business technology" (BT). His reasoning, in a nutshell, was that such nomenclature would reflect the centrality of technology's role for businesses.

      Following similar reasoning but reaching a different conclusion, Julia King, an Executive Editor for Computerworld and one of of today's speakers, noted that IT titles are being "techno-scrubbed", and that there is a shift from managing technology to managing information.

      While I can't get excited about a naming debate, I do feel there's an important point overlooked in this discussion. Even though we've achieved consensus on the importance of technology, we need a sharper focus on information. It is a cliché that we live in an information age, but expertise about information is scarce. Information scientists struggle to influence technology development, and information theory is mostly confined to areas like cryptography and compression.

      We have no lack of information technology. Search engines, databases, and applications built on top of them are ubiquitous. But we still just learning how to work with information.

      Monday, May 5, 2008

      Saracevic on Relevance and Interaction

      There is no Nobel Prize in computer science, despite computer science having done more than any other discipline in the past fifty years to change the world. Instead, there is the Turing Award, which serves as a Nobel Prize of computing.

      But the Turing Award has never been given to anyone in information retrieval. Instead, there is the Gerald Salton Award, which serves as a Turing Award of information retrieval. Its recipients represent an A-list of information retrieval researchers.

      Last week, I had the opportunity to talk with Salton Award recipient Tefko Saracevic. If you are not familiar with Saracevic, I suggest you take an hour to watch his 2007 lecture on "Relevance in information science".

      I won't try to capture an hour of conversation in a blog post, but here are a few highlights:
      • We learn from philosophers, particularly Alfred Schütz, that we cannot reduce relevance to a single concept, but rather have to consider a system of interdependent relevancies, such as topical relevance, interpretational relevance, and motivational relevance.

      • When we talk about relevance measures, such as precision and recall, we evaluate results from the perspective of a user. But information retrieval approaches necessarily take a systems perspective, making assumptions about what people will want and encoding those assumptions in models and algorithms.

      • A major challenge in the information retrieval is that users--particularly web search users--often formulate queries that are ineffective, particularly because they are too short. Studies have shown that reference interviews can lead to improved retrieval effectiveness (typically through longer, more informative queries). He said that automated systems could help too, but he wasn't aware of any that had achieved traction.

      • A variety of factors affect interactive information retrieval, including task context, intent, expertise. Moreover, people react to certain relevance clues more than others, and more within some populations than others.
      As I expected, I walked away with more questions than answers. But I did walk away reassured that my colleagues and I at Endeca , along with others in the HCIR community, are attacking the right problem: helping users formulate better queries.

      I'd like to close with an anecdote that Saracevic recounts in his 2007 lecture. Bruce Croft had just delivered an information retrieval talk, and Nick Belkin raised the objection that users need to be incorporated into the study. Croft's conversation-ending response: "Tell us what to do, and we will do it."

      We're halfway there. We've built interactive information retrieval systems, and we see from deployment after deployment that they work. Not that there isn't plenty of room for improvement, but the unmet challenge, as Ellen Voorhees makes clear, is evaluation. We need to address Nick Belkin's grand challenge and establish a paradigm suitable for evaluation of interactive IR systems.

      Friday, May 2, 2008

      Guided Summarization

      I'm still waiting for the ECIR organizers to post the slides from the Industry Day. I particularly liked Nick Craswell's presentation on A Brief Tour of "Query Space". Until his slides are up, I recommend this SIGIR '07 paper to give you an idea of his approach.

      Slides are here as a PowerPoint show for anyone interested, or use the embedded SlideShare show below.

      List of Findability Solutions

      Dan Keldsen has posted a list of findability-related solutions at BizTechTalk. The 80 or so solutions that he lists are certainly an attempt to err on the side of recall, by including search, taxonomies, interfaces, and visualization as aspects of findability. Definitely a useful resource for anyone interested in enterprise information access.

      Thursday, May 1, 2008

      Privacy through Difficulty

      I had lunch today with Harr Chen, a graduate student at MIT, and we were talking about the consequences of information efficiency for privacy.

      A nice example is the company pages on LinkedIn. No company, to my knowledge, publishes statistics on:
      • the schools their employees attended.
      • the companies where their employees previously worked.
      • the companies where their ex-employees work next.
      If a company maintains these statistics, it surely considers them to be sensitive and confidential. Nonetheless, by aggregating information from member profiles, LinkedIn computes best guesses at these statistics and makes them public.

      Arguably, information like this was never truly private, but was simply so difficult to aggregate that nobody bothered. As Harr aptly put it, they practiced "privacy through difficulty"--a privacy analog to security through obscurity.

      Some people are terrified by the increasing efficiency of the information market and look for legal remedies as a last ditch attempt to protect their privacy. I am inclined towards the other extreme (see my previous post on privacy and information theory): let's assume that information flow is efficient and confront the consequences honestly. Then we can have an informed conversation about information privacy.

      Friday, May 30, 2008

      Is Search Broken?

      Last night, I had the privilege of speaking to fellow CMU School of Computer Science alumni at Fidelity's Center for Advanced Technology in Boston. Dean Randy Bryant, Associate Director of Corporate Relations Dan Jenkins, and Director of Alumni Relations Tina Carr, organized the event, and they encouraged me to pick a provocative subject.

      Thus encouraged, I decided to ask the question: Is Search Broken?

      Slides are here as a PowerPoint show for anyone interested, or use the embedded SlideShare show below.



      Wednesday, May 28, 2008

      Another HCIR Game

      I just received an announcement from the SIG-IRList about the flickling challenge, a "game" designed around known-item image retrieval from Flickr. The user is given an image (not annotated) and the goal is to find the image again from Flickr using the system.

      I'm not sure how well it will catch on with casual gamers--but that is hardly its primary motivation. Rather, the challenge was designed to help provide a foundation for evaluating interactive information retrieval--in a cross-language setting, no less. Details available at the iCLEF 2008 site or in this paper.

      I'm thrilled to see efforts like these emerging to evaluate interactive retrieval--indeed, this feels like a solitaire version of Phetch.

      Tuesday, May 27, 2008

      The Magic Shelf

      I generally shy away from pimping Endeca's customers here at The Noisy Channel, but occasionally I have to make an exception. As some of you may remember, Borders made a deal several years ago to have Amazon operate their web site. Last year, they decided to reclaim their site. And today they are live, powered by Endeca! For more details, visit http://blog.endeca.com.

      Now back to our commercial-free programming...

      Monday, May 26, 2008

      Your Input is Relevant!

      The following is a public service announcement.

      As some of you may know, I am the primary author of the Human Computer Information Retrieval entry on Wikipedia. I created this entry last November, shortly after the HCIR '07 workshop. One of the ideas we've tossed around for HCIR '08 is to collaboratively edit the page. But why wait? With apologies to Isaac Asimov, I/you/we are Wikipedia, so let's improve the entry now!

      And, while you've got Wikipedia on the brain, please take a look at the Relevance (Information Retrieval) entry. After an unsuccessful attempt to have this entry folded into the main Information Retrieval entry, I've tried to rewrite it to conform to what I perceive as Wikipedia's standards of quality and non-partisanship. While I tried my best, I'm sure there's still room for improving it, and I suspect that some of you reading this are among the best qualified folks to do so!

      As Lawrence Lessig says, it's a read-write society. So readers, please help out a bit with the writing.

      Saturday, May 24, 2008

      Games With an HCIR Purpose?

      A couple of weeks ago, my colleague Luis Von Ahn at CMU launched Games With a Purpose,

      Here is a brief explanation from the site:

      When you play a game at Gwap, you aren't just having fun. You're helping the world become a better place. By playing our games, you're training computers to solve problems for humans all over the world.

      Von Ahn has made a career (and earned a MacArthur Fellowship) from his work on such games, most notably the ESP Game and reCAPTCHA. His games emphasize tagging tasks that are difficult for machines but easy for human beings, such as labeling images with high-level descriptors.

      I've been interested in Von Ahn's work for several years, and most particularly in a game called Phetch, a game which never quite made it out of beta but strikes me as one of the most ambitious examples of "human computation". Here is a description from the Phetch site:

      Quick! Find an image of Michael Jackson wearing a sailor hat.
      Phetch is like a treasure hunt -- you must find or help find an image from the Web.

      One of the players is the Describer and the others are Seekers. Only the Describer can see the hidden image, and has to help the Seekers find it by giving them descriptions.

      If the image is found, the Describer wins 200 points. The first to find it wins 100 points and becomes the new Describer.

      A few important details that this description leaves out:

      • The Seeker (but not the Describer) has access to search engine that has indexed the images based on results from the ESP Game.
      • A Seeker loses points (I can't recall how many) for wrong guesses.
      • The game has a time limit (hence the "Quick!").

      Now, let's unpack the game description and analyze it in terms of the Human-Computer Information Retrieval (HCIR) paradigm. First, let us simplify the game, so that there is only one Seeker. In that case, we have a cooperative information retrieval game, where the Describer is trying to describe a target document (specifically, an image) as informatively as possible, while the Seeker is trying to execute clever algorithms in his or her wetware to retrieve it. If we think in terms of a traditional information retrieval setup, that makes the Describer the user and the Seeker the information retrieval system. Sort of.

      A full analysis of this game is beyond the scope of a single blog post, but let's look at the game from the Seeker's perspective, holding our assumption that there is only one Seeker, and adding the additional assumption that the Describer's input is static and supplied before the Seeker starts trying to find the image.

      Assuming these simplifications, here is how a Seeker plays Phetch:

      • Read the description provided by the Describer and uses it to compose a search.
      • Scan the results sequentially, interrupting either to make a guess or to reformulate the search.

      The key observation is that Phetch is about interactive information retrieval. A good Seeker recognizes when it is better to try reformulating the search than to keep scanning.

      Returning to our theme of evaluation, we can envision modifying Phetch to create a system for evaluating interactive information retrieval. In fact, I persuaded my colleague Shiry Ginosar, who worked with Von Ahn on Phetch and is now a software engineer at Endeca, to elaborate such an approach at HCIR '07. There are a lot of details to work out, but I find this vision very compelling and perhaps a route to addressing Nick Belkin's grand challenge.

      Thursday, May 22, 2008

      Back from Orlando

      I'm back from Endeca Discover '08: two and a half days of presentations, superheroic attractions, and, in the best tradition of The Noisy Channel, karaoke. A bunch of us tried our best to blog the presentations at http://blog.endeca.com/.

      All in all, a fun exhausting time, but it's good to be back home. So, for those who have noticed the lack of posts in your RSS feeds, I promise I'll start making it up to you in the next few days.

      Friday, May 16, 2008

      Attending Endeca Discover '08

      I'll be attending Endeca Discover '08, Endeca's annual user conference, from Sunday, May 18th to Wednesday, May 21st, so you might see a bit of a lull in my verbiage here while I live blog at http://blog.endeca.com and hang out in sunny Orlando with Endeca customers and partners.

      If you're attending Discover, please give me a shout and come to my sessions:
      Otherwise, I'll do my best to sneak in a post or comment, and I'll be back in full force later next week.

      A Utilitarian View of IR Evaluation

      In many information retrieval papers that propose new techniques, the authors validate those techniques by demonstrating improved mean average precision over a standard test collection. The value of such results--at least to a practitioner--hinges on whether mean average precision correlates to utility for users. Not only do user studies place this correlation in doubt, but I have yet to see an empirical argument defending the utility of average precision as an evaluation measure. Please send me any references if you are aware of them!

      Of course, user studies are fraught with complications, the most practical one being their expense. I'm not suggesting that we need to replace Cranfield studies with user studies wholesale. Rather, I see the purpose of user studies as establishing the utility of measures that can then be evaluated by Cranfield studies. As with any other science, we need to work with simplified, abstract models to achieve progress, but we also need to ground those models by validating them in the real world.

      For example, consider the scenario where a collection contains no documents that match a user's need. In this case, it is ideal for the user to reach this conclusion as accurately, quickly, and confidently as possible. Holding the interface constant, are there evaluation measures that correlate to how well users perform on these three criteria? Alternatively, can we demonstrate that some interfaces lead to better user performance than others? If so, can we establish measures suitable for those interfaces?

      The "no documents" case is just one of many real-world scenarios, and I don't mean to suggest we should study it at the expense of all others. That said, I think it's a particularly valuable scenario that, as far as I can tell, has been neglected by the information retreival community. I use it to drive home the argument that practical use cases should drive our process of defining evaluation measures.

      Tuesday, May 13, 2008

      Thinking about IR Evaluation

      I just read the recent Information Processing & Management special issue on Evaluation of Interactive Information Retrieval Systems. The articles were a worthwhile read, and yet they weren't exactly what I was looking for. Let me explain.

      In fact, let's start by going back to Cranfield. The Cranfield paradigm offers us a quantitative, repeatable means to evaluate information retrieval systems. Its proponents make a strong case that it is effective and cost-effective. Its critics object that it measures the wrong thing because it neglects the user.

      But let's look a bit harder at the proponents' case. The primary measure in use today is average precision--indeed, most authors of SIGIR papers validate their proposed approaches by demonstrating increased mean average precision (MAP) over a standard test collection of queries. The dominance of average precision as a measure is no accident: it has been shown to be the best single predictor of the precision-recall graph.

      So why are folks like me complaining? There are the various user studies asserting that MAP does not predict user performance on search tasks. Those have me at hello, but the studies are controversial in the information retrieval community, and in any case not constructive.

      Instead, consider a paper by Harr Chen and David Karger (both at MIT) entitled "Less is more." Here is a snippet from the abstract:
      Traditionally, information retrieval systems aim to maximize the number of relevant documents returned to a user within some window of the top. For that goal, the probability ranking principle, which ranks documents in decreasing order of probability of relevance, is provably optimal. However, there are many scenarios in which that ranking does not optimize for the user's information need.
      Let me rephrase that: the precision-recall graph, which indicates how well a ranked retrieval algorithms does at ranking relevant documents ahead of irrelevant ones, does not necessarily characterize how well a system meets a user's information need.

      One of Chen and Karger's examples is the case where the user is only interested in retrieving one relevant document. In this case, a system does well to return a diverse set of results that hedges against different possible query interpretations or query processing strategies. The authors also discuss more general scenarios, along with heuristics to address them.

      But the main contribution of this paper, at least in my eyes, is a philosophical one. The authors consider the diversity of user needs and offer quantitative, repeatable way to evaluate information retrieval systems with respect to different needs. Granted, they do not even consider the challenge of evaluating interactive information retrieval. But they do set a good example.

      Stay tuned for more musings on this theme...

      Monday, May 12, 2008

      A Lofty Goal

      The blogosphere is all atwitter with Powerset's public launch last night. Over at Techcrunch, Michael Arrington refers to their approach as a lofty goal.

      But I'd like us to dream bigger. In the science fiction stories that inspired me to study computer and information science, the human-computer interface is not just natural language input. It's dialogue. The authors do not treat machine understanding of unambiguous requests as a wonder, but instead take it for granted as an artifact of technical progress. Indeed, the human-computer interface only becomes relevant to the plot when communication breaks down (aka "that does not compute").

      Ever since I hacked a BASIC version of ELIZA on a Commodore 64, I've felt the visceral appeal of natural language input as an interface. Conversely, the progress of speech synthesis attests to our desire to humanize the machine's output. It is as if we want to reduce the Turing Test to a look-and-feel.

      But the essence of dialogue lies beneath the surface. The conversations we have with machines are driven by our information needs, and should be optimized to that end. Even we human drop natural language among ourselves when circumstances call for more efficient communication. Consider an example as mundane as Starbucks baristas eliciting and delegating a latte order.

      In short, let's remember that we want to talk with our computers, not just at them. Today's natural language input may be a step towards that end, or it may be just a detour.

      Sunday, May 11, 2008

      Powerset: Public Launch Later Today

      As a member of the Powerset private beta, I just received this announcement:

      Greetings Powerlabbers,

      Later today, Powerset is going to launch the first publicly available version of our product. Since you've been active in the Powerlabs community, we wanted to give you a special heads-up to look for our release. Your suggestions, help, feedback, bug reports, and conversation have helped us immensely in creating an innovative and useful product. We hope that you'll continue to be active in Powerlabs and make more great suggestions.

      More information will be posted on Powerset's blog later today, so keep your eye out for updates. Also, consider following us on Twitter or becoming a fan of Powerset on Facebook.

      If you have a blog, we'd especially appreciate it if you'd write a blog post about your experience with this first Powerset product. Since you've been on the journey with us, your insight will be helpful in showing other people all of the amazing features in this release.

      Again, we want to extend special thanks to you for sticking with us. We hope you feel almost as invested in this release as we are.

      Thanks!

      The Powerset Team


      As loyal readers know, I've posted my impressions in the past. Now that the beta will be publicly available, I'm curious to hear impressions from you all.

      Saturday, May 10, 2008

      Special Issues of Information Processing & Management

      My colleague Max Wilson at the University of Southampton recently called my attention to a pair of special issues of Information Processing & Management. The first is on Evaluation of Interactive Information Retrieval Systems; the second is on Evaluating Exploratory Search Systems. Both are available online at ScienceDirect. The interactive IR papers can be downloaded for free; the exploratory search papers are available for purchase to folks who don't have access through their institutions.

      I'm behind on my reading, but the titles look promising. Stay tuned!

      Friday, May 9, 2008

      A Harmonic Convergence

      This week, Forrester released a report entitled "Search + BI = Unified Information Access". The authors assert the convergence of search and business intelligence, a case that Forrester has been developing for quite some time.

      The executive summary:
      Search and business intelligence (BI) really are two sides of the same coin. Enterprise search enables people to access unstructured content like documents, blog and wiki entries, and emails stored in repositories across their organizations. BI surfaces structured data in reports and dashboards. As both technologies mature, the boundary between them is beginning to blur. Search platforms are beginning to perform BI functions like data visualization and reporting, and BI vendors have begun to incorporate simple to use search experiences into their products. Information and knowledge management professionals should take advantage of this convergence, which will have the same effect from both sides: to give businesspeople better context and information for the decisions they make every day.
      It's hard to find any fault here. In fact, the convergence of search and BI is a corollary to the fact that people (yes, businesspeople are people too) use these systems, and that the same people have no desire to distinguish between "structured" and "unstructured" content as they pursue their information needs.

      That said, I do have some quibbles with how the authors expect the convergence to play out. The authors make two assertions that I have a hard time accepting at face value:
        • People will be able to execute data queries via a search box using natural language.
        Sure, but will they want to? Natural language is fraught with communication challenges, and I'm no more persuaded by natural language queries for BI than I am by natural language queries for search.
        • Visual data representations will increase understanding of linkages among concepts.
        We've all heard the cliché that a picture is worth a thousand words. I know this better than most, as I earned my PhD by producing visual representations of networks. But I worry that people overestimate the value of these visualizations. Data visualization is simply a way to represent data analytics. I see more value in making analytics interactive (e.g., supporting and guiding incremental refinement) than in emphasizing visual representations.

        But I quibble. I strongly agree with most of their points, including:
        • BI interfaces will encourage discovery of additional data dimensions.
        • BI and search tools will provide proactive suggestions.
        • BI and search will continue to borrow techniques from each other.
        And it doesn't hurt that the authors express a very favorable view of Endeca. I can only hope they won't change their minds after reading this post!

        Thursday, May 8, 2008

        This Conversation is Public

        An interesting implication of blogging and other social media is that conversations once conducted privately have become public. The most common examples are conversations that take place through the comment areas for posts, rather than through private email.

        My initial reaction to this phenomenon was to bemoan the loss of boundaries. But, in keeping with my recent musings about privacy, I increasingly see the virtues of public conversations. After all, a synonym for privacy, albeit with a somewhat different connotation, is secrecy. Near-antonyms include transparency and openness.

        I can't promise to always serve personally as an open, transparent information access provider. But I'll do so where possible. Here at The Noisy Channel, the conversation is public.

        Wednesday, May 7, 2008

        Business, Technology, and Information

        I was fortunate to attend the Tri-State CIO Forum these last couple of days, and I thought I'd change the pace a bit by posting some reflections about it.

        In his keynote speech last night, George Colony, Chairman and CEO of Forrester Research, called on the business community to drop the name "information technology" (IT) in favor of "business technology" (BT). His reasoning, in a nutshell, was that such nomenclature would reflect the centrality of technology's role for businesses.

        Following similar reasoning but reaching a different conclusion, Julia King, an Executive Editor for Computerworld and one of of today's speakers, noted that IT titles are being "techno-scrubbed", and that there is a shift from managing technology to managing information.

        While I can't get excited about a naming debate, I do feel there's an important point overlooked in this discussion. Even though we've achieved consensus on the importance of technology, we need a sharper focus on information. It is a cliché that we live in an information age, but expertise about information is scarce. Information scientists struggle to influence technology development, and information theory is mostly confined to areas like cryptography and compression.

        We have no lack of information technology. Search engines, databases, and applications built on top of them are ubiquitous. But we still just learning how to work with information.

        Monday, May 5, 2008

        Saracevic on Relevance and Interaction

        There is no Nobel Prize in computer science, despite computer science having done more than any other discipline in the past fifty years to change the world. Instead, there is the Turing Award, which serves as a Nobel Prize of computing.

        But the Turing Award has never been given to anyone in information retrieval. Instead, there is the Gerald Salton Award, which serves as a Turing Award of information retrieval. Its recipients represent an A-list of information retrieval researchers.

        Last week, I had the opportunity to talk with Salton Award recipient Tefko Saracevic. If you are not familiar with Saracevic, I suggest you take an hour to watch his 2007 lecture on "Relevance in information science".

        I won't try to capture an hour of conversation in a blog post, but here are a few highlights:
        • We learn from philosophers, particularly Alfred Schütz, that we cannot reduce relevance to a single concept, but rather have to consider a system of interdependent relevancies, such as topical relevance, interpretational relevance, and motivational relevance.

        • When we talk about relevance measures, such as precision and recall, we evaluate results from the perspective of a user. But information retrieval approaches necessarily take a systems perspective, making assumptions about what people will want and encoding those assumptions in models and algorithms.

        • A major challenge in the information retrieval is that users--particularly web search users--often formulate queries that are ineffective, particularly because they are too short. Studies have shown that reference interviews can lead to improved retrieval effectiveness (typically through longer, more informative queries). He said that automated systems could help too, but he wasn't aware of any that had achieved traction.

        • A variety of factors affect interactive information retrieval, including task context, intent, expertise. Moreover, people react to certain relevance clues more than others, and more within some populations than others.
        As I expected, I walked away with more questions than answers. But I did walk away reassured that my colleagues and I at Endeca , along with others in the HCIR community, are attacking the right problem: helping users formulate better queries.

        I'd like to close with an anecdote that Saracevic recounts in his 2007 lecture. Bruce Croft had just delivered an information retrieval talk, and Nick Belkin raised the objection that users need to be incorporated into the study. Croft's conversation-ending response: "Tell us what to do, and we will do it."

        We're halfway there. We've built interactive information retrieval systems, and we see from deployment after deployment that they work. Not that there isn't plenty of room for improvement, but the unmet challenge, as Ellen Voorhees makes clear, is evaluation. We need to address Nick Belkin's grand challenge and establish a paradigm suitable for evaluation of interactive IR systems.

        Friday, May 2, 2008

        Guided Summarization

        I'm still waiting for the ECIR organizers to post the slides from the Industry Day. I particularly liked Nick Craswell's presentation on A Brief Tour of "Query Space". Until his slides are up, I recommend this SIGIR '07 paper to give you an idea of his approach.

        Slides are here as a PowerPoint show for anyone interested, or use the embedded SlideShare show below.

        List of Findability Solutions

        Dan Keldsen has posted a list of findability-related solutions at BizTechTalk. The 80 or so solutions that he lists are certainly an attempt to err on the side of recall, by including search, taxonomies, interfaces, and visualization as aspects of findability. Definitely a useful resource for anyone interested in enterprise information access.

        Thursday, May 1, 2008

        Privacy through Difficulty

        I had lunch today with Harr Chen, a graduate student at MIT, and we were talking about the consequences of information efficiency for privacy.

        A nice example is the company pages on LinkedIn. No company, to my knowledge, publishes statistics on:
        • the schools their employees attended.
        • the companies where their employees previously worked.
        • the companies where their ex-employees work next.
        If a company maintains these statistics, it surely considers them to be sensitive and confidential. Nonetheless, by aggregating information from member profiles, LinkedIn computes best guesses at these statistics and makes them public.

        Arguably, information like this was never truly private, but was simply so difficult to aggregate that nobody bothered. As Harr aptly put it, they practiced "privacy through difficulty"--a privacy analog to security through obscurity.

        Some people are terrified by the increasing efficiency of the information market and look for legal remedies as a last ditch attempt to protect their privacy. I am inclined towards the other extreme (see my previous post on privacy and information theory): let's assume that information flow is efficient and confront the consequences honestly. Then we can have an informed conversation about information privacy.