Tuesday, July 8, 2008

Librarian 2.0

Many of the words that mark milestones in the history of technology, such as calculator and word processor, originally corresponded to people. Calculating had at least two lives as a technology breakthrough--first as a process, and then as a automatic means for executing that process. Thanks to inventions like calculators and computers, human beings have moved up the value chain to become scientists and engineers who take low-level details for granted.

Similarly, the advances in information science and retrieval have dramatically changed the role of a reference librarian.

Hopefully some of you old enough to remember card catalogs, They were certainly functional if you knew the exact title or author you were looking for, assuming the title wasn't too generic or author too prolific. Where card catalogs fell short was in supporting exploratory search. In many cases, your best bet was to quite literally explore the stacks and hope that locality within the Dewey Decimal system sufficed for to support your information seeking needs. Alternatively, you could follow citation paths--the dead-tree precursor of surfing a hypertext collection.

For exploratory tasks, library patrons would turn to reference librarians, who would clarify the patrons' needs through a process called the reference interview. According to Wikipedia:
A reference interview is composed of two segments:

1. An initial segment in which the librarian encourages the user to fully discuss the request.
2. A final segment in which the librarian asks questions to relate the request to the materials available in the library

A reference interview is structured (ideally) according to the following series of steps. First the library user states a question or describes a problem. The librarian then clarifies the user's information need, sometimes leading him or her back from a request for a specific resource (which may not be the best one for the problem at hand) to the actual information need as it manifests in the library user's life. Following that, the librarian suggests information resources that address the user's information need, explaining the nature and scope of information they contain and soliciting feedback. The reference interview closes when the librarian has provided the appropriate information or a referral to an outside resource where it can be found, and the user confirms that he or she has received the information needed.
Fast forward to the present day. Thanks to modern search engines, title and author search are no longer tedious processes. Moreover, search engines are somewhat forgiving of users, offering spelling correction and inexact query matching. Libraries are still catching up with advances in technology, but the evolution is clearly under way.

However, search engines have not obviated the need for a reference interview. Excepting the simple cases of known item search, the typical information seeker needs help translating an information need into one or more search queries. And that information need may change as the seeker learns from the process.

But it should come as no surprise that information seeking support systems need to be more than search engines. The ideal information seeking support system emulates a reference librarian, stepping users through a structured process of clarification. Indeed, this is exactly what my colleagues and I at Endeca are trying to do in our work with libraries and more broadly in pursuing a vision of human computer information retrieval.

What then becomes of librarians? Much as calculators and computers did not obviate the need for mathematicians, I don't see technology obviating the need for information scientists. Library schools have already evolved into information schools, and I have no doubt that their graduates will help establish the next generation of information seeking technology that makes today's search engines seem as quaint as card catalogs.

5 comments:

stefanoq said...

Nice post, Daniel. It's wise to look at technology in its historical context.

I believe there are two important ways great librarians can help. Unlike automated systems, human assistants can help the searcher refine his or her question. Arriving at the right question is not a simple matter of getting the right query syntax: it's about deepening your understanding of what it is you really want to know. We can all get better at that!

In addition, librarians bring a lot of practical context to prioritizing results. Like so many of the other mediators who are being "disintermediated" by the Internet (editors, publishers, teachers, journalists, producers, and so forth), librarians used to help us understand which sources we could trust and which were most likely to help us answer our specific question. It is much more than a calculation of relevance. It depends as much on a deep understanding of our questions and the authors and institutions which produced the text as it depends on the text itself.

If I am right about this, and I suspect that you agree, Daniel, both people and technology will be working together on this problem for a long time. And librarians can expect to play a major role in the "re-intermediation" process.

David Fauth said...

One of the problems with the initial search is the user may not know what they really want. They may not know what is available for them to be able to search on. In the majority of sites, keyword or phrase searches are all that is available. My wife is going through that now as she searches for academic documents for her master level classes.

The context around the source of data is so valuable. Understanding of the sources of the data, their validity, how current the information is, and whether that data will answer the question is as important if not more important than the simple relevance ranking.

These problems will keep us busy for quite some time.

Daniel Tunkelang said...

Thanks for the comments.

Stefano, I realized after posting this that I might have offended librarians by suggesting that that technology will make them superfluous. I do think that technology should automate much of what consumes librarians' time today. For example, I was reading about the Hutchins heuristic, and it's clear that librarians should not be executing algorithms that could be implemented in software. That said, it will be a while until our expert systems are competitive with human experts. And here I speak of experts in information seeking--figuring out better ways to leverage collective content expertise is a topic in and of itself.

David, I feel your wife's pain. I look at libraries like the Triangle Research Libraries Network as a starting point for how online catalogs should work. My colleagues and I at Endeca are working on even better for support for exploratory search. As I've blogged here before, I think exploratory search is where it's at.

David Fauth said...

Daniel,

Librarians have a standard taxonomy (Dewey decimal) to help them get started in the information retrieval. If all of the books were thrown into a huge pile, it would be a pain for the librarian to find the best book for the user. Unfortunately, a lot of the web has a limited taxonomy or reference to help some one do the exploratory search.

I haven't yet read all of you exploratory search writings but agree with that approach. I'm interested in the problems you are approaching and how taxonomy (static or possibly dynamic) would help drive the faceted search and exploratory search.

Daniel Tunkelang said...

David,

Taxonomies are nice but far too rigid. The colon classification system proposed by Ranganathan in 1933 was the first serious attempt to propose faceted classification as an alternative, but the technology for faceted search took a while to develop (check out some early technology in the space).

Fast forward to the present day, and you have systems like WorldCat or the Endeca-powered system at the Triangle Research Libraries Network. These systems leverage facets to facilitate exploratory search. There are still taxonomies in there, but less emphasis on achieve one static hierarchy to rule them all, and instead a focus on leveraging context dynamically to propose useful query refinement options.

Tuesday, July 8, 2008

Librarian 2.0

Many of the words that mark milestones in the history of technology, such as calculator and word processor, originally corresponded to people. Calculating had at least two lives as a technology breakthrough--first as a process, and then as a automatic means for executing that process. Thanks to inventions like calculators and computers, human beings have moved up the value chain to become scientists and engineers who take low-level details for granted.

Similarly, the advances in information science and retrieval have dramatically changed the role of a reference librarian.

Hopefully some of you old enough to remember card catalogs, They were certainly functional if you knew the exact title or author you were looking for, assuming the title wasn't too generic or author too prolific. Where card catalogs fell short was in supporting exploratory search. In many cases, your best bet was to quite literally explore the stacks and hope that locality within the Dewey Decimal system sufficed for to support your information seeking needs. Alternatively, you could follow citation paths--the dead-tree precursor of surfing a hypertext collection.

For exploratory tasks, library patrons would turn to reference librarians, who would clarify the patrons' needs through a process called the reference interview. According to Wikipedia:
A reference interview is composed of two segments:

1. An initial segment in which the librarian encourages the user to fully discuss the request.
2. A final segment in which the librarian asks questions to relate the request to the materials available in the library

A reference interview is structured (ideally) according to the following series of steps. First the library user states a question or describes a problem. The librarian then clarifies the user's information need, sometimes leading him or her back from a request for a specific resource (which may not be the best one for the problem at hand) to the actual information need as it manifests in the library user's life. Following that, the librarian suggests information resources that address the user's information need, explaining the nature and scope of information they contain and soliciting feedback. The reference interview closes when the librarian has provided the appropriate information or a referral to an outside resource where it can be found, and the user confirms that he or she has received the information needed.
Fast forward to the present day. Thanks to modern search engines, title and author search are no longer tedious processes. Moreover, search engines are somewhat forgiving of users, offering spelling correction and inexact query matching. Libraries are still catching up with advances in technology, but the evolution is clearly under way.

However, search engines have not obviated the need for a reference interview. Excepting the simple cases of known item search, the typical information seeker needs help translating an information need into one or more search queries. And that information need may change as the seeker learns from the process.

But it should come as no surprise that information seeking support systems need to be more than search engines. The ideal information seeking support system emulates a reference librarian, stepping users through a structured process of clarification. Indeed, this is exactly what my colleagues and I at Endeca are trying to do in our work with libraries and more broadly in pursuing a vision of human computer information retrieval.

What then becomes of librarians? Much as calculators and computers did not obviate the need for mathematicians, I don't see technology obviating the need for information scientists. Library schools have already evolved into information schools, and I have no doubt that their graduates will help establish the next generation of information seeking technology that makes today's search engines seem as quaint as card catalogs.

5 comments:

stefanoq said...

Nice post, Daniel. It's wise to look at technology in its historical context.

I believe there are two important ways great librarians can help. Unlike automated systems, human assistants can help the searcher refine his or her question. Arriving at the right question is not a simple matter of getting the right query syntax: it's about deepening your understanding of what it is you really want to know. We can all get better at that!

In addition, librarians bring a lot of practical context to prioritizing results. Like so many of the other mediators who are being "disintermediated" by the Internet (editors, publishers, teachers, journalists, producers, and so forth), librarians used to help us understand which sources we could trust and which were most likely to help us answer our specific question. It is much more than a calculation of relevance. It depends as much on a deep understanding of our questions and the authors and institutions which produced the text as it depends on the text itself.

If I am right about this, and I suspect that you agree, Daniel, both people and technology will be working together on this problem for a long time. And librarians can expect to play a major role in the "re-intermediation" process.

David Fauth said...

One of the problems with the initial search is the user may not know what they really want. They may not know what is available for them to be able to search on. In the majority of sites, keyword or phrase searches are all that is available. My wife is going through that now as she searches for academic documents for her master level classes.

The context around the source of data is so valuable. Understanding of the sources of the data, their validity, how current the information is, and whether that data will answer the question is as important if not more important than the simple relevance ranking.

These problems will keep us busy for quite some time.

Daniel Tunkelang said...

Thanks for the comments.

Stefano, I realized after posting this that I might have offended librarians by suggesting that that technology will make them superfluous. I do think that technology should automate much of what consumes librarians' time today. For example, I was reading about the Hutchins heuristic, and it's clear that librarians should not be executing algorithms that could be implemented in software. That said, it will be a while until our expert systems are competitive with human experts. And here I speak of experts in information seeking--figuring out better ways to leverage collective content expertise is a topic in and of itself.

David, I feel your wife's pain. I look at libraries like the Triangle Research Libraries Network as a starting point for how online catalogs should work. My colleagues and I at Endeca are working on even better for support for exploratory search. As I've blogged here before, I think exploratory search is where it's at.

David Fauth said...

Daniel,

Librarians have a standard taxonomy (Dewey decimal) to help them get started in the information retrieval. If all of the books were thrown into a huge pile, it would be a pain for the librarian to find the best book for the user. Unfortunately, a lot of the web has a limited taxonomy or reference to help some one do the exploratory search.

I haven't yet read all of you exploratory search writings but agree with that approach. I'm interested in the problems you are approaching and how taxonomy (static or possibly dynamic) would help drive the faceted search and exploratory search.

Daniel Tunkelang said...

David,

Taxonomies are nice but far too rigid. The colon classification system proposed by Ranganathan in 1933 was the first serious attempt to propose faceted classification as an alternative, but the technology for faceted search took a while to develop (check out some early technology in the space).

Fast forward to the present day, and you have systems like WorldCat or the Endeca-powered system at the Triangle Research Libraries Network. These systems leverage facets to facilitate exploratory search. There are still taxonomies in there, but less emphasis on achieve one static hierarchy to rule them all, and instead a focus on leveraging context dynamically to propose useful query refinement options.

Tuesday, July 8, 2008

Librarian 2.0

Many of the words that mark milestones in the history of technology, such as calculator and word processor, originally corresponded to people. Calculating had at least two lives as a technology breakthrough--first as a process, and then as a automatic means for executing that process. Thanks to inventions like calculators and computers, human beings have moved up the value chain to become scientists and engineers who take low-level details for granted.

Similarly, the advances in information science and retrieval have dramatically changed the role of a reference librarian.

Hopefully some of you old enough to remember card catalogs, They were certainly functional if you knew the exact title or author you were looking for, assuming the title wasn't too generic or author too prolific. Where card catalogs fell short was in supporting exploratory search. In many cases, your best bet was to quite literally explore the stacks and hope that locality within the Dewey Decimal system sufficed for to support your information seeking needs. Alternatively, you could follow citation paths--the dead-tree precursor of surfing a hypertext collection.

For exploratory tasks, library patrons would turn to reference librarians, who would clarify the patrons' needs through a process called the reference interview. According to Wikipedia:
A reference interview is composed of two segments:

1. An initial segment in which the librarian encourages the user to fully discuss the request.
2. A final segment in which the librarian asks questions to relate the request to the materials available in the library

A reference interview is structured (ideally) according to the following series of steps. First the library user states a question or describes a problem. The librarian then clarifies the user's information need, sometimes leading him or her back from a request for a specific resource (which may not be the best one for the problem at hand) to the actual information need as it manifests in the library user's life. Following that, the librarian suggests information resources that address the user's information need, explaining the nature and scope of information they contain and soliciting feedback. The reference interview closes when the librarian has provided the appropriate information or a referral to an outside resource where it can be found, and the user confirms that he or she has received the information needed.
Fast forward to the present day. Thanks to modern search engines, title and author search are no longer tedious processes. Moreover, search engines are somewhat forgiving of users, offering spelling correction and inexact query matching. Libraries are still catching up with advances in technology, but the evolution is clearly under way.

However, search engines have not obviated the need for a reference interview. Excepting the simple cases of known item search, the typical information seeker needs help translating an information need into one or more search queries. And that information need may change as the seeker learns from the process.

But it should come as no surprise that information seeking support systems need to be more than search engines. The ideal information seeking support system emulates a reference librarian, stepping users through a structured process of clarification. Indeed, this is exactly what my colleagues and I at Endeca are trying to do in our work with libraries and more broadly in pursuing a vision of human computer information retrieval.

What then becomes of librarians? Much as calculators and computers did not obviate the need for mathematicians, I don't see technology obviating the need for information scientists. Library schools have already evolved into information schools, and I have no doubt that their graduates will help establish the next generation of information seeking technology that makes today's search engines seem as quaint as card catalogs.

5 comments:

stefanoq said...

Nice post, Daniel. It's wise to look at technology in its historical context.

I believe there are two important ways great librarians can help. Unlike automated systems, human assistants can help the searcher refine his or her question. Arriving at the right question is not a simple matter of getting the right query syntax: it's about deepening your understanding of what it is you really want to know. We can all get better at that!

In addition, librarians bring a lot of practical context to prioritizing results. Like so many of the other mediators who are being "disintermediated" by the Internet (editors, publishers, teachers, journalists, producers, and so forth), librarians used to help us understand which sources we could trust and which were most likely to help us answer our specific question. It is much more than a calculation of relevance. It depends as much on a deep understanding of our questions and the authors and institutions which produced the text as it depends on the text itself.

If I am right about this, and I suspect that you agree, Daniel, both people and technology will be working together on this problem for a long time. And librarians can expect to play a major role in the "re-intermediation" process.

David Fauth said...

One of the problems with the initial search is the user may not know what they really want. They may not know what is available for them to be able to search on. In the majority of sites, keyword or phrase searches are all that is available. My wife is going through that now as she searches for academic documents for her master level classes.

The context around the source of data is so valuable. Understanding of the sources of the data, their validity, how current the information is, and whether that data will answer the question is as important if not more important than the simple relevance ranking.

These problems will keep us busy for quite some time.

Daniel Tunkelang said...

Thanks for the comments.

Stefano, I realized after posting this that I might have offended librarians by suggesting that that technology will make them superfluous. I do think that technology should automate much of what consumes librarians' time today. For example, I was reading about the Hutchins heuristic, and it's clear that librarians should not be executing algorithms that could be implemented in software. That said, it will be a while until our expert systems are competitive with human experts. And here I speak of experts in information seeking--figuring out better ways to leverage collective content expertise is a topic in and of itself.

David, I feel your wife's pain. I look at libraries like the Triangle Research Libraries Network as a starting point for how online catalogs should work. My colleagues and I at Endeca are working on even better for support for exploratory search. As I've blogged here before, I think exploratory search is where it's at.

David Fauth said...

Daniel,

Librarians have a standard taxonomy (Dewey decimal) to help them get started in the information retrieval. If all of the books were thrown into a huge pile, it would be a pain for the librarian to find the best book for the user. Unfortunately, a lot of the web has a limited taxonomy or reference to help some one do the exploratory search.

I haven't yet read all of you exploratory search writings but agree with that approach. I'm interested in the problems you are approaching and how taxonomy (static or possibly dynamic) would help drive the faceted search and exploratory search.

Daniel Tunkelang said...

David,

Taxonomies are nice but far too rigid. The colon classification system proposed by Ranganathan in 1933 was the first serious attempt to propose faceted classification as an alternative, but the technology for faceted search took a while to develop (check out some early technology in the space).

Fast forward to the present day, and you have systems like WorldCat or the Endeca-powered system at the Triangle Research Libraries Network. These systems leverage facets to facilitate exploratory search. There are still taxonomies in there, but less emphasis on achieve one static hierarchy to rule them all, and instead a focus on leveraging context dynamically to propose useful query refinement options.