FXPAL Paper on Exploratory Search

Posted on September 22, 2011


I read a paper from FXPAL on a new framework and system of exploratory search. It’s been months since I read it, yet I want make a connection with my recent work here. They characterize exploratory search as follows:

 Exploratory search is often characterized by an evolving information need and the likelihood that the information sought is distributed across multiple documents. Thus the goal of the search process is not to formulate the perfect query or to find the ideal document, but to collect information through a variety of means, and to combine the discovered information to achieve a coherent understanding of some topic.


The paper defines several categories of objects (document, document set, query, query set) which users create and judge during a search session. For instance, starting with a query and its results, a user judges documents, issues new queries based on previous judgments, and so on. Given these definitions, the task of exploratory search is a sequence of transitions between these objects by which users have a set of queries and documents about the topic of interests in the end.

I think the paper is interesting for several reasons, first of all, in defining the transition between objects, they introduce several new transitions, such as meta-search (query set leads to document set) and relevance feedback (document set leads to document set). While these techniques are not new in itself, I think the model of exploratory search which incorporates these techniques is novel.

Secondly, they stress that the context (queries and documents seen so far) should play an important role in exploratory search task. the prototype system (SACK – Selective Application of Contextual Knowledge) they implemented supports reviewing and selecting a subset of the session context to make further progress in search task.

For instance, while the system displays the list of documents retrieved so far, it also displays the contribution of each query in retrieving each document, and the user can select a subset of queries based on what seems useful in retrieving current set of relevant documents. Combined with the meta-search method mentioned above, this can provide a powerful mechanism in refining user’s expression of information needs to the system.


The evaluation method employed in this paper is example-based, showing how a user can find documents on a TREC topic. While the example is quite illustrative, I think a user study will be necessary to further verify the value of this approach. The study may compare the system and traditional IR system in a set of well-defined tasks.

For instance, users can be given a set of TREC topics and asked to find documents using the suggested system. A control group of users can be given the same set of tasks and traditional search engine. In the end, the amount of efforts and the quality of results can be compared to evaluate the system against traditional search system.

Since we would want to evaluate each session as a whole, we can use usage-based evaluation measures like the one suggested in Azzonpardi et al. The experimental condition can be further refined by allowing the users to do different types of transitions, and see how these variations can affect user’s performance.

Another possibility is using a simulation technique, which can be based a reasonable model of user interacting with the system. If the role of user is to move between the state transition shown in the figure below, we can have a agent with some reasonable model of user’s knowledge and behavior to do the job. Since some part of the interaction would be done by the system algorithmically (e.g. retrieval model), we only need to model the user part (formulation and evaluation of query)

State transition in suggested model of exploratory search. (dashed line denotes user's action, while a solid line is the action fulfilled by the system algorithmically)

For instance, the user model can issue a query by selecting terms from each of TREC topics. Given initial results, it can make relevance judgment (probably based on TREC judgments), and retrieve documents further based on current set of documents, or use them to select a subset of queries which can be used for retrieving more documents, which in turn can be judged. This process can be repeated until some criteria is reached, and the resulting interaction can be evaluated in the same way as actual user logs.

This kind of simulation approach certainly would not substitute user study, yet it will provide an efficient way of tuning many parameters of the system (e.g. retrieval model) before actual user study. More importantly, it enables the system to be evaluated under various assumptions on the user, given that we can parameterize the user model based on such assumptions.

For instance, we can expect some users will be more inclined to depend on Document Set to Document Set transition, while some others tend to use Document to Query Set transition more often. Users will also vary in how many queries they issue before they start using other types of interaction. By parameterizing the user model to control these crucial aspects of user’s behavior, we can evaluate the system based on each of these conditions. In the end, we can evaluate the effectiveness of user’s interaction under the variation of such conditions.

As a similar case, my recent HCIR paper is based on evaluating user’s interaction with a known-item finding system which supports both term-based search and associative browsing between documents. Based on the experiments using a simulated model of user, we studied how the system’s interaction with the user depends on the level of user’s knowledge, and the pattern of user’s behavior.

State transition in known-item finding.

The figure above shows the model of user’s interaction with the system (more on the paper), and I think a variant of this kind of model is equally applicable for more the evaluation of more complex interactions, such as exploratory search. I also compared the simulation results and user study results, which further validated the simulation method we used.

Most search tasks are exploratory…

In my view, most search tasks, except for known-item, navigational ones, are exploratory in nature, and thus are likely to benefit from the framework proposed. Users will be able to exploit their previous search experience better, and build a concrete form of knowledge over time. I can’t wait to see what the authors, as well as others would come up with the next!

Posted in: HCI, IR