Marc Krellenstein, Ph.D.
Technology for Integrated Information Access and Discovery

The last several years have seen improvements in search technology and increased expertise among end-users, and for many sorts of searches, currently available technologies, if properly deployed, are good enough to satisfy common or easily defined information requests against a particular set of documents. The most significant current challenges, especially for the academic and professional communities, are (1) simplifying access to needed materials by integrating ever-larger separate repositories while simultaneously increasing the granularity (or specificity) of search and retrieval, and (2) providing tools for more difficult information needs and genuine discovery, especially as these are made even harder by larger, integrated data stores. The key to meeting the integration and granularity challenge is scalable, XML-based distributed search, supplemented as necessary by standards-based federated searching (i.e., meta-searching). For satisfying more complex information needs, there are both existing technologies not yet widely deployed as well as newer ones that could today have a significant impact. These include document clustering, descriptive or "long query" searching, and text-based data mining, the last of which represents the increased importance of natural language processing and the real beginning of post-search approaches to text-based knowledge discovery.

Bielefeld University Library - last update: 01/20/2004