Defense Date

11-17-2006

Graduation Date

2006

Availability

Immediate Access

Submission Type

thesis

Degree Name

MS

Department

Computational Mathematics

School

McAnulty College and Graduate School of Liberal Arts

Committee Chair

Patrick Juola

Committee Member

John C. Kern

Committee Member

Kathleen Taylor

Keywords

back-of-the-book indexing, hierarchical cluster analysis, humanities computing, latent semantic analysis, singular value decomposition, word sense disambiguation

Abstract

Back-of-the-book indexing is the process of generating a list of relevant terms, sub-terms and cross-references from a corpus and providing the user with corresponding page references.

Several cognitive tasks are necessary to produce a good index, and are performed primarily by the human indexer. Indexing has become somewhat automated through computer applications, which at best generate a concordance, and exist to reduce the mundane portions of the process. However, none of these tools determines which terms to index, nor do they capture context-sensitive information about terms and their relationships. Human indexers perform these time-consuming tasks.

The challenge is to develop software that bridges the gap between computerized concordances and manual indexing. The prototype application described herein is unique in its ability to incorporate the intelligent portions of the process. Because of this, it provides a robust draft index that a human indexer can refine in a fraction of the time.

Format

PDF

Language

English

Share

COinS