McAnulty College and Graduate School of Liberal Arts
John C. Kern
back-of-the-book indexing, hierarchical cluster analysis, humanities computing, latent semantic analysis, singular value decomposition, word sense disambiguation
Back-of-the-book indexing is the process of generating a list of relevant terms, sub-terms and cross-references from a corpus and providing the user with corresponding page references.
Several cognitive tasks are necessary to produce a good index, and are performed primarily by the human indexer. Indexing has become somewhat automated through computer applications, which at best generate a concordance, and exist to reduce the mundane portions of the process. However, none of these tools determines which terms to index, nor do they capture context-sensitive information about terms and their relationships. Human indexers perform these time-consuming tasks.
The challenge is to develop software that bridges the gap between computerized concordances and manual indexing. The prototype application described herein is unique in its ability to incorporate the intelligent portions of the process. Because of this, it provides a robust draft index that a human indexer can refine in a fraction of the time.
Lukon, S. (2006). A Machine-Aided Approach to Intelligent Index Generation (Master's thesis, Duquesne University). Retrieved from https://dsc.duq.edu/etd/842