Identification and Resolution of Entity Mentions in Text: A Novel Pipelined Approach
Detecting the entities mentioned in a text is a very important part of understanding that text. Entities represent the main concepts of the discourse, what the document "is all about". Without knowing these, the text is just a succession of words. The task of detecting entities has applications in many natural language processing domains, like machine translation, summarization, information retrieval and question answering, in all of which a thorough understanding of the conceptual structure of discourse is vital. This book proposes a novel method for detecting entities and their mentions in natural language text. The process is divided into two successive steps: detecting all the mentions in a text and grouping together the mentions that refer to the same entity. The novelties introduced are the use of the semantic hierarchies of the WordNet lexical database to detect the types of mentions and a top-down, graph-based approach to mention clustering. This book should be of interest to students of natural language processing and anyone else who would like some insight into the automatic understanding of text.