Textual Data Mining
Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, PMLR R0:168-174, 1995.
Most automated or semi-automated techniques for extracting novel information from data have concentrated on analyzing simple tables of numeric or atomic symbolic values. A related (but much more complex) problem, that of inferring new facts or knowledge from textual databases, has been addressed most effectively by the library and information retrieval research communities. This paper incorporates several ad hoc search strategies proposed by those communities into a single search methodology that guides the search process and provides a framework for the presentation of facts gleaned from the search. This graphical search result representation is semi-formal, in the sense that it represents the structure of search results formally while the contents of the search are represented informally. The methodology is intended as an aid to "mining" new scientific information from textual/bibliographic databases, rather than as an automated proof system.