Learning from Data by Guiding the Analyst: On the Representation, Use and Creation of Visual Statistical Strategies
Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, PMLR R0:531-539, 1995.
The concept of statistical strategy is introduced and used to develop a structured graphical user interface for guiding data analysts so that they can learn about the structure of their data. The interface visually represents statistical strategies that are designed by expert data analysts to guide novices. The representation is an abstraction of the expert’s concepts of the essence of a data analysis. The interface consists of two interacting windows: the guidemap and the workmap. An example is shown in Figure 1 (a screen image from UiSta (Young, 1994), software that implements the ideas in this paper).Each window contains a graph which has nodes and edges. The guidemap graph represents the statistical strategy for a specific statistical task (such as describing data). Nodes represent potential data-analysis actions that can be taken by the system. Edges represent potential actions that can be taken by the analyst. The guidemap graph exists prior to the data-analysis session, having been created by an expert. The workmap graph represents the complete history of all steps taken by the data analyst. It is constructed during the data-analysis session as a result of the analyst’s actions. Workmap nodes represent datasets, data models, or data-analysis procedures which have been created or used by the analyst. Workmap edges represent the chronological sequence of the analyst’s actions. One workmap node is highlighted to show which statistical object is the focus of the strategy.