- title: 'Models for Conditional Probability Tables in Educational Assessment'
  abstract: 'Experts in educational assessment can often identify the skills needed to provide a solution for a test item and which patterns of those skills pro duce better expected performance. The method described here combines judgements about the structure of the conditional probability table (e.g., conjunctive or compensatory) with Item Response Theory methods for partial credit scoring (Samejima, 1969) to produce a conditional probability table or a prior distribution for a learning algorithm. The structural judgements induce a projection of each configuration of parent skill variables onto a single latent response-propensity $\theta$. This is then used to calculate a probability for each cell in the table.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/almond01a.html
  PDF: http://proceedings.mlr.press/r3/almond01a/almond01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-almond01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Russell G.
    family: Almond
  - given: Lou
    family: DiBello
  - given: Frank
    family: Jenkins
  - given: Deniz
    family: Senturk
  - given: Robert J.
    family: Mislevy
  - given: Linda S.
    family: Steinberg
  - given: Duanli
    family: Yan
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 1-7
  id: almond01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 1
  lastpage: 7
  published: 2001-01-04 00:00:00 +0000
- title: 'Learning in high dimensions: Modular Mixture Models'
  abstract: 'We present a new approach to learning prob- abilistic models for high dimensional data. This approach divides the data dimensions into low dimensional subspaces, and learns a separate mixture model for each subspace. The models combine in a principled manner to form a flexible modular network that pro- duces a total density estimate. We derive and demonstrate an iterative learning algorithm that uses only local information.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/attias01a.html
  PDF: http://proceedings.mlr.press/r3/attias01a/attias01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-attias01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Hagai
    family: Attias
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 8-12
  id: attias01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 8
  lastpage: 12
  published: 2001-01-04 00:00:00 +0000
- title: 'Learning Bayesian networks with mixed variables'
  abstract: 'The paper considers conditional Gaussian networks. As conjugate local priors, we use the Dirichlet distribution for discrete variables and the Gaussian-inverse Gamma distribution for continuous variables, given a configuration of the discrete parents. We assume parameter independence and complete data. Further, the network-score is calculated. We then develop a local master prior procedure, for deriving parameter priors in CG networks. The local master procedure satisfies parameter independence, parameter modularity and likelihood equivalence.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/bottcher01a.html
  PDF: http://proceedings.mlr.press/r3/bottcher01a/bottcher01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-bottcher01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Susanne
    family: Bottcher
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 13-20
  id: bottcher01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 13
  lastpage: 20
  published: 2001-01-04 00:00:00 +0000
- title: 'Products of Hidden Markov Models'
  abstract: 'We present products of hidden Markov models (PoHMM’s), a way of combining HMM’s to form a distributed state time series model. Inference in a PoHMM is tractable and efficient. Learning of the parameters, although intractable, can be effectively done using the Product of Experts learning rule. The distributed state helps the model to explain data which has multiple causes, and the fact that each model need only explain part of the data means a PoHMM can capture longer range structure than an HMM is capable of. We show some results on modelling character strings, a simple language task and the symbolic family trees problem, which highlight these advantages.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/brown01a.html
  PDF: http://proceedings.mlr.press/r3/brown01a/brown01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-brown01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Andrew D.
    family: Brown
  - given: Geoffrey E.
    family: Hinton
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 21-28
  id: brown01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 21
  lastpage: 28
  published: 2001-01-04 00:00:00 +0000
- title: 'Information-Theoretic Advisors in Invisible Chess'
  abstract: 'Making decisions under uncertainty remains a central problem in AI research. Unfortunately, most uncertain real-world problems are so complex that progress in them is extremely difficult. Games model some elements of the real world, and offer a more controlled environment for exploring methods for dealing with uncertainty. Chess and chesslike games have long been used as a strategically complex test-bed for general AI research, and we extend that tradition by introducing an imperfect information variant of chess with some useful properties such as the ability to scale the amount of uncertainty in the game. We discuss the complexity of this game which we call invisible chess, and present results outlining the basic game. We motivate and describe the implementation and application of two information-theoretic advisors, and describe our decision-theoretic approach to combining these information-theoretic advisors with a basic strategic advisor. Finally we discuss promising preliminary results that we have obtained with these advisors.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/bud01a.html
  PDF: http://proceedings.mlr.press/r3/bud01a/bud01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-bud01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Ariel E.
    family: Bud
  - given: David W.
    family: Albrecht
  - given: Ann E.
    family: Nicholson
  - given: Ingrid
    family: Zukerman
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 29-34
  id: bud01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 29
  lastpage: 34
  published: 2001-01-04 00:00:00 +0000
- title: 'A Non-Parametric EM-Style Algorithm for Imputing Missing Values'
  abstract: 'We present an iterative non-parametric algorithm for imputing missing values. The algorithm is similar to EM except that it uses non-parametric models such as k-nearest neighbor or kernel regression instead of the parametric models used with EM. An interesting feature of the algorithm is that the E and M steps collapse into a single step because the data being filled in is the model - updating the filled-in values updates the model at the same time. The main advantages of this approach compared to parametric EM methods are that: 1) it is more efficient for moderate size data sets, and 2) it is less susceptible to errors that parametric methods make when the parametric models do not fit the data well. The robustness to model failure makes the non-parametric method more accurate when models of the data are not known apriori and cannot be determined reliably. We evaluate the method using a real medical data set that has many missing values.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/caruana01a.html
  PDF: http://proceedings.mlr.press/r3/caruana01a/caruana01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-caruana01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Rich
    family: Caruana
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 35-40
  id: caruana01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 35
  lastpage: 40
  published: 2001-01-04 00:00:00 +0000
- title: 'Managing Multiple Models'
  abstract: 'Recent research in model selection and adaptive modeling has produced an embarrassment of riches. By using any one of several different techniques, an analyst is able to generate a number of models that describe the same data set well. Examples include multiple tree models generated by bootstrapping or stochastic searches, and different subsets of variables in linear regression models identified by stochastic or exhaustive searches. While model averaging can use these models to improve prediction accuracy, interpretation of the resultant models becomes difficult. We seek a compromise, developing measures of dissimilarity between different models and using these to select good models which may reveal different aspects of the data. Data on housing prices in Boston are used to illustrate this in the context of treed regression models.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/chipman01a.html
  PDF: http://proceedings.mlr.press/r3/chipman01a/chipman01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-chipman01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Hugh A.
    family: Chipman
  - given: Edward I.
    family: George
  - given: Robert E.
    family: McCulloch
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 41-48
  id: chipman01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 41
  lastpage: 48
  published: 2001-01-04 00:00:00 +0000
- title: 'Solving Hidden-Mode Markov Decision Problems'
  abstract: 'Markov decision processes (HM-MDPs) are a novel mathematical framework for a subclass of nonstationary reinforcement learning problems where environment dynamics change over time according to a Markov process. HM-MDPs are a special case of partially observable Markov decision processes (POMDPs), and therefore nonstationary problems of this type can in principle be addressed indirectly via existing POMDP algorithms. However, previous research has shown that such an indirect approach is inefficient compared with a direct HM-MDP approach in terms of the model learning time. In this paper, we investigate how to solve HM-MDP problems efficiently by using a direct approach. We exploit the HM-MDP structure and derive an equation for dynamic programming update. Our equation decomposes the value function into a number of components and as a result, substantially reduces the amount of computations in finding optimal policies. Based on the incremental pruning and point-based improvement techniques, a value iteration algorithm is also implemented. Empirical results show that the HM-MDP approach outperforms the POMDP one several order of magnitude with respect to both space requirement and speed.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/choi01a.html
  PDF: http://proceedings.mlr.press/r3/choi01a/choi01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-choi01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Samuel Ping-Man
    family: Choi
  - given: Nevin Lianwen
    family: Zhang
  - given: Dit-Yan
    family: Yeung
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 49-56
  id: choi01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 49
  lastpage: 56
  published: 2001-01-04 00:00:00 +0000
- title: 'Bagging and the Bayesian Bootstrap'
  abstract: 'Bagging is a method of obtaining more robust predictions when the model class under consideration is unstable with respect to the data, i.e., small changes in the data can cause the predicted values to change significantly. In this paper, we introduce a Bayesian version of bagging based on the Bayesian bootstrap. The Bayesian bootstrap resolves a theoretical problem with ordinary bagging and often results in more efficient estimators. We show how model averaging can be combined within the Bayesian bootstrap and illustrate the procedure with several examples.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/clyde01a.html
  PDF: http://proceedings.mlr.press/r3/clyde01a/clyde01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-clyde01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Merlise
    family: Clyde
  - given: Herbert
    family: Lee
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 57-62
  id: clyde01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 57
  lastpage: 62
  published: 2001-01-04 00:00:00 +0000
- title: 'Hyperparameters for Soft Bayesian Model Selection'
  abstract: 'Mixture models, in which a probability distribution is represented as a linear superposition of component distributions, are widely used in statistical modeling and pattern recognition. One of the key tasks in the application of mixture models is the determination of a suitable number of components. Conventional approaches based on cross-validation are computationally expensive, are wasteful of data, and give noisy estimates for the optimal number of components. A fully Bayesian treatment, based on Markov chain Monte Carlo methods for instance, will return a posterior distribution over the number of components. However, in practical applications it is generally convenient, or even computationally essential, to select a single, most appropriate model. Recently it has been shown, in the context of linear latent variable models, that the use of hierarchical priors governed by continuous hyperparameters whose values are set by typeII maximum likelihood, can be used to optimize model complexity. In this paper we extend this framework to mixture distributions by considering the classical task of density estimation using mixtures of Gaussians. We show that, by setting the mixing coefficients to maximize the marginal log-likelihood, unwanted components can be suppressed, and the appropriate number of components for the mixture can be determined in a single training run without recourse to crossvalidation. Our approach uses a variational treatment based on a factorized approximation to the posterior distribution.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/corduneanu01a.html
  PDF: http://proceedings.mlr.press/r3/corduneanu01a/corduneanu01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-corduneanu01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Adrian
    family: Corduneanu
  - given: Christopher M.
    family: Bishop
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 63-70
  id: corduneanu01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 63
  lastpage: 70
  published: 2001-01-04 00:00:00 +0000
- title: 'On searching for optimal classifiers among Bayesian networks'
  abstract: 'There is much interest in constructing from datasets Bayesian networks which are efficient, or even optimal, for classification purposes. Most search strategies usually discriminate between networks by comparing their marginal likelihood score, but recently it has been suggested that search strategies for classifiers should instead select among models using alternative scores. This paper contributes to this discussion by presenting the results of simulations on the sets of all directed acyclic graphs on four and five nodes. Our results add evidence to earlier indications that the marginal likelihood is likely to be a poor criterion to use for classifier selection.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/cowell01a.html
  PDF: http://proceedings.mlr.press/r3/cowell01a/cowell01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-cowell01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Robert G.
    family: Cowell
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 71-76
  id: cowell01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 71
  lastpage: 76
  published: 2001-01-04 00:00:00 +0000
- title: 'Statistical Aspects of Stochastic Logic Programs'
  abstract: 'Stochastic logic programs (SLPs) and the various distributions they define are presented with a stress on their characterisation in terms of Markov chains. Sampling, parameter estimation and structure learning for SLPs are discussed. The application of SLPs to Bayesian learning, computational linguistics and computational biology are considered. Lafferty’s Gibbs-Markov models are compared and contrasted with SLPs.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/cussens01a.html
  PDF: http://proceedings.mlr.press/r3/cussens01a/cussens01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-cussens01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: James
    family: Cussens
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 77-82
  id: cussens01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 77
  lastpage: 82
  published: 2001-01-04 00:00:00 +0000
- title: 'Some variations on variation independence.'
  abstract: 'Variation independence of functions is a simple natural ’irrelevance’ property arising in a number of applications in Artificial Intelligence and Statistics. We show how it can be alternatively expressed in terms of two other representations of the same underlying structure: equivalence relations and $\tau$ -fields.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/dawid01a.html
  PDF: http://proceedings.mlr.press/r3/dawid01a/dawid01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-dawid01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: A. Philip
    family: Dawid
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 83-86
  id: dawid01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 83
  lastpage: 86
  published: 2001-01-04 00:00:00 +0000
- title: 'Are they really neighbors? A statistical analysis of the SOM algorithm output'
  abstract: 'One of the attractive features of Self-Organizing Maps (SOM) is the so-called "topological preservation property": observations that are close to each other in the input space (at least locally) remain close to each other in the SOM. In this work, we propose the use of a bootstrap scheme to construct a statistical significance test of the observed proximity among individuals in the SOM.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/bodt01a.html
  PDF: http://proceedings.mlr.press/r3/bodt01a/bodt01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-bodt01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Eric
    prefix: de
    family: Bodt
  - given: Marie
    family: Cottrell
  - given: Michel
    family: Verleysen
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 87-92
  id: bodt01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 87
  lastpage: 92
  published: 2001-01-04 00:00:00 +0000
- title: 'Monte-Carlo Algorithms for the Improvement of Finite-State Stochastic Controllers: Application to Bayes-Adaptive Markov Decision Processes'
  abstract: 'We consider the problem of "optimal learning" for Markov decision processes with uncertain transition probabilities. Motivated by the correspondence between these processes and partially-observable Markov decision processes, we adopt policies expressed as finite-state stochastic automata, and we propose policy improvement algorithms that utilize Monte-Carlo techniques for gradient estimation and ascent.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/duff01a.html
  PDF: http://proceedings.mlr.press/r3/duff01a/duff01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-duff01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Michael O.
    family: Duff
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 93-97
  id: duff01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 93
  lastpage: 97
  published: 2001-01-04 00:00:00 +0000
- title: 'Why Averaging Classifiers can Protect Against Overfitting'
  abstract: 'We study a simple learning algorithm for binary classification. Instead of predicting with the best hypothesis in the hypothesis class, this algorithm predicts with a weighted average of all hypotheses, weighted exponentially with respect to their training error. We show that the prediction of this algorithm is much more stable than the prediction of an algorithm that predicts with the best hypothesis. By allowing the algorithm to abstain from predicting on some examples, we show that the predictions it makes when it does not abstain are very reliable. Finally, we show that the probability that the algorithm abstains is comparable to the generalization error of the best hypothesis in the class.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/freund01a.html
  PDF: http://proceedings.mlr.press/r3/freund01a/freund01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-freund01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Yoav
    family: Freund
  - given: Yishay
    family: Mansour
  - given: Robert E.
    family: Schapire
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 98-105
  id: freund01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 98
  lastpage: 105
  published: 2001-01-04 00:00:00 +0000
- title: 'Dual perturb and combine algorithm'
  abstract: 'In this paper, a dual perturb and combine algorithm is proposed which consists in producing the perturbed predictions at the prediction stage using only one model. To this end, the attribute vector of a test case is perturbed several times by an additive random noise, the model is applied to each of these perturbed vectors and the resulting predictions are aggregated. An analytical version of this algorithm is described in the context of decision tree induction. From experiments on several datasets, it appears that this simple algorithm yields significant improvements on several problems, sometimes comparable to those obtained with bagging. When combined with decision tree bagging, this algorithm also improves accuracy in many problems.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/geurts01a.html
  PDF: http://proceedings.mlr.press/r3/geurts01a/geurts01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-geurts01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Pierre
    family: Geurts
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 106-111
  id: geurts01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 106
  lastpage: 111
  published: 2001-01-04 00:00:00 +0000
- title: 'Handling Missing and Unreliable Information in Speech Recognition'
  abstract: 'In this work, techniques for classification with missing or unreliable data are applied to the problem of noise-robustness in Automatic Speech Recognition (ASR). The primary advantage of this viewpoint is that it makes minimal assumptions about any noise background. As motivation, we review evidence that the auditory system is capable of dealing with incomplete data and, indeed, does so in normal listening conditions. We formulate the unreliable classification problem and show how it can be expressed in the framework of Continuous Density Hidden Markov Models for statistical ASR. We describe experiments on connected digit recognition in noise in which encouraging results are obtained. Results are improved by ’softening’ the missing data decision. We argue that if the noise background is unpredictable it is necessary to integrate primitive processes which identify coherent spectraltemporal regions likely to be dominated by a single source with a generalised recognition decode which searches for the best sub-set of regions which match a speech source. We describe an implementation of a multi-source decoder using missing data recognition and show how it improves recognition results for non-stationary noises.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/green01a.html
  PDF: http://proceedings.mlr.press/r3/green01a/green01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-green01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Phil D.
    family: Green
  - given: Jon
    family: Barker
  - given: Martin
    family: Cooke
  - given: Ljubomir
    family: Josifovski
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 112-116
  id: green01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 112
  lastpage: 116
  published: 2001-01-04 00:00:00 +0000
- title: 'Discriminant Analysis on Dissimilarity Data : a New Fast Gaussian like Algorithm'
  abstract: 'Classifying objects according to their proximity is the fundamental task of pattern recognition and arises as a classification problem or discriminant analysis in experimental sciences. Here we consider a particular point of view on discriminant analysis from a dissimilarity data table. We develop a new approach, inspired from the Gaussian model in discriminant analysis, which defines a set a decision rules from simple statistics on the dissimilarity matrix between observations. This matrix can be only sparse dealing with huge databases. Numerical experiments on artificial and real data (proteins classification) show interesting behaviour compared to a $K$NN classifier, (i) equivalent error rate, (ii) dramatically lower CPU times and (iii) more robustness with sparse dissimilarity structure up to $40 %$ of actual dissimilarity measures.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/guerin-dugue01a.html
  PDF: http://proceedings.mlr.press/r3/guerin-dugue01a/guerin-dugue01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-guerin-dugue01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Anne
    family: Guérin-Dugué
  - given: Gilles
    family: Celeux
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 117-122
  id: guerin-dugue01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 117
  lastpage: 122
  published: 2001-01-04 00:00:00 +0000
- title: 'Profile Likelihood in Directed Graphical Models from BUGS Output'
  abstract: 'This paper presents a method for using output of the computer program BUGS to obtain approximate profile likelihood functions of parameters or functions of parameters in directed graphical models with incomplete data. The method also provides a tool to approximate integrated likelihood functions. The prior distributions specified in BUGS do not have a significant impact on the profile likelihood functions and we consider the method as a desirable supplement to BUGS that enables us to do both Bayesian and likelihood based analyses in directed graphical models.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/hojbjerre01a.html
  PDF: http://proceedings.mlr.press/r3/hojbjerre01a/hojbjerre01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-hojbjerre01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Malene
    family: Højbjerre
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 123-128
  id: hojbjerre01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 123
  lastpage: 128
  published: 2001-01-04 00:00:00 +0000
- title: 'Is regularization unnecessary for boosting?'
  abstract: 'Boosting algorithms are often observed to be resistant to overfitting, to a degree that one may wonder whether it is harmless to run the algorithms forever, and whether regularization in on way or another is unnecessary [see, e.g., Schapire (1999); Friedman, Hastie and Tibshirani (1999); Grove and Schuurmans (1998); Mason, Baxter, Bartlett and Frean (1999)]. One may also wonder whether it is possible to adapt the boosting ideas to regression, and whether or not it is possible to avoid the need of regularization by just adopting the boosting device. In this paper we present examples where ’boosting forever’ leads to suboptimal predictions; while some regularization method, on the other hand, can achieve asymptotic optimality, at least in theory. We conjecture that this can be true in more general situations, and for some other regularization methods as well. Therefore the emerging literature on regularized variants of boosting is not unnecessary, but should be encouraged instead. The results of this paper are obtained from an analogy between some boosting algorithms that are used in regression and classification.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/jiang01a.html
  PDF: http://proceedings.mlr.press/r3/jiang01a/jiang01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-jiang01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Wenxin
    family: Jiang
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 129-136
  id: jiang01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 129
  lastpage: 136
  published: 2001-01-04 00:00:00 +0000
- title: 'Learning mixtures of smooth, nonuniform deformation models for probabilistic image matching'
  abstract: 'By representing images and image prototypes by linear subspaces spanned by "tangent vectors" (derivatives of an image with respect to translation, rotation, etc.), impressive invariance to known types of uniform distortion can be built into feedforward discriminators. We describe a new probability model that can jointly cluster data and learn mixtures of nonuniform, smooth deformation fields. Our fields are based on low-frequency wavelets, so they use very few parameters to model a wide range of smooth deformations (unlike, e.g., factor analysis, which uses a large number of parameters to model deformations). We give results on handwritten digit recognition and face recognition.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/jojic01a.html
  PDF: http://proceedings.mlr.press/r3/jojic01a/jojic01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-jojic01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Nebojsa
    family: Jojic
  - given: Patrice Y.
    family: Simard
  - given: Brendan J.
    family: Frey
  - given: David
    family: Heckerman
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 137-142
  id: jojic01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 137
  lastpage: 142
  published: 2001-01-04 00:00:00 +0000
- title: 'Predicting with Variables Constructed from Temporal Sequences'
  abstract: 'In this study, we applied the local learning paradigm and conditional independence assumptions to control the rapid growth of the dimensionality introduced by multivariate time series. We also combined various univariate time series with different stationary assumptions in temporal models. These techniques are applied to learn simple Bayesian networks from temporal data and to predict survival probabilities of ICU patients on every day of their ICU stay.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/kayaalp01a.html
  PDF: http://proceedings.mlr.press/r3/kayaalp01a/kayaalp01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-kayaalp01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Mehmet
    family: Kayaalp
  - given: Gregory F.
    family: Cooper
  - given: Gilles
    family: Clermont
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 143-148
  id: kayaalp01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 143
  lastpage: 148
  published: 2001-01-04 00:00:00 +0000
- title: 'Another look at sensitivity of Bayesian networks to imprecise probabilities'
  abstract: 'Empirical study of sensitivity analysis on a Bayesian network examines the effects of varying the network’s probability parameters on the posterior probabilities of the true hypothesis. One appealing approach to modeling the uncertainty of the probability parameters is to add normal noise to the log-odds of the nominal probabilities. However, the paper argues that differences in sensitivities found on true hypothesis may only be valid in the range of standard deviations where the log-odds normal distribution is unimodal. The paper also shows that using average posterior probabilities as criterion to measure the sensitivity may not be the most indicative, especially when the distribution is very asymmetric as is the case at nominal values close to zero or one. It is proposed, instead, to use the partial ordering of the most probable causes of diagnosis, measured by a suitable lower confidence bound. The paper also presents the preliminary results of our sensitivity analysis experiments with three Bayesian networks built for diagnosis of airplane systems. Our results show that some networks are more sensitive to imprecision in probabilities than previously believed.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/kipersztok01a.html
  PDF: http://proceedings.mlr.press/r3/kipersztok01a/kipersztok01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-kipersztok01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Oscar
    family: Kipersztok
  - given: Haiqin
    family: Wang
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 149-155
  id: kipersztok01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 149
  lastpage: 155
  published: 2001-01-04 00:00:00 +0000
- title: 'Comparing Prequential Model Selection Criteria in Supervised Learning of Mixture Models'
  abstract: 'In this paper we study prequential model selection criteria in supervised learning domains. The main problem with this approach is the fact that the criterion is sensitive to the ordering the data is processed with. We discuss several approaches for addressing the ordering problem, and compare empirically their performance in real-world supervised model selection tasks. The empirical results demonstrate that with the prequential approach it is quite easy to find predictive models that are significantly more accurate classifiers than the models found by the standard unsupervised marginal likelihood criterion. The results also suggest that averaging over random orderings may be a more sensible strategy for solving the ordering problem than trying to find the ordering optimizing the prequential model selection criterion.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/kontkanen01a.html
  PDF: http://proceedings.mlr.press/r3/kontkanen01a/kontkanen01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-kontkanen01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Petri
    family: Kontkanen
  - given: Petri
    family: Myllymäki
  - given: Henry
    family: Tirri
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 156-161
  id: kontkanen01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 156
  lastpage: 161
  published: 2001-01-04 00:00:00 +0000
- title: 'Bayesian Support Vector Regression'
  abstract: 'We show that the Bayesian evidence framework can be applied to both $\epsilon$-support vector regression ($\epsilon$-SVR) and $\nu$-support vector regression ($\nu$-SVR) algorithms. Standard SVR training can be regarded as performing level one inference of the evidence framework, while levels two and three allow automatic adjustments of the regularization and kernel parameters respectively, without the need of a validation set.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/law01a.html
  PDF: http://proceedings.mlr.press/r3/law01a/law01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-law01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Martin H. C.
    family: Law
  - given: James Tin-Yau
    family: Kwok
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 162-167
  id: law01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 162
  lastpage: 167
  published: 2001-01-04 00:00:00 +0000
- title: 'Variational Learning for Multi-Layer Networks of Linear Threshold Units'
  abstract: 'Linear threshold units (LTUs) were originally proposed as models of biological neurons. They were widely studied in the context of the perceptron (Rosenblatt, 1962). Due to the difficulties of finding a general algorithm for networks with hidden nodes, they never passed into general use. In this work we derive an algorithm in the context of probabilistic models and show how it may be applied in multi-layer networks of LTUs. We demonstrate the performance of the algorithm on three data-sets.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/lawrence01a.html
  PDF: http://proceedings.mlr.press/r3/lawrence01a/lawrence01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-lawrence01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Neil D.
    family: Lawrence
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 168-175
  id: lawrence01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 168
  lastpage: 175
  published: 2001-01-04 00:00:00 +0000
- title: 'On the effectiveness of the skew divergence for statistical language analysis'
  abstract: 'Estimating word co-occurrence probabilities is a problem underlying many applications in statistical natural language processing. Distance-weighted (or similarityweighted) averaging has been shown to be a promising approach to the analysis of novel co-occurrences. Many measures of distributional similarity have been proposed for use in the distance-weighted averaging framework; here, we empirically study their stability properties, finding that similarity-based estimation appears to make more efficient use of more reliable portions of the training data. We also investigate properties of the skew divergence, a weighted version of the KullbackLeibler (KL) divergence; our results indicate that the skew divergence yields better results than the KL divergence even when the KL divergence is applied to more sophisticated probability estimates.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/lee01a.html
  PDF: http://proceedings.mlr.press/r3/lee01a/lee01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-lee01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Lillian
    family: Lee
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 176-183
  id: lee01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 176
  lastpage: 183
  published: 2001-01-04 00:00:00 +0000
- title: 'A Simulation Study of Three Related Causal Data Mining Algorithms'
  abstract: 'In all scientific domains causality plays a significant role. This study focused on evaluating and refining efficient algorithms to learn causal relationships from observational data. Evaluation of learned causal output is difficult, due to lack of a gold standard in real-world domains. Therefore, we used simulated data from a known causal network in a medical domain-the Alarm network. For causal discovery we used three variants of the Local Causal Discovery (LCD) algorithms, that are referred to as LCDa, LCDb and LCDc. These algorithms use the framework of causal Bayesian Networks to represent causal relationships among model variables. LCDa, LCDb and LCDe take as input a dataset and a partial node ordering, and output purported causes of the form variable $Y$ causally influences variable $Z$. Using the simulated Alarm dataset as input, LCDa had a false positive rate of $0.09$, LCDb $0.08$ and LCDc 0.04. All the algorithms had a true positive rate of about 0.27 . Most of the false positives occurred when a causal relationship was confounded. LCDc output as causal only those causally confounded pairs that had very weak confounding. We identify and discuss the causally confounded relationships that often seem to induce false positive results.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/mani01a.html
  PDF: http://proceedings.mlr.press/r3/mani01a/mani01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-mani01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Subramani
    family: Mani
  - given: Gregory F.
    family: Cooper
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 184-191
  id: mani01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 184
  lastpage: 191
  published: 2001-01-04 00:00:00 +0000
- title: 'Finding a path is harder than finding a tree'
  abstract: 'This note shows that the problem of learning an optimal chain graphical model from data is NP-hard for the Bayesian, maximum likelihood, and minimum description length approaches. This hardness result holds despite the fact that the problem is a restriction of the polynomially solvable problem of finding the optimal tree graphical model.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/meek01a.html
  PDF: http://proceedings.mlr.press/r3/meek01a/meek01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-meek01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Christopher
    family: Meek
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 192-195
  id: meek01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 192
  lastpage: 195
  published: 2001-01-04 00:00:00 +0000
- title: 'The Learning Curve Method Applied to Clustering'
  abstract: 'We describe novel fast learning curve methods—methods for scaling inductive methods to large data sets—and their application to clustering. We describe the decision theoretic underpinnings of the approach and demonstrate significant performance gains on two real-world data sets.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/meek01b.html
  PDF: http://proceedings.mlr.press/r3/meek01b/meek01b.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-meek01b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Christopher
    family: Meek
  - given: Bo
    family: Thiesson
  - given: David
    family: Heckerman
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 196-202
  id: meek01b
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 196
  lastpage: 202
  published: 2001-01-04 00:00:00 +0000
- title: 'A Random Walks View of Spectral Segmentation'
  abstract: 'We present a new view of clustering and segmentation by pairwise similarities. We interpret the similarities as edge flows in a Markov random walk and study the eigenvalues and eigenvectors of the walk’s transition matrix. This view shows that spectral methods for clustering and segmentation have a probabilistic foundation. We prove that the Normalized Cut method arises naturally from our framework and we provide a complete characterization of the cases when the Normalized Cut algorithm is exact. Then we discuss other spectral segmentation and clustering methods showing that several of them are essentially the same as NCut.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/meila01a.html
  PDF: http://proceedings.mlr.press/r3/meila01a/meila01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-meila01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Marina
    family: Meilă
  - given: Jianbo
    family: Shi
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 203-208
  id: meila01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 203
  lastpage: 208
  published: 2001-01-04 00:00:00 +0000
- title: 'An improved training algorithm for kernel Fisher discriminants'
  abstract: 'We present a fast training algorithm for the kernel Fisher discriminant classifier. It uses a greedy approximation technique and has an empirical scaling behavior which improves upon the state of the art by more than an order of magnitude, thus rendering the kernel Fisher algorithm a viable option also for large datasets.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/mika01a.html
  PDF: http://proceedings.mlr.press/r3/mika01a/mika01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-mika01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Sebastian
    family: Mika
  - given: Alexander J.
    family: Smola
  - given: Bernhard
    family: Schölkopf
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 209-215
  id: mika01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 209
  lastpage: 215
  published: 2001-01-04 00:00:00 +0000
- title: 'Message Length as an Effective Ockham’s Razor in Decision Tree Induction'
  abstract: 'The validity of the Ockham’s Razor principle is a topic of much debate. A series of empirical investigations have sought to discredit the principle by the application of decision trees to learning tasks using node cardinality as the objective function. As a response to these papers, we suggest that the message length of a hypothesis can be used as an effective interpretation of Ockham’s Razor, resulting in positive empirical support for the principle. The theoretical justification for this Bayesian interpretation is also investigated.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/needham01a.html
  PDF: http://proceedings.mlr.press/r3/needham01a/needham01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-needham01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Scott
    family: Needham
  - given: David L.
    family: Dowe
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 216-223
  id: needham01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 216
  lastpage: 223
  published: 2001-01-04 00:00:00 +0000
- title: 'Using Unsupervised Learning to Guide Resampling in Imbalanced Data Sets'
  abstract: 'The class imbalance problem causes a classifier to over-fit the data belonging to the class with the greatest number of training examples. The purpose of this paper is to argue that methods that equalize class membership are not as effective as possible when applied blindly and that improvements can be obtained by adjusting for the within-class imbalance. A guided resampling technique is proposed and tested within a simpler letter recognition domain and a more difficult text classification domain. A fast unsupervised clustering technique, Principal Direction Divisive Partitioning (PDDP), is used to determine the internal characteristics of each class. The performance improvement in categories that suffer from a large between-class imbalance (few positive examples) are shown to be improved when using the guided resampling method.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/nickerson01a.html
  PDF: http://proceedings.mlr.press/r3/nickerson01a/nickerson01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-nickerson01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Adam
    family: Nickerson
  - given: Nathalie
    family: Japkowicz
  - given: Evangelos E.
    family: Milios
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 224-228
  id: nickerson01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 224
  lastpage: 228
  published: 2001-01-04 00:00:00 +0000
- title: 'Online Bagging and Boosting'
  abstract: 'Bagging and boosting are well-known ensemble learning methods. They combine multiple learned base models with the aim of improving generalization performance. To date, they have been used primarily in batch mode, and no effective online versions have been proposed. We present simple online bagging and boosting algorithms that we claim perform as well as their batch counterparts.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/oza01a.html
  PDF: http://proceedings.mlr.press/r3/oza01a/oza01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-oza01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Nikunj C.
    family: Oza
  - given: Stuart J.
    family: Russell
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 229-236
  id: oza01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 229
  lastpage: 236
  published: 2001-01-04 00:00:00 +0000
- title: 'Geographical Clustering of Cancer Incidence by Means of Bayesian Networks and Conditional Gaussian Networks'
  abstract: 'With the aim of improving knowledge on the geographical distribution and characterization of malignant tumors in the Autonomous Community of the Basque Country (Spain), age-standardized cancer incidence rates of the 6 most frequent cancer types for patients of each sex between 1986 and 1994 are analyzed, in relation to the towns of the Community. Concretely, we perform a geographical clustering of the towns of the Community by means of Bayesian networks and conditional Gaussian networks. We present several maps that show the clusterings encoded by the learnt models. In addition to this, we outline the cancer incidence profile for each of the obtained clusters.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/pena01a.html
  PDF: http://proceedings.mlr.press/r3/pena01a/pena01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-pena01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: José M.
    family: Peña
  - given: I.
    family: Izarzugaza
  - given: José Antonio
    family: Lozano
  - given: E.
    family: Aldasoro
  - given: Pedro
    family: Larrañaga
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 237-242
  id: pena01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 237
  lastpage: 242
  published: 2001-01-04 00:00:00 +0000
- title: 'Stochastic System Monitoring and Control'
  abstract: 'In this article we propose a new technique for efficiently solving a specialized instance of a finite state sequential decision process. This specialized task requires keeping a system within a set of nominal states, introducing control actions only when forbidden states are entered. Instead of assuming that the process evolves only due to control actions, we assume that system evolution occurs due to both internal system dynamics and control actions, referred to as endogenous and exogenous evolution respectively. Since controls are needed only for exogenous evolution, we separate inference for the case of endogenous and exogenous evolution, obtaining an inference method that is computationally simpler than using a standard POMDP framework for solving this task. We summarize the problem framework and the algorithm for performing sequential decision-making.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/provan01a.html
  PDF: http://proceedings.mlr.press/r3/provan01a/provan01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-provan01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Gregory M.
    family: Provan
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 243-250
  id: provan01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 243
  lastpage: 250
  published: 2001-01-04 00:00:00 +0000
- title: 'Can the Computer Learn to Play Music Expressively?'
  abstract: 'A computer system is described that provides a real-time musical accompaniment for a live soloist in a piece of non-improvised music. A Bayesian belief network is developed that represents the joint distribution on the times at which the solo and accompaniment notes are played as well as many hidden variables. The network models several important sources of information including the information contained in the score and the rhythmic interpretations of the soloist and accompaniment which are learned from examples. The network is used to provide a computationally efficient decision-making engine that utilizes all available information while producing a flexible and musical accompaniment.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/raphael01a.html
  PDF: http://proceedings.mlr.press/r3/raphael01a/raphael01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-raphael01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Christopher
    family: Raphael
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 251-258
  id: raphael01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 251
  lastpage: 258
  published: 2001-01-04 00:00:00 +0000
- title: 'On Parameter Priors for Discrete DAG Models'
  abstract: 'We investigate parameter priors for discrete DAG models. It was shown in previous works that a Dirichlet prior on the parameters of a discrete DAG model is inevitable assuming global and local parameter independence for all possible complete DAG structures. A similar result for Gaussian DAG models hinted that the assumption of local independence may be redundant. Herein, we prove that the local independence assumption is necessary in order to dictate a Dirichlet prior on the parameters of a discrete DAG model. We explicate the minimal set of assumptions needed to dictate a Dirichlet prior, and we derive the functional form of prior distributions that arise under the global independence assumption alone.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/rusakov01a.html
  PDF: http://proceedings.mlr.press/r3/rusakov01a/rusakov01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-rusakov01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Dmitry
    family: Rusakov
  - given: Dan
    family: Geiger
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 259-264
  id: rusakov01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 259
  lastpage: 264
  published: 2001-01-04 00:00:00 +0000
- title: 'Piecewise Linear Instrumental Variable Estimation of Causal Influence'
  abstract: 'Instrumental Variable (IV) estimation is a powerful strategy for estimating causal  influence, even in the presence of confounding. Standard IV estimation requires that the relationships between variables is linear. Here we relax the linearity requirement by constructing a piecewise linear IV estimator. Simulation studies show that when the causal influence of $X$ on $Y$ is non-linear, the piecewise linear is an improvement.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/scheines01a.html
  PDF: http://proceedings.mlr.press/r3/scheines01a/scheines01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-scheines01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Richard
    family: Scheines
  - given: Gregory F.
    family: Cooper
  - given: Changwon
    family: Yoo
  - given: Tianjiao
    family: Chu
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 265-271
  id: scheines01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 265
  lastpage: 271
  published: 2001-01-04 00:00:00 +0000
- title: 'The Efficient Propagation of Arbitrary Subsets of Beliefs in Discrete-Valued Bayesian Networks'
  abstract: 'The paper describes an approach for propagating arbitrary subsets of beliefs in Bayesian Belief Networks. The method is based on a multiple message passing scheme in junction trees. A hybrid tree structure is introduced, both for the propagation of evidence and as an efficiently permutable representation of a decomposable graph. The use of maximal prime subgraph decompositions and tree permutations to reduce computational cost is demonstrated.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/smith01a.html
  PDF: http://proceedings.mlr.press/r3/smith01a/smith01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-smith01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Duncan
    family: Smith
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 272-277
  id: smith01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 272
  lastpage: 277
  published: 2001-01-04 00:00:00 +0000
- title: 'An Anytime Algorithm for Causal Inference'
  abstract: 'The Fast Casual Inference (FCI) algorithm searches for features common to observationally equivalent sets of causal directed acyclic graphs. It is correct in the large sample limit with probability one even if there is a possibility of hidden variables and selection bias. In the worst case, the number of conditional independence tests performed by the algorithm grows exponentially with the number of variables in the data set. This affects both the speed of the algorithm and the accuracy of the algorithm on small samples, because tests of independence conditional on large numbers of variables have very low power. In this paper, I prove that the FCI algorithm can be interrupted at any stage and asked for output. The output from the interrupted algorithm is still correct with probability one in the large sample limit, although possibly less informative (in the sense that it answers "Can’t tell" for a larger number of questions) than if the FCI algorithm had been allowed to continue uninterrupted.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/spirtes01a.html
  PDF: http://proceedings.mlr.press/r3/spirtes01a/spirtes01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-spirtes01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Peter
    family: Spirtes
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 278-285
  id: spirtes01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 278
  lastpage: 285
  published: 2001-01-04 00:00:00 +0000
- title: 'Dynamic Positional Trees for Structural Image Analysis'
  abstract: 'Dynamic positional trees are a significant extension of dynamic trees, incorporating movable nodes. This addition makes sequence tracking viable within the model, but requires a new formulation to incorporate the prior over positions. The model is implemented using a structured variational procedure, and is illustrated on synthetic raytraced images and image sequences.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/storkey01a.html
  PDF: http://proceedings.mlr.press/r3/storkey01a/storkey01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-storkey01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Amos J.
    family: Storkey
  - given: Christopher K. I.
    family: Williams
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 286-292
  id: storkey01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 286
  lastpage: 292
  published: 2001-01-04 00:00:00 +0000
- title: 'Temporal Matching under Uncertainty'
  abstract: 'Temporal matching is the problem of matching observations to predefined temporal patterns or templates. This problem arises in many applications including medical and model-based diagnosis, plan-recognition, and temporal databases. This work examines the sources of uncertainty in temporal matching and presents a probabilistic technique to perform temporal matching under uncertainty. This technique is then applied to the problem of finding the onset of infection with \emph{Toxoplasma Gondii}.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/tawfik01a.html
  PDF: http://proceedings.mlr.press/r3/tawfik01a/tawfik01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-tawfik01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Ahmed Y.
    family: Tawfik
  - given: Greg
    family: Scott
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 293-297
  id: tawfik01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 293
  lastpage: 297
  published: 2001-01-04 00:00:00 +0000
- title: 'A Kernel Approach for Vector Quantization with Guaranteed Distortion Bounds'
  abstract: 'We propose a kernel method for vector quantization and clustering. Our approach allows a priori specification of the maximally allowed distortion, and it automatically finds a sufficient representative subset of the data to act as codebook vectors (or cluster centres). It does not find the minimal number of such vectors, which would amount to a combinatorial problem; however, we find a ’good’ quantization through linear programming.'
  note: 'Reissued by PMLR on 31 March 2021.'
  volume: R3
  URL: https://proceedings.mlr.press/r3/tipping01a.html
  PDF: http://proceedings.mlr.press/r3/tipping01a/tipping01a.pdf
  edit: https://github.com/mlresearch//r3/edit/gh-pages/_posts/2001-01-04-tipping01a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics'
  publisher: 'PMLR'
  author: 
  - given: Michael E.
    family: Tipping
  - given: Bernhard
    family: Schölkopf
  editor: 
  - given: Thomas S.
    family: Richardson
  - given: Tommi S.
    family: Jaakkola
  page: 298-303
  id: tipping01a
  issued:
    date-parts: 
      - 2001
      - 1
      - 4
  firstpage: 298
  lastpage: 303
  published: 2001-01-04 00:00:00 +0000