- title: 'A Bayesian Approach to Bergman’s Minimal Model' abstract: 'The classical minimal model of glucose disposal was proposed as a powerful modeling approach to estimating the insulin sensitivity and the glucose effectiveness, which are very useful in the study of diabetes. The minimal model is a highly ill-posed inverse problem and most often the reconstruction of the glucose kinetics has been done by deterministic iterative numerical algorithms. However, these algorithms do not consider the severe ill-posedness inherent in the minimal model and may only be efficient when a good initial estimate is provided. In this work we adopt graphical models as a powerful and flexible modeling framework for regularizing the problem and thereby allow for estimation of the insulin sensitivity and glucose effectiveness. We illustrate how the reconstruction algorithm may be efficiently implemented in a Bayesian approach where posterior sampling is made through the use of Markov chain Monte Carlo techniques. We demonstrate the method on simulated data.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/andersen03a.html PDF: http://proceedings.mlr.press/r4/andersen03a/andersen03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-andersen03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Kim E. family: Andersen - given: Malene family: Højbjerre editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 1-8 id: andersen03a issued: date-parts: - 2003 - 1 - 3 firstpage: 1 lastpage: 8 published: 2003-01-03 00:00:00 +0000 - title: 'Planning by Probabilistic Inference' abstract: 'This paper presents and demonstrates a new approach to the problem of planning under uncertainty. Actions are treated as hidden variables, with their own prior distributions, in a probabilistic generative model involving actions and states. Planning is done by computing the posterior distribution over actions, conditioned on reaching the goal state within a specified number of steps. Under the new formulation, the toolbox of inference techniques be brought to bear on the planning problem. This paper focuses on problems with discrete actions and states, and discusses some extensions.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/attias03a.html PDF: http://proceedings.mlr.press/r4/attias03a/attias03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-attias03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Hagai family: Attias editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 9-16 id: attias03a issued: date-parts: - 2003 - 1 - 3 firstpage: 9 lastpage: 16 published: 2003-01-03 00:00:00 +0000 - title: 'Quick Training of Probabilistic Neural Nets by Importance Sampling' abstract: 'Our previous work on statistical language modeling introduced the use of probabilistic feedforward neural networks to help dealing with the curse of dimensionality. Training this model by maximum likelihood however requires for each example to perform as many network passes as there are words in the vocabulary. Inspired by the contrastive divergence model, we propose and evaluate sampling-based methods which require network passes only for the observed "positive example" and a few sampled negative example words. A very significant speed-up is obtained with an adaptive importance sampling.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/bengio03a.html PDF: http://proceedings.mlr.press/r4/bengio03a/bengio03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-bengio03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Yoshua family: Bengio - given: Jean-Sébastien family: Senecal editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 17-24 id: bengio03a issued: date-parts: - 2003 - 1 - 3 firstpage: 17 lastpage: 24 published: 2003-01-03 00:00:00 +0000 - title: 'Super-resolution Enhancement of Video' abstract: 'We consider the problem of enhancing the resolution of video through the addition of perceptually plausible high frequency information. Our approach is based on a learned data set of image patches capturing the relationship between the middle and high spatial frequency bands of natural images. By introducing an appropriate prior distribution over such patches we can ensure consistency of static image regions across successive frames of the video, and also take account of object motion. A key concept is the use of the previously enhanced frame to provide part of the training set for super-resolution enhancement of the current frame. Our results show that a marked improvement in video quality can be achieved at reasonable computational cost.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/bishop03a.html PDF: http://proceedings.mlr.press/r4/bishop03a/bishop03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-bishop03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Christopher M. family: Bishop - given: Andrew family: Blake - given: Bhaskara family: Marthi editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 25-32 id: bishop03a issued: date-parts: - 2003 - 1 - 3 firstpage: 25 lastpage: 32 published: 2003-01-03 00:00:00 +0000 - title: 'Structured Variational Distributions in VIBES' abstract: 'Variational methods are becoming increasingly popular for the approximate solution of complex probabilistic models in machine learning, computer vision, information retrieval and many other fields. Unfortunately, for every new application it is necessary first to derive the specific forms of the variational update equations for the particular probabilistic model being used, and then to implement these equations in applicationspecific software. Each of these steps is both time consuming and error prone. We have therefore recently developed a general purpose inference engine called VIBES [1] (’Variational Inference for Bayesian Networks’) which allows a wide variety of probabilistic models to be implemented and solved variationally without recourse to coding. New models are specified as a directed acyclic graph using an interface analogous to a drawing package, and VIBES then automatically generates and solves the variational equations. The original version of VIBES assumed a fully factorized variational posterior distribution. In this paper we present an extension of VIBES in which the variational posterior distribution corresponds to a sub-graph of the full probabilistic model. Such structured distributions can produce much closer approximations to the true posterior distribution. We illustrate this approach using an example based on Bayesian hidden Markov models.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/bishop03b.html PDF: http://proceedings.mlr.press/r4/bishop03b/bishop03b.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-bishop03b.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Christopher M. family: Bishop - given: John M. family: Winn editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 33-40 id: bishop03b issued: date-parts: - 2003 - 1 - 3 firstpage: 33 lastpage: 40 published: 2003-01-03 00:00:00 +0000 - title: 'A Unifying Theorem for Spectral Embedding and Clustering' abstract: 'Spectral methods use selected eigenvectors of a data affinity matrix to obtain a data representation that can be trivially clustered or embedded in a low-dimensional space. We present a theorem that explains, for broad classes of affinity matrices and eigenbases, why this works: For successively smaller eigenbases (i.e., using fewer and fewer of the affinity matrix’s dominant eigenvalues and eigenvectors), the angles between "similar" vectors in the new representation shrink while the angles between "dissimilar" vectors grow. Specifically, the sum of the squared cosines of the angles is strictly increasing as the dimensionality of the representation decreases. Thus spectral methods work because the truncated eigenbasis amplifies structure in the data so that any heuristic post-processing is more likely to succeed. We use this result to construct a nonlinear dimensionality reduction (NLDR) algorithm for data sampled from manifolds whose intrinsic coordinate system has linear and cyclic axes, and a novel clustering-by-projections algorithm that requires no post-processing and gives superior performance on "challenge problems" from the recent literature.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/brand03a.html PDF: http://proceedings.mlr.press/r4/brand03a/brand03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-brand03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Matthew family: Brand - given: Kun family: Huang editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 41-48 id: brand03a issued: date-parts: - 2003 - 1 - 3 firstpage: 41 lastpage: 48 published: 2003-01-03 00:00:00 +0000 - title: 'The Sound of an Album Cover: A Probabilistic Approach to Multimedia' abstract: 'We present a novel, flexible, statistical approach to modeling music, images and text jointly. The technique is based on multi-modal mixture models and efficient computation using online EM. The learned models can be used to browse multimedia databases, to query on a multimedia database using any combination of music, images and text (lyrics and other contextual information), to annotate documents with music and images, and to find documents in a database similar to input text, music and/or graphics files.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/brochu03a.html PDF: http://proceedings.mlr.press/r4/brochu03a/brochu03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-brochu03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Eric family: Brochu - given: Nando prefix: de family: Freitas - given: Kejie family: Bao editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 49-56 id: brochu03a issued: date-parts: - 2003 - 1 - 3 firstpage: 49 lastpage: 56 published: 2003-01-03 00:00:00 +0000 - title: 'Is Multinomial PCA Multi-faceted Clustering or Dimensionality Reduction?' abstract: 'Discrete analogues to Principal Components Analysis (PCA) are intended to handle discrete or positive-only data, for instance sets of documents. The class of methods is appropriately called multinomial PCA because it replaces the Gaussian in the probabilistic formulation of PCA with a multinomial. Experiments to date, however, have been on small data sets, for instance, from early information retrieval collections. This paper demonstrates the method on two large data sets and considers two extremes of behaviour: (1) dimensionality reduction where the feature set (i.e., bag of words) is considerably reduced, and (2) multi-faceted clustering (or aspect modelling) where clustering is done but items can now belong in several clusters at once.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/buntine03a.html PDF: http://proceedings.mlr.press/r4/buntine03a/buntine03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-buntine03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Wray L. family: Buntine - given: Sami family: Perttu editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 57-64 id: buntine03a issued: date-parts: - 2003 - 1 - 3 firstpage: 57 lastpage: 64 published: 2003-01-03 00:00:00 +0000 - title: 'Expectation Maximization of Forward Decoding Kernel Machines' abstract: 'Forward Decoding Kernel Machines (FDKM) combine large-margin kernel classifiers with Hidden Markov Models (HMM) for Maximum a Posteriori (MAP) adaptive sequence estimation. This paper proposes a variant on FDKM training using ExpectationMaximization (EM). Parameterization of the expectation step controls the temporal extent of the context used in correcting noisy and missing labels in the training sequence. Experiments with EM-FDKM on TIMIT phone sequence data demonstrate up to $10 %$ improvement in classification performance over FDKM trained with hard transitions between labels.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/chakrabartty03a.html PDF: http://proceedings.mlr.press/r4/chakrabartty03a/chakrabartty03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-chakrabartty03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Shantanu family: Chakrabartty - given: Gert family: Cauwenberghs editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 65-71 id: chakrabartty03a issued: date-parts: - 2003 - 1 - 3 firstpage: 65 lastpage: 71 published: 2003-01-03 00:00:00 +0000 - title: 'Model Averaging with Bayesian Network Classifiers' abstract: 'This paper considers the problem of performing classification by model-averaging over a class of discrete Bayesian network structures consistent with a partial ordering and with bounded in-degree $k .$ We show that for $N$ nodes this class contains in the worst-case at least $\Omega\left(\left(\begin{array}{c}N/2 \\{k}\end{array}\right)^{N / 2} \right)$ distinct network structures, but we show that this summation can be performed in $O\left(\left(\begin{array}{c}N \\{k}\end{array}\right) \cdot N\right)$ time. We use this fact to show that it is possible to efficiently construct a single directed acyclic graph (DAG) whose predictions approximate those of exact model-averaging over this class, allowing approximate model-averaged predictions to be performed in $O(N)$ time. We evaluate the procedure in a supervised classification context, and show empirically that this technique can be beneficial for classification even when the generating distribution is not a member of the class being averaged over, and we characterize the performance over several parameters on simulated and real-world data.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/dash03a.html PDF: http://proceedings.mlr.press/r4/dash03a/dash03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-dash03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Denver family: Dash - given: Gregory F. family: Cooper editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 72-79 id: dash03a issued: date-parts: - 2003 - 1 - 3 firstpage: 72 lastpage: 79 published: 2003-01-03 00:00:00 +0000 - title: 'An object-oriented Bayesian network for estimating mutation rates' abstract: 'We describe the use of the object-oriented HUGIN 6 probabilistic expert system software to structure the problem of estimating mutation rates on the basis of family data when paternity can not be regarded as certain.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/dawid03a.html PDF: http://proceedings.mlr.press/r4/dawid03a/dawid03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-dawid03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: A. Philip family: Dawid editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 80-84 id: dawid03a issued: date-parts: - 2003 - 1 - 3 firstpage: 80 lastpage: 84 published: 2003-01-03 00:00:00 +0000 - title: 'Document Retrieval and Clustering: from Principal Component Analysis to Self-aggregation Networks' abstract: 'Abstract. We first extend Hopfield networks to clustering bipartite graphs (words-to-document association) and show that the solution is the principal component analysis. We then generalize this via the min-max clustering principle into a self-aggregation networks which are composed of scaled PCA components via Hebb rule. Clustering amounts to an updating process where connections between different clusters are automatically suppressed while connections within same clusters are enhanced. This framework combines dimension reduction with clustering via neural networks and PCA. Self-aggregation networks can also improve information retrieval performance. Applications are presented.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/ding03a.html PDF: http://proceedings.mlr.press/r4/ding03a/ding03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-ding03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Chris family: Ding editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 85-92 id: ding03a issued: date-parts: - 2003 - 1 - 3 firstpage: 85 lastpage: 92 published: 2003-01-03 00:00:00 +0000 - title: 'On the Naive Bayes Model for Text Categorization' abstract: 'This paper empirically compares the performance of four probabilistic models for text classification - Poisson, Bernoulli, Multinomial and Negative Binomial. We examine the "naive Bayes" assumption in the four models and show that the multinomial model is a modified naive Bayes Poisson model that assumes independence of document length and document class. Despite the fact that this last assumption might not be correct in many situations, we find that, in general, relaxing it does not change the performance of the classifier. Finally we propose and evaluate an ad-hoc method for incorporating document length.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/eyheramendy03a.html PDF: http://proceedings.mlr.press/r4/eyheramendy03a/eyheramendy03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-eyheramendy03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Susana family: Eyheramendy - given: David D. family: Lewis - given: David family: Madigan editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 93-100 id: eyheramendy03a issued: date-parts: - 2003 - 1 - 3 firstpage: 93 lastpage: 100 published: 2003-01-03 00:00:00 +0000 - title: 'Curve Clustering with Random Effects Regression Mixtures' abstract: 'In this paper we address the problem of clustering sets of curve or trajectory data generated by groups of objects or individuals. The focus is to model curve data directly using a set of model-based curve clustering algorithms referred to as mixtures of regressions or regression mixtures. The proposed methodology is based on extension to regression mixtures that we call random effects regression mixtures which combines linear random effects models with standard regression mixtures. We develop a general expectationmaximization (EM) algorithm using maximum a posteriori (MAP) estimation for random effects regression mixtures and demonstrate how this technique can be applied to the problem of clustering cyclone data.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/gaffney03a.html PDF: http://proceedings.mlr.press/r4/gaffney03a/gaffney03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-gaffney03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Scott family: Gaffney - given: Padhraic family: Smyth editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 101-108 id: gaffney03a issued: date-parts: - 2003 - 1 - 3 firstpage: 101 lastpage: 108 published: 2003-01-03 00:00:00 +0000 - title: 'Clustering Markov States into Equivalence Classes using SVD and Heuristic Search Algorithms' abstract: 'This paper investigates the problem of finding a $K$-state first-order Markov chain that approximates an $M$-state first-order Markov chain, where $K$ is typically much smaller than $M$. A variety of greedy heuristic search algorithms that maximize the data likelihood are investigated and found to work well empirically. The proposed algorithms are demonstrated on two applications: learning user models from traces of Unix commands, and word segmentation in language modeling.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/ge03a.html PDF: http://proceedings.mlr.press/r4/ge03a/ge03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-ge03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Xianping family: Ge - given: Sridevi family: Parise - given: Padhraic family: Smyth editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 109-116 id: ge03a issued: date-parts: - 2003 - 1 - 3 firstpage: 109 lastpage: 116 published: 2003-01-03 00:00:00 +0000 - title: 'Rapid Evaluation of Multiple Density Models' abstract: 'When highly-accurate and/or assumptionfree density estimation is needed, nonparametric methods are often called upon - most notably the popular kernel density estimation (KDE) method. However, the practitioner is instantly faced with the formidable computational cost of KDE for appreciable dataset sizes, which becomes even more prohibitive when many models with different kernel scales (bandwidths) must be evaluated - this is necessary for finding the optimal model, among other reasons. In previous work we presented an algorithm for fast KDE which addresses large dataset sizes and large dimensionalities, but assumes only a single bandwidth. In this paper we present a generalization of that algorithm allowing multiple models with different bandwidths to be computed simultaneously, in substantially less time than either running the singlebandwidth algorithm for each model independently, or running the standard exhaustive method. We show examples of computing the likelihood curve for 100,000 data and 100 models ranging across 3 orders of magnitude in scale, in minutes or seconds.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/gray03a.html PDF: http://proceedings.mlr.press/r4/gray03a/gray03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-gray03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Alexander G. family: Gray - given: Andrew W. family: Moore editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 117-123 id: gray03a issued: date-parts: - 2003 - 1 - 3 firstpage: 117 lastpage: 123 published: 2003-01-03 00:00:00 +0000 - title: 'Bayesian Feature Weighting for Unsupervised Learning, with Application to Object Recognition' abstract: 'We present a method for variable selection/weighting in an unsupervised learning context using Bayesian shrinkage. The basis for the model is a finite mixture of multivariate Gaussian distributions. We demonstrate how the model parameters and cluster assignments can be computed simultaneously using an efficient EM algorithm. Applying our Bayesian shrinkage model to a complex problem in object recognition (Duygulu, Barnard, de Freitas and Forsyth 2002), our experiments yield good results.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/gustafson03a.html PDF: http://proceedings.mlr.press/r4/gustafson03a/gustafson03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-gustafson03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Paul family: Gustafson - given: Peter family: Carbonetto - given: Natalie family: Thompson - given: Nando prefix: de family: Freitas editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 124-131 id: gustafson03a issued: date-parts: - 2003 - 1 - 3 firstpage: 124 lastpage: 131 published: 2003-01-03 00:00:00 +0000 - title: 'Generalized belief propagation for approximate inference in hybrid Bayesian networks' abstract: 'We apply generalized belief propagation to approximate inference in hybrid Bayesian networks. In essence, in the algorithms developed for discrete networks we only have to change "strong marginalization" (exact) into "weak marginalization" (same moments) or, equivalently, the "sum" operation in the (generalized) sum-product algorithm into a "collapse" operation. We describe both a message-free single-loop algorithm based on fixed-point iteration and a more tedious double-loop algorithm guaranteed to converge to a minimum of the Kikuchi free energy. With the cluster variation method we can interpolate between the minimal Kikuchi approximation and the (strong) junction tree algorithm. Simulations on the emission network of [7] , extended in [13], indicate that the Kikuchi approximation in practice often works really well, even in the difficult case of discrete children of continuous parents.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/heskes03a.html PDF: http://proceedings.mlr.press/r4/heskes03a/heskes03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-heskes03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Tom family: Heskes - given: Onno family: Zoeter editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 132-140 id: heskes03a issued: date-parts: - 2003 - 1 - 3 firstpage: 132 lastpage: 140 published: 2003-01-03 00:00:00 +0000 - title: 'Learning Bayesian Networks From Dependency Networks: A Preliminary Study' abstract: 'In this paper we describe how to learn Bayesian networks from a summary of complete data in the form of a dependency network rather than from data directly. This method allows us to gain the advantages of both representations: scalable algorithms for learning dependency networks and convenient inference with Bayesian networks. Our approach is to use a dependency network as an "oracle" for the statistics needed to learn a Bayesian network. We show that the general problem is NP-hard and develop a greedy search algorithm. We conduct a preliminary experimental evaluation and find that the prediction accuracy of the Bayesian networks constructed from our algorithm almost equals that of Bayesian networks learned directly from the data.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/hulten03a.html PDF: http://proceedings.mlr.press/r4/hulten03a/hulten03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-hulten03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Geoff family: Hulten - given: David Maxwell family: Chickering - given: David family: Heckerman editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 141-148 id: hulten03a issued: date-parts: - 2003 - 1 - 3 firstpage: 141 lastpage: 148 published: 2003-01-03 00:00:00 +0000 - title: 'Convex Invariance Learning' abstract: 'Invariance and representation learning are important precursors to modeling and classification tools particularly for non-Euclidean spaces such as images, strings and nonvectorial data. This article proposes a method for learning invariances in data while jointly estimating a model. The technique results in a convex programming problem with a consistent and unique solution. Representation variables are considered as affine transformations confined by multiple equality and inequality constraints. These interact individually with each datum yet maintain the overall solvability of the model estimation process while uniquely solving for the representational variables themselves. The method is applicable to various types of modeling, including maximum likelihood estimation, principal components analysis, and discriminative methods. Starting from affine invariance, several types of invariances are proposed and implemented as convex programs including clustering, permutation, selection, rotation, and translation. Experiments on non-vectorial data such as images and collections of tuples provide promising results.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/jebara03a.html PDF: http://proceedings.mlr.press/r4/jebara03a/jebara03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-jebara03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Tony family: Jebara editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 149-156 id: jebara03a issued: date-parts: - 2003 - 1 - 3 firstpage: 149 lastpage: 156 published: 2003-01-03 00:00:00 +0000 - title: 'Refining Kernels for Regression and Uneven Classification Problems' abstract: 'Kernel alignment has recently been proposed as a method for measuring the degree of agreement between a kernel and a classification learning task. In this paper we extend the notion of kernel alignment to two other common learning problems: regression and classification with uneven data. We present a modified definition of alignment together with a novel theoretical justification for why improving alignment will lead to better performance in the regression case. Experimental evidence is provided to show that improving the alignment leads to a reduction in generalization error of standard regressors and classifiers.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/kandola03a.html PDF: http://proceedings.mlr.press/r4/kandola03a/kandola03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-kandola03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Jaz S. family: Kandola - given: John family: Shawe-Taylor editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 157-162 id: kandola03a issued: date-parts: - 2003 - 1 - 3 firstpage: 157 lastpage: 162 published: 2003-01-03 00:00:00 +0000 - title: 'Fast Robust Logistic Regression for Large Sparse Datasets with Binary Outputs' abstract: 'Although popular and extremely well established in mainstream statistical data analysis, logistic regression is strangely absent in the field of data mining. There are two possible explanations of this phenomenon. First, there might be an assumption that any tool which can only produce linear classification boundaries is likely to be trumped by more modern nonlinear tools. Second, there is a legitimate fear that logistic regression cannot practically scale up to the massive dataset sizes to which modern data mining tools are applied. This paper consists of an empirical examination of the first assumption, and surveys, implements and compares techniques by which logistic regression can be scaled to data with millions of attributes and records. Our results, on a large life sciences dataset, indicate that logistic regression can perform surprisingly well, both statistically and computationally, when compared with an array of more recent classification algorithms.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/komarek03a.html PDF: http://proceedings.mlr.press/r4/komarek03a/komarek03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-komarek03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Paul family: Komarek - given: Andrew W. family: Moore editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 163-170 id: komarek03a issued: date-parts: - 2003 - 1 - 3 firstpage: 163 lastpage: 170 published: 2003-01-03 00:00:00 +0000 - title: 'Efficient Computing of Stochastic Complexity' abstract: 'Stochastic complexity of a data set is defined as the shortest possible code length for the data obtainable by using some fixed set of models. This measure is of great theoretical and practical importance as a tool for tasks such as model selection or data clustering. Unfortunately, computing the modern version of stochastic complexity, defined as the Normalized Maximum Likelihood (NML) criterion, requires computing a sum with an exponential number of terms. Therefore, in order to be able to apply the stochastic complexity measure in practice, in most cases it has to be approximated. In this paper, we show that for some interesting and important cases with multinomial data sets, the exponentiality can be removed without loss of accuracy. We also introduce a new computationally efficient approximation scheme based on analytic combinatorics and assess its accuracy, together with earlier approximations, by comparing them to the exact form. The results suggest that due to its accuracy and efficiency, the new sharper approximation will be useful for a wide class of problems with discrete data.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/kontkanen03a.html PDF: http://proceedings.mlr.press/r4/kontkanen03a/kontkanen03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-kontkanen03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Petri family: Kontkanen - given: Wray L. family: Buntine - given: Petri family: Myllymäki - given: Jorma family: Rissanen - given: Henry family: Tirri editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 171-178 id: kontkanen03a issued: date-parts: - 2003 - 1 - 3 firstpage: 171 lastpage: 178 published: 2003-01-03 00:00:00 +0000 - title: 'The Joint Causal Effect in Linear Structural Equation Model and Its Application to Process Analysis' abstract: 'Consider a case where cause-effect relationships among variables can be described by a causal diagram and the corresponding linear structural equation model. In order to bring a response variable close to a target, this paper proposes a statistical method for inferring a joint causal effect of a conditional plan on the variance of a response variable from nonexperimental data. Moreover, based on this method, this paper formulates a conditional plan, which can cancel the influence of covariates on a response variable. The results of this paper could enable us to select an effective plan in linear conditional plans.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/kuroki03a.html PDF: http://proceedings.mlr.press/r4/kuroki03a/kuroki03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-kuroki03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Manabu family: Kuroki - given: Zhihong family: Cai editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 179-186 id: kuroki03a issued: date-parts: - 2003 - 1 - 3 firstpage: 179 lastpage: 186 published: 2003-01-03 00:00:00 +0000 - title: 'Bayesian Inference in the Presence of Determinism' abstract: 'In this paper, we consider the problem of performing inference on Bayesian networks which exhibit a substantial degree of determinism. We improve upon the determinismexploiting inference algorithm presented in [4], showing that the information brought to light by constraint propagation may be exploited to a much greater extent than has been previously possible. This is confirmed with theoretical and empirical studies.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/larkin03a.html PDF: http://proceedings.mlr.press/r4/larkin03a/larkin03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-larkin03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: David family: Larkin - given: Rina family: Dechter editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 187-194 id: larkin03a issued: date-parts: - 2003 - 1 - 3 firstpage: 187 lastpage: 194 published: 2003-01-03 00:00:00 +0000 - title: 'Reduced Rank Approximations of Transition Matrices' abstract: 'We present various latent variable models for the reduced rank approximation of transition matrices. Two main categories of models, termed Latent Markov Analysis(LMA) models, are introduced. We first address the case where the transition matrix is consistent with a reversible random walk. A more general case is subsequently addressed. Iterative EM-type algorithms are presented for all models. LMA is applied to clustering based on pairwise similarities, where similarities between objects are described probabilistically. In the model, relationships between the inferred clusters are again described probabilistically by the reduced rank transition matrix. LMA simultaneously infers the clusters and abstracts the relationships between them, which can be represented in the form of a weighted graph. Finally, a "targeted" LMA model is introduced where a prior specification of the transition between latent cluster states is incorporated. This provides an algorithm which searches for clusters satisfying pre-specified relationships.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/lin03a.html PDF: http://proceedings.mlr.press/r4/lin03a/lin03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-lin03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Juan family: Lin editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 195-202 id: lin03a issued: date-parts: - 2003 - 1 - 3 firstpage: 195 lastpage: 202 published: 2003-01-03 00:00:00 +0000 - title: 'On Retrieval Properties of Samples of Large Collections' abstract: 'We consider text retrieval applications that assign query-specific relevance scores to documents drawn from particular collections. Such applications represent a primary focus of the annual Text Retrieval Conference (TREC) where the participants compare the empirical performance of different approaches. $P@K$, the proportion of the top $K$ documents that are relevant, is a popular measure of retrieval effectiveness. Participants in the TREC Very Large Corpus track have observed that $P @ K$ increases substantially when moving from a sample to the full collection. Hawking et al. (1999) posed as an open research question the cause of this phenomenon and proposed five possible explanatory hypotheses. In this paper we present a mathematical analysis of the phenomenon. We will also introduce "contamination at $K, "$ the number of irrelevant documents amongst the top $K$ relevant documents, and describe its properties. Our analysis shows that while $P @ K$ typically will increase with collection size, the phenomenon is not universal. That is, there exist score distributions for which $P @ K$ (and $C @ K$ ) approach a constant limit as collection size increases.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/madigan03a.html PDF: http://proceedings.mlr.press/r4/madigan03a/madigan03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-madigan03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: David family: Madigan - given: Yehuda family: Vardi - given: Ishay family: Weissman editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 203-208 id: madigan03a issued: date-parts: - 2003 - 1 - 3 firstpage: 203 lastpage: 208 published: 2003-01-03 00:00:00 +0000 - title: 'Data Centering in Feature Space' abstract: 'This paper presents a family of methods for data translation in feature space, to be used in conjunction with kernel machines. The translations are performed using only kernel evaluations in input space. We use the methods to improve the numerical properties of kernel machines. Experiments with synthetic and real data demonstrate the effectiveness of data centering and highlight other interesting aspects of translation in feature space.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/meila03a.html PDF: http://proceedings.mlr.press/r4/meila03a/meila03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-meila03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Marina family: Meilă editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 209-216 id: meila03a issued: date-parts: - 2003 - 1 - 3 firstpage: 209 lastpage: 216 published: 2003-01-03 00:00:00 +0000 - title: 'A Blessing of Dimensionality: Measure Concentration and Probabilistic Inference' abstract: 'This paper proposes an efficient sampling method for inference in probabilistic graphical models. The method exploits a blessing of dimensionality known as the concentration of measure phenomenon in order to derive analytic expressions for proposal distributions. The method can also be interpreted in a variational setting, were one minimises an upperbound on the estimator variance. The results on simple settings are very promising. We believe this method has great potential in graphical models used for diagnosis.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/muyan03a.html PDF: http://proceedings.mlr.press/r4/muyan03a/muyan03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-muyan03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Pinar family: Muyan - given: Nando prefix: de family: Freitas editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 217-224 id: muyan03a issued: date-parts: - 2003 - 1 - 3 firstpage: 217 lastpage: 224 published: 2003-01-03 00:00:00 +0000 - title: 'Real-time On-line Learning of Transformed Hidden Markov Models from Video' abstract: 'The transformed hidden Markov model is a temporal model that captures three typical causes of variability in video - scene/object class, appearance variability within the class, and image motion. In our previous work, we showed that an exact EM algorithm can jointly learn the appearances of multiple objects and/or poses of an object, and track the objects or camera motion in video, starting simply from random initialization. As such, this model can serve as a basis for both video clustering and object tracking applications. However, the original algorithm requires a significant amount of computation that renders it impractical for video clustering and its off-line nature makes it unsuitable for real-time tracking applications. In this paper, we propose a new, significantly faster, on-line learning algorithm that enables real-time clustering and tracking. We demonstrate that the algorithm can extract objects using the constraints on their motion and also perform tracking while the appearance models are learned. We also demonstrate the clustering results on an example of typical unrestricted personal media - the vacation video.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/petrovic03a.html PDF: http://proceedings.mlr.press/r4/petrovic03a/petrovic03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-petrovic03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Nemanja family: Petrovic - given: Nebojsa family: Jojic - given: Brendan J. family: Frey - given: Thomas S. family: Huang editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 225-232 id: petrovic03a issued: date-parts: - 2003 - 1 - 3 firstpage: 225 lastpage: 232 published: 2003-01-03 00:00:00 +0000 - title: 'Ensemble Coupled Hidden Markov Models for Joint Characterisation of Dynamic Signals' abstract: 'How does one model data with the aid of labels, when the labels themselves are noisy, unreliable and have their own dynamics? How does one measure interactions between variables that are so different in their nature that a direct comparison using, say cross-correlations, is meaningless? In this paper these problems are approached using Coupled Hidden Markov Models which are estimated in the Variational Bayesian framework. Signals can be diverse since each chain has its own observation model. Signals can have their own dynamics and may temporally lag or lead one another by allowing linking edges in the network topology to be estimated and chosen according to the most probable posterior model. Integrated feature extraction and modelling is accomplished by providing the Markov models models with linear observations models. We derive Coupled Hidden Markov Models estimators, apply and compare them with sampling based approaches found in the literature.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/rezek03a.html PDF: http://proceedings.mlr.press/r4/rezek03a/rezek03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-rezek03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Iead family: Rezek - given: Stephen J. family: Roberts - given: Peter family: Sykacek editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 233-239 id: rezek03a issued: date-parts: - 2003 - 1 - 3 firstpage: 233 lastpage: 239 published: 2003-01-03 00:00:00 +0000 - title: 'A Generalized Linear Model for Principal Component Analysis of Binary Data' abstract: 'We investigate a generalized linear model for dimensionality reduction of binary data. The model is related to principal component analysis (PCA) in the same way that logistic regression is related to linear regression. Thus we refer to the model as logistic PCA. In this paper, we derive an alternating least squares method to estimate the basis vectors and generalized linear coefficients of the logistic PCA model. The resulting updates have a simple closed form and are guaranteed at each iteration to improve the model’s likelihood. We evaluate the performance of logistic PCA—as measured by reconstruction error rates—on data sets drawn from four real world applications. In general, we find that logistic PCA is much better suited to modeling binary data than conventional PCA.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/schein03a.html PDF: http://proceedings.mlr.press/r4/schein03a/schein03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-schein03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Andrew I. family: Schein - given: Lawrence K. family: Saul - given: Lyle H. family: Ungar editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 240-247 id: schein03a issued: date-parts: - 2003 - 1 - 3 firstpage: 240 lastpage: 247 published: 2003-01-03 00:00:00 +0000 - title: 'Combining Conjugate Direction Methods with Stochastic Approximation of Gradients' abstract: 'The method of conjugate directions provides a very effective way to optimize large, deterministic systems by gradient descent. In its standard form, however, it is not amenable to stochastic approximation of the gradient. Here we explore ideas from conjugate gradient in the stochastic (online) setting, using fast Hessian-gradient products to set up low-dimensional Krylov subspaces within individual mini-batches. In our benchmark experiments the resulting online learning algorithms converge orders of magnitude faster than ordinary stochastic gradient descent.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/schraudolph03a.html PDF: http://proceedings.mlr.press/r4/schraudolph03a/schraudolph03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-schraudolph03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Nicol N. family: Schraudolph - given: Thore family: Graepel editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 248-253 id: schraudolph03a issued: date-parts: - 2003 - 1 - 3 firstpage: 248 lastpage: 253 published: 2003-01-03 00:00:00 +0000 - title: 'Fast Forward Selection to Speed Up Sparse Gaussian Process Regression' abstract: 'We present a method for the sparse greedy approximation of Bayesian Gaussian process regression, featuring a novel heuristic for very fast forward selection. Our method is essentially as fast as an equivalent one which selects the "support" patterns at random, yet it can outperform random selection on hard curve fitting tasks. More importantly, it leads to a sufficiently stable approximation of the log marginal likelihood of the training data, which can be optimised to adjust a large number of hyperparameters automatically. We demonstrate the model selection capabilities of the algorithm in a range of experiments. In line with the development of our method, we present a simple view on sparse approximations for GP models and their underlying assumptions and show relations to other methods.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/seeger03a.html PDF: http://proceedings.mlr.press/r4/seeger03a/seeger03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-seeger03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Matthias W. family: Seeger - given: Christopher K. I. family: Williams - given: Neil D. family: Lawrence editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 254-261 id: seeger03a issued: date-parts: - 2003 - 1 - 3 firstpage: 254 lastpage: 261 published: 2003-01-03 00:00:00 +0000 - title: 'On Improving the Efficiency of the Iterative Proportional Fitting Procedure' abstract: 'Iterative proportional fitting (IPF) on junction trees is an important tool for learning in graphical models. We identify the propagation and IPF updates on the junction tree as fixed point equations of a single constrained entropy maximization problem. This allows a more efficient message updating protocol than the well known effective IPF of Jiroušek and Preučil (1995). When the junction tree has an intractably large maximum clique size we propose to maximize an approximate constrained entropy based on region graphs (Yedidia et al., 2002). To maximize the new objective we propose a "loopy" version of IPF. We show that this yields accurate estimates of the weights of undirected graphical models in a simple experiment.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/teh03a.html PDF: http://proceedings.mlr.press/r4/teh03a/teh03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-teh03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Yee Whye family: Teh - given: Max family: Welling editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 262-269 id: teh03a issued: date-parts: - 2003 - 1 - 3 firstpage: 262 lastpage: 269 published: 2003-01-03 00:00:00 +0000 - title: 'Discriminative Model Selection for Density Models' abstract: 'Density models are a popular tool for building classifiers. When using density models to build a classifier, one typically learns a separate density model for each class of interest. These density models are then combined to make a classifier through the use of Bayes’ rule utilizing the prior distribution over the classes. In this paper, we provide a discriminative method for choosing among alternative density models for each class to improve classification accuracy.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/thiesson03a.html PDF: http://proceedings.mlr.press/r4/thiesson03a/thiesson03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-thiesson03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Bo family: Thiesson - given: Christopher family: Meek editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 270-275 id: thiesson03a issued: date-parts: - 2003 - 1 - 3 firstpage: 270 lastpage: 275 published: 2003-01-03 00:00:00 +0000 - title: 'Fast Marginal Likelihood Maximisation for Sparse Bayesian Models' abstract: 'The ’sparse Bayesian’ modelling approach, as exemplified by the ’relevance vector machine’, enables sparse classification and regression functions to be obtained by linearlyweighting a small number of fixed basis functions from a large dictionary of potential candidates. Such a model conveys a number of advantages over the related and very popular ’support vector machine’, but the necessary ’training’ procedure - optimisation of the marginal likelihood function is typically much slower. We describe a new and highly accelerated algorithm which exploits recently-elucidated properties of the marginal likelihood function to enable maximisation via a principled and efficient sequential addition and deletion of candidate basis functions.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/tipping03a.html PDF: http://proceedings.mlr.press/r4/tipping03a/tipping03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-tipping03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Michael E. family: Tipping - given: Anita C. family: Faul editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 276-283 id: tipping03a issued: date-parts: - 2003 - 1 - 3 firstpage: 276 lastpage: 283 published: 2003-01-03 00:00:00 +0000 - title: 'Sequential Importance Sampling for Visual Tracking Reconsidered' abstract: 'We consider the task of filtering dynamical systems observed in noise by means of sequential importance sampling when the proposal is restricted to the innovation components of the state. It is argued that the unmodified sequential importance sampling/resampling (SIR) algorithm may yield high variance estimates of the posterior in this case, resulting in poor performance when e.g. in visual tracking one tries to build a SIR algorithm on the top of the output of a color blob detector. A new method that associates the innovations sampled from the proposal and the particles in a separate computational step is proposed. The method is shown to outperform the unmodified SIR algorithm in a series of vision based object tracking experiments, both in terms of accuracy and robustness.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/torma03a.html PDF: http://proceedings.mlr.press/r4/torma03a/torma03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-torma03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Péter family: Torma - given: Csaba family: Szepesvári editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 284-291 id: torma03a issued: date-parts: - 2003 - 1 - 3 firstpage: 284 lastpage: 291 published: 2003-01-03 00:00:00 +0000 - title: 'Solving Markov Random Fields using Semi Definite Programming' abstract: 'This paper explores a new generic method for matching, when there are conditional dependencies between the matches. It allows different sorts of features to be matched in the same global optimization framework. The method is based on a binary Markov random field model which is defined on the product space of matches, and is shown to be equivalent to $0-1$ quadratic programming, and the MAXCUT graph problem. In general these problem are $N P$ complete. However our approach takes inspiration from the celebrated result of Goemans and Williamson (1995) that finds a polynomial time 0.879 approximation to several $N P$ complete, using semidefinite programming. The method is demonstrated for the problem of curve matching.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/torr03a.html PDF: http://proceedings.mlr.press/r4/torr03a/torr03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-torr03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Philip H. S. family: Torr editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 292-299 id: torr03a issued: date-parts: - 2003 - 1 - 3 firstpage: 292 lastpage: 299 published: 2003-01-03 00:00:00 +0000 - title: 'Towards Principled Feature Selection: Relevancy, Filters and Wrappers' abstract: 'In an influential paper Kohavi and John [7] presented a number of disadvantages of the filter approach to the feature selection problem, steering research towards algorithms adopting the wrapper approach. We show here that neither approach is inherently better and that any practical feature selection algorithm needs to at least consider the learner used for classification and the metric used for evaluating the learner’s performance. In the process we formally define the feature selection problem, re-examine the relationship between relevancy and filter algorithms, and establish a connection between Kohavi and John’s definition of relevancy to the Markov Blanket of a target variable in a Bayesian Network faithful to some data distribution. The theoretical results lead to principled ways of designing optimal filter algorithms of which we present one example.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/tsamardinos03a.html PDF: http://proceedings.mlr.press/r4/tsamardinos03a/tsamardinos03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-tsamardinos03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Ioannis family: Tsamardinos - given: Constantin F. family: Aliferis editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 300-307 id: tsamardinos03a issued: date-parts: - 2003 - 1 - 3 firstpage: 300 lastpage: 307 published: 2003-01-03 00:00:00 +0000 - title: 'Tree-reweighted Belief Propagation Algorithms and Approximate ML Estimation by Pseudo-Moment Matching' abstract: 'In previous work [10] we presented a class of upper bounds on the log partition function of an arbitrary undirected graphical model based on solving a convex variational problem. Here we develop a class of local message-passing algorithms, which we call tree-reweighted belief propagation, for efficiently computing the value of these upper bounds, as well as the associated pseudomarginals. We also consider the uses of our bounds for the problem of maximum likelihood (ML) parameter estimation. For a completely observed model, our analysis gives rise to a concave lower bound on the log likelihood of the data. Maximizing this lower bound yields an approximate ML estimate which, in analogy to the moment-matching of exact ML estimation, can be interpreted in terms of pseudo-moment-matching. We present preliminary results illustrating the behavior of this approximate ML estimator.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/wainwright03a.html PDF: http://proceedings.mlr.press/r4/wainwright03a/wainwright03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-wainwright03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Martin J. family: Wainwright - given: Tommi S. family: Jaakkola - given: Alan S. family: Willsky editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 308-315 id: wainwright03a issued: date-parts: - 2003 - 1 - 3 firstpage: 308 lastpage: 315 published: 2003-01-03 00:00:00 +0000 - title: 'Latent Maximum Entropy Approach for Semantic $N$-gram Language Modeling' abstract: 'In this paper, we describe a unified probabilistic framework for statistical language modeling-the latent maximum entropy principle-which can effectively incorporate various aspects of natural language, such as local word interaction, syntactic structure and semantic document information. Unlike previous work on maximum entropy methods for language modeling, which only allow explicit features to be modeled, our framework also allows relationships over hidden features to be captured, resulting in a more expressive language model. We describe efficient algorithms for marginalization, inference and normalization in our extended models. We then present promising experimental results for our approach on the Wall Street Journal corpus.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/wang03a.html PDF: http://proceedings.mlr.press/r4/wang03a/wang03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-wang03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Shaojun family: Wang - given: Dale family: Schuurmans - given: Fuchun family: Peng editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 316-322 id: wang03a issued: date-parts: - 2003 - 1 - 3 firstpage: 316 lastpage: 322 published: 2003-01-03 00:00:00 +0000 - title: 'On Boosting and the Exponential Loss' abstract: 'Boosting algorithms in general and AdaBoost in particular, initially baffled the statistical world by posing two questions: (1) Why is it that AdaBoost performs so well? and (2) What makes Boosting methods resistant to overfiting? In response to question (1) Hastie, Tibshirani and Friedman (2000) take a statistical view of Boosting by recasting it as a stagewise approach to the minimization of an exponential loss function by means of an additive model in a process similar to additive logistic regression. This characterization has since been well integrated in the statistics and computer science communities as the best statistical answer to question (1). In this paper, we argue that this well assimilated view is questionable and that perhaps Boosting’s success has nothing to do with the minimization of an exponential criterion or indeed any optimization at all. Our argument rests on a constructive theorem that states that for any sequence of classifiers there exists a linear combination for which the exponential criterion equals one. Furthermore, we present a Boosting algorithm which performs empirically like AdaBoost while stabilizing the exponential loss to a constant.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/wyner03a.html PDF: http://proceedings.mlr.press/r4/wyner03a/wyner03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-wyner03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Abraham J. family: Wyner editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 323-329 id: wyner03a issued: date-parts: - 2003 - 1 - 3 firstpage: 323 lastpage: 329 published: 2003-01-03 00:00:00 +0000 - title: 'An Active Approach to Collaborative Filtering' abstract: 'Collaborative filtering allows the preferences of multiple users to be pooled in a principled way in order to make recommendations about products, services or information unseen by a specific user. We consider here the problem of online and interactive collaborative filtering: given the current ratings and recommendations associated with a user, what queries (new ratings) would most improve the quality of the recommendations made? This can be cast in a straightforward fashion in terms of expected value of information; but the online computational cost of computing optimal queries is prohibitive. We show how offline precomputation of bounds on value of information, and of prototypes in query space, can be used to dramatically reduce the required online computation. The framework we develop is quite general, but we derive detailed bounds for the multiplecause vector quantization model, and empirically demonstrate the value of our active approach using this model.' note: 'Reissued by PMLR on 01 April 2021.' volume: R4 URL: https://proceedings.mlr.press/r4/zemel03a.html PDF: http://proceedings.mlr.press/r4/zemel03a/zemel03a.pdf edit: https://github.com/mlresearch//r4/edit/gh-pages/_posts/2003-01-03-zemel03a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics' publisher: 'PMLR' author: - given: Richard S. family: Zemel - given: Craig family: Boutilier editor: - given: Christopher M. family: Bishop - given: Brendan J. family: Frey page: 330-337 id: zemel03a issued: date-parts: - 2003 - 1 - 3 firstpage: 330 lastpage: 337 published: 2003-01-03 00:00:00 +0000