- title: 'Preface'
abstract: 'Preface to the Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics May 13-15, 2010, Chia Laguna Resort, Sardinia, Italy.'
volume: 9
URL: http://proceedings.mlr.press/v9/teh10a.html
PDF: http://proceedings.mlr.press/v9/teh10a/teh10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-teh10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: i-v
id: teh10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: i
lastpage: v
published: 2010-03-31 00:00:00 +0000
- title: 'Learning the Structure of Deep Sparse Graphical Models'
abstract: 'Deep belief networks are a powerful way to model complex probability distributions. However, it is difficult to learn the structure of a belief network, particularly one with hidden units. The Indian buffet process has been used as a nonparametric Bayesian prior on the structure of a directed belief network with a single infinitely wide hidden layer. Here, we introduce the cascading Indian buffet process (CIBP), which provides a prior on the structure of a layered, directed belief network that is unbounded in both depth and width, yet allows tractable inference. We use the CIBP prior with the nonlinear Gaussian belief network framework to allow each unit to vary its behavior between discrete and continuous representations. We use Markov chain Monte Carlo for inference in this model and explore the structures learned on image data.'
volume: 9
URL: http://proceedings.mlr.press/v9/adams10a.html
PDF: http://proceedings.mlr.press/v9/adams10a/adams10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-adams10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Adams
given: Ryan
- family: Wallach
given: Hanna
- family: Ghahramani
given: Zoubin
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 1-8
id: adams10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 1
lastpage: 8
published: 2010-03-31 00:00:00 +0000
- title: 'Optimal Allocation Strategies for the Dark Pool Problem'
abstract: 'We study the problem of allocating stocks to dark pools. We propose and analyze an optimal approach for allocations, if continuous-valued allocations are allowed. We also propose a modification for the case when only integer-valued allocations are possible. We extend the previous work on this problem by Ganchev et al (UAI 2009) to adversarial scenarios, while also improving over their results in the iid setup. The resulting algorithms are efficient, and are tested on extensive simulations under stochastic and adversarial inputs. Our work also has consequences for other perishable inventory control problems, extending their analyses to adversarial models too.'
volume: 9
URL: http://proceedings.mlr.press/v9/agarwal10a.html
PDF: http://proceedings.mlr.press/v9/agarwal10a/agarwal10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-agarwal10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Agarwal
given: Alekh
- family: Bartlett
given: Peter
- family: Dama
given: Max
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 9-16
id: agarwal10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 9
lastpage: 16
published: 2010-03-31 00:00:00 +0000
- title: 'Multitask Learning for Brain-Computer Interfaces'
abstract: 'Brain-computer interfaces (BCIs) are limited in their applicability in everyday settings by the current necessity to record subject-specific calibration data prior to actual use of the BCI for communication. In this paper, we utilize the framework of multitask learning to construct a BCI that can be used without any subject-specific calibration process. We discuss how this out-of-the-box BCI can be further improved in a computationally efficient manner as subject-specific data becomes available. The feasibility of the approach is demonstrated on two sets of experimental EEG data recorded during a standard two-class motor imagery paradigm from a total of 19 healthy subjects. Specifically, we show that satisfactory classification results can be achieved with zero training data, and combining prior recordings with subject-specific calibration data substantially outperforms using subject-specific data only. Our results further show that transfer between recordings under slightly different experimental setups is feasible.'
volume: 9
URL: http://proceedings.mlr.press/v9/alamgir10a.html
PDF: http://proceedings.mlr.press/v9/alamgir10a/alamgir10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-alamgir10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Alamgir
given: Morteza
- family: Grosse–Wentrup
given: Moritz
- family: Altun
given: Yasemin
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 17-24
id: alamgir10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 17
lastpage: 24
published: 2010-03-31 00:00:00 +0000
- title: 'Efficient Multioutput Gaussian Processes through Variational Inducing Kernels'
abstract: 'Interest in multioutput kernel methods is increasing, whether under the guise of multitask learning, multisensor networks or structured output data. From the Gaussian process perspective a multioutput Mercer kernel is a covariance function over correlated output functions. One way to construct such kernels is based on convolution processes (CP). A key problem for this approach is efficient inference. Alvarez and Lawrence recently presented a sparse approximation for CPs that enabled efficient inference. In this paper, we extend this work in two directions: we introduce the concept of variational inducing functions to handle potential non-smooth functions involved in the kernel CP construction and we consider an alternative approach to approximate inference based on variational methods, extending the work by Titsias (2009) to the multiple output case. We demonstrate our approaches on prediction of school marks, compiler performance and financial time series.'
volume: 9
URL: http://proceedings.mlr.press/v9/alvarez10a.html
PDF: http://proceedings.mlr.press/v9/alvarez10a/alvarez10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-alvarez10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Álvarez
given: Mauricio
- family: Luengo
given: David
- family: Titsias
given: Michalis
- family: Lawrence
given: Neil D.
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 25-32
id: alvarez10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 25
lastpage: 32
published: 2010-03-31 00:00:00 +0000
- title: 'Learning with Blocks: Composite Likelihood and Contrastive Divergence'
abstract: 'Composite likelihood methods provide a wide spectrum of computationally efficient techniques for statistical tasks such as parameter estimation and model selection. In this paper, we present a formal connection between the optimization of composite likelihoods and the well-known contrastive divergence algorithm. In particular, we show that composite likelihoods can be stochastically optimized by performing a variant of contrastive divergence with random-scan blocked Gibbs sampling. By using higher-order composite likelihoods, our proposed learning framework makes it possible to trade off computation time for increased accuracy. Furthermore, one can choose composite likelihood blocks that match the model’s dependence structure, making the optimization of higher-order composite likelihoods computationally efficient. We empirically analyze the performance of blocked contrastive divergence on various models, including visible Boltzmann machines, conditional random fields, and exponential random graph models, and we demonstrate that using higher-order blocks improves both the accuracy of parameter estimates and the rate of convergence.'
volume: 9
URL: http://proceedings.mlr.press/v9/asuncion10a.html
PDF: http://proceedings.mlr.press/v9/asuncion10a/asuncion10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-asuncion10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Asuncion
given: Arthur
- family: Liu
given: Qiang
- family: Ihler
given: Alexander
- family: Smyth
given: Padhraic
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 33-40
id: asuncion10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 33
lastpage: 40
published: 2010-03-31 00:00:00 +0000
- title: 'Deterministic Bayesian inference for the p* model'
abstract: 'The p* model is widely used in social network analysis. The likelihood of a network under this model is impossible to calculate for all but trivially small networks. Various approximation have been presented in the literature, and the pseudolikelihood approximation is the most popular. The aim of this paper is to introduce two likelihood approximations which have the pseudolikelihood estimator as a special case. We show, for the examples that we have considered, that both approximations result in improved estimation of model parameters with respect to the standard methodological approaches. We provide a deterministic approach and also illustrate how Bayesian model choice can be carried out in this setting.'
volume: 9
URL: http://proceedings.mlr.press/v9/austad10a.html
PDF: http://proceedings.mlr.press/v9/austad10a/austad10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-austad10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Austad
given: Haakon
- family: Friel
given: Nial
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 41-48
id: austad10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 41
lastpage: 48
published: 2010-03-31 00:00:00 +0000
- title: 'Half Transductive Ranking'
abstract: 'We study the standard retrieval task of ranking a fixed set of items given a previously unseen query and pose it as the half transductive ranking problem. The task is transductive as the set of items is fixed. Transductive representations (where the vector representation of each example is learned) allow the generation of highly nonlinear embeddings that capture object relationships without relying on a specific choice of features, and require only relatively simple optimization. Unfortunately, they have no direct out-of-sample extension. Inductive approaches on the other hand allow for the representation of unknown queries. We describe algorithms for this setting which have the advantages of both transductive and inductive approaches, and can be applied in unsupervised (either reconstruction-based or graph-based) and supervised ranking setups. We show empirically that our methods give strong performance on all three tasks.'
volume: 9
URL: http://proceedings.mlr.press/v9/bai10a.html
PDF: http://proceedings.mlr.press/v9/bai10a/bai10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-bai10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Bai
given: Bing
- family: Weston
given: Jason
- family: Grangier
given: David
- family: Collobert
given: Ronan
- family: Cortes
given: Corinna
- family: Mohri
given: Mehryar
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 49-56
id: bai10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 49
lastpage: 56
published: 2010-03-31 00:00:00 +0000
- title: 'Kernel Partial Least Squares is Universally Consistent'
abstract: 'We prove the statistical consistency of kernel Partial Least Squares Regression applied to a bounded regression learning problem on a reproducing kernel Hilbert space. Partial Least Squares stands out of well-known classical approaches as e.g. Ridge Regression or Principal Components Regression, as it is not defined as the solution of a global cost minimization procedure over a fixed model nor is it a linear estimator. Instead, approximate solutions are constructed by projections onto a nested set of data-dependent subspaces. To prove consistency, we exploit the known fact that Partial Least Squares is equivalent to the conjugate gradient algorithm in combination with early stopping. The choice of the stopping rule (number of iterations) is a crucial point. We study two empirical stopping rules. The first one monitors the estimation error in each iteration step of Partial Least Squares, and the second one estimates the empirical complexity in terms of a condition number. Both stopping rules lead to universally consistent estimators provided the kernel is universal.'
volume: 9
URL: http://proceedings.mlr.press/v9/blanchard10a.html
PDF: http://proceedings.mlr.press/v9/blanchard10a/blanchard10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-blanchard10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Blanchard
given: Gilles
- family: Krämer
given: Nicole
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 57-64
id: blanchard10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 57
lastpage: 64
published: 2010-03-31 00:00:00 +0000
- title: 'Towards Understanding Situated Natural Language'
abstract: 'We present a general framework and learning algorithm for the task of concept labeling: each word in a given sentence has to be tagged with the unique physical entity (e.g. person, object or location) or abstract concept it refers to. Our method allows both world knowledge and linguistic information to be used during learning and prediction. We show experimentally that we can learn to use world knowledge to resolve ambiguities in language, such as word senses or reference resolution, without the use of handcrafted rules or features.'
volume: 9
URL: http://proceedings.mlr.press/v9/bordes10a.html
PDF: http://proceedings.mlr.press/v9/bordes10a/bordes10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-bordes10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Bordes
given: Antoine
- family: Usunier
given: Nicolas
- family: Collobert
given: Ronan
- family: Weston
given: Jason
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 65-72
id: bordes10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 65
lastpage: 72
published: 2010-03-31 00:00:00 +0000
- title: 'Using Descendants as Instrumental Variables for the Identification of Direct Causal Effects in Linear SEMs'
abstract: 'In this paper, we present an extended set of graphical criteria for the identification of direct causal effects in linear Structural Equation Models (SEMs). Previous methods of graphical identification of direct causal effects in linear SEMs include methods such as the single-door criterion, the instrumental variable and the IV-pair, and the accessory set. However, there remain graphical models where a direct causal effect can be identified and these graphical criteria all fail. As a result, we introduce a new set of graphical criteria which uses descendants of either the cause variable or the effect variable as “path-specific instrumental variables” for the identification of the direct causal effect as long as certain conditions are satisfied. These conditions are based on edge removal and the existing graphical criteria of instrumental variables, and the identifiability of certain other total effects, and thus can be easily checked.'
volume: 9
URL: http://proceedings.mlr.press/v9/chan10a.html
PDF: http://proceedings.mlr.press/v9/chan10a/chan10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-chan10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Chan
given: Hei
- family: Kuroki
given: Manabu
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 73-80
id: chan10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 73
lastpage: 80
published: 2010-03-31 00:00:00 +0000
- title: 'Why are DBNs sparse?'
abstract: 'Real stochastic processes operate in continuous time and can be modeled by sets of stochastic differential equations. On the other hand, several popular model families, including hidden Markov models and dynamic Bayesian networks (DBNs), use discrete time steps. This paper explores methods for converting DBNs with infinitesimal time steps into DBNs with finite time steps, to enable efficient simulation and filtering over long periods. An exact conversion—summing out all intervening time slices between two steps—results in a completely connected DBN, yet nearly all human-constructed DBNs are sparse. We show how this sparsity arises from well-founded approximations resulting from differences among the natural time scales of the variables in the DBN. We define an automated procedure for constructing a provably accurate, approximate DBN model for any desired time step. We illustrate the method by generating a series of approximations to a simple pH model for the human body, demonstrating speedups of several orders of magnitude compared to the original model.'
volume: 9
URL: http://proceedings.mlr.press/v9/chatterjee10a.html
PDF: http://proceedings.mlr.press/v9/chatterjee10a/chatterjee10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-chatterjee10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Chatterjee
given: Shaunak
- family: Russell
given: Stuart
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 81-88
id: chatterjee10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 81
lastpage: 88
published: 2010-03-31 00:00:00 +0000
- title: 'Focused Belief Propagation for Query-Specific Inference'
abstract: 'With the increasing popularity of large-scale probabilistic graphical models, even “lightweight” approximate inference methods are becoming infeasible. Fortunately, often large parts of the model are of no immediate interest to the end user. Given the variable that the user actually cares about, we show how to quantify edge importance in graphical models and to significantly speed up inference by focusing computation on important parts of the model. Our algorithm empirically demonstrates convergence speedup by multiple times over state of the art'
volume: 9
URL: http://proceedings.mlr.press/v9/chechetka10a.html
PDF: http://proceedings.mlr.press/v9/chechetka10a/chechetka10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-chechetka10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Chechetka
given: Anton
- family: Guestrin
given: Carlos
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 89-96
id: chechetka10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 89
lastpage: 96
published: 2010-03-31 00:00:00 +0000
- title: 'Parametric Herding'
abstract: 'A parametric version of herding is formulated. The nonlinear mapping between consecutive time slices is learned by a form of self-supervised training. The resulting dynamical system generates pseudo-samples that resemble the original data. We show how this parametric herding can be successfully used to compress a dataset consisting of binary digits. It is also verified that high compression rates translate into good prediction performance on unseen test data.'
volume: 9
URL: http://proceedings.mlr.press/v9/chen10a.html
PDF: http://proceedings.mlr.press/v9/chen10a/chen10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-chen10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Chen
given: Yutian
- family: Welling
given: Max
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 97-104
id: chen10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 97
lastpage: 104
published: 2010-03-31 00:00:00 +0000
- title: 'Mass Fatality Incident Identification based on nuclear DNA evidence'
abstract: 'This paper focuses on the use of nuclear DNA Short Tandem Repeat traits for the identification of the victims of a Mass Fatality Incident. The goal of the analysis is the assessment of the identification probabilities concerning the recovered victims. Identification hypotheses are evaluated conditionally to the DNA evidence observed both on the recovered victims and on the relatives of the missing persons disappeared in the tragical event. After specifying a set of conditional independence assertions suitable for the problem, an inference strategy is provided, treating some points to achieve computational efficiency. Finally, the proposal is tested through the simulation of a Mass Fatality Incident and the results are examined in details.'
volume: 9
URL: http://proceedings.mlr.press/v9/corradi10a.html
PDF: http://proceedings.mlr.press/v9/corradi10a/corradi10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-corradi10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Corradi
given: Fabio
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 105-112
id: corradi10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 105
lastpage: 112
published: 2010-03-31 00:00:00 +0000
- title: 'On the Impact of Kernel Approximation on Learning Accuracy'
abstract: 'Kernel approximation is commonly used to scale kernel-based algorithms to applications containing as many as several million instances. This paper analyzes the effect of such approximations in the kernel matrix on the hypothesis generated by several widely used learning algorithms. We give stability bounds based on the norm of the kernel approximation for these algorithms, including SVMs, KRR, and graph Laplacian-based regularization algorithms. These bounds help determine the degree of approximation that can be tolerated in the estimation of the kernel matrix. Our analysis is general and applies to arbitrary approximations of the kernel matrix. However, we also give a specific analysis of the Nystrom low-rank approximation in this context and report the results of experiments evaluating the quality of the Nystrom low-rank kernel approximation when used with ridge regression.'
volume: 9
URL: http://proceedings.mlr.press/v9/cortes10a.html
PDF: http://proceedings.mlr.press/v9/cortes10a/cortes10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-cortes10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Cortes
given: Corinna
- family: Mohri
given: Mehryar
- family: Talwalkar
given: Ameet
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 113-120
id: cortes10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 113
lastpage: 120
published: 2010-03-31 00:00:00 +0000
- title: 'Improving posterior marginal approximations in latent Gaussian models'
abstract: 'We consider the problem of correcting the posterior marginal approximations computed by expectation propagation and Laplace approximation in latent Gaussian models and propose correction methods that are similar in spirit to the Laplace approximation of Tierney and Kadane (1986). We show that in the case of sparse Gaussian models, the computational complexity of expectation propagation can be made comparable to that of the Laplace approximation by using a parallel updating scheme. In some cases, expectation propagation gives excellent estimates, where the Laplace approximation fails. Inspired by bounds on the marginal corrections, we arrive at factorized approximations, which can be applied on top of both expectation propagation and Laplace. These give nearly indistinguishable results from the non-factorized approximations in a fraction of the time.'
volume: 9
URL: http://proceedings.mlr.press/v9/cseke10a.html
PDF: http://proceedings.mlr.press/v9/cseke10a/cseke10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-cseke10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Cseke
given: Botond
- family: Heskes
given: Tom
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 121-128
id: cseke10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 121
lastpage: 128
published: 2010-03-31 00:00:00 +0000
- title: 'Impossibility Theorems for Domain Adaptation'
abstract: 'The domain adaptation problem in machine learning occurs when the test data generating distribution differs from the one that generates the training data. It is clear that the success of learning under such circumstances depends on similarities between the two data distributions. We study assumptions about the relationship between the two distributions that one needed for domain adaptation learning to succeed. We analyze the assumptions in an agnostic PAC-style learning model for a the setting in which the learner can access a labeled training data sample and an unlabeled sample generated by the test data distribution. We focus on three assumptions: (i) Similarity between the unlabeled distributions, (ii) Existence of a classifier in the hypothesis class with low error on both training and testing distributions, and (iii) The covariate shift assumption. I.e., the assumption that the conditioned label distribution (for each data point) is the same for both the training and test distributions. We show that without either assumption (i) or (ii), the combination of the remaining assumptions is not sufficient toguarantee successful learning. Our negative results hold with respect to any domain adaptation learning algorithm, as long as it does not have access to target labeled examples. In particular, we provide formal proofs that the popular covariate shift assumption is rather weak and does not relieve the necessity of the other assumptions. We also discuss the intuitively appealing paradigm of reweighing the labeled training sample according to the target unlabeled distribution. We show that, somewhat counter intuitively, that paradigm cannot be trusted in the following sense. There are DA tasks that are indistinguishable, as far as the input training data goes, but in which reweighing leads to significant improvement in one task, while causing dramatic deterioration of the learning success in the other.'
volume: 9
URL: http://proceedings.mlr.press/v9/david10a.html
PDF: http://proceedings.mlr.press/v9/david10a/david10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-david10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: David
given: Shai Ben
- family: Lu
given: Tyler
- family: Luu
given: Teresa
- family: Pal
given: David
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 129-136
id: david10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 129
lastpage: 136
published: 2010-03-31 00:00:00 +0000
- title: 'Multiclass-Multilabel Classification with More Classes than Examples'
abstract: 'We discuss multiclass-multilabel classification problems in which the set of possible labels is extremely large. Most existing multiclass-multilabel learning algorithms expect to observe a reasonably large sample from each class, and fail if they receive only a handful of examples with a given label. We propose and analyze the following two-stage approach: first use an arbitrary (perhaps heuristic) classification algorithm to construct an initial classifier, then apply a simple but principled method to augment this classifier by removing harmful labels from its output. A careful theoretical analysis allows us to justify our approach under some reasonable conditions (such as label sparsity and power-law distribution of label frequencies), even when the training set does not provide a statistically accurate representation of most classes. Surprisingly, our theoretical analysis continues to hold even when the number of classes exceeds the sample size. We demonstrate the merits of our approach on the ambitious task of categorizing the entire web using the 1.5 million categories defined on Wikipedia.'
volume: 9
URL: http://proceedings.mlr.press/v9/dekel10a.html
PDF: http://proceedings.mlr.press/v9/dekel10a/dekel10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-dekel10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Dekel
given: Ofer
- family: Shamir
given: Ohad
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 137-144
id: dekel10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 137
lastpage: 144
published: 2010-03-31 00:00:00 +0000
- title: 'Tempered Markov Chain Monte Carlo for training of Restricted Boltzmann Machines'
abstract: 'Alternating Gibbs sampling is the most common scheme used for sampling from Restricted Boltzmann Machines (RBM), a crucial component in deep architectures such as Deep Belief Networks. However, we find that it often does a very poor job of rendering the diversity of modes captured by the trained model. We suspect that this hinders the advantage that could in principle be brought by training algorithms relying on Gibbs sampling for uncovering spurious modes, such as the Persistent Contrastive Divergence algorithm. To alleviate this problem, we explore the use of tempered Markov Chain Monte-Carlo for sampling in RBMs. We find both through visualization of samples and measures of likelihood on a toy dataset that it helps both sampling and learning.'
volume: 9
URL: http://proceedings.mlr.press/v9/desjardins10a.html
PDF: http://proceedings.mlr.press/v9/desjardins10a/desjardins10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-desjardins10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Desjardins
given: Guillaume
- family: Courville
given: Aaron
- family: Bengio
given: Yoshua
- family: Vincent
given: Pascal
- family: Delalleau
given: Olivier
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 145-152
id: desjardins10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 145
lastpage: 152
published: 2010-03-31 00:00:00 +0000
- title: 'Feature Selection using Multiple Streams'
abstract: 'Feature selection for supervised learning can be greatly improved by making use of the fact that features often come in classes. For example, in gene expression data, the genes which serve as features may be divided into classes based on their membership in gene families or pathways. When labeling words with senses for word sense disambiguation, features fall into classes including adjacent words, their parts of speech, and the topic and venue of the document the word is in. We present a streamwise feature selection method that allows dynamic generation and selection of features, while taking advantage of the different feature classes, and the fact that they are of different sizes and have different (but unknown) fractions of good features. Experimental results show that our approach provides significant improvement in performance and is computationally less expensive than comparable “batch” methods that do not take advantage of the feature classes and expect all features to be known in advance.'
volume: 9
URL: http://proceedings.mlr.press/v9/dhillon10a.html
PDF: http://proceedings.mlr.press/v9/dhillon10a/dhillon10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-dhillon10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Dhillon
given: Paramveer
- family: Foster
given: Dean
- family: Ungar
given: Lyle
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 153-160
id: dhillon10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 153
lastpage: 160
published: 2010-03-31 00:00:00 +0000
- title: 'Bayesian variable order Markov models'
abstract: 'We present a simple, effective generalisation of variable order Markov models to full online Bayesian estimation. The mechanism used is close to that employed in context tree weighting. The main contribution is the addition of a prior, conditioned on context, on the Markov order. The resulting construction uses a simple recursion and can be updated efficiently. This allows the model to make predictions using more complex contexts, as more data is acquired, if necessary. In addition, our model can be alternatively seen as a mixture of tree experts. Experimental results show that the predictive model exhibits consistently good performance in a variety of domains.'
volume: 9
URL: http://proceedings.mlr.press/v9/dimitrakakis10a.html
PDF: http://proceedings.mlr.press/v9/dimitrakakis10a/dimitrakakis10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-dimitrakakis10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Dimitrakakis
given: Christos
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 161-168
id: dimitrakakis10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 161
lastpage: 168
published: 2010-03-31 00:00:00 +0000
- title: 'Nonparametric Bayesian Matrix Factorization by Power-EP'
abstract: 'Many real-world applications can be modeled by matrix factorization. By approximating an observed data matrix as the product of two latent matrices, matrix factorization can reveal hidden structures embedded in data. A common challenge to use matrix factorization is determining the dimensionality of the latent matrices from data. Indian Buffet Processes (IBPs) enable us to apply the nonparametric Bayesian machinery to address this challenge. However, it remains a difficult task to learn nonparametric Bayesian matrix factorization models. In this paper, we propose a novel variational Bayesian method based on new equivalence classes of infinite matrices for learning these models. Furthermore, inspired by the success of nonnegative matrix factorization on many learning problems, we impose nonnegativity constraints on the latent matrices and mix variational inference with expectation propagation. This mixed inference method is unified in a power expectation propagation framework. Experimental results on image decomposition demonstrate the superior computational efficiency and the higher prediction accuracy of our methods compared to alternative Monte Carlo and variational inference methods for IBP models. We also apply the new methods to collaborative filtering and role mining and show the improved predictive performance over other matrix factorization methods.'
volume: 9
URL: http://proceedings.mlr.press/v9/ding10a.html
PDF: http://proceedings.mlr.press/v9/ding10a/ding10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-ding10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Ding
given: Nan
- family: Qi
given: Yuan
- family: Xiang
given: Rongjing
- family: Molloy
given: Ian
- family: Li
given: Ninghui
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 169-176
id: ding10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 169
lastpage: 176
published: 2010-03-31 00:00:00 +0000
- title: 'Neural conditional random fields'
abstract: 'We propose a non-linear graphical model for structured prediction. It combines the power of deep neural networks to extract high level features with the graphical framework of Markov networks, yielding a powerful and scalable probabilistic model that we apply to signal labeling tasks.'
volume: 9
URL: http://proceedings.mlr.press/v9/do10a.html
PDF: http://proceedings.mlr.press/v9/do10a/do10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-do10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Do
given: Trinh–Minh–Tri
- family: Artieres
given: Thierry
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 177-184
id: do10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 177
lastpage: 184
published: 2010-03-31 00:00:00 +0000
- title: 'Combining Experiments to Discover Linear Cyclic Models with Latent Variables'
abstract: 'We present an algorithm to infer causal relations between a set of measured variables on the basis of experiments on these variables. The algorithm assumes that the causal relations are linear, but is otherwise completely general: It provides consistent estimates when the true causal structure contains feedback loops and latent variables, while the experiments can involve surgical or ’soft’ interventions on one or multiple variables at a time. The algorithm is ’online’ in the sense that it combines the results from any set of available experiments, can incorporate background knowledge and resolves conflicts that arise from combining results from different experiments. In addition we provide a necessary and sufficient condition that (i) determines when the algorithm can uniquely return the true graph, and (ii) can be used to select the next best experiment until this condition is satisfied. We demonstrate the method by applying it to simulated data and the flow cytometry data of Sachs et al (2005).'
volume: 9
URL: http://proceedings.mlr.press/v9/eberhardt10a.html
PDF: http://proceedings.mlr.press/v9/eberhardt10a/eberhardt10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-eberhardt10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Eberhardt
given: Frederick
- family: Hoyer
given: Patrik
- family: Scheines
given: Richard
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 185-192
id: eberhardt10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 185
lastpage: 192
published: 2010-03-31 00:00:00 +0000
- title: 'Graphical Gaussian modelling of multivariate time series with latent variables'
abstract: 'In time series analysis, inference about cause-effect relationships among multiple times series is commonly based on the concept of Granger causality, which exploits temporal structure to achieve causal ordering of dependent variables. One major problem in the application of Granger causality for the identification of causal relationships is the possible presence of latent variables that affect the measured components and thus lead to so-called spurious causalities. In this paper, we describe a new graphical approach for modelling the dependence structure of multivariate stationary time series that are affected by latent variables. To this end, we introduce dynamic maximal ancestral graphs (dMAGs), in which each time series is represented by a single vertex. For Gaussian processes, this approach leads to vector autoregressive models with errors that are not independent but correlated according to the dashed edges in the graph. We discuss identifiability of the parameters and show that these models can be viewed as graphical ARMA models that satisfy the Granger causality restrictions encoded by the associated dynamic maximal ancestral graph.'
volume: 9
URL: http://proceedings.mlr.press/v9/eichler10a.html
PDF: http://proceedings.mlr.press/v9/eichler10a/eichler10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-eichler10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Eichler
given: Michael
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 193-200
id: eichler10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 193
lastpage: 200
published: 2010-03-31 00:00:00 +0000
- title: 'Why Does Unsupervised Pre-training Help Deep Learning?'
abstract: 'Much recent research has been devoted to learning algorithms for deep architectures such as Deep Belief Networks and stacks of auto-encoder variants with impressive results being obtained in several areas, mostly on vision and language datasets. The best results obtained on supervised learning tasks often involve an unsupervised learning component, usually in an unsupervised pre-training phase. The main question investigated here is the following: why does unsupervised pre-training work so well? Through extensive experimentation, we explore several possible explanations discussed in the literature including its action as a regularizer (Erhan et al. 2009) and as an aid to optimization (Bengio et al. 2007). Our results build on the work of Erhan et al. 2009, showing that unsupervised pre-training appears to play predominantly a regularization role in subsequent supervised training. However our results in an online setting, with a virtually unlimited data stream, point to a somewhat more nuanced interpretation of the roles of optimization and regularization in the unsupervised pre-training effect.'
volume: 9
URL: http://proceedings.mlr.press/v9/erhan10a.html
PDF: http://proceedings.mlr.press/v9/erhan10a/erhan10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-erhan10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Erhan
given: Dumitru
- family: Courville
given: Aaron
- family: Bengio
given: Yoshua
- family: Vincent
given: Pascal
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 201-208
id: erhan10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 201
lastpage: 208
published: 2010-03-31 00:00:00 +0000
- title: 'Semi-Supervised Learning via Generalized Maximum Entropy'
abstract: 'Various supervised inference methods can be analyzed as convex duals of the generalized maximum entropy (MaxEnt) framework. Generalized MaxEnt aims to find a distribution that maximizes an entropy function while respecting prior information represented as potential functions in miscellaneous forms of constraints and/or penalties. We extend this framework to semi-supervised learning by incorporating unlabeled data via modifications to these potential functions reflecting structural assumptions on the data geometry. The proposed approach leads to a family of discriminative semi-supervised algorithms, that are convex, scalable, inherently multi-class, easy to implement, and that can be kernelized naturally. Experimental evaluation of special cases shows the competitiveness of our methodology.'
volume: 9
URL: http://proceedings.mlr.press/v9/erkan10a.html
PDF: http://proceedings.mlr.press/v9/erkan10a/erkan10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-erkan10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Erkan
given: Ayse
- family: Altun
given: Yasemin
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 209-216
id: erkan10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 209
lastpage: 216
published: 2010-03-31 00:00:00 +0000
- title: 'Model-Free Monte Carlo-like Policy Evaluation'
abstract: 'We propose an algorithm for estimating the finite-horizon expected return of a closed loop control policy from an a priori given (off-policy) sample of one-step transitions. It averages cumulated rewards along a set of “broken trajectories” made of one-step transitions selected from the sample on the basis of the control policy. Under some Lipschitz continuity assumptions on the system dynamics, reward function and control policy, we provide bounds on the bias and variance of the estimator that depend only on the Lipschitz constants, on the number of broken trajectories used in the estimator, and on the sparsity of the sample of one-step transitions.'
volume: 9
URL: http://proceedings.mlr.press/v9/fonteneau10a.html
PDF: http://proceedings.mlr.press/v9/fonteneau10a/fonteneau10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-fonteneau10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Fonteneau
given: Raphael
- family: Murphy
given: Susan
- family: Wehenkel
given: Louis
- family: Ernst
given: Damien
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 217-224
id: fonteneau10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 217
lastpage: 224
published: 2010-03-31 00:00:00 +0000
- title: 'A Weighted Multi-Sequence Markov Model For Brain Lesion Segmentation'
abstract: 'We propose a technique for fusing the output of multiple Magnetic Resonance (MR) sequences to robustly and accurately segment brain lesions. It is based on an augmented multi-sequence Hidden Markov model that includes additional weight variables to account for the relative importance and control the impact of each sequence. The augmented framework has the advantage of allowing 1) the incorporation of expert knowledge on the a priori relevant information content of each sequence and 2) a weighting scheme which is modified adaptively according to the data and the segmentation task under consideration. The model, applied to the detection of multiple sclerosis and stroke lesions shows promising results.'
volume: 9
URL: http://proceedings.mlr.press/v9/forbes10a.html
PDF: http://proceedings.mlr.press/v9/forbes10a/forbes10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-forbes10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Forbes
given: Florence
- family: Doyle
given: Senan
- family: Garcia–Lorenzo
given: Daniel
- family: Barillot
given: Christian
- family: Dojat
given: Michel
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 225-232
id: forbes10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 225
lastpage: 232
published: 2010-03-31 00:00:00 +0000
- title: 'Posterior distributions are computable from predictive distributions'
abstract: 'As we devise more complicated prior distributions, will inference algorithms keep up? We highlight a negative result in computable probability theory by Ackerman, Freer, and Roy (2010) that shows that there exist computable priors with noncomputable posteriors. In addition to providing a brief survey of computable probability theory geared towards the A.I. and statistics community, we give a new result characterizing when conditioning is computable in the setting of exchangeable sequences, and provide a computational perspective on work by Orbanz (2010) on conjugate nonparametric models. In particular, using a computable extension of de Finetti’s theorem (Freer and Roy 2009), we describe how to transform a posterior predictive rule for generating an exchangeable sequence into an algorithm for computing the posterior distribution of the directing random measure.'
volume: 9
URL: http://proceedings.mlr.press/v9/freer10a.html
PDF: http://proceedings.mlr.press/v9/freer10a/freer10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-freer10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Freer
given: Cameron
- family: Roy
given: Daniel
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 233-240
id: freer10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 233
lastpage: 240
published: 2010-03-31 00:00:00 +0000
- title: 'Variational methods for Reinforcement Learning'
abstract: 'We consider reinforcement learning as solving a Markov decision process with unknown transition distribution. Based on interaction with the environment, an estimate of the transition matrix is obtained from which the optimal decision policy is formed. The classical maximum likelihood point estimate of the transition model does not reflect the uncertainty in the estimate of the transition model and the resulting policies may consequently lack a sufficient degree of exploration. We consider a Bayesian alternative that maintains a distribution over the transition so that the resulting policy takes into account the limited experience of the environment. The resulting algorithm is formally intractable and we discuss two approximate solution methods, Variational Bayes and Expectation Propagation.'
volume: 9
URL: http://proceedings.mlr.press/v9/furmston10a.html
PDF: http://proceedings.mlr.press/v9/furmston10a/furmston10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-furmston10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Furmston
given: Thomas
- family: Barber
given: David
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 241-248
id: furmston10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 241
lastpage: 248
published: 2010-03-31 00:00:00 +0000
- title: 'Understanding the difficulty of training deep feedforward neural networks'
abstract: 'Whereas before 2006 it appears that deep multi-layer neural networks were not successfully trained, since then several algorithms have been shown to successfully train them, with experimental results showing the superiority of deeper vs less deep architectures. All these experimental results were obtained with new initialization or training mechanisms. Our objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future. We first observe the influence of the non-linear activations functions. We find that the logistic sigmoid activation is unsuited for deep networks with random initialization because of its mean value, which can drive especially the top hidden layer into saturation. Surprisingly, we find that saturated units can move out of saturation by themselves, albeit slowly, and explaining the plateaus sometimes seen when training neural networks. We find that a new non-linearity that saturates less can often be beneficial. Finally, we study how activations and gradients vary across layers and during training, with the idea that training may be more difficult when the singular values of the Jacobian associated with each layer are far from 1. Based on these considerations, we propose a new initialization scheme that brings substantially faster convergence.'
volume: 9
URL: http://proceedings.mlr.press/v9/glorot10a.html
PDF: http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-glorot10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Glorot
given: Xavier
- family: Bengio
given: Yoshua
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 249-256
id: glorot10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 249
lastpage: 256
published: 2010-03-31 00:00:00 +0000
- title: 'On Combining Graph-based Variance Reduction schemes'
abstract: 'In this paper, we consider two variance reduction schemes that exploit the structure of the primal graph of the graphical model: Rao-Blackwellised w-cutset sampling and AND/OR sampling. We show that the two schemes are orthogonal and can be combined to further reduce the variance. Our combination yields a new family of estimators which trade time and space with variance. We demonstrate experimentally that the new estimators are superior, often yielding an order of magnitude improvement over previous schemes on several benchmarks.'
volume: 9
URL: http://proceedings.mlr.press/v9/gogate10a.html
PDF: http://proceedings.mlr.press/v9/gogate10a/gogate10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-gogate10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Gogate
given: Vibhav
- family: Dechter
given: Rina
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 257-264
id: gogate10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 257
lastpage: 264
published: 2010-03-31 00:00:00 +0000
- title: 'Locally Linear Denoising on Image Manifolds'
abstract: 'We study the problem of image denoising where images are assumed to be samples from low dimensional (sub)manifolds. We propose the algorithm of locally linear denoising. The algorithm approximates manifolds with locally linear patches by constructing nearest neighbor graphs. Each image is then locally denoised within its neighborhoods. A global optimal denoising result is then identified by aligning those local estimates. The algorithm has a closed-form solution that is efficient to compute. We evaluated and compared the algorithm to alternative methods on two image data sets. We demonstrated the effectiveness of the proposed algorithm, which yields visually appealing denoising results, incurs smaller reconstruction errors and results in lower error rates when the denoised data are used in supervised learning tasks.'
volume: 9
URL: http://proceedings.mlr.press/v9/gong10a.html
PDF: http://proceedings.mlr.press/v9/gong10a/gong10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-gong10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Gong
given: Dian
- family: Sha
given: Fei
- family: Medioni
given: Gérard
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 265-272
id: gong10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 265
lastpage: 272
published: 2010-03-31 00:00:00 +0000
- title: 'Regret Bounds for Gaussian Process Bandit Problems'
abstract: 'Bandit algorithms are concerned with trading exploration with exploitation where a number of options are available but we can only learn their quality by experimenting with them. We consider the scenario in which the reward distribution for arms is modeled by a Gaussian process and there is no noise in the observed reward. Our main result is to bound the regret experienced by algorithms relative to the a posteriori optimal strategy of playing the best arm throughout based on benign assumptions about the covariance function defining the Gaussian process. We further complement these upper bounds with corresponding lower bounds for particular covariance functions demonstrating that in general there is at most a logarithmic looseness in our upper bounds.'
volume: 9
URL: http://proceedings.mlr.press/v9/grunewalder10a.html
PDF: http://proceedings.mlr.press/v9/grunewalder10a/grunewalder10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-grunewalder10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Grünewälder
given: Steffen
- family: Audibert
given: Jean–Yves
- family: Opper
given: Manfred
- family: Shawe–Taylor
given: John
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 273-280
id: grunewalder10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 273
lastpage: 280
published: 2010-03-31 00:00:00 +0000
- title: 'Sufficient covariates and linear propensity analysis'
abstract: 'Working within the decision-theoretic framework for causal inference, we study the properties of “sufficient covariates", which support causal inference from observational data, and possibilities for their reduction. In particular we illustrate the role of a propensity variable by means of a simple model, and explain why such a reduction typically does not increase (and may reduce) estimation efficiency.'
volume: 9
URL: http://proceedings.mlr.press/v9/guo10a.html
PDF: http://proceedings.mlr.press/v9/guo10a/guo10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-guo10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Guo
given: Hui
- family: Dawid
given: Philip
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 281-288
id: guo10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 281
lastpage: 288
published: 2010-03-31 00:00:00 +0000
- title: 'Real-time Multiattribute Bayesian Preference Elicitation with Pairwise Comparison Queries'
abstract: 'Preference elicitation (PE) is an important component of interactive decision support systems that aim to make optimal recommendations to users by actively querying their preferences. In this paper, we outline five principles important for PE in real-world problems: (1) real-time, (2) multiattribute, (3) low cognitive load, (4) robust to noise, and (5) scalable. In light of these requirements, we introduce an approximate PE framework based on TrueSkill for performing efficient closed-form Bayesian updates and query selection for a multiattribute utility belief state — a novel PE approach that naturally facilitates the efficient evaluation of value of information (VOI) heuristics for use in query selection strategies. Our best VOI query strategy satisfies all five principles (in contrast to related work) and performs on par with the most accurate (and often computationally intensive) algorithms on experiments with synthetic and real-world datasets.'
volume: 9
URL: http://proceedings.mlr.press/v9/guo10b.html
PDF: http://proceedings.mlr.press/v9/guo10b/guo10b.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-guo10b.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Guo
given: Shengbo
- family: Sanner
given: Scott
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 289-296
id: guo10b
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 289
lastpage: 296
published: 2010-03-31 00:00:00 +0000
- title: 'Noise-contrastive estimation: A new estimation principle for unnormalized statistical models'
abstract: 'We present a new estimation principle for parameterized statistical models. The idea is to perform nonlinear logistic regression to discriminate between the observed data and some artificially generated noise, using the model log-density function in the regression nonlinearity. We show that this leads to a consistent (convergent) estimator of the parameters, and analyze the asymptotic variance. In particular, the method is shown to directly work for unnormalized models, i.e. models where the density function does not integrate to one. The normalization constant can be estimated just like any other parameter. For a tractable ICA model, we compare the method with other estimation methods that can be used to learn unnormalized models, including score matching, contrastive divergence, and maximum-likelihood where the normalization constant is estimated with importance sampling. Simulations show that noise-contrastive estimation offers the best trade-off between computational and statistical efficiency. The method is then applied to the modeling of natural images: We show that the method can successfully estimate a large-scale two-layer model and a Markov random field.'
volume: 9
URL: http://proceedings.mlr.press/v9/gutmann10a.html
PDF: http://proceedings.mlr.press/v9/gutmann10a/gutmann10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-gutmann10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Gutmann
given: Michael
- family: Hyvärinen
given: Aapo
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 297-304
id: gutmann10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 297
lastpage: 304
published: 2010-03-31 00:00:00 +0000
- title: 'Boosted Optimization for Network Classification'
abstract: 'In this paper we propose a new classification algorithm designed for application on complex networks motivated by algorithmic similarities between boosting learning and message passing. We consider a network classifier as a logistic regression where the variables define the nodes and the interaction effects define the edges. From this definition we represent the problem as a factor graph of local exponential loss functions. Using the factor graph representation it is possible to interpret the network classifier as an ensemble of individual node classifiers. We then combine ideas from boosted learning with network optimization algorithms to define two novel algorithms, Boosted Expectation Propagation (BEP) and Boosted Message Passing (BMP). These algorithms optimize the global network classifier performance by locally weighting each node classifier by the error of the surrounding network structure. We compare the performance of BEP and BMP to logistic regression as well state of the art penalized logistic regression models on simulated grid structured networks. The results show that using local boosting to optimize the performance of a network classifier increases classification performance and is especially powerful in cases when the whole network structure must be considered for accurate classification.'
volume: 9
URL: http://proceedings.mlr.press/v9/hancock10a.html
PDF: http://proceedings.mlr.press/v9/hancock10a/hancock10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-hancock10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Hancock
given: Timothy
- family: Mamitsuka
given: Hiroshi
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 305-312
id: hancock10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 305
lastpage: 312
published: 2010-03-31 00:00:00 +0000
- title: 'Dirichlet Process Mixtures of Generalized Linear Models'
abstract: 'We propose Dirichlet Process mixtures of Generalized Linear Models (DP-GLMs), a new method of nonparametric regression that accommodates continuous and categorical inputs, models a response variable locally by a generalized linear model. We give conditions for the existence and asymptotic unbiasedness of the DP-GLM regression mean function estimate; we then give a practical example for when those conditions hold. We evaluate DP-GLM on several data sets, comparing it to modern methods of nonparametric regression including regression trees and Gaussian processes.'
volume: 9
URL: http://proceedings.mlr.press/v9/hannah10a.html
PDF: http://proceedings.mlr.press/v9/hannah10a/hannah10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-hannah10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Hannah
given: Lauren
- family: Blei
given: David
- family: Powell
given: Warren
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 313-320
id: hannah10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 313
lastpage: 320
published: 2010-03-31 00:00:00 +0000
- title: 'Negative Results for Active Learning with Convex Losses'
abstract: 'We study the problem of active learning with convex loss functions. We prove that even under bounded noise constraints, the minimax rates for proper active learning are often no better than passive learning.'
volume: 9
URL: http://proceedings.mlr.press/v9/hanneke10a.html
PDF: http://proceedings.mlr.press/v9/hanneke10a/hanneke10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-hanneke10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Hanneke
given: Steve
- family: Yang
given: Liu
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 321-325
id: hanneke10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 321
lastpage: 325
published: 2010-03-31 00:00:00 +0000
- title: 'Coherent Inference on Optimal Play in Game Trees'
abstract: 'Round-based games are an instance of discrete planning problems. Some of the best contemporary game tree search algorithms use random roll-outs as data. Relying on a good policy, they learn on-policy values by propagating information upwards in the tree, but not between sibling nodes. Here, we present a generative model and a corresponding approximate message passing scheme for inference on the optimal, off-policy value of nodes in smooth AND/OR trees, given random roll-outs. The crucial insight is that the distribution of values in game trees is not completely arbitrary. We define a generative model of the on-policy values using a latent score for each state, representing the value under the random roll-out policy. Inference on the values under the optimal policy separates into an inductive, pre-data step and a deductive, post-data part. Both can be solved approximately with Expectation Propagation, allowing off-policy value inference for any node in the (exponentially big) tree in linear time.'
volume: 9
URL: http://proceedings.mlr.press/v9/hennig10a.html
PDF: http://proceedings.mlr.press/v9/hennig10a/hennig10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-hennig10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Hennig
given: Philipp
- family: Stern
given: David
- family: Graepel
given: Thore
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 326-333
id: hennig10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 326
lastpage: 333
published: 2010-03-31 00:00:00 +0000
- title: 'Collaborative Filtering via Rating Concentration'
abstract: 'While most popular collaborative filtering methods use low-rank matrix factorization and parametric density assumptions, this article proposes an approach based on distribution-free concentration inequalities. Using agnostic hierarchical sampling assumptions, functions of observed ratings are provably close to their expectations over query ratings, on average. A joint probability distribution over queries of interest is estimated using maximum entropy regularization. The distribution resides in a convex hull of allowable candidate distributions which satisfy concentration inequalities that stem from the sampling assumptions. The method accurately estimates rating distributions on synthetic and real data and is competitive with low rank and parametric methods which make more aggressive assumptions about the problem.'
volume: 9
URL: http://proceedings.mlr.press/v9/huang10a.html
PDF: http://proceedings.mlr.press/v9/huang10a/huang10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-huang10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Huang
given: Bert
- family: Jebara
given: Tony
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 334-341
id: huang10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 334
lastpage: 341
published: 2010-03-31 00:00:00 +0000
- title: 'Maximum-likelihood learning of cumulative distribution functions on graphs'
abstract: 'For many applications, a probability model can be easily expressed as a cumulative distribution functions (CDF) as compared to the use of probability density or mass functions (PDF/PMFs). Cumulative distribution networks (CDNs) have recently been proposed as a class of graphical models for CDFs. One advantage of CDF models is the simplicity of representing multivariate heavy-tailed distributions. Examples of fields that can benefit from the use of graphical models for CDFs include climatology and epidemiology, where data may follow extreme value statistics and exhibit spatial correlations so that dependencies between model variables must be accounted for. The problem of learning from data in such settings may nevertheless consist of optimizing the log-likelihood function with respect to model parameters where we are required to optimize a log-PDF/PMF and not a log-CDF. We present a message-passing algorithm called the gradient-derivative-product (GDP) algorithm that allows us to learn the model in terms of the log-likelihood function whereby messages correspond to local gradients of the likelihood with respect to model parameters. We will demonstrate the GDP algorithm on real-world rainfall and H1N1 mortality data and we will show that CDNs provide a natural choice of parameterizations for the heavy-tailed multivariate distributions that arise in these problems.'
volume: 9
URL: http://proceedings.mlr.press/v9/huang10b.html
PDF: http://proceedings.mlr.press/v9/huang10b/huang10b.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-huang10b.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Huang
given: Jim
- family: Jojic
given: Nebojsa
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 342-349
id: huang10b
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 342
lastpage: 349
published: 2010-03-31 00:00:00 +0000
- title: 'Learning Nonlinear Dynamic Models from Non-sequenced Data'
abstract: 'Virtually all methods of learning dynamic systems from data start from the same basic assumption: the learning algorithm will be given a sequence, or trajectory, of data generated from the dynamic system. We consider the case where the data is not sequenced. The training data points come from the system’s operation but with no temporal ordering. The data are simply drawn as individual disconnected points. While making this assumption may seem absurd at first glance, many scientific modeling tasks have exactly this property. Previous work proposed methods for learning linear, discrete time models under these assumptions by optimizing approximate likelihood functions. In this paper, we extend those methods to nonlinear models using kernel methods. We go on to propose a new approach to solving the problem that focuses on achieving temporal smoothness in the learned dynamics. The result is a convex criterion that can be easily optimized and often outperforms the earlier methods. We test these methods on several synthetic data sets including one generated from the Lorenz attractor.'
volume: 9
URL: http://proceedings.mlr.press/v9/huang10c.html
PDF: http://proceedings.mlr.press/v9/huang10c/huang10c.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-huang10c.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Huang
given: Tzu–Kuo
- family: Song
given: Le
- family: Schneider
given: Jeff
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 350-357
id: huang10c
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 350
lastpage: 357
published: 2010-03-31 00:00:00 +0000
- title: 'Learning Bayesian Network Structure using LP Relaxations'
abstract: 'We propose to solve the combinatorial problem of finding the highest scoring Bayesian network structure from data. This structure learning problem can be viewed as an inference problem where the variables specify the choice of parents for each node in the graph. The key combinatorial difficulty arises from the global constraint that the graph structure has to be acyclic. We cast the structure learning problem as a linear program over the polytope defined by valid acyclic structures. In relaxing this problem, we maintain an outer bound approximation to the polytope and iteratively tighten it by searching over a new class of valid constraints. If an integral solution is found, it is guaranteed to be the optimal Bayesian network. When the relaxation is not tight, the fast dual algorithms we develop remain useful in combination with a branch and bound method. Empirical results suggest that the method is competitive or faster than alternative exact methods based on dynamic programming.'
volume: 9
URL: http://proceedings.mlr.press/v9/jaakkola10a.html
PDF: http://proceedings.mlr.press/v9/jaakkola10a/jaakkola10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-jaakkola10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Jaakkola
given: Tommi
- family: Sontag
given: David
- family: Globerson
given: Amir
- family: Meila
given: Marina
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 358-365
id: jaakkola10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 358
lastpage: 365
published: 2010-03-31 00:00:00 +0000
- title: 'Structured Sparse Principal Component Analysis'
abstract: 'We present an extension of sparse PCA, or sparse dictionary learning, where the sparsity patterns of all dictionary elements are structured and constrained to belong to a prespecified set of shapes. This structured sparse PCA is based on a structured regularization recently introduced by Jenatton et al.(2009). While classical sparse priors only deal with cardinality, the regularization we use encodes higher-order information about the data. We propose an efficient and simple optimization procedure to solve this problem. Experiments with two practical tasks, the denoising of sparse structured signals and face recognition, demonstrate the benefits of the proposed structured approach over unstructured approaches.'
volume: 9
URL: http://proceedings.mlr.press/v9/jenatton10a.html
PDF: http://proceedings.mlr.press/v9/jenatton10a/jenatton10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-jenatton10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Jenatton
given: Rodolphe
- family: Obozinski
given: Guillaume
- family: Bach
given: Francis
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 366-373
id: jenatton10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 366
lastpage: 373
published: 2010-03-31 00:00:00 +0000
- title: 'Nonlinear functional regression: a functional RKHS approach'
abstract: 'This paper deals with functional regression, in which the input attributes as well as the response are functions. To deal with this problem, we develop a functional reproducing kernel Hilbert space approach; here, a kernel is an operator acting on a function and yielding a function. We demonstrate basic properties of these functional RKHS, as well as a representer theorem for this setting; we investigate the construction of kernels; we provide some experimental insight.'
volume: 9
URL: http://proceedings.mlr.press/v9/kadri10a.html
PDF: http://proceedings.mlr.press/v9/kadri10a/kadri10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-kadri10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Kadri
given: Hachem
- family: Duflos
given: Emmanuel
- family: Preux
given: Philippe
- family: Canu
given: Stéphane
- family: Davy
given: Manuel
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 374-380
id: kadri10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 374
lastpage: 380
published: 2010-03-31 00:00:00 +0000
- title: 'Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity'
abstract: 'The versatility of exponential families, along with their attendant convexity properties, make them a popular and effective statistical model. A central issue is learning these models in high-dimensions when the optimal parameter vector is sparse. This work characterizes a certain strong convexity property of general exponential families, which allows their generalization ability to be quantified. In particular, we show how this property can be used to analyze generic exponential families under L1 regularization.'
volume: 9
URL: http://proceedings.mlr.press/v9/kakade10a.html
PDF: http://proceedings.mlr.press/v9/kakade10a/kakade10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-kakade10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Kakade
given: Sham
- family: Shamir
given: Ohad
- family: Sindharan
given: Karthik
- family: Tewari
given: Ambuj
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 381-388
id: kakade10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 381
lastpage: 388
published: 2010-03-31 00:00:00 +0000
- title: 'Collaborative Filtering on a Budget'
abstract: 'Matrix factorization is a successful technique for building collaborative filtering systems. While it works well on a large range of problems, it is also known for requiring significant amounts of storage for each user or item to be added to the database. This is a problem whenever the collaborative filtering task is larger than the medium-sized Netflix Prize data. In this paper, we propose a new model for representing and compressing matrix factors via hashing. This allows for essentially unbounded storage (at a graceful storage / performance trade-off) for users and items to be represented in a pre-defined memory footprint. It allows us to scale recommender systems to very large numbers of users or conversely, obtain very good performance even for tiny models (e.g. 400kB of data suffice for a representation of the EachMovie problem). We provide both experimental results and approximation bounds for our compressed representation and we show how this approach can be extended to multipartite problems.'
volume: 9
URL: http://proceedings.mlr.press/v9/karatzoglou10a.html
PDF: http://proceedings.mlr.press/v9/karatzoglou10a/karatzoglou10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-karatzoglou10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Karatzoglou
given: Alexandros
- family: Smola
given: Alex
- family: Weimer
given: Markus
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 389-396
id: karatzoglou10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 389
lastpage: 396
published: 2010-03-31 00:00:00 +0000
- title: 'Fast Active-set-type Algorithms for L1-regularized Linear Regression'
abstract: 'In this paper, we investigate new active-set-type methods for l1-regularized linear regression that overcome some difficulties of existing active set methods. By showing a relationship between l1-regularized linear regression and the linear complementarity problem with bounds, we present a fast active-set-type method, called block principal pivoting. This method accelerates computation by allowing exchanges of several variables among working sets. We further provide an improvement of this method, discuss its properties, and also explain a connection to the structure learning of Gaussian graphical models. Experimental comparisons on synthetic and real data sets show that the proposed method is significantly faster than existing active set methods and competitive against recently developed iterative methods.'
volume: 9
URL: http://proceedings.mlr.press/v9/kim10a.html
PDF: http://proceedings.mlr.press/v9/kim10a/kim10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-kim10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Kim
given: Jingu
- family: Park
given: Haesun
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 397-404
id: kim10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 397
lastpage: 404
published: 2010-03-31 00:00:00 +0000
- title: 'Online Anomaly Detection under Adversarial Impact'
abstract: 'Security analysis of learning algorithms is gaining increasing importance, especially since they have become target of deliberate obstruction in certain applications. Some security-hardened algorithms have been previously proposed for supervised learning; however, very little is known about the behavior of anomaly detection methods in such scenarios. In this contribution, we analyze the performance of a particular method—online centroid anomaly detection—in the presence of adversarial noise. Our analysis addresses three key security-related issues: derivation of an optimal attack, analysis of its efficiency and constraints. Experimental evaluation carried out on real HTTP and exploit traces confirms the tightness of our theoretical bounds.'
volume: 9
URL: http://proceedings.mlr.press/v9/kloft10a.html
PDF: http://proceedings.mlr.press/v9/kloft10a/kloft10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-kloft10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Kloft
given: Marius
- family: Laskov
given: Pavel
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 405-412
id: kloft10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 405
lastpage: 412
published: 2010-03-31 00:00:00 +0000
- title: 'Ultra-high Dimensional Multiple Output Learning With Simultaneous Orthogonal Matching Pursuit: Screening Approach'
abstract: 'We propose a novel application of the Simultaneous Orthogonal Matching Pursuit (S-OMP) procedure to perform variable selection in ultra-high dimensional multiple output regression problems, which is the first attempt to utilize multiple outputs to perform fast removal of the irrelevant variables. As our main theoretical contribution, we show that the S-OMP can be used to reduce an ultra-high number of variables to below the sample size, without losing relevant variables. We also provide formal evidence that the modified Bayesian information criterion (BIC) can be used to efficiently select the number of iterations in the S-OMP. Once the number of variables has been reduced to a manageable size, we show that a more computationally demanding procedure can be used to identify the relevant variables for each of the regression outputs. We further provide evidence on the benefit of variable selection using the regression outputs jointly, as opposed to performing variable selection for each output separately. The finite sample performance of the S-OMP has been demonstrated on extensive simulation studies.'
volume: 9
URL: http://proceedings.mlr.press/v9/kolar10a.html
PDF: http://proceedings.mlr.press/v9/kolar10a/kolar10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-kolar10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Kolar
given: Mladen
- family: Xing
given: Eric
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 413-420
id: kolar10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 413
lastpage: 420
published: 2010-03-31 00:00:00 +0000
- title: 'Semi-Supervised Learning with Max-Margin Graph Cuts'
abstract: 'This paper proposes a novel algorithm for semi-supervised learning. This algorithm learns graph cuts that maximize the margin with respect to the labels induced by the harmonic function solution. We motivate the approach, compare it to existing work, and prove a bound on its generalization error. The quality of our solutions is evaluated on a synthetic problem and three UCI ML repository datasets. In most cases, we outperform manifold regularization of support vector machines, which is a state-of-the-art approach to semi-supervised max-margin learning.'
volume: 9
URL: http://proceedings.mlr.press/v9/kveton10a.html
PDF: http://proceedings.mlr.press/v9/kveton10a/kveton10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-kveton10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Kveton
given: Branislav
- family: Valko
given: Michal
- family: Rahimi
given: Ali
- family: Huang
given: Ling
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 421-428
id: kveton10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 421
lastpage: 428
published: 2010-03-31 00:00:00 +0000
- title: 'Solving the Uncapacitated Facility Location Problem Using Message Passing Algorithms'
abstract: 'The Uncapacitated Facility Location Problem (UFLP) is one of the most widely studied discrete location problems, whose applications arise in a variety of settings. We tackle the UFLP using probabilistic inference in a graphical model - an approach that has received little attention in the past. We show that the fixed points of max-product linear programming (MPLP), a convexified version of the max-product algorithm, can be used to construct a solution with a 3-approximation guarantee for metric UFLP instances. In addition, we characterize some scenarios under which the MPLP solution is guaranteed to be globally optimal. We evaluate the performance of both max-sum and MPLP empirically on metric and non-metric problems, demonstrating the advantages of the 3-approximation construction and algorithm applicability to non-metric instances.'
volume: 9
URL: http://proceedings.mlr.press/v9/lazic10a.html
PDF: http://proceedings.mlr.press/v9/lazic10a/lazic10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-lazic10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Lazic
given: Nevena
- family: Frey
given: Brendan
- family: Aarabi
given: Parham
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 429-436
id: lazic10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 429
lastpage: 436
published: 2010-03-31 00:00:00 +0000
- title: 'Relating Function Class Complexity and Cluster Structure in the Function Domain with Applications to Transduction'
abstract: 'We relate function class complexity to structure in the function domain. This facilitates risk analysis relative to cluster structure in the input space which is particularly effective in semi-supervised learning. In particular we quantify the complexity of function classes defined over a graph in terms of the graph structure.'
volume: 9
URL: http://proceedings.mlr.press/v9/lever10a.html
PDF: http://proceedings.mlr.press/v9/lever10a/lever10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-lever10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Lever
given: Guy
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 437-444
id: lever10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 437
lastpage: 444
published: 2010-03-31 00:00:00 +0000
- title: 'The Feature Selection Path in Kernel Methods'
abstract: 'The problem of automatic feature selection/weighting in kernel methods is examined. We work on a formulation that optimizes both the weights of features and the parameters of the kernel model simultaneously, using L_1 regularization for feature selection. Under quite general choices of kernels, we prove that there exists a unique regularization path for this problem, that runs from 0 to a stationary point of the non-regularized problem. We propose an ODE-based homotopy method to follow this trajectory. By following the path, our algorithm is able to automatically discard irrelevant features and to automatically go back and forth to avoid local optima. Experiments on synthetic and real datasets show that the method achieves low prediction error and is efficient in separating relevant from irrelevant features.'
volume: 9
URL: http://proceedings.mlr.press/v9/li10a.html
PDF: http://proceedings.mlr.press/v9/li10a/li10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-li10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Li
given: Fuxin
- family: Sminchisescu
given: Cristian
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 445-452
id: li10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 445
lastpage: 452
published: 2010-03-31 00:00:00 +0000
- title: 'Simple Exponential Family PCA'
abstract: 'Bayesian principal component analysis (BPCA), a probabilistic reformulation of PCA with Bayesian model selection, is a systematic approach to determining the number of essential principal components (PCs) for data representation. However, it assumes that data are Gaussian distributed and thus it cannot handle all types of practical observations, e.g. integers and binary values. In this paper, we propose simple exponential family PCA (SePCA), a generalised family of probabilistic principal component analysers. SePCA employs exponential family distributions to handle general types of observations. By using Bayesian inference, SePCA also automatically discovers the number of essential PCs. We discuss techniques for fitting the model, develop the corresponding mixture model, and show the effectiveness of the model based on experiments.'
volume: 9
URL: http://proceedings.mlr.press/v9/li10b.html
PDF: http://proceedings.mlr.press/v9/li10b/li10b.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-li10b.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Li
given: Jun
- family: Tao
given: Dacheng
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 453-460
id: li10b
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 453
lastpage: 460
published: 2010-03-31 00:00:00 +0000
- title: 'The Group Dantzig Selector'
abstract: 'We introduce a new method – the group Dantzig selector – for high dimensional sparse regression with group structure, which has a convincing theory about why utilizing the group structure can be beneficial. Under a group restricted isometry condition, we obtain a significantly improved nonasymptotic L2-norm bound over the basis pursuit or the Dantzig selector which ignores the group structure. To gain more insight, we also introduce a surprisingly simple and intuitive “sparsity oracle condition” to obtain a block L1-norm bound, which is easily accessible to a broad audience in machine learning community. Encouraging numerical results are also provided to support our theory.'
volume: 9
URL: http://proceedings.mlr.press/v9/liu10a.html
PDF: http://proceedings.mlr.press/v9/liu10a/liu10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-liu10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Liu
given: Han
- family: Zhang
given: Jian
- family: Jiang
given: Xiaoye
- family: Liu
given: Jun
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 461-468
id: liu10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 461
lastpage: 468
published: 2010-03-31 00:00:00 +0000
- title: 'Descent Methods for Tuning Parameter Refinement'
abstract: 'This paper addresses multidimensional tuning parameter selection in the context of “train-validate-test” and K-fold cross validation. A coarse grid search over tuning parameter space is used to initialize a descent method which then jointly optimizes over variables and tuning parameters. We study four regularized regression methods and develop the update equations for the corresponding descent algorithms. Experiments on both simulated and real-world datasets show that the method results in significant tuning parameter refinement.'
volume: 9
URL: http://proceedings.mlr.press/v9/lorbert10a.html
PDF: http://proceedings.mlr.press/v9/lorbert10a/lorbert10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-lorbert10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Lorbert
given: Alexander
- family: Ramadge
given: Peter
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 469-476
id: lorbert10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 469
lastpage: 476
published: 2010-03-31 00:00:00 +0000
- title: 'Exploiting Covariate Similarity in Sparse Regression via the Pairwise Elastic Net'
abstract: 'A new approach to regression regularization called the Pairwise Elastic Net is proposed. Like the Elastic Net, it simultaneously performs automatic variable selection and continuous shrinkage. In addition, the Pairwise Elastic Net encourages the grouping of strongly correlated predictors based on a pairwise similarity measure. We give examples of how the Pairwise Elastic Net can be used to achieve the objectives of Ridge regression, the Lasso, the Elastic Net, and Group Lasso. Finally, we present a coordinate descent algorithm to solve the Pairwise Elastic Net.'
volume: 9
URL: http://proceedings.mlr.press/v9/lorbert10b.html
PDF: http://proceedings.mlr.press/v9/lorbert10b/lorbert10b.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-lorbert10b.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Lorbert
given: Alexander
- family: Eis
given: David
- family: Kostina
given: Victoria
- family: Blei
given: David
- family: Ramadge
given: Peter
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 477-484
id: lorbert10b
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 477
lastpage: 484
published: 2010-03-31 00:00:00 +0000
- title: 'Contextual Multi-Armed Bandits'
abstract: 'We study contextual multi-armed bandit problems where the context comes from a metric space and the payoff satisfies a Lipschitz condition with respect to the metric. Abstractly, a contextual multi-armed bandit problem models a situation where, in a sequence of independent trials, an online algorithm chooses, based on a given context (side information), an action from a set of possible actions so as to maximize the total payoff of the chosen actions. The payoff depends on both the action chosen and the context. In contrast, context-free multi-armed bandit problems, a focus of much previous research, model situations where no side information is available and the payoff depends only on the action chosen. Our problem is motivated by sponsored web search, where the task is to display ads to a user of an Internet search engine based on her search query so as to maximize the click-through rate (CTR) of the ads displayed. We cast this problem as a contextual multi-armed bandit problem where queries and ads form metric spaces and the payoff function is Lipschitz with respect to both the metrics. For any ε> 0 we present an algorithm with regret O(T^\fraca+b+1a+b+2 + ε) where a,b are the covering dimensions of the query space and the ad space respectively. We prove a lower bound Ω(T^\frac\tildea+\tildeb+1\tildea+\tildeb+2 ε) for the regret of any algorithm where \tildea, \tildeb are packing dimensions of the query spaces and the ad space respectively. For finite spaces or convex bounded subsets of Euclidean spaces, this gives an almost matching upper and lower bound.'
volume: 9
URL: http://proceedings.mlr.press/v9/lu10a.html
PDF: http://proceedings.mlr.press/v9/lu10a/lu10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-lu10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Lu
given: Tyler
- family: Pal
given: David
- family: Pal
given: Martin
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 485-492
id: lu10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 485
lastpage: 492
published: 2010-03-31 00:00:00 +0000
- title: 'Exploiting Feature Covariance in High-Dimensional Online Learning'
abstract: 'Some online algorithms for linear classification model the uncertainty in their weights over the course of learning. Modeling the full covariance structure of the weights can provide a significant advantage for classification. However, for high-dimensional, large-scale data, even though there may be many second-order feature interactions, it is computationally infeasible to maintain this covariance structure. To extend second-order methods to high-dimensional data, we develop low-rank approximations of the covariance structure. We evaluate our approach on both synthetic and real-world data sets using the confidence-weighted online learning framework. We show improvements over diagonal covariance matrices for both low and high-dimensional data.'
volume: 9
URL: http://proceedings.mlr.press/v9/ma10a.html
PDF: http://proceedings.mlr.press/v9/ma10a/ma10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-ma10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Ma
given: Justin
- family: Kulesza
given: Alex
- family: Dredze
given: Mark
- family: Crammer
given: Koby
- family: Saul
given: Lawrence
- family: Pereira
given: Fernando
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 493-500
id: ma10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 493
lastpage: 500
published: 2010-03-31 00:00:00 +0000
- title: 'Supervised Dimension Reduction Using Bayesian Mixture Modeling'
abstract: 'We develop a Bayesian framework for supervised dimension reduction using a flexible nonparametric Bayesian mixture modeling approach. Our method retrieves the dimension reduction or d.r. subspace by utilizing a dependent Dirichlet process that allows for natural clustering for the data in terms of both the response and predictor variables. Formal probabilistic models with likelihoods and priors are given and efficient posterior sampling of the d.r. subspace can be obtained by a Gibbs sampler. As the posterior draws are linear subspaces which are points on a Grassmann manifold, we output the posterior mean d.r. subspace with respect to geodesics on the Grassmannian. The utility of our approach is illustrated on a set of simulated and real examples. Some Key Words: supervised dimension reduction, inverse regression, Dirichlet process, factor models, Grassman manifold.'
volume: 9
URL: http://proceedings.mlr.press/v9/mao10a.html
PDF: http://proceedings.mlr.press/v9/mao10a/mao10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-mao10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Mao
given: Kai
- family: Liang
given: Feng
- family: Mukherjee
given: Sayan
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 501-508
id: mao10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 501
lastpage: 508
published: 2010-03-31 00:00:00 +0000
- title: 'Inductive Principles for Restricted Boltzmann Machine Learning'
abstract: 'Recent research has seen the proposal of several new inductive principles designed specifically to avoid the problems associated with maximum likelihood learning in models with intractable partition functions. In this paper, we study learning methods for binary restricted Boltzmann machines (RBMs) based on ratio matching and generalized score matching. We compare these new RBM learning methods to a range of existing learning methods including stochastic maximum likelihood, contrastive divergence, and pseudo-likelihood. We perform an extensive empirical evaluation across multiple tasks and data sets.'
volume: 9
URL: http://proceedings.mlr.press/v9/marlin10a.html
PDF: http://proceedings.mlr.press/v9/marlin10a/marlin10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-marlin10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Marlin
given: Benjamin
- family: Swersky
given: Kevin
- family: Chen
given: Bo
- family: Freitas
given: Nando
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 509-516
id: marlin10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 509
lastpage: 516
published: 2010-03-31 00:00:00 +0000
- title: 'Parallelizable Sampling of Markov Random Fields'
abstract: 'Markov Random Fields (MRFs) are an important class of probabilistic models which are used for density estimation, classification, denoising, and for constructing Deep Belief Networks. Every application of an MRF requires addressing its inference problem, which can be done using deterministic inference methods or using stochastic Markov Chain Monte Carlo methods. In this paper we introduce a new Markov Chain transition operator that updates all the variables of a pairwise MRF in parallel by using auxiliary Gaussian variables. The proposed MCMC operator is extremely simple to implement and to parallelize. This is achieved by a formal equivalence result between arbitrary pairwise MRFs and a particular type of Restricted Boltzmann Machine. This result also implies that the later can be learned in place of the former without any loss of modeling power, a possibility we explore in experiments.'
volume: 9
URL: http://proceedings.mlr.press/v9/martens10a.html
PDF: http://proceedings.mlr.press/v9/martens10a/martens10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-martens10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Martens
given: James
- family: Sutskever
given: Ilya
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 517-524
id: martens10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 517
lastpage: 524
published: 2010-03-31 00:00:00 +0000
- title: 'Exploiting Within-Clique Factorizations in Junction-Tree Algorithms'
abstract: 'We show that the expected computational complexity of the Junction-Tree Algorithm for maximum a posteriori inference in graphical models can be improved. Our results apply whenever the potentials over maximal cliques of the triangulated graph are factored over subcliques. This is common in many real applications, as we illustrate with several examples. The new algorithms are easily implemented, and experiments show substantial speed-ups over the classical Junction-Tree Algorithm. This enlarges the class of models for which exact inference is efficient.'
volume: 9
URL: http://proceedings.mlr.press/v9/mcauley10a.html
PDF: http://proceedings.mlr.press/v9/mcauley10a/mcauley10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-mcauley10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: McAuley
given: Julian
- family: Caetano
given: Tiberio
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 525-532
id: mcauley10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 525
lastpage: 532
published: 2010-03-31 00:00:00 +0000
- title: 'Discriminative Topic Segmentation of Text and Speech'
abstract: 'We explore automated discovery of topically-coherent segments in speech or text sequences. We give two new discriminative topic segmentation algorithms which employ a new measure of text similarity based on word co-occurrence. Both algorithms function by finding extrema in the similarity signal over the text, with the latter algorithm using a compact support-vector based description of a window of text or speech observations in word similarity space to overcome noise introduced by speech recognition errors and off-topic content. In experiments over speech and text news streams, we show that these algorithms outperform previous methods. We observe that topic segmentation of speech recognizer output is a more difficult problem than that of text streams; however, we demonstrate that by using a lattice of competing hypotheses rather than just the one-best hypothesis as input to the segmentation algorithm, the performance of the algorithm can be improved.'
volume: 9
URL: http://proceedings.mlr.press/v9/mohri10a.html
PDF: http://proceedings.mlr.press/v9/mohri10a/mohri10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-mohri10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Mohri
given: Mehryar
- family: Moreno
given: Pedro
- family: Weinstein
given: Eugene
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 533-540
id: mohri10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 533
lastpage: 540
published: 2010-03-31 00:00:00 +0000
- title: 'Elliptical slice sampling'
abstract: 'Many probabilistic models introduce strong dependencies between variables using a latent multivariate Gaussian distribution or a Gaussian process. We present a new Markov chain Monte Carlo algorithm for performing inference in models with multivariate Gaussian priors. Its key properties are: 1) it has simple, generic code applicable to many models, 2) it has no free parameters, 3) it works well for a variety of Gaussian process based models. These properties make our method ideal for use while model building, removing the need to spend time deriving and tuning updates for more complex algorithms.'
volume: 9
URL: http://proceedings.mlr.press/v9/murray10a.html
PDF: http://proceedings.mlr.press/v9/murray10a/murray10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-murray10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Murray
given: Iain
- family: Adams
given: Ryan
- family: MacKay
given: David
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 541-548
id: murray10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 541
lastpage: 548
published: 2010-03-31 00:00:00 +0000
- title: 'Near-Optimal Evasion of Convex-Inducing Classifiers'
abstract: 'Classifiers are often used to detect miscreant activities. We study how an adversary can efficiently query a classifier to elicit information that allows the adversary to evade detection at near-minimal cost. We generalize results of Lowd and Meek (2005) to convex-inducing classifiers. We present algorithms that construct undetected instances of near-minimal cost using only polynomially many queries in the dimension of the space and without reverse engineering the decision boundary.'
volume: 9
URL: http://proceedings.mlr.press/v9/nelson10a.html
PDF: http://proceedings.mlr.press/v9/nelson10a/nelson10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-nelson10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Nelson
given: Blaine
- family: Rubinstein
given: Benjamin
- family: Huang
given: Ling
- family: Joseph
given: Anthony
- family: Lau
given: Shing–hon
- family: Lee
given: Steven
- family: Rao
given: Satish
- family: Tran
given: Anthony
- family: Tygar
given: Doug
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 549-556
id: nelson10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 549
lastpage: 556
published: 2010-03-31 00:00:00 +0000
- title: 'Incremental Sparsification for Real-time Online Model Learning'
abstract: 'Online model learning in real-time is required by many applications, for example, robot tracking control. It poses a difficult problem, as fast and incremental online regression with large data sets is the essential component and cannot be realized by straightforward usage of off-the-shelf machine learning methods such as Gaussian process regression or support vector regression. In this paper, we propose a framework for online, incremental sparsification with a fixed budget designed for large scale real-time model learning. The proposed approach combines a sparsification method based on an independency measure with a large scale database. In combination with an incremental learning approach such as sequential support vector regression, we obtain a regression method which is applicable in real-time online learning. It exhibits competitive learning accuracy when compared with standard regression techniques. Implementation on a real robot emphasizes the applicability of the proposed approach in real-time online model learning for real world systems.'
volume: 9
URL: http://proceedings.mlr.press/v9/nguyen_tuong10a.html
PDF: http://proceedings.mlr.press/v9/nguyen_tuong10a/nguyen_tuong10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-nguyen_tuong10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Nguyen–Tuong
given: Duy
- family: Peters
given: Jan
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 557-564
id: nguyen_tuong10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 557
lastpage: 564
published: 2010-03-31 00:00:00 +0000
- title: 'Fluid Dynamics Models for Low Rank Discriminant Analysis'
abstract: 'We consider the problem of reducing the dimensionality of labeled data for classification. Unfortunately, the optimal approach of finding the low-dimensional projection with minimal Bayes classification error is intractable, so most standard algorithms optimize a tractable heuristic function in the projected subspace. Here, we investigate a physics-based model where we consider the labeled data as interacting fluid distributions. We derive the forces arising in the fluids from information theoretic potential functions, and consider appropriate low rank constraints on the resulting acceleration and velocity flow fields. We show how to apply the Gauss principle of least constraint in fluids to obtain tractable solutions for low rank projections. Our fluid dynamic approach is demonstrated to better approximate the Bayes optimal solution on Gaussian systems, including infinite dimensional Gaussian processes.'
volume: 9
URL: http://proceedings.mlr.press/v9/noh10a.html
PDF: http://proceedings.mlr.press/v9/noh10a/noh10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-noh10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Noh
given: Yung–Kyun
- family: Zhang
given: Byoung–Tak
- family: Lee
given: Daniel
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 565-572
id: noh10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 565
lastpage: 572
published: 2010-03-31 00:00:00 +0000
- title: 'Approximation of hidden Markov models by mixtures of experts with application to particle filtering'
abstract: 'Selecting conveniently the proposal kernel and the adjustment multiplier weights of the auxiliary particle filter may increase significantly the accuracy and computational efficiency of the method. However, in practice the optimal proposal kernel and multiplier weights are seldom known. In this paper we present a simulation-based method for constructing offline an approximation of these quantities that makes the filter close to fully adapted at a reasonable computational cost. The approximation is constructed as a mixture of experts optimised through an efficient stochastic approximation algorithm. The method is illustrated on two simulated examples.'
volume: 9
URL: http://proceedings.mlr.press/v9/olsson10a.html
PDF: http://proceedings.mlr.press/v9/olsson10a/olsson10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-olsson10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Olsson
given: Jimmy
- family: Ströjby
given: Jonas
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 573-580
id: olsson10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 573
lastpage: 580
published: 2010-03-31 00:00:00 +0000
- title: 'A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection'
abstract: 'We propose a generalization of the Multiple-try Metropolis (MTM) algorithm of Liu et al. (2000), which is based on drawing several proposals at each step and randomly choosing one of them on the basis of weights that may be arbitrary chosen. In particular, for Bayesian estimation we also introduce a method based on weights depending on a quadratic approximation of the posterior distribution. The resulting algorithm cannot be reformulated as an MTM algorithm and leads to a comparable gain of efficiency with a lower computational effort. We also outline the extension of the proposed strategy, and then of the MTM strategy, to Bayesian model selection, casting it in a Reversible Jump framework. The approach is illustrated by real examples.'
volume: 9
URL: http://proceedings.mlr.press/v9/pandolfi10a.html
PDF: http://proceedings.mlr.press/v9/pandolfi10a/pandolfi10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-pandolfi10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Pandolfi
given: Silvia
- family: Bartolucci
given: Francesco
- family: Friel
given: Nial
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 581-588
id: pandolfi10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 581
lastpage: 588
published: 2010-03-31 00:00:00 +0000
- title: 'Bayesian structure discovery in Bayesian networks with less space'
abstract: 'Current exact algorithms for score-based structure discovery in Bayesian networks on n nodes run in time and space within a polynomial factor of 2^n. For practical use, the space requirement is the bottleneck, which motivates trading space against time. Here, previous results on finding an optimal network structure in less space are extended in two directions. First, we consider the problem of computing the posterior probability of a given arc set. Second, we operate with the general partial order framework and its specialization to bucket orders, introduced recently for related permutation problems. The main technical contribution is the development of a fast algorithm for a novel zeta transform variant, which may be of independent interest.'
volume: 9
URL: http://proceedings.mlr.press/v9/parviainen10a.html
PDF: http://proceedings.mlr.press/v9/parviainen10a/parviainen10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-parviainen10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Parviainen
given: Pekka
- family: Koivisto
given: Mikko
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 589-596
id: parviainen10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 589
lastpage: 596
published: 2010-03-31 00:00:00 +0000
- title: 'Identifying Cause and Effect on Discrete Data using Additive Noise Models'
abstract: 'Inferring the causal structure of a set of random variables from a finite sample of the joint distribution is an important problem in science. Recently, methods using additive noise models have been suggested to approach the case of continuous variables. In many situations, however, the variables of interest are discrete or even have only finitely many states. In this work we extend the notion of additive noise models to these cases. Whenever the joint distribution P^(X,Y) admits such a model in one direction, e.g. Y=f(X)+N, N independent of X, it does not admit the reversed model X=g(Y)+N’, N’ independent of Y as long as the model is chosen in a generic way. Based on these deliberations we propose an efficient new algorithm that is able to distinguish between cause and effect for a finite sample of discrete variables. We show that this algorithm works both on synthetic and real data sets.'
volume: 9
URL: http://proceedings.mlr.press/v9/peters10a.html
PDF: http://proceedings.mlr.press/v9/peters10a/peters10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-peters10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Peters
given: Jonas
- family: Janzing
given: Dominik
- family: Schölkopf
given: Bernhard
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 597-604
id: peters10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 597
lastpage: 604
published: 2010-03-31 00:00:00 +0000
- title: 'REGO: Rank-based Estimation of Renyi Information using Euclidean Graph Optimization'
abstract: 'We propose a new method for a non-parametric estimation of Renyi and Shannon information for a multivariate distribution using a corresponding copula, a multivariate distribution over normalized ranks of the data. As the information of the distribution is the same as the negative entropy of its copula, our method estimates this information by solving a Euclidean graph optimization problem on the empirical estimate of the distribution’s copula. Owing to the properties of the copula, we show that the resulting estimator of Renyi information is strongly consistent and robust. Further, we demonstrate its applicability in the image registration in addition to simulated experiments.'
volume: 9
URL: http://proceedings.mlr.press/v9/poczos10a.html
PDF: http://proceedings.mlr.press/v9/poczos10a/poczos10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-poczos10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Poczos
given: Barnabas
- family: Kirshner
given: Sergey
- family: Szepesvári
given: Csaba
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 605-612
id: poczos10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 605
lastpage: 612
published: 2010-03-31 00:00:00 +0000
- title: 'Infinite Predictor Subspace Models for Multitask Learning'
abstract: 'Given several related learning tasks, we propose a nonparametric Bayesian model that captures task relatedness by assuming that the task parameters (i.e., predictors) share a latent subspace. More specifically, the intrinsic dimensionality of the task subspace is not assumed to be known a priori. We use an infinite latent feature model to automatically infer this number (depending on and limited by only the number of tasks). Furthermore, our approach is applicable when the underlying task parameter subspace is inherently sparse, drawing parallels with l1 regularization and LASSO-style models. We also propose an augmented model which can make use of (labeled, and additionally unlabeled if available) inputs to assist learning this subspace, leading to further improvements in the performance. Experimental results demonstrate the efficacy of both the proposed approaches, especially when the number of examples per task is small. Finally, we discuss an extension of the proposed framework where a nonparametric mixture of linear subspaces can be used to learn a manifold over the task parameters, and also deal with the issue of negative transfer from unrelated tasks.'
volume: 9
URL: http://proceedings.mlr.press/v9/rai10a.html
PDF: http://proceedings.mlr.press/v9/rai10a/rai10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-rai10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Rai
given: Piyush
- family: Daumé III
given: Hal
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 613-620
id: rai10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 613
lastpage: 620
published: 2010-03-31 00:00:00 +0000
- title: 'Factored 3-Way Restricted Boltzmann Machines For Modeling Natural Images'
abstract: 'Deep belief nets have been successful in modeling handwritten characters, but it has proved more difficult to apply them to real images. The problem lies in the restricted Boltzmann machine (RBM) which is used as a module for learning deep belief nets one layer at a time. The Gaussian-Binary RBMs that have been used to model real-valued data are not a good way to model the covariance structure of natural images. We propose a factored 3-way RBM that uses the states of its hidden units to represent abnormalities in the local covariance structure of an image. This provides a probabilistic framework for the widely used simple/complex cell architecture. Our model learns binary features that work very well for object recognition on the “tiny images” data set. Even better features are obtained by then using standard binary RBM’s to learn a deeper model.'
volume: 9
URL: http://proceedings.mlr.press/v9/ranzato10a.html
PDF: http://proceedings.mlr.press/v9/ranzato10a/ranzato10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-ranzato10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Ranzato
given: Marc’Aurelio
- family: Krizhevsky
given: Alex
- family: Hinton
given: Geoffrey
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 621-628
id: ranzato10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 621
lastpage: 628
published: 2010-03-31 00:00:00 +0000
- title: 'Nonparametric prior for adaptive sparsity'
abstract: 'For high-dimensional problems various parametric priors have been proposed to promote sparse solutions. While parametric priors has shown considerable success they are not very robust in adapting to varying degrees of sparsity. In this work we propose a discrete mixture prior which is partially nonparametric. The right structure for the prior and the amount of sparsity is estimated directly from the data. Our experiments show that the proposed prior adapts to sparsity much better than its parametric counterparts. We apply the proposed method to classification of high dimensional microarray datasets.'
volume: 9
URL: http://proceedings.mlr.press/v9/raykar10a.html
PDF: http://proceedings.mlr.press/v9/raykar10a/raykar10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-raykar10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Raykar
given: Vikas
- family: Zhao
given: Linda
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 629-636
id: raykar10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 629
lastpage: 636
published: 2010-03-31 00:00:00 +0000
- title: 'Convexity of Proper Composite Binary Losses'
abstract: 'A composite loss assigns a penalty to a real-valued prediction by associating the prediction with a probability via a link function then applying a class probability estimation (CPE) loss. If the risk for a composite loss is always minimised by predicting the value associated with the true class probability the composite loss is proper. We provide a novel, explicit and complete characterisation of the convexity of any proper composite loss in terms of its link and its “weight function” associated with its proper CPE loss.'
volume: 9
URL: http://proceedings.mlr.press/v9/reid10a.html
PDF: http://proceedings.mlr.press/v9/reid10a/reid10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-reid10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Reid
given: Mark
- family: Williamson
given: Robert
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 637-644
id: reid10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 637
lastpage: 644
published: 2010-03-31 00:00:00 +0000
- title: 'Gaussian processes with monotonicity information'
abstract: 'A method for using monotonicity information in multivariate Gaussian process regression and classification is proposed. Monotonicity information is introduced with virtual derivative observations, and the resulting posterior is approximated with expectation propagation. Behaviour of the method is illustrated with artificial regression examples, and the method is used in a real world health care classification problem to include monotonicity information with respect to one of the covariates.'
volume: 9
URL: http://proceedings.mlr.press/v9/riihimaki10a.html
PDF: http://proceedings.mlr.press/v9/riihimaki10a/riihimaki10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-riihimaki10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Riihimäki
given: Jaakko
- family: Vehtari
given: Aki
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 645-652
id: riihimaki10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 645
lastpage: 652
published: 2010-03-31 00:00:00 +0000
- title: 'A Regularization Approach to Nonlinear Variable Selection'
abstract: 'In this paper we consider a regularization approach to variable selection when the regression function depends nonlinearly on a few input variables. The proposed method is based on a regularized least square estimator penalizing large values of the partial derivatives. An efficient iterative procedure is proposed to solve the underlying variational problem, and its convergence is proved. The empirical properties of the obtained estimator are tested both for prediction and variable selection. The algorithm compares favorably to more standard ridge regression and L1 regularization schemes.'
volume: 9
URL: http://proceedings.mlr.press/v9/rosasco10a.html
PDF: http://proceedings.mlr.press/v9/rosasco10a/rosasco10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-rosasco10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Rosasco
given: Lorenzo
- family: Santoro
given: Matteo
- family: Mosci
given: Sofia
- family: Verri
given: Alessandro
- family: Villa
given: Silvia
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 653-660
id: rosasco10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 653
lastpage: 660
published: 2010-03-31 00:00:00 +0000
- title: 'Efficient Reductions for Imitation Learning'
abstract: 'Imitation Learning, while applied successfully on many large real-world problems, is typically addressed as a standard supervised learning problem, where it is assumed the training and testing data are i.i.d.. This is not true in imitation learning as the learned policy influences the future test inputs (states) upon which it will be tested. We show that this leads to compounding errors and a regret bound that grows quadratically in the time horizon of the task. We propose two alternative algorithms for imitation learning where training occurs over several episodes of interaction. These two approaches share in common that the learner’s policy is slowly modified from executing the expert’s policy to the learned policy. We show that this leads to stronger performance guarantees and demonstrate the improved performance on two challenging problems: training a learner to play 1) a 3D racing game (Super Tux Kart) and 2) Mario Bros.; given input images from the games and corresponding actions taken by a human expert and near-optimal planner respectively.'
volume: 9
URL: http://proceedings.mlr.press/v9/ross10a.html
PDF: http://proceedings.mlr.press/v9/ross10a/ross10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-ross10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Ross
given: Stephane
- family: Bagnell
given: Drew
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 661-668
id: ross10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 661
lastpage: 668
published: 2010-03-31 00:00:00 +0000
- title: 'Approximate parameter inference in a stochastic reaction-diffusion model'
abstract: 'We present an approximate inference approach to parameter estimation in a spatio-temporal stochastic process of the reaction-diffusion type. The continuous space limit of an inference method for Markov jump processes leads to an approximation which is related to a spatial Gaussian process. An efficient solution in feature space using a Fourier basis is applied to inference on simulational data.'
volume: 9
URL: http://proceedings.mlr.press/v9/ruttor10a.html
PDF: http://proceedings.mlr.press/v9/ruttor10a/ruttor10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-ruttor10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Ruttor
given: Andreas
- family: Opper
given: Manfred
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 669-676
id: ruttor10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 669
lastpage: 676
published: 2010-03-31 00:00:00 +0000
- title: 'Active Sequential Learning with Tactile Feedback'
abstract: 'We consider the problem of tactile discrimination, with the goal of estimating an underlying state parameter in a sequential setting. If the data is continuous and high-dimensional, collecting enough representative data samples becomes difficult. We present a framework that uses active learning to help with the sequential gathering of data samples, using information-theoretic criteria to find optimal actions at each time step. We consider two approaches to recursively update the state parameter belief: an analytical Gaussian approximation and a Monte Carlo sampling method. We show how both active frameworks improve convergence, demonstrating results on a real robotic hand-arm system that estimates the viscosity of liquids from tactile feedback data.'
volume: 9
URL: http://proceedings.mlr.press/v9/saal10a.html
PDF: http://proceedings.mlr.press/v9/saal10a/saal10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-saal10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Saal
given: Hannes
- family: Ting
given: Jo–Anne
- family: Vijayakumar
given: Sethu
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 677-684
id: saal10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 677
lastpage: 684
published: 2010-03-31 00:00:00 +0000
- title: 'Reducing Label Complexity by Learning From Bags'
abstract: 'We consider a supervised learning setting in which the main cost of learning is the number of training labels and one can obtain a single label for a bag of examples, indicating only if a positive example exists in the bag, as in Multi-Instance Learning. We thus propose to create a training sample of bags, and to use the obtained labels to learn to classify individual examples. We provide a theoretical analysis showing how to select the bag size as a function of the problem parameters, and prove that if the original labels are distributed unevenly, the number of required labels drops considerably when learning from bags. We demonstrate that finding a low-error separating hyperplane from bags is feasible in this setting using a simple iterative procedure similar to latent SVM. Experiments on synthetic and real data sets demonstrate the success of the approach.'
volume: 9
URL: http://proceedings.mlr.press/v9/sabato10a.html
PDF: http://proceedings.mlr.press/v9/sabato10a/sabato10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-sabato10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Sabato
given: Sivan
- family: Srebro
given: Nathan
- family: Tishby
given: Naftali
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 685-692
id: sabato10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 685
lastpage: 692
published: 2010-03-31 00:00:00 +0000
- title: 'Efficient Learning of Deep Boltzmann Machines'
abstract: 'We present a new approximate inference algorithm for Deep Boltzmann Machines (DBM’s), a generative model with many layers of hidden variables. The algorithm learns a separate “recognition” model that is used to quickly initialize, in a single bottom-up pass, the values of the latent variables in all hidden layers. We show that using such a recognition model, followed by a combined top-down and bottom-up pass, it is possible to efficiently learn a good generative model of high-dimensional highly-structured sensory input. We show that the additional computations required by incorporating a top-down feedback plays a critical role in the performance of a DBM, both as a generative and discriminative model. Moreover, inference is only at most three times slower compared to the approximate inference in a Deep Belief Network (DBN), making large-scale learning of DBM’s practical. Finally, we demonstrate that the DBM’s trained using the proposed approximate inference algorithm perform well compared to DBN’s and SVM’s on the MNIST handwritten digit, OCR English letters, and NORB visual object recognition tasks.'
volume: 9
URL: http://proceedings.mlr.press/v9/salakhutdinov10a.html
PDF: http://proceedings.mlr.press/v9/salakhutdinov10a/salakhutdinov10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-salakhutdinov10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Salakhutdinov
given: Ruslan
- family: Larochelle
given: Hugo
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 693-700
id: salakhutdinov10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 693
lastpage: 700
published: 2010-03-31 00:00:00 +0000
- title: 'Factorized Orthogonal Latent Spaces'
abstract: 'Existing approaches to multi-view learning are particularly effective when the views are either independent (i.e, multi-kernel approaches) or fully dependent (i.e., shared latent spaces). However, in real scenarios, these assumptions are almost never truly satisfied. Recently, two methods have attempted to tackle this problem by factorizing the information and learn separate latent spaces for modeling the shared (i.e., correlated) and private (i.e., independent) parts of the data. However, these approaches are very sensitive to parameters setting or initialization. In this paper we propose a robust approach to factorizing the latent space into shared and private spaces by introducing orthogonality constraints, which penalize redundant latent representations. Furthermore, unlike previous approaches, we simultaneously learn the structure and dimensionality of the latent spaces by relying on a regularizer that encourages the latent space of each data stream to be low dimensional. To demonstrate the benefits of our approach, we apply it to two existing shared latent space models that assume full dependence of the views, the sGPLVM and the sKIE, and show that our constraints improve the performance of these models on the task of pose estimation from monocular images.'
volume: 9
URL: http://proceedings.mlr.press/v9/salzmann10a.html
PDF: http://proceedings.mlr.press/v9/salzmann10a/salzmann10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-salzmann10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Salzmann
given: Mathieu
- family: Ek
given: Carl Henrik
- family: Urtasun
given: Raquel
- family: Darrell
given: Trevor
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 701-708
id: salzmann10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 701
lastpage: 708
published: 2010-03-31 00:00:00 +0000
- title: 'Convex Structure Learning in Log-Linear Models: Beyond Pairwise Potentials'
abstract: 'Previous work has examined structure learning in log-linear models with L1-regularization, largely focusing on the case of pairwise potentials. In this work we consider the case of models with potentials of arbitrary order, but that satisfy a hierarchical constraint. We enforce the hierarchical constraint using group L1-regularization with overlapping groups, and an active set method that enforces hierarchical inclusion allows us to tractably consider the exponential number of higher-order potentials. We use a spectral projected gradient method as a sub-routine for solving the overlapping group L1-regularization problem, and make use of a sparse version of Dykstra’s algorithm to compute the projection. Our experiments indicate that this model gives equal or better test set likelihood compared to previous models.'
volume: 9
URL: http://proceedings.mlr.press/v9/schmidt10a.html
PDF: http://proceedings.mlr.press/v9/schmidt10a/schmidt10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-schmidt10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Schmidt
given: Mark
- family: Murphy
given: Kevin
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 709-716
id: schmidt10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 709
lastpage: 716
published: 2010-03-31 00:00:00 +0000
- title: 'Polynomial-Time Exact Inference in NP-Hard Binary MRFs via Reweighted Perfect Matching'
abstract: 'We develop a new form of reweighting (Wainwright et al., 2005) to leverage the relationship between Ising spin glasses and perfect matchings into a novel technique for the exact computation of MAP states in hitherto intractable binary Markov random fields. Our method solves an n by n lattice with external field and random couplings much faster, and for larger n, than the best competing algorithms. It empirically scales as O(n³) even though this problem is NP-hard and non-approximable in polynomial time. We discuss limitations of our current implementation and propose ways to overcome them.'
volume: 9
URL: http://proceedings.mlr.press/v9/schraudolph10a.html
PDF: http://proceedings.mlr.press/v9/schraudolph10a/schraudolph10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-schraudolph10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Schraudolph
given: Nic
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 717-724
id: schraudolph10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 717
lastpage: 724
published: 2010-03-31 00:00:00 +0000
- title: 'Dense Message Passing for Sparse Principal Component Analysis'
abstract: 'We describe a novel inference algorithm for sparse Bayesian PCA with a zero-norm prior on the model parameters. Bayesian inference is very challenging in probabilistic models of this type. MCMC procedures are too slow to be practical in a very high-dimensional setting and standard mean-field variational Bayes algorithms are ineffective. We adopt a dense message passing algorithm similar to algorithms developed in the statistical physics community and previously applied to inference problems in coding and sparse classification. The algorithm achieves near-optimal performance on synthetic data for which a statistical mechanics theory of optimal learning can be derived. We also study two gene expression datasets used in previous studies of sparse PCA. We find our method performs better than one published algorithm and comparably to a second.'
volume: 9
URL: http://proceedings.mlr.press/v9/sharp10a.html
PDF: http://proceedings.mlr.press/v9/sharp10a/sharp10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-sharp10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Sharp
given: Kevin
- family: Rattray
given: Magnus
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 725-732
id: sharp10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 725
lastpage: 732
published: 2010-03-31 00:00:00 +0000
- title: 'Empirical Bernstein Boosting'
abstract: 'Concentration inequalities that incorporate variance information (such as Bernstein’s or Bennett’s inequality) are often significantly tighter than counterparts (such as Hoeffding’s inequality) that disregard variance. Nevertheless, many state of the art machine learning algorithms for classification problems like AdaBoost and support vector machines (SVMs) extensively use Hoeffding’s inequalities to justify empirical risk minimization and its variants. This article proposes a novel boosting algorithm based on a recently introduced principle–sample variance penalization–which is motivated from an empirical version of Bernstein’s inequality. This framework leads to an efficient algorithm that is as easy to implement as AdaBoost while producing a strict generalization. Experiments on a large number of datasets show significant performance gains over AdaBoost. This paper shows that sample variance penalization could be a viable alternative to empirical risk minimization.'
volume: 9
URL: http://proceedings.mlr.press/v9/shivaswamy10a.html
PDF: http://proceedings.mlr.press/v9/shivaswamy10a/shivaswamy10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-shivaswamy10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Shivaswamy
given: Pannagadatta
- family: Jebara
given: Tony
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 733-740
id: shivaswamy10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 733
lastpage: 740
published: 2010-03-31 00:00:00 +0000
- title: 'Reduced-Rank Hidden Markov Models'
abstract: 'Hsu et al.(2009) recently proposed an efficient, accurate spectral learning algorithm for Hidden Markov Models (HMMs). In this paper we relax their assumptions and prove a tighter finite-sample error bound for the case of Reduced-Rank HMMs, i.e., HMMs with low-rank transition matrices. Since rank-k RR-HMMs are a larger class of models than k-state HMMs while being equally efficient to work with, this relaxation greatly increases the learning algorithm’s scope. In addition, we generalize the algorithm and bounds to models where multiple observations are needed to disambiguate state, and to models that emit multivariate real-valued observations. Finally we prove consistency for learning Predictive State Representations, an even larger class of models. Experiments on synthetic data and a toy video, as well as on difficult robot vision data, yield accurate models that compare favorably with alternatives in simulation quality and prediction accuracy.'
volume: 9
URL: http://proceedings.mlr.press/v9/siddiqi10a.html
PDF: http://proceedings.mlr.press/v9/siddiqi10a/siddiqi10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-siddiqi10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Siddiqi
given: Sajid
- family: Boots
given: Byron
- family: Gordon
given: Geoffrey
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 741-748
id: siddiqi10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 741
lastpage: 748
published: 2010-03-31 00:00:00 +0000
- title: 'Detecting Weak but Hierarchically-Structured Patterns in Networks'
abstract: 'The ability to detect weak distributed activation patterns in networks is critical to several applications, such as identifying the onset of anomalous activity or incipient congestion in the Internet, or faint traces of a biochemical spread by a sensor network. This is a challenging problem since weak distributed patterns can be invisible in per node statistics as well as a global network-wide aggregate. Most prior work considers situations in which the activation/non-activation of each node is statistically independent, but this is unrealistic in many problems. In this paper, we consider structured patterns arising from statistical dependencies in the activation process. Our contributions are three-fold. First, we propose a sparsifying transform that succinctly represents structured activation patterns that conform to a hierarchical dependency graph. Second, we establish that the proposed transform facilitates detection of very weak activation patterns that cannot be detected with existing methods. Third, we show that the structure of the hierarchical dependency graph governing the activation process, and hence the network transform, can be learnt from very few (logarithmic in network size) independent snapshots of network activity.'
volume: 9
URL: http://proceedings.mlr.press/v9/singh10a.html
PDF: http://proceedings.mlr.press/v9/singh10a/singh10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-singh10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Singh
given: Aarti
- family: Nowak
given: Robert
- family: Calderbank
given: Robert
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 749-756
id: singh10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 749
lastpage: 756
published: 2010-03-31 00:00:00 +0000
- title: 'Inference of Sparse Networks with Unobserved Variables. Application to Gene Regulatory Networks'
abstract: 'Networks are becoming a unifying framework for modeling complex systems and network inference problems are frequently encountered in many fields. Here, I develop and apply a generative approach to network inference (RCweb) for the case when the network is sparse and the latent (not observed) variables affect the observed ones. From all possible factor analysis (FA) decompositions explaining the variance in the data, RCweb selects the FA decomposition that is consistent with a sparse underlying network. The sparsity constraint is imposed by a novel method that significantly outperforms (in terms of accuracy, robustness to noise, complexity scaling and computational efficiency) methods using l1 norm relaxation such as K-SVD and l1-based sparse principle component analysis (PCA). Results from simulated models demonstrate that RCweb recovers exactly the model structures for sparsity as low (as non-sparse) as 50% and with ratio of unobserved to observed variables as high as 2. RCweb is robust to noise, with gradual decrease in the parameter ranges as the noise level increases.'
volume: 9
URL: http://proceedings.mlr.press/v9/slavov10a.html
PDF: http://proceedings.mlr.press/v9/slavov10a/slavov10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-slavov10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Slavov
given: Nikolai
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 757-764
id: slavov10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 757
lastpage: 764
published: 2010-03-31 00:00:00 +0000
- title: 'Nonparametric Tree Graphical Models'
abstract: 'We introduce a nonparametric representation for graphical model on trees which expresses marginals as Hilbert space embeddings and conditionals as embedding operators. This formulation allows us to define a graphical model solely on the basis of the feature space representation of its variables. Thus, this nonparametric model can be applied to general domains where kernels are defined, handling challenging cases such as discrete variables whose domains are huge, or very complex, non-Gaussian continuous distributions. We also derive \emphkernel belief propagation, a Hilbert-space algorithm for performing inference in our model. We show that our method outperforms state-of-the-art techniques in a cross-lingual document retrieval task and a camera rotation estimation problem.'
volume: 9
URL: http://proceedings.mlr.press/v9/song10a.html
PDF: http://proceedings.mlr.press/v9/song10a/song10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-song10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Song
given: Le
- family: Gretton
given: Arthur
- family: Guestrin
given: Carlos
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 765-772
id: song10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 765
lastpage: 772
published: 2010-03-31 00:00:00 +0000
- title: 'On the relation between universality, characteristic kernels and RKHS embedding of measures'
abstract: 'Universal kernels have been shown to play an important role in the achievability of the Bayes risk by many kernel-based algorithms that include binary classification, regression, etc. In this paper, we propose a notion of universality that generalizes the notions introduced by Steinwart and Micchelli et al. and study the necessary and sufficient conditions for a kernel to be universal. We show that all these notions of universality are closely linked to the injective embedding of a certain class of Borel measures into a reproducing kernel Hilbert space (RKHS). By exploiting this relation between universality and the embedding of Borel measures into an RKHS, we establish the relation between universal and characteristic kernels. The latter have been proposed in the context of the RKHS embedding of probability measures, used in statistical applications like homogeneity testing, independence testing, etc.'
volume: 9
URL: http://proceedings.mlr.press/v9/sriperumbudur10a.html
PDF: http://proceedings.mlr.press/v9/sriperumbudur10a/sriperumbudur10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-sriperumbudur10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Sriperumbudur
given: Bharath
- family: Fukumizu
given: Kenji
- family: Lanckriet
given: Gert
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 773-780
id: sriperumbudur10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 773
lastpage: 780
published: 2010-03-31 00:00:00 +0000
- title: 'Conditional Density Estimation via Least-Squares Density Ratio Estimation'
abstract: 'Estimating the conditional mean of an input-output relation is the goal of regression. However, regression analysis is not sufficiently informative if the conditional distribution has multi-modality, is highly asymmetric, or contains heteroscedastic noise. In such scenarios, estimating the conditional distribution itself would be more useful. In this paper, we propose a novel method of conditional density estimation that is suitable for multi-dimensional continuous variables. The basic idea of the proposed method is to express the conditional density in terms of the density ratio and the ratio is directly estimated without going through density estimation. Experiments using benchmark and robot transition datasets illustrate the usefulness of the proposed approach.'
volume: 9
URL: http://proceedings.mlr.press/v9/sugiyama10a.html
PDF: http://proceedings.mlr.press/v9/sugiyama10a/sugiyama10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-sugiyama10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Sugiyama
given: Masashi
- family: Takeuchi
given: Ichiro
- family: Suzuki
given: Taiji
- family: Kanamori
given: Takafumi
- family: Hachiya
given: Hirotaka
- family: Okanohara
given: Daisuke
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 781-788
id: sugiyama10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 781
lastpage: 788
published: 2010-03-31 00:00:00 +0000
- title: 'On the Convergence Properties of Contrastive Divergence'
abstract: 'Contrastive Divergence (CD) is a popular method for estimating the parameters of Markov Random Fields (MRFs) by rapidly approximating an intractable term in the gradient of the log probability. Despite CD’s empirical success, little is known about its theoretical convergence properties. In this paper, we analyze the CD-1 update rule for Restricted Boltzmann Machines (RBMs) with binary variables. We show that this update is not the gradient of any function, and construct a counterintuitive “regularization function” that causes CD learning to cycle indefinitely. Nonetheless, we show that the regularized CD update has a fixed point for a large class of regularization functions using Brower’s fixed point theorem.'
volume: 9
URL: http://proceedings.mlr.press/v9/sutskever10a.html
PDF: http://proceedings.mlr.press/v9/sutskever10a/sutskever10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-sutskever10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Sutskever
given: Ilya
- family: Tieleman
given: Tijmen
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 789-795
id: sutskever10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 789
lastpage: 795
published: 2010-03-31 00:00:00 +0000
- title: 'Inference and Learning in Networks of Queues'
abstract: 'Probabilistic models of the performance of computer systems are useful both for predicting system performance in new conditions, and for diagnosing past performance problems. The most popular performance models are networks of queues. However, no current methods exist for parameter estimation or inference in networks of queues with missing data. In this paper, we present a novel viewpoint that combines queueing networks and graphical models, allowing Markov chain Monte Carlo to be applied. We demonstrate the effectiveness of our sampler on real-world data from a benchmark Web application.'
volume: 9
URL: http://proceedings.mlr.press/v9/sutton10a.html
PDF: http://proceedings.mlr.press/v9/sutton10a/sutton10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-sutton10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Sutton
given: Charles
- family: Jordan
given: Michael I.
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 796-803
id: sutton10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 796
lastpage: 803
published: 2010-03-31 00:00:00 +0000
- title: 'Sufficient Dimension Reduction via Squared-loss Mutual Information Estimation'
abstract: 'The goal of sufficient dimension reduction in supervised learning is to find the low dimensional subspace of input features that is "sufficient" for predicting output values. In this paper, we propose a novel sufficient dimension reduction method using a squaredloss variant of mutual information as a dependency measure. We utilize an analytic approximator of squared-loss mutual information based on density ratio estimation, which is shown to possess suitable convergence properties. We then develop a natural gradient algorithm for sufficient subspace search. Numerical experiments show that the proposed method compares favorably with existing dimension reduction approaches.'
volume: 9
URL: http://proceedings.mlr.press/v9/suzuki10a.html
PDF: http://proceedings.mlr.press/v9/suzuki10a/suzuki10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-suzuki10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Suzuki
given: Taiji
- family: Sugiyama
given: Masashi
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 804-811
id: suzuki10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 804
lastpage: 811
published: 2010-03-31 00:00:00 +0000
- title: 'HOP-MAP: Efficient Message Passing with High Order Potentials'
abstract: 'There is a growing interest in building probabilistic models with high order potentials (HOPs), or interactions, among discrete variables. Message passing inference in such models generally takes time exponential in the size of the interaction, but in some cases maximum a posteriori (MAP) inference can be carried out efficiently. We build upon such results, introducing two new classes, including composite HOPs that allow us to flexibly combine tractable HOPs using simple logical switching rules. We present efficient message update algorithms for the new HOPs, and we improve upon the efficiency of message updates for a general class of existing HOPs. Importantly, we present both new and existing HOPs in a common representation; performing inference with any combination of these HOPs requires no change of representations or new derivations.'
volume: 9
URL: http://proceedings.mlr.press/v9/tarlow10a.html
PDF: http://proceedings.mlr.press/v9/tarlow10a/tarlow10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-tarlow10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Tarlow
given: Daniel
- family: Givoni
given: Inmar
- family: Zemel
given: Richard
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 812-819
id: tarlow10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 812
lastpage: 819
published: 2010-03-31 00:00:00 +0000
- title: 'Hartigan’s Method: k-means Clustering without Voronoi'
abstract: 'Hartigan’s method for k-means clustering is the following greedy heuristic: select a point, and optimally reassign it. This paper develops two other formulations of the heuristic, one leading to a number of consistency properties, the other showing that the data partition is always quite separated from the induced Voronoi partition. A characterization of the volume of this separation is provided. Empirical tests verify not only good optimization performance relative to Lloyd’s method, but also good running time.'
volume: 9
URL: http://proceedings.mlr.press/v9/telgarsky10a.html
PDF: http://proceedings.mlr.press/v9/telgarsky10a/telgarsky10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-telgarsky10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Telgarsky
given: Matus
- family: Vattani
given: Andrea
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 820-827
id: telgarsky10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 820
lastpage: 827
published: 2010-03-31 00:00:00 +0000
- title: 'Learning Policy Improvements with Path Integrals'
abstract: 'With the goal to generate more scalable algorithms with higher efficiency and fewer open parameters, reinforcement learning (RL) has recently moved towards combining classical techniques from optimal control and dynamic programming with modern learning techniques from statistical estimation theory. In this vein, this paper suggests to use the framework of stochastic optimal control with path integrals to derive a novel approach to RL with parametrized policies. While solidly grounded in value function estimation and optimal control based on the stochastic Hamilton-Jacobi-Bellman (HJB) equations, policy improvements can be transformed into an approximation problem of a path integral which has no open parameters other than the exploration noise. The resulting algorithm can be conceived of as model-based, semi-model-based, or even model free, depending on how the learning problem is structured. Our new algorithm demonstrates interesting similarities with previous RL research in the framework of probability matching and provides intuition why the slightly heuristically motivated probability matching approach can actually perform well. Empirical evaluations demonstrate significant performance improvements over gradient-based policy learning and scalability to high-dimensional control problems. We believe that Policy Improvement with Path Integrals PI^2 offers currently one of the most efficient, numerically robust, and easy to implement algorithms for RL based on trajectory roll-outs.'
volume: 9
URL: http://proceedings.mlr.press/v9/theodorou10a.html
PDF: http://proceedings.mlr.press/v9/theodorou10a/theodorou10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-theodorou10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Theodorou
given: Evangelos
- family: Buchli
given: Jonas
- family: Schaal
given: Stefan
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 828-835
id: theodorou10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 828
lastpage: 835
published: 2010-03-31 00:00:00 +0000
- title: 'Unsupervised Aggregation for Classification Problems with Large Numbers of Categories'
abstract: 'Classification problems with a very large or unbounded set of output categories are common in many areas such as natural language and image processing. In order to improve accuracy on these tasks, it is natural for a decision-maker to combine predictions from various sources. However, supervised data needed to fit an aggregation model is often difficult to obtain, especially if needed for multiple domains. Therefore, we propose a generative model for unsupervised aggregation which exploits the agreement signal to estimate the expertise of individual judges. Due to the large output space size, this aggregation model cannot encode expertise of constituent judges with respect to every category for all problems. Consequently, we extend it by incorporating the notion of category types to account for variability of the judge expertise depending on the type. The viability of our approach is demonstrated both on synthetic experiments and on a practical task of syntactic parser aggregation.'
volume: 9
URL: http://proceedings.mlr.press/v9/titov10a.html
PDF: http://proceedings.mlr.press/v9/titov10a/titov10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-titov10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Titov
given: Ivan
- family: Klementiev
given: Alexandre
- family: Small
given: Kevin
- family: Roth
given: Dan
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 836-843
id: titov10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 836
lastpage: 843
published: 2010-03-31 00:00:00 +0000
- title: 'Bayesian Gaussian Process Latent Variable Model'
abstract: 'We introduce a variational inference framework for training the Gaussian process latent variable model and thus performing Bayesian nonlinear dimensionality reduction. This method allows us to variationally integrate out the input variables of the Gaussian process and compute a lower bound on the exact marginal likelihood of the nonlinear latent variable model. The maximization of the variational lower bound provides a Bayesian training procedure that is robust to overfitting and can automatically select the dimensionality of the nonlinear latent space. We demonstrate our method on real world datasets. The focus in this paper is on dimensionality reduction problems, but the methodology is more general. For example, our algorithm is immediately applicable for training Gaussian process models in the presence of missing or uncertain inputs.'
volume: 9
URL: http://proceedings.mlr.press/v9/titsias10a.html
PDF: http://proceedings.mlr.press/v9/titsias10a/titsias10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-titsias10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Titsias
given: Michalis
- family: Lawrence
given: Neil D.
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 844-851
id: titsias10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 844
lastpage: 851
published: 2010-03-31 00:00:00 +0000
- title: 'A Markov-Chain Monte Carlo Approach to Simultaneous Localization and Mapping'
abstract: 'A Markov-Chain Monte Carlo based algorithm is provided to solve the simultaneous localization and mapping (SLAM) problem with general dynamical and observation models under open-loop control and provided that the map-representation is finite dimensional. To our knowledge this is the first provably consistent yet (close-to) practical solution to this problem. The superiority of our algorithm over alternative SLAM algorithms is demonstrated in a difficult loop closing situation.'
volume: 9
URL: http://proceedings.mlr.press/v9/torma10a.html
PDF: http://proceedings.mlr.press/v9/torma10a/torma10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-torma10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Torma
given: Peter
- family: György
given: András
- family: Szepesvári
given: Csaba
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 852-859
id: torma10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 852
lastpage: 859
published: 2010-03-31 00:00:00 +0000
- title: 'Learning Causal Structure from Overlapping Variable Sets'
abstract: 'We present an algorithm name cSAT+ for learning the causal structure in a domain from datasets measuring different variables sets. The algorithm outputs a graph with edges corresponding to all possible pairwise causal relations between two variables, named Pairwise Causal Graph (PCG). Examples of interesting inferences include the induction of the absence or presence of some causal relation between two variables never measured together. cSAT+ converts the problem to a series of SAT problems, obtaining leverage from the efficiency of state-of-the-art solvers. In our empirical evaluation, it is shown to outperform ION, the first algorithm solving a similar but more general problem, by two orders of magnitude.'
volume: 9
URL: http://proceedings.mlr.press/v9/triantafillou10a.html
PDF: http://proceedings.mlr.press/v9/triantafillou10a/triantafillou10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-triantafillou10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Triantafillou
given: Sofia
- family: Tsamardinos
given: Ioannis
- family: Tollis
given: Ioannis
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 860-867
id: triantafillou10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 860
lastpage: 867
published: 2010-03-31 00:00:00 +0000
- title: 'State-Space Inference and Learning with Gaussian Processes'
abstract: 'State-space inference and learning with Gaussian processes (GPs) is an unsolved problem. We propose a new, general methodology for inference and learning in nonlinear state-space models that are described probabilistically by non-parametric GP models. We apply the expectation maximization algorithm to iterate between inference in the latent state-space and learning the parameters of the underlying GP dynamics model.'
volume: 9
URL: http://proceedings.mlr.press/v9/turner10a.html
PDF: http://proceedings.mlr.press/v9/turner10a/turner10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-turner10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Turner
given: Ryan
- family: Deisenroth
given: Marc
- family: Rasmussen
given: Carl
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 868-875
id: turner10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 868
lastpage: 875
published: 2010-03-31 00:00:00 +0000
- title: 'Sequential Monte Carlo Samplers for Dirichlet Process Mixtures'
abstract: 'In this paper, we develop a novel online algorithm based on the Sequential Monte Carlo(SMC) samplers framework for posterior inference in Dirichlet Process Mixtures (DPM). Our method generalizes many sequential importance sampling approaches. It provides a computationally efficient improvement to particle filtering that is less prone to getting stuck in isolated modes. The proposed method is a particular SMC sampler that enables us to design sophisticated clustering update schemes, such as updating past trajectories of the particles in light of recent observations, and still ensures convergence to the true DPM target distribution asymptotically. Performance has been evaluated in a Bayesian Infinite Gaussian mixture density estimation problem and it is shown that the proposed algorithm outperforms conventional Monte Carlo approaches in terms of estimation variance and average log-marginal likelihood.'
volume: 9
URL: http://proceedings.mlr.press/v9/ulker10a.html
PDF: http://proceedings.mlr.press/v9/ulker10a/ulker10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-ulker10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Ulker
given: Yener
- family: Günsel
given: Bilge
- family: Cemgil
given: Taylan
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 876-883
id: ulker10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 876
lastpage: 883
published: 2010-03-31 00:00:00 +0000
- title: 'Guarantees for Approximate Incremental SVMs'
abstract: 'Assume a teacher provides examples one by one. An approximate incremental SVM computes a sequence of classifiers that are close to the true SVM solutions computed on the successive incremental training sets. We show that simple algorithms can satisfy an averaged accuracy criterion with a computational cost that scales as well as the best SVM algorithms with the number of examples. Finally, we exhibit some experiments highlighting the benefits of joining fast incremental optimization and curriculum and active learning (Schon and Cohn, 2000; Bordes et al., 2005; Bengio et al., 2009).'
volume: 9
URL: http://proceedings.mlr.press/v9/usunier10a.html
PDF: http://proceedings.mlr.press/v9/usunier10a/usunier10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-usunier10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Usunier
given: Nicolas
- family: Bordes
given: Antoine
- family: Bottou
given: Léon
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 884-891
id: usunier10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 884
lastpage: 891
published: 2010-03-31 00:00:00 +0000
- title: 'An Alternative Prior Process for Nonparametric Bayesian Clustering'
abstract: 'Prior distributions play a crucial role in Bayesian approaches to clustering. Two commonly-used prior distributions are the Dirichlet and Pitman-Yor processes. In this paper, we investigate the predictive probabilities that underlie these processes, and the implicit “rich-get-richer” characteristic of the resulting partitions. We explore an alternative prior for nonparametric Bayesian clustering, the uniform process, for applications where the “rich-get-richer” property is undesirable. We also explore the cost of this new process: partitions are no longer exchangeable with respect to the ordering of variables. We present new asymptotic and simulation-based results for the clustering characteristics of the uniform process and compare these with known results for the Dirichlet and Pitman-Yor processes. Finally, we compare performance on a real document clustering task, demonstrating the practical advantage of the uniform process despite its lack of exchangeability over orderings.'
volume: 9
URL: http://proceedings.mlr.press/v9/wallach10a.html
PDF: http://proceedings.mlr.press/v9/wallach10a/wallach10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-wallach10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Wallach
given: Hanna
- family: Jensen
given: Shane
- family: Dicker
given: Lee
- family: Heller
given: Katherine
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 892-899
id: wallach10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 892
lastpage: 899
published: 2010-03-31 00:00:00 +0000
- title: 'A Potential-based Framework for Online Multi-class Learning with Partial Feedback'
abstract: 'We study the problem of online multi-class learning with partial feedback: in each trial of online learning, instead of providing the true class label for a given instance, the oracle will only reveal to the learner if the predicted class label is correct. We present a general framework for online multi-class learning with partial feedback that adapts the potential-based gradient descent approaches. The generality of the proposed framework is verified by the fact that Banditron is indeed a special case of our work if the potential function is set to be the squared L_2 norm of the weight vector. We propose an exponential gradient algorithm for online multi-class learning with partial feedback. Compared to the Banditron algorithm, the exponential gradient algorithm is advantageous in that its mistake bound is independent from the dimension of data, making it suitable for classifying high dimensional data. Our empirical study with four data sets show that the proposed algorithm for online learning with partial feedback is more effective than the Banditron algorithm.'
volume: 9
URL: http://proceedings.mlr.press/v9/wang10a.html
PDF: http://proceedings.mlr.press/v9/wang10a/wang10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-wang10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Wang
given: Shijun
- family: Jin
given: Rong
- family: Valizadegan
given: Hamed
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 900-907
id: wang10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 900
lastpage: 907
published: 2010-03-31 00:00:00 +0000
- title: 'Online Passive-Aggressive Algorithms on a Budget'
abstract: 'In this paper a kernel-based online learning algorithm, which has both constant space and update time, is proposed. The approach is based on the popular online Passive-Aggressive (PA) algorithm. When used in conjunction with kernel function, the number of support vectors in PA grows without bounds when learning from noisy data streams. This implies unlimited memory and ever increasing model update and prediction time. To address this issue, the proposed budgeted PA algorithm maintains only a fixed number of support vectors. By introducing an additional constraint to the original PA optimization problem, a closed-form solution was derived for the support vector removal and model update. Using the hinge loss we developed several budgeted PA algorithms that can trade between accuracy and update cost. We also developed the ramp loss versions of both original and budgeted PA and showed that the resulting algorithms can be interpreted as the combination of active learning and hinge loss PA. All proposed algorithms were comprehensively tested on 7 benchmark data sets. The experiments showed that they are superior to the existing budgeted online algorithms. Even with modest budgets, the budgeted PA achieved very competitive accuracies to the non-budgeted PA and kernel perceptron algorithms.'
volume: 9
URL: http://proceedings.mlr.press/v9/wang10b.html
PDF: http://proceedings.mlr.press/v9/wang10b/wang10b.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-wang10b.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Wang
given: Zhuang
- family: Vucetic
given: Slobodan
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 908-915
id: wang10b
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 908
lastpage: 915
published: 2010-03-31 00:00:00 +0000
- title: 'Structured Prediction Cascades'
abstract: 'Structured prediction tasks pose a fundamental trade-off between the need for model complexity to increase predictive power and the limited computational resources for inference in the exponentially-sized output spaces such models require. We formulate and develop structured prediction cascades: a sequence of increasingly complex models that progressively filter the space of possible outputs. We represent an exponentially large set of filtered outputs using max marginals and propose a novel convex loss function that balances filtering error with filtering efficiency. We provide generalization bounds for these loss functions and evaluate our approach on handwriting recognition and part-of-speech tagging. We find that the learned cascades are capable of reducing the complexity of inference by up to five orders of magnitude, enabling the use of models which incorporate higher order features and yield higher accuracy.'
volume: 9
URL: http://proceedings.mlr.press/v9/weiss10a.html
PDF: http://proceedings.mlr.press/v9/weiss10a/weiss10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-weiss10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Weiss
given: David
- family: Taskar
given: Benjamin
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 916-923
id: weiss10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 916
lastpage: 923
published: 2010-03-31 00:00:00 +0000
- title: 'Dependent Indian Buffet Processes'
abstract: 'Latent variable models represent hidden structure in observational data.To account for the distribution of the observational data changing over time, space or some other covariate, we need generalizations of latent variable models that explicitly capture this dependency on the covariate. A variety of such generalizations has been proposed for latent variable models based on the Dirichlet process. We address dependency on covariates in binary latent feature models, by introducing a dependent Indian buffet process. The model generates, for each value of the covariate, a binary random matrix with an unbounded number of columns. Evolution of the binary matrices over the covariate set is controlled by a hierarchical Gaussian process model. The choice of covariance functions controls the dependence structure and exchangeability properties of the model. We derive a Markov Chain Monte Carlo sampling algorithm for Bayesian inference, and provide experiments on both synthetic and real-world data. The experimental results show that explicit modeling of dependencies significantly improves accuracy of predictions.'
volume: 9
URL: http://proceedings.mlr.press/v9/williamson10a.html
PDF: http://proceedings.mlr.press/v9/williamson10a/williamson10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-williamson10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Williamson
given: Sinead
- family: Orbanz
given: Peter
- family: Ghahramani
given: Zoubin
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 924-931
id: williamson10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 924
lastpage: 931
published: 2010-03-31 00:00:00 +0000
- title: 'Modeling annotator expertise: Learning when everybody knows a bit of something'
abstract: 'Supervised learning from multiple labeling sources is an increasingly important problem in machine learning and data mining. This paper develops a probabilistic approach to this problem when annotators may be unreliable (labels are noisy), but also their expertise varies depending on the data they observe (annotators may have knowledge about different parts of the input space). That is, an annotator may not be consistently accurate (or inaccurate) across the task domain. The presented approach produces classification and annotator models that allow us to provide estimates of the true labels and annotator variable expertise. We provide an analysis of the proposed model under various scenarios and show experimentally that annotator expertise can indeed vary in real tasks and that the presented approach provides clear advantages over previously introduced multi-annotator methods, which only consider general annotator characteristics.'
volume: 9
URL: http://proceedings.mlr.press/v9/yan10a.html
PDF: http://proceedings.mlr.press/v9/yan10a/yan10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-yan10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Yan
given: Yan
- family: Rosales
given: Romer
- family: Fung
given: Glenn
- family: Schmidt
given: Mark
- family: Hermosillo
given: Gerardo
- family: Bogoni
given: Luca
- family: Moy
given: Linda
- family: Dy
given: Jennifer
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 932-939
id: yan10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 932
lastpage: 939
published: 2010-03-31 00:00:00 +0000
- title: 'A highly efficient blocked Gibbs sampler reconstruction of multidimensional NMR spectra'
abstract: 'Projection Reconstruction Nuclear Magnetic Resonance (PR-NMR) is a new technique to generate multi-dimensional NMR spectra, which have discrete features that are relatively sparsely distributed in space. A small number of projections from lower dimensional NMR spectra are used to reconstruct the multi-dimensional NMR spectra. We propose an efficient algorithm which employs a blocked Gibbs sampler to accurately reconstruct NMR spectra. This statistical method generates samples in Bayesian scheme. Our proposed algorithm is tested on a set of six projections derived from the three-dimensional 700 MHz HNCO spectrum of HasA, a 187-residue heme binding protein.'
volume: 9
URL: http://proceedings.mlr.press/v9/yoon10a.html
PDF: http://proceedings.mlr.press/v9/yoon10a/yoon10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-yoon10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Yoon
given: Ji Won
- family: Wilson
given: Simon
- family: Mok
given: K. Hun
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 940-947
id: yoon10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 940
lastpage: 947
published: 2010-03-31 00:00:00 +0000
- title: 'Risk Bounds for Levy Processes in the PAC-Learning Framework'
abstract: 'Levy processes play an important role in the stochastic process theory. However, since samples are non-i.i.d., statistical learning results based on the i.i.d. scenarios cannot be utilized to study the risk bounds for Levy processes. In this paper, we present risk bounds for non-i.i.d. samples drawn from Levy processes in the PAC-learning framework. In particular, by using a concentration inequality for infinitely divisible distributions, we first prove that the function of risk error is Lipschitz continuous with a high probability, and then by using a specific concentration inequality for Levy processes, we obtain the risk bounds for non-i.i.d. samples drawn from Levy processes without Gaussian components. Based on the resulted risk bounds, we analyze the factors that affect the convergence of the risk bounds and then prove the convergence.'
volume: 9
URL: http://proceedings.mlr.press/v9/zhang10a.html
PDF: http://proceedings.mlr.press/v9/zhang10a/zhang10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-zhang10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Zhang
given: Chao
- family: Tao
given: Dacheng
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 948-955
id: zhang10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 948
lastpage: 955
published: 2010-03-31 00:00:00 +0000
- title: 'Bayesian Online Learning for Multi-label and Multi-variate Performance Measures'
abstract: 'Many real world applications employ multi-variate performance measures and each example can belong to multiple classes. The currently most popular approaches train an SVM for each class, followed by ad hoc thresholding. Probabilistic models using Bayesian decision theory are also commonly adopted. In this paper, we propose a Bayesian online multi-label classification framework (BOMC) which learns a probabilistic linear classifier. The likelihood is modeled by a graphical model similar to TrueSkill^TM, and inference is based on Gaussian density filtering with expectation propagation. Using samples from the posterior, we label the testing data by maximizing the expected F_1-score. Our experiments on Reuters1-v2 dataset show BOMC compares favorably to the state-of-the-art online learners in macro-averaged F_1-score and training time.'
volume: 9
URL: http://proceedings.mlr.press/v9/zhang10b.html
PDF: http://proceedings.mlr.press/v9/zhang10b/zhang10b.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-zhang10b.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Zhang
given: Xinhua
- family: Graepel
given: Thore
- family: Herbrich
given: Ralf
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 956-963
id: zhang10b
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 956
lastpage: 963
published: 2010-03-31 00:00:00 +0000
- title: 'Multi-Task Learning using Generalized t Process'
abstract: 'Multi-task learning seeks to improve the generalization performance of a learning task with the help of other related learning tasks. Among the multi-task learning methods proposed thus far, Bonilla et al.’s method provides a novel multi-task extension of Gaussian process (GP) by using a task covariance matrix to model the relationships between tasks. However, learning the task covariance matrix directly has both computational and representational drawbacks. In this paper, we propose a Bayesian extension by modeling the task covariance matrix as a random matrix with an inverse-Wishart prior and integrating it out to achieve Bayesian model averaging. To make the computation feasible, we first give an alternative weight-space view of Bonilla et al.’s multi-task GP model and then integrate out the task covariance matrix in the model, leading to a multi-task generalized t process (MTGTP). For the likelihood, we use a generalized t noise model which, together with the generalized t process prior, brings about the robustness advantage as well as an analytical form for the marginal likelihood. In order to specify the inverse-Wishart prior, we use the maximum mean discrepancy (MMD) statistic to estimate the parameter matrix of the inverse-Wishart prior. Moreover, we investigate some theoretical properties of MTGTP, such as its asymptotic analysis and learning curve. Comparative experimental studies on two common multi-task learning applications show very promising results.'
volume: 9
URL: http://proceedings.mlr.press/v9/zhang10c.html
PDF: http://proceedings.mlr.press/v9/zhang10c/zhang10c.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-zhang10c.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Zhang
given: Yu
- family: Yeung
given: Dit–Yan
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 964-971
id: zhang10c
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 964
lastpage: 971
published: 2010-03-31 00:00:00 +0000
- title: 'Bayesian Generalized Kernel Models'
abstract: 'We propose a fully Bayesian approach for generalized kernel models (GKMs), which are extensions of generalized linear models in the feature space induced by a reproducing kernel. We place a mixture of a point-mass distribution and Silverman’s g-prior on the regression vector of GKMs. This mixture prior allows a fraction of the regression vector to be zero. Thus, it serves for sparse modeling and Bayesian computation. For inference, we exploit data augmentation methodology to develop a Markov chain Monte Carlo (MCMC) algorithm in which the reversible jump method is used for model selection and a Bayesian model averaging method is used for posterior prediction.'
volume: 9
URL: http://proceedings.mlr.press/v9/zhang10d.html
PDF: http://proceedings.mlr.press/v9/zhang10d/zhang10d.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-zhang10d.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Zhang
given: Zhihua
- family: Dai
given: Guang
- family: Wang
given: Donghui
- family: Jordan
given: Michael I.
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 972-979
id: zhang10d
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 972
lastpage: 979
published: 2010-03-31 00:00:00 +0000
- title: 'Matrix-Variate Dirichlet Process Mixture Models'
abstract: 'We are concerned with a multivariate response regression problem where the interest is in considering correlations both across response variates and across response samples. In this paper we develop a new Bayesian nonparametric model for such a setting based on Dirichlet process priors. Building on an additive kernel model, we allow each sample to have its own regression matrix. Although this overcomplete representation could in principle suffer from severe overfitting problems, we are able to provide effective control over the model via a matrix-variate Dirichlet process prior on the regression matrices. Our model is able to share statistical strength among regression matrices due to the clustering property of the Dirichlet process. We make use of a Markov chain Monte Carlo algorithm for inference and prediction. Compared with other Bayesian kernel models, our model has advantages in both computational and statistical efficiency.'
volume: 9
URL: http://proceedings.mlr.press/v9/zhang10e.html
PDF: http://proceedings.mlr.press/v9/zhang10e/zhang10e.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-zhang10e.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Zhang
given: Zhihua
- family: Dai
given: Guang
- family: Jordan
given: Michael I.
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 980-987
id: zhang10e
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 980
lastpage: 987
published: 2010-03-31 00:00:00 +0000
- title: 'Exclusive Lasso for Multi-task Feature Selection'
abstract: 'We propose a novel group regularization which we call exclusive lasso. Unlike the group lasso regularizer that assumes co-varying variables in groups, the proposed exclusive lasso regularizer models the scenario when variables in the same group compete with each other. Analysis is presented to illustrate the properties of the proposed regularizer. We present a framework of kernel-based multi-task feature selection algorithm based on the proposed exclusive lasso regularizer. An efficient algorithm is derived to solve the related optimization problem. Experiments with document categorization show that our approach outperforms state-of-the-art algorithms for multi-task feature selection.'
volume: 9
URL: http://proceedings.mlr.press/v9/zhou10a.html
PDF: http://proceedings.mlr.press/v9/zhou10a/zhou10a.pdf
edit: https://github.com/mlresearch/v9/edit/gh-pages/_posts/2010-03-31-zhou10a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- family: Zhou
given: Yang
- family: Jin
given: Rong
- family: Hoi
given: Steven Chu–Hong
editor:
- family: Teh
given: Yee Whye
- family: Titterington
given: Mike
address: Chia Laguna Resort, Sardinia, Italy
page: 988-995
id: zhou10a
issued:
date-parts:
- 2010
- 3
- 31
firstpage: 988
lastpage: 995
published: 2010-03-31 00:00:00 +0000