- title: 'Bayesian learning of joint distributions of objects'
abstract: 'There is increasing interest in broad application areas in defining flexible joint models for data having a variety of measurement scales, while also allowing data of complex types, such as functions, images and documents. We consider a general framework for nonparametric Bayes joint modeling through mixture models that incorporate dependence across data types through a joint mixing measure. The mixing measure is assigned a novel infinite tensor factorization (ITF) prior that allows flexible dependence in cluster allocation across data types. The ITF prior is formulated as a tensor product of stick-breaking processes. Focusing on a convenient special case corresponding to a Parafac factorization, we provide basic theory justifying the flexibility of the proposed prior. Focusing on ITF mixtures of product kernels, we develop a new Gibbs sampling algorithm for routine implementation relying on slice sampling. The methods are compared with alternative joint mixture models based on Dirichlet processes and related approaches through simulations and real data applications.'
note: 'Notable paper award'
volume: 31
URL: https://proceedings.mlr.press/v31/banerjee13a.html
PDF: http://proceedings.mlr.press/v31/banerjee13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-banerjee13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Anjishnu
family: Banerjee
- given: Jared
family: Murray
- given: David
family: Dunson
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 1-9
id: banerjee13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 1
lastpage: 9
published: 2013-04-29 00:00:00 +0000
- title: 'Permutation estimation and minimax rates of identifiability'
abstract: 'The problem of matching two sets of features appears in various tasks of computer vision and can be often formalized as a problem of permutation estimation. We address this problem from a statistical point of view and provide a theoretical analysis of the accuracy of several natural estimators. To this end, the notion of the minimax matching threshold is introduced and its expression is obtained as a function of the sample size, noise level and dimensionality. We consider the cases of homoscedastic and heteroscedastic noise and carry out, in each case, upper bounds on the matching threshold of several estimators. This upper bounds are shown to be unimprovable in the homoscedastic setting. We also discuss the computational aspects of the estimators and provide some empirical evidence of their consistency on synthetic data-sets.'
note: 'Notable paper award'
volume: 31
URL: https://proceedings.mlr.press/v31/collier13a.html
PDF: http://proceedings.mlr.press/v31/collier13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-collier13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Olivier
family: Collier
- given: Arnak
family: Dalalyan
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 10-19
id: collier13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 10
lastpage: 19
published: 2013-04-29 00:00:00 +0000
- title: 'A unifying representation for a class of dependent random measures'
abstract: 'We present a general construction for dependent random measures based on thinning Poisson processes on an augmented space. The framework is not restricted to dependent versions of a specific nonparametric model, but can be applied to all models that can be represented using completely random measures. Several existing dependent random measures can be seen as specific cases of this framework. Interesting properties of the resulting measures are derived and the efficacy of the framework is demonstrated by constructing a covariate-dependent latent feature model and topic model that obtain superior predictive performance.'
note: 'Notable paper award'
volume: 31
URL: https://proceedings.mlr.press/v31/foti13a.html
PDF: http://proceedings.mlr.press/v31/foti13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-foti13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Nicholas
family: Foti
- given: Joseph
family: Futoma
- given: Daniel
family: Rockmore
- given: Sinead
family: Williamson
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 20-28
id: foti13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 20
lastpage: 28
published: 2013-04-29 00:00:00 +0000
- title: 'Diagonal Orthant Multinomial Probit Models'
abstract: 'Bayesian classification commonly relies on probit models, with data augmentation algorithms used for posterior computation. By imputing latent Gaussian variables, one can often trivially adapt computational approaches used in Gaussian models. However, MCMC for multinomial probit (MNP) models can be inefficient in practice due to high posterior dependence between latent variables and parameters, and to difficulties in efficiently sampling latent variables when there are more than two categories. To address these problems, we propose a new class of diagonal orthant (DO) multinomial models. The key characteristics of these models include conditional independence of the latent variables given model parameters, avoidance of arbitrary identifiability restrictions, and simple expressions for category probabilities. We show substantially improved computational efficiency and comparable predictive performance to MNP. '
note: 'Notable paper award'
volume: 31
URL: https://proceedings.mlr.press/v31/johndrow13a.html
PDF: http://proceedings.mlr.press/v31/johndrow13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-johndrow13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: James
family: Johndrow
- given: David
family: Dunson
- given: Kristian
family: Lum
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 29-38
id: johndrow13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 29
lastpage: 38
published: 2013-04-29 00:00:00 +0000
- title: 'Distributed Learning of Gaussian Graphical Models via Marginal Likelihoods'
abstract: 'We consider distributed estimation of the inverse covariance matrix, also called the concentration matrix, in Gaussian graphical models. Traditional centralized estimation often requires iterative and expensive global inference and is therefore difficult in large distributed networks. In this paper, we propose a general framework for distributed estimation based on a maximum marginal likelihood (MML) approach. Each node independently computes a local estimate by maximizing a marginal likelihood defined with respect to data collected from its local neighborhood. Due to the non-convexity of the MML problem, we derive and consider solving a convex relaxation. The local estimates are then combined into a global estimate without the need for iterative message-passing between neighborhoods. We prove that this relaxed MML estimator is asymptotically consistent. Through numerical experiments on several synthetic and real-world data sets, we demonstrate that the two-hop version of the proposed estimator is significantly better than the one-hop version, and nearly closes the gap to the centralized maximum likelihood estimator in many situations.'
note: 'Notable paper award'
volume: 31
URL: https://proceedings.mlr.press/v31/meng13a.html
PDF: http://proceedings.mlr.press/v31/meng13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-meng13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Zhaoshi
family: Meng
- given: Dennis
family: Wei
- given: Ami
family: Wiesel
- given: Alfred
family: Hero
suffix: III
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 39-47
id: meng13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 39
lastpage: 47
published: 2013-04-29 00:00:00 +0000
- title: 'Sparse Principal Component Analysis for High Dimensional Multivariate Time Series'
abstract: 'We study sparse principal component analysis (sparse PCA) for high dimensional multivariate vector autoregressive (VAR) time series. By treating the transition matrix as a nuisance parameter, we show that sparse PCA can be directly applied on analyzing multivariate time series as if the data are i.i.d. generated. Under a double asymptotic framework in which both the length of the sample period T and dimensionality d of the time series can increase (with possibly d≫T), we provide explicit rates of convergence of the angle between the estimated and population leading eigenvectors of the time series covariance matrix. Our results suggest that the spectral norm of the transition matrix plays a pivotal role in determining the final rates of convergence. Implications of such a general result is further illustrated using concrete examples. The results of this paper have impacts on different applications, including financial time series, biomedical imaging, and social media, etc.'
note: 'Notable paper award'
volume: 31
URL: https://proceedings.mlr.press/v31/wang13a.html
PDF: http://proceedings.mlr.press/v31/wang13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-wang13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Zhaoran
family: Wang
- given: Fang
family: Han
- given: Han
family: Liu
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 48-56
id: wang13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 48
lastpage: 56
published: 2013-04-29 00:00:00 +0000
- title: 'A Competitive Test for Uniformity of Monotone Distributions'
abstract: 'We propose a test that takes random samples drawn from a monotone distribution and decides whether or not the distribution is uniform. The test is nearly optimal in that it uses at most O(n\sqrt\log n) samples, where n is the number of samples that a genie who knew all but one bit about the underlying distribution would need for the same task. Furthermore, we show that any such test would require Ω(n\sqrt\log n) samples for some distributions.'
volume: 31
URL: https://proceedings.mlr.press/v31/acharya13a.html
PDF: http://proceedings.mlr.press/v31/acharya13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-acharya13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Jayadev
family: Acharya
- given: Ashkan
family: Jafarpour
- given: Alon
family: Orlitsky
- given: Ananda
family: Suresh
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 57-65
id: acharya13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 57
lastpage: 65
published: 2013-04-29 00:00:00 +0000
- title: 'Clustering Oligarchies'
abstract: 'We investigate the extent to which clustering algorithms are robust to the addition of a small, potentially adversarial, set of points. Our analysis reveals radical differences in the robustness of popular clustering methods. k-means and several related techniques are robust when data is clusterable, and we provide a quantitative analysis capturing the precise relationship between clusterability and robustness. In contrast, common linkage-based algorithms and several standard objective-function-based clustering methods can be highly sensitive to the addition of a small set of points even when the data is highly clusterable. We call such sets of points oligarchies. Lastly, we show that the behavior with respect to oligarchies of the popular Lloyd’s method changes radically with the initialization technique.'
volume: 31
URL: https://proceedings.mlr.press/v31/ackerman13a.html
PDF: http://proceedings.mlr.press/v31/ackerman13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-ackerman13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Margareta
family: Ackerman
- given: Shai
family: Ben-David
- given: David
family: Loker
- given: Sivan
family: Sabato
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 66-74
id: ackerman13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 66
lastpage: 74
published: 2013-04-29 00:00:00 +0000
- title: 'Reconstructing ecological networks with hierarchical Bayesian regression and Mondrian processes'
abstract: 'Ecological systems consist of complex sets of interactions among species and their environment, the understanding of which has implications for predicting environmental response to perturbations such as invading species and climate change. However, the revelation of these interactions is not straightforward, nor are the interactions necessarily stable across space. Machine learning can enable the recovery of such complex, spatially varying interactions from relatively easily obtained species abundance data. Here, we describe a novel Bayesian regression and Mondrian process model (BRAMP) for reconstructing species interaction networks from observed field data. BRAMP enables robust inference of species interactions considering autocorrelation in species abundances and allowing for variation in the interactions across space. We evaluate the model on spatially explicit simulated data, produced using a trophic niche model combined with stochastic population dynamics. We compare the model’s performance against L1-penalized sparse regression (LASSO) and non-linear Bayesian networks with the BDe scoring scheme. Finally, we apply BRAMP to real ecological data.'
volume: 31
URL: https://proceedings.mlr.press/v31/aderhold13a.html
PDF: http://proceedings.mlr.press/v31/aderhold13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-aderhold13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Andrej
family: Aderhold
- given: Dirk
family: Husmeier
- given: V. Anne
family: Smith
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 75-84
id: aderhold13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 75
lastpage: 84
published: 2013-04-29 00:00:00 +0000
- title: 'Nyström Approximation for Large-Scale Determinantal Processes'
abstract: 'Determinantal point processes (DPPs) are appealing models for subset selection problems where diversity is desired. They offer surprisingly efficient inference, including sampling in $O(N^3)$ time and $O(N^2)$ space, where $N$ is the number of base items. However, in some applications, $N$ may grow so large that sampling from a DPP becomes computationally infeasible. This is especially true in settings where the DPP kernel matrix cannot be represented by a linear decomposition of low-dimensional feature vectors. In these cases, we propose applying the Nystrom approximation to project the kernel matrix into a low-dimensional space. While theoretical guarantees for the Nystrom approximation in terms of standard matrix norms have been previously established, we are concerned with probabilistic measures, like total variation distance between the DPP generated by a kernel matrix and the one generated by its Nystrom approximation, that behave quite differently. In this paper we derive new error bounds for the Nystrom-approximated DPP and present empirical results to corroborate them. We then demonstrate the Nystrom-approximated DPP by applying it to a motion capture summarization task.'
volume: 31
URL: https://proceedings.mlr.press/v31/affandi13a.html
PDF: http://proceedings.mlr.press/v31/affandi13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-affandi13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Raja Hafiz
family: Affandi
- given: Alex
family: Kulesza
- given: Emily
family: Fox
- given: Ben
family: Taskar
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 85-98
id: affandi13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 85
lastpage: 98
published: 2013-04-29 00:00:00 +0000
- title: 'Further Optimal Regret Bounds for Thompson Sampling'
abstract: 'Thompson Sampling is one of the oldest heuristics for multi-armed bandit problems. It is a randomized algorithm based on Bayesian ideas, and has recently generated significant interest after several studies demonstrated it to have comparable or better empirical performance compared to the state of the art methods. In this paper, we provide a novel regret analysis for Thompson Sampling that proves the first near-optimal problem-independent bound of O(\sqrtNT\ln T) on the expected regret of this algorithm. Our novel martingale-based analysis techniques are conceptually simple, and easily extend to distributions other than the Beta distribution. For the version of Thompson Sampling that uses Gaussian priors, we prove a problem-independent bound of O(\sqrtNT\ln N) on the expected regret, and demonstrate the optimality of this bound by providing a matching lower bound. This lower bound of Ω(\sqrtNT\ln N) is the first lower bound on the performance of a natural version of Thompson Sampling that is away from the general lower bound of O(\sqrtNT) for the multi-armed bandit problem. Our near-optimal problem-independent bounds for Thompson Sampling solve a COLT 2012 open problem of Chapelle and Li. Additionally, our techniques simultaneously provide the optimal problem-dependent bound of (1+ε)\sum_i \frac\ln Td(\mu_i, \mu_1)+O(\fracNε^2) on the expected regret. The optimal problem-dependent regret bound for this problem was first proven recently by Kaufmann et al. [2012].'
volume: 31
URL: https://proceedings.mlr.press/v31/agrawal13a.html
PDF: http://proceedings.mlr.press/v31/agrawal13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-agrawal13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Shipra
family: Agrawal
- given: Navin
family: Goyal
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 99-107
id: agrawal13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 99
lastpage: 107
published: 2013-04-29 00:00:00 +0000
- title: 'Distributed and Adaptive Darting Monte Carlo through Regenerations'
abstract: 'Darting Monte Carlo (DMC) is a MCMC procedure designed to effectively mix between multiple modes of a probability distribution. We propose an adaptive and distributed version of this method by using regenerations. This allows us to run multiple chains in parallel and adapt the shape of the jump regions as well as all other aspects of the Markov chain on the fly. We show that this significantly improves the performance of DMC because 1) a population of chains has a higher chance of finding the modes in the distribution, 2) jumping between modes becomes easier due to the adaptation of their shapes, 3) computation is much more efficient due to parallelization across multiple processors. While the curse of dimensionality is a challenge for both DMC and regeneration, we find that their combination ameliorates this issue slightly. '
volume: 31
URL: https://proceedings.mlr.press/v31/ahn13a.html
PDF: http://proceedings.mlr.press/v31/ahn13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-ahn13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Sungjin
family: Ahn
- given: Yutian
family: Chen
- given: Max
family: Welling
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 108-116
id: ahn13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 108
lastpage: 116
published: 2013-04-29 00:00:00 +0000
- title: 'Consensus Ranking with Signed Permutations'
abstract: 'Signed permutations (also known as the hyperoctahedral group) are used in modeling genome rearrangements. The algorithmic problems they raise are computationally demanding when not NP-hard. This paper presents a tractable algorithm for learning consensus ranking between signed permutations under the inversion distance. This can be extended to estimate a natural class of exponential models over the group of signed permutations. We investigate experimentally the efficiency of our algorithm for modeling data generated by random reversals.'
volume: 31
URL: https://proceedings.mlr.press/v31/arora13a.html
PDF: http://proceedings.mlr.press/v31/arora13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-arora13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Raman
family: Arora
- given: Marina
family: Meilă
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 117-125
id: arora13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 117
lastpage: 125
published: 2013-04-29 00:00:00 +0000
- title: 'Ultrahigh Dimensional Feature Screening via RKHS Embeddings'
abstract: 'Feature screening is a key step in handling ultrahigh dimensional data sets that are ubiquitous in modern statistical problems. Over the last decade, convex relaxation based approaches (e.g., Lasso/sparse additive model) have been extensively developed and analyzed for feature selection in high dimensional regime. But in the ultrahigh dimensional regime, these approaches suffer from several problems, both computationally and statistically. To overcome these issues, in this paper, we propose a novel Hilbert space embedding based approach to independence screening for ultrahigh dimensional data sets. The proposed approach is model-free (i.e., no model assumption is made between response and predictors) and could handle non-standard (e.g., graphs) and multivariate outputs directly. We establish the sure screening property of the proposed approach in the ultrahigh dimensional regime, and experimentally demonstrate its advantages and superiority over other approaches on several synthetic and real data sets.'
volume: 31
URL: https://proceedings.mlr.press/v31/balasubramanian13a.html
PDF: http://proceedings.mlr.press/v31/balasubramanian13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-balasubramanian13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Krishnakumar
family: Balasubramanian
- given: Bharath
family: Sriperumbudur
- given: Guy
family: Lebanon
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 126-134
id: balasubramanian13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 126
lastpage: 134
published: 2013-04-29 00:00:00 +0000
- title: 'Meta-Transportability of Causal Effects: A Formal Approach'
abstract: 'This paper considers the problem of transferring experimental findings learned from multiple heterogeneous domains to a different environment, in which only passive observations can be collected. Pearl and Bareinboim (2011) established a complete characterization for such transfer between two domains, a source and a target, and this paper generalizes their results to multiple heterogeneous domains. It establishes a necessary and sufficient condition for deciding when effects in the target domain are estimable from both statistical and causal information transferred from the experiments in the source domains. The paper further provides a complete algorithm for computing the transport formula, that is, a way of fusing observational and experimental information to synthesize an unbiased estimate of the desired effects. '
volume: 31
URL: https://proceedings.mlr.press/v31/bareinboim13a.html
PDF: http://proceedings.mlr.press/v31/bareinboim13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-bareinboim13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Elias
family: Bareinboim
- given: Judea
family: Pearl
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 135-143
id: bareinboim13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 135
lastpage: 143
published: 2013-04-29 00:00:00 +0000
- title: 'Convex Collective Matrix Factorization'
abstract: 'In many applications, multiple interlinked sources of data are available and they cannot be represented by a single adjacency matrix, to which large scale factorization method could be applied. Collective matrix factorization is a simple yet powerful approach to jointly factorize multiple matrices, each of which represents a relation between two entity types. Existing algorithms to estimate parameters of collective matrix factorization models are based on non-convex formulations of the problem; in this paper, a convex formulation of this approach is proposed. This enables the derivation of large scale algorithms to estimate the parameters, including an iterative eigenvalue thresholding algorithm. Numerical experiments illustrate the benefits of this new approach.'
volume: 31
URL: https://proceedings.mlr.press/v31/bouchard13a.html
PDF: http://proceedings.mlr.press/v31/bouchard13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-bouchard13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Guillaume
family: Bouchard
- given: Dawei
family: Yin
- given: Shengbo
family: Guo
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 144-152
id: bouchard13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 144
lastpage: 152
published: 2013-04-29 00:00:00 +0000
- title: 'Efficiently Sampling Probabilistic Programs via Program Analysis'
abstract: 'Probabilistic programs are intuitive and succinct representations of complex probability distributions. A natural approach to performing inference over these programs is to execute them and compute statistics over the resulting samples. Indeed, this approach has been taken before in a number of probabilistic programming tools. In this paper, we address two key challenges of this paradigm: (i) ensuring samples are well distributed in the combinatorial space of the program, and (ii) efficiently generating samples with minimal rejection. We present a new sampling algorithm Qi that addresses these challenges using concepts from the field of program analysis. To solve the first challenge (getting diverse samples), we use a technique called symbolic execution to systematically explore all the paths in a program. In the case of programs with loops, we systematically explore all paths up to a given depth, and present theorems on error bounds on the estimates as a function of the path bounds used. To solve the second challenge (efficient samples with minimal rejection), we propagate observations backward through the program using the notion of Dijkstra’s weakest preconditions and hoist these propagated conditions to condition elementary distributions during sampling. We present theorems explaining the mathematical properties of Qi, as well as empirical results from an implementation of the algorithm.'
volume: 31
URL: https://proceedings.mlr.press/v31/chaganty13a.html
PDF: http://proceedings.mlr.press/v31/chaganty13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-chaganty13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Arun
family: Chaganty
- given: Aditya
family: Nori
- given: Sriram
family: Rajamani
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 153-160
id: chaganty13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 153
lastpage: 160
published: 2013-04-29 00:00:00 +0000
- title: 'Computing the M Most Probable Modes of a Graphical Model '
abstract: 'We introduce the M-modes problem for graphical models: predicting the M label configurations of highest probability that are at the same time local maxima of the probability landscape. M-modes have multiple possible applications: because they are intrinsically diverse, they provide a principled alternative to non-maximum suppression techniques for structured prediction, they can act as codebook vectors for quantizing the configuration space, or they can form component centers for mixture model approximation. We present two algorithms for solving the M-modes problem. The first algorithm solves the problem in polynomial time when the underlying graphical model is a simple chain. The second algorithm solves the problem for junction chains. In synthetic and real dataset, we demonstrate how M-modes can improve the performance of prediction. We also use the generated modes as a tool to understand the topography of the probability distribution of configurations, for example with relation to the training set size and amount of noise in the data.'
volume: 31
URL: https://proceedings.mlr.press/v31/chen13a.html
PDF: http://proceedings.mlr.press/v31/chen13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-chen13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Chao
family: Chen
- given: Vladimir
family: Kolmogorov
- given: Yan
family: Zhu
- given: Dimitris
family: Metaxas
- given: Christoph
family: Lampert
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 161-169
id: chen13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 161
lastpage: 169
published: 2013-04-29 00:00:00 +0000
- title: 'A simple criterion for controlling selection bias'
abstract: 'Controlling selection bias, a statistical error caused by preferential sampling of data, is a fundamental problem in machine learning and statistical inference. This paper presents a simple criterion for controlling selection bias in the odds ratio, a widely used measure for association between variables, that connects the nature of selection bias with the graph modeling the selection mechanism. If the graph contains certain paths, we show that the odds ratio cannot be expressed using data with selection bias. Otherwise, we show that a d-separability test can determine whether the odds ratio can be recovered, and when the answer is affirmative, output an unbiased estimand of the odds ratio. The criterion can be test in linear time and enhances the power of the estimand.'
volume: 31
URL: https://proceedings.mlr.press/v31/chen13b.html
PDF: http://proceedings.mlr.press/v31/chen13b.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-chen13b.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Eunice Yuh-Jie
family: Chen
- given: Judea
family: Pearl
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 170-177
id: chen13b
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 170
lastpage: 177
published: 2013-04-29 00:00:00 +0000
- title: 'Evidence Estimation for Bayesian Partially Observed MRFs'
abstract: 'Bayesian estimation in Markov random fields is very hard due to the intractability of the partition function. The introduction of hidden units makes the situation even worse due to the presence of potentially very many modes in the posterior distribution. For the first time we propose a comprehensive procedure to address one of the Bayesian estimation problems, approximating the evidence of partially observed MRFs based on the Laplace approximation. We also introduce a number of approximate MCMC-based methods for comparison but find that the Laplace approximation significantly outperforms these.'
volume: 31
URL: https://proceedings.mlr.press/v31/chen13c.html
PDF: http://proceedings.mlr.press/v31/chen13c.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-chen13c.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Yutian
family: Chen
- given: Max
family: Welling
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 178-186
id: chen13c
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 178
lastpage: 186
published: 2013-04-29 00:00:00 +0000
- title: 'Why Steiner-tree type algorithms work for community detection'
abstract: 'We consider the problem of reconstructing a specific connected community S ⊂V in a graph G = (V, E), where each node v is associated with a signal whose strength grows with the likelihood that v belongs to S. This problem appears in social or protein interaction network, the latter also referred to as the signaling pathway reconstruction problem. We study this community reconstruction problem under several natural generative models, and make the following two contributions. First, in the context of social networks, where the signals are modeled as bounded-supported random variables, we design an efficient algorithm for recovering most members in S with well-controlled false positive overhead, by utilizing the network structure for a large family of “homogeneous” generative models. This positive result is complemented by an information theoretic lower bound for the case where the network structure is unknown or the network is heterogeneous. Second, we consider the case in which the graph represents the protein interaction network, in which it is customary to consider signals that have unbounded support, we generalize our first contribution to give the first theoretical justification of why existing Steiner-tree type heuristics work well in practice.'
volume: 31
URL: https://proceedings.mlr.press/v31/chiang13a.html
PDF: http://proceedings.mlr.press/v31/chiang13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-chiang13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Mung
family: Chiang
- given: Henry
family: Lam
- given: Zhenming
family: Liu
- given: Vincent
family: Poor
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 187-195
id: chiang13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 187
lastpage: 195
published: 2013-04-29 00:00:00 +0000
- title: 'A simple sketching algorithm for entropy estimation over streaming data'
abstract: 'We consider the problem of approximating the empirical Shannon entropy of a high-frequency data stream under the relaxed strict-turnstile model, when space limitations make exact computation infeasible. An equivalent measure of entropy is the Renyi entropy that depends on a constant α. This quantity can be estimated efficiently and unbiasedly from a low-dimensional synopsis called an α-stable data sketch via the method of compressed counting. An approximation to the Shannon entropy can be obtained from the Renyi entropy by taking alpha sufficiently close to 1. However, practical guidelines for parameter calibration with respect to αare lacking. We avoid this problem by showing that the random variables used in estimating the Renyi entropy can be transformed to have a proper distributional limit as αapproaches 1: the maximally skewed, strictly stable distribution with α= 1 defined on the entire real line. We propose a family of asymptotically unbiased log-mean estimators of the Shannon entropy, indexed by a constant ζ> 0, that can be computed in a single-pass algorithm to provide an additive approximation. We recommend the log-mean estimator with ζ= 1 that has exponentially decreasing tail bounds on the error probability, asymptotic relative efficiency of 0.932, and near-optimal computational complexity. '
volume: 31
URL: https://proceedings.mlr.press/v31/clifford13a.html
PDF: http://proceedings.mlr.press/v31/clifford13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-clifford13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Peter
family: Clifford
- given: Ioana
family: Cosma
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 196-206
id: clifford13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 196
lastpage: 206
published: 2013-04-29 00:00:00 +0000
- title: 'Deep Gaussian Processes'
abstract: 'In this paper we introduce deep Gaussian process (GP) models. Deep GPs are a deep belief network based on Gaussian process mappings. The data is modeled as the output of a multivariate GP. The inputs to that Gaussian process are then governed by another GP. A single layer model is equivalent to a standard GP or the GP latent variable model (GP-LVM). We perform inference in the model by approximate variational marginalization. This results in a strict lower bound on the marginal likelihood of the model which we use for model selection (number of layers and nodes per layer). Deep belief networks are typically applied to relatively large data sets using stochastic gradient descent for optimization. Our fully Bayesian treatment allows for the application of deep models even when data is scarce. Model selection by our variational bound shows that a five layer hierarchy is justified even when modelling a digit data set containing only 150 examples.'
volume: 31
URL: https://proceedings.mlr.press/v31/damianou13a.html
PDF: http://proceedings.mlr.press/v31/damianou13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-damianou13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Andreas
family: Damianou
- given: Neil D.
family: Lawrence
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 207-215
id: damianou13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 207
lastpage: 215
published: 2013-04-29 00:00:00 +0000
- title: 'ODE parameter inference using adaptive gradient matching with Gaussian processes'
abstract: 'Parameter inference in mechanistic models based on systems of coupled differential equations is a topical yet computationally challenging problem, due to the need to follow each parameter adaptation with a numerical integration of the differential equations. Techniques based on gradient matching, which aim to minimize the discrepancy between the slope of a data interpolant and the derivatives predicted from the differential equations, offer a computationally appealing shortcut to the inference problem. The present paper discusses a method based on nonparametric Bayesian statistics with Gaussian processes due to Calderhead et al. (2008), and shows how inference in this model can be substantially improved by consistently sampling from the joint distribution of the ODE parameters and GP hyperparameters. We demonstrate the efficiency of our adaptive gradient matching technique on three benchmark systems, and perform a detailed comparison with the method in Calderhead et al. (2008) and the explicit ODE integration approach, both in terms of parameter inference accuracy and in terms of computational efficiency.'
volume: 31
URL: https://proceedings.mlr.press/v31/dondelinger13a.html
PDF: http://proceedings.mlr.press/v31/dondelinger13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-dondelinger13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Frank
family: Dondelinger
- given: Dirk
family: Husmeier
- given: Simon
family: Rogers
- given: Maurizio
family: Filippone
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 216-228
id: dondelinger13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 216
lastpage: 228
published: 2013-04-29 00:00:00 +0000
- title: 'Uncover Topic-Sensitive Information Diffusion Networks'
abstract: 'Analyzing the spreading patterns of memes with respect to their topic distributions and the underlying diffusion network structures is an important task in social network analysis. This task in many cases becomes very challenging since the underlying diffusion networks are often hidden, and the topic specific transmission rates are unknown either. In this paper, we propose a continuous time model, TopicCascade, for topic-sensitive information diffusion networks, and infer the hidden diffusion networks and the topic dependent transmission rates from the observed time stamps and contents of cascades. One attractive property of the model is that its parameters can be estimated via a convex optimization which we solve with an efficient proximal gradient based block coordinate descent (BCD) algorithm. In both synthetic and real-world data, we show that our method significantly improves over the previous state-of-the-art models in terms of both recovering the hidden diffusion networks and predicting the transmission times of memes.'
volume: 31
URL: https://proceedings.mlr.press/v31/du13a.html
PDF: http://proceedings.mlr.press/v31/du13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-du13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Nan
family: Du
- given: Le
family: Song
- given: Hyenkyun
family: Woo
- given: Hongyuan
family: Zha
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 229-237
id: du13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 229
lastpage: 237
published: 2013-04-29 00:00:00 +0000
- title: 'Stochastic blockmodeling of relational event dynamics'
abstract: 'Several approaches have recently been proposed for modeling of continuous-time network data via dyadic event rates conditioned on the observed history of events and nodal or dyadic covariates. In many cases, however, interaction propensities – and even the underlying mechanisms of interaction – vary systematically across subgroups whose identities are unobserved. For static networks such heterogeneity has been treated via methods such as stochastic blockmodeling, which operate by assuming latent groups of individuals with similar tendencies in their group-wise interactions. Here we combine ideas from stochastic blockmodeling and continuous-time network models by positing a latent partition of the node set such that event dynamics within and between subsets evolve in potentially distinct ways. We illustrate the use of our model family by application to several forms of dyadic interaction data, including email communication and Twitter direct messages. Parameter estimates from the fitted models clearly reveal heterogeneity in the dynamics among groups of individuals. We also find that the fitted models have better predictive accuracy than both baseline models and relational event models that lack latent structure. '
volume: 31
URL: https://proceedings.mlr.press/v31/dubois13a.html
PDF: http://proceedings.mlr.press/v31/dubois13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-dubois13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Christopher
family: DuBois
- given: Carter
family: Butts
- given: Padhraic
family: Smyth
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 238-246
id: dubois13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 238
lastpage: 246
published: 2013-04-29 00:00:00 +0000
- title: 'Dynamic Copula Networks for Modeling Real-valued Time Series'
abstract: 'Probabilistic modeling of temporal phenomena is of central importance in a variety of fields ranging from neuroscience to economics to speech recognition. While the task has received extensive attention in recent decades, learning temporal models for multivariate real-valued data that is non-Gaussian is still a formidable challenge. Recently, the power of copulas, a framework for representing complex multi-modal and heavy-tailed distributions, was fused with the formalism of Bayesian networks to allow for flexible modeling of high-dimensional distributions. In this work we introduce Dynamic Copula Bayesian Networks, a generalization aimed at capturing the distribution of rich temporal sequences. We apply our model to three markedly different real-life domains and demonstrate substantial quantitative and qualitative advantage.'
volume: 31
URL: https://proceedings.mlr.press/v31/eban13a.html
PDF: http://proceedings.mlr.press/v31/eban13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-eban13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Elad
family: Eban
- given: Gideon
family: Rothschild
- given: Adi
family: Mizrahi
- given: Israel
family: Nelken
- given: Gal
family: Elidan
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 247-255
id: eban13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 247
lastpage: 255
published: 2013-04-29 00:00:00 +0000
- title: 'Data-driven covariate selection for nonparametric estimation of causal effects'
abstract: 'The estimation of causal effects from non-experimental data is a fundamental problem in many fields of science. One of the main obstacles concerns confounding by observed or latent covariates, an issue which is typically tackled by adjusting for some set of observed covariates. In this contribution, we analyze the problem of inferring whether a given variable has a causal effect on another and, if it does, inferring an adjustment set of covariates that yields a consistent and unbiased estimator of this effect, based on the (conditional) independence and dependence relationships among the observed variables. We provide two elementary rules that we show to be both sound and complete for this task, and compare the performance of a straightforward application of these rules with standard alternative procedures for selecting adjustment sets.'
volume: 31
URL: https://proceedings.mlr.press/v31/entner13a.html
PDF: http://proceedings.mlr.press/v31/entner13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-entner13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Doris
family: Entner
- given: Patrik
family: Hoyer
- given: Peter
family: Spirtes
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 256-264
id: entner13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 256
lastpage: 264
published: 2013-04-29 00:00:00 +0000
- title: 'Learning to Top-K Search using Pairwise Comparisons'
abstract: 'Given a collection of N items with some unknown underlying ranking, we examine how to use pairwise comparisons to determine the top ranked items in the set. Resolving the top items from pairwise comparisons has application in diverse fields ranging from recommender systems to image-based search to protein structure analysis. In this paper we introduce techniques to resolve the top ranked items using significantly fewer than all the possible pairwise comparisons using both random and adaptive sampling methodologies. Using randomly-chosen comparisons, a graph-based technique is shown to efficiently resolve the top O(\logN) items when there are no comparison errors. In terms of adaptively-chosen comparisons, we show how the top O(\logN) items can be found, even in the presence of corrupted observations, using a voting methodology that only requires O(N\log^2N) pairwise comparisons.'
volume: 31
URL: https://proceedings.mlr.press/v31/eriksson13a.html
PDF: http://proceedings.mlr.press/v31/eriksson13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-eriksson13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Brian
family: Eriksson
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 265-273
id: eriksson13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 265
lastpage: 273
published: 2013-04-29 00:00:00 +0000
- title: 'Predictive Correlation Screening: Application to Two-stage Predictor Design in High Dimension'
abstract: 'We introduce a new approach to variable selection, called Predictive Correlation Screening, for predictor design. Predictive Correlation Screening (PCS) implements false positive control on the selected variables, is well suited to small sample sizes, and is scalable to high dimensions. We establish asymptotic bounds for Familywise Error Rate (FWER), and resultant mean square error of a linear predictor on the selected variables. We apply Predictive Correlation Screening to the following two-stage predictor design problem. An experimenter wants to learn a multivariate predictor of gene expressions based on successive biological samples assayed on mRNA arrays. She assays the whole genome on a few samples and from these assays she selects a small number of variables using Predictive Correlation Screening. To reduce assay cost, she subsequently assays only the selected variables on the remaining samples, to learn the predictor coefficients. We show superiority of Predictive Correlation Screening relative to LASSO and correlation learning (sometimes popularly referred to in the literature as marginal regression or simple thresholding) in terms of performance and computational complexity.'
volume: 31
URL: https://proceedings.mlr.press/v31/firouzi13a.html
PDF: http://proceedings.mlr.press/v31/firouzi13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-firouzi13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Hamed
family: Firouzi
- given: Bala
family: Rajaratnam
- given: Alfred
family: Hero III
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 274-288
id: firouzi13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 274
lastpage: 288
published: 2013-04-29 00:00:00 +0000
- title: 'Mixed LICORS: A Nonparametric Algorithm for Predictive State Reconstruction'
abstract: 'We introduce mixed LICORS, an algorithm for learning nonlinear, high-dimensional dynamics from spatio-temporal data, suitable for both prediction and simulation. Mixed LICORS extends the recent LICORS algorithm (Goerg and Shalizi, 2012) from hard clustering of predictive distributions to a non-parametric, EM-like soft clustering. This retains the asymptotic predictive optimality of LICORS, but, as we show in simulations, greatly improves out-of-sample forecasts with limited data. The new method is implemented in the publicly-available R package LICORS.'
volume: 31
URL: https://proceedings.mlr.press/v31/goerg13a.html
PDF: http://proceedings.mlr.press/v31/goerg13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-goerg13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Georg
family: Goerg
- given: Cosma
family: Shalizi
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 289-297
id: goerg13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 289
lastpage: 297
published: 2013-04-29 00:00:00 +0000
- title: 'Unsupervised Link Selection in Networks'
abstract: 'Real-world networks are often noisy, and the existing linkage structure may not be reliable. For example, a link which connects nodes from different communities may affect the group assignment of nodes in a negative way. In this paper, we study a new problem called link selection, which can be seen as the network equivalent of the traditional feature selection problem in machine learning. More specifically, we investigate unsupervised link selection as follows: given a network, it selects a subset of informative links from the original network which enhance the quality of community structures. To achieve this goal, we use Ratio Cut size of a network as the quality measure. The resulting link selection approach can be formulated as a semi-definite programming problem. In order to solve it efficiently, we propose a backward elimination algorithm using sequential optimization. Experiments on benchmark network datasets illustrate the effectiveness of our method.'
volume: 31
URL: https://proceedings.mlr.press/v31/gu13a.html
PDF: http://proceedings.mlr.press/v31/gu13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-gu13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Quanquan
family: Gu
- given: Charu
family: Aggarwal
- given: Jiawei
family: Han
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 298-306
id: gu13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 298
lastpage: 306
published: 2013-04-29 00:00:00 +0000
- title: 'Clustered Support Vector Machines'
abstract: 'In many problems of machine learning, the data are distributed nonlinearly. One way to address this kind of data is training a nonlinear classifier such as kernel support vector machine (kernel SVM). However, the computational burden of kernel SVM limits its application to large scale datasets. In this paper, we propose a Clustered Support Vector Machine (CSVM), which tackles the data in a divide and conquer manner. More specifically, CSVM groups the data into several clusters, followed which it trains a linear support vector machine in each cluster to separate the data locally. Meanwhile, CSVM has an additional global regularization, which requires the weight vector of each local linear SVM aligning with a global weight vector. The global regularization leverages the information from one cluster to another, and avoids over-fitting in each cluster. We derive a data-dependent generalization error bound for CSVM, which explains the advantage of CSVM over linear SVM. Experiments on several benchmark datasets show that the proposed method outperforms linear SVM and some other related locally linear classifiers. It is also comparable to a fine-tuned kernel SVM in terms of prediction performance, while it is more efficient than kernel SVM.'
volume: 31
URL: https://proceedings.mlr.press/v31/gu13b.html
PDF: http://proceedings.mlr.press/v31/gu13b.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-gu13b.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Quanquan
family: Gu
- given: Jiawei
family: Han
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 307-315
id: gu13b
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 307
lastpage: 315
published: 2013-04-29 00:00:00 +0000
- title: 'DivMCuts: Faster Training of Structural SVMs with Diverse M-Best Cutting-Planes'
abstract: 'Training of Structural SVMs involves solving a large Quadratic Program (QP). One popular method for solving this QP is a cutting-plane approach, where the most violated constraint is iteratively added to a working-set of constraints. Unfortunately, training models with a large number of parameters remains a time consuming process. This paper shows that significant computational savings can be achieved by adding multiple diverse and highly violated constraints at every iteration of the cutting-plane algorithm. We show that generation of such diverse cutting-planes involves extracting diverse M-Best solutions from the loss-augmented score of the training instances. To find these diverse M-Best solutions, we employ a recently proposed algorithm [4]. Our experiments on image segmentation and protein side-chain prediction show that the proposed approach can lead to significant computational savings, e.g., ∼28% reduction in training time.'
volume: 31
URL: https://proceedings.mlr.press/v31/guzman-rivera13a.html
PDF: http://proceedings.mlr.press/v31/guzman-rivera13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-guzman-rivera13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Abner
family: Guzman-Rivera
- given: Pushmeet
family: Kohli
- given: Dhruv
family: Batra
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 316-324
id: guzman-rivera13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 316
lastpage: 324
published: 2013-04-29 00:00:00 +0000
- title: 'Recursive Karcher Expectation Estimators And Geometric Law of Large Numbers'
abstract: 'This paper studies a form of law of large numbers on Pn, the space of nxn symmetric positive-definite matrices equipped with Fisher-Rao metric. Specifically, we propose a recursive algorithm for estimating the Karcher expectation of an arbitrary distribution defined on Pn, and we show that the estimates computed by the recursive algorithm asymptotically converge in probability to the correct Karcher expectation. The steps in the recursive algorithm mainly consist of making appropriate moves on geodesics in Pn, and the algorithm is simple to implement and it offers a tremendous gain in computation time of several orders in magnitude over existing non-recursive algorithms. We elucidate the connection between the more familiar law of large numbers for real-valued random variables and the asymptotic convergence of the proposed recursive algorithm, and our result provides an example of a new form of law of large numbers for random variables taking values in a Riemannian manifold. From the practical side, the computation of the mean of a collection of symmetric positive-definite (SPD) matrices is a fundamental ingredient in many algorithms in machine learning, computer vision and medical imaging applications. We report an experiment using the proposed recursive algorithm for K-means clustering, demonstrating the algorithm’s efficiency, accuracy and stability.'
volume: 31
URL: https://proceedings.mlr.press/v31/ho13a.html
PDF: http://proceedings.mlr.press/v31/ho13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-ho13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Jeffrey
family: Ho
- given: Guang
family: Cheng
- given: Hesamoddin
family: Salehian
- given: Baba
family: Vemuri
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 325-332
id: ho13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 325
lastpage: 332
published: 2013-04-29 00:00:00 +0000
- title: 'DYNACARE: Dynamic Cardiac Arrest Risk Estimation'
abstract: 'Cardiac arrest is a deadly condition caused by a sudden failure of the heart with an in-hospital mortality rate of ∼80%. Therefore, the ability to accurately estimate patients at high risk of cardiac arrest is crucial for improving the survival rate. Existing research generally fails to utilize a patient’s temporal dynamics. In this paper, we present two dynamic cardiac risk estimation models, focusing on different temporal signatures in a patient’s risk trajectory. These models can track a patient’s risk trajectory in real time, allow interpretability and predictability of a cardiac arrest event, provide an intuitive visualization to medical professionals, offer a personalized dynamic hazard function, and estimate the risk for a new patient.'
volume: 31
URL: https://proceedings.mlr.press/v31/ho13b.html
PDF: http://proceedings.mlr.press/v31/ho13b.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-ho13b.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Joyce
family: Ho
- given: Yubin
family: Park
- given: Carlos
family: Carvalho
- given: Joydeep
family: Ghosh
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 333-341
id: ho13b
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 333
lastpage: 341
published: 2013-04-29 00:00:00 +0000
- title: 'Active Learning for Interactive Visualization'
abstract: 'Many automatic visualization methods have been proposed. However, a visualization that is automatically generated might be different to how a user wants to arrange the objects in visualization space. By allowing users to re-locate objects in the embedding space of the visualization, they can adjust the visualization to their preference. We propose an active learning framework for interactive visualization which selects objects for the user to re-locate so that they can obtain their desired visualization by re-locating as few as possible. The framework is based on an information theoretic criterion, which favors objects that reduce the uncertainty of the visualization. We present a concrete application of the proposed framework to the Laplacian eigenmap visualization method. We demonstrate experimentally that the proposed framework yields the desired visualization with fewer user interactions than existing methods.'
volume: 31
URL: https://proceedings.mlr.press/v31/iwata13a.html
PDF: http://proceedings.mlr.press/v31/iwata13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-iwata13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Tomoharu
family: Iwata
- given: Neil
family: Houlsby
- given: Zoubin
family: Ghahramani
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 342-350
id: iwata13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 342
lastpage: 350
published: 2013-04-29 00:00:00 +0000
- title: 'A Parallel, Block Greedy Method for Sparse Inverse Covariance Estimation for Ultra-high Dimensions'
abstract: 'Discovering the graph structure of a Gaussian Markov Random Field is an important problem in application areas such as computational biology and atmospheric sciences. This task, which translates to estimating the sparsity pattern of the inverse covariance matrix, has been extensively studied in the literature. However, the existing approaches are unable to handle ultra-high dimensional datasets and there is a crucial need to develop methods that are both highly scalable and memory-efficient. In this paper, we present GINCO, a blocked greedy method for sparse inverse covariance matrix estimation. We also present detailed description of a highly-scalable and memory-efficient implementation of GINCO, which is able to operate on both shared- and distributed-memory architectures. Our implementation is able recover the sparsity pattern of 25,000 vertex random and chain graphs with 87% and 84% accuracy in \le 5 minutes using \le 10GB of memory on a single 8-core machine. Furthermore, our method is statistically consistent in recovering the sparsity pattern of the inverse covariance matrix, which we demonstrate through extensive empirical studies.'
volume: 31
URL: https://proceedings.mlr.press/v31/kambadur13a.html
PDF: http://proceedings.mlr.press/v31/kambadur13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-kambadur13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Prabhanjan
family: Kambadur
- given: Aurelie
family: Lozano
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 351-359
id: kambadur13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 351
lastpage: 359
published: 2013-04-29 00:00:00 +0000
- title: 'Beyond Sentiment: The Manifold of Human Emotions'
abstract: 'Sentiment analysis predicts the presence of positive or negative emotions in a text document. In this paper we consider higher dimensional extensions of the sentiment concept, which represent a richer set of human emotions. Our approach goes beyond previous work in that our model contains a continuous manifold rather than a finite set of human emotions. We investigate the resulting model, compare it to psychological observations, and explore its predictive capabilities. Besides obtaining significant improvements over a baseline without manifold, we are also able to visualize different notions of positive sentiment in different domains.'
volume: 31
URL: https://proceedings.mlr.press/v31/kim13a.html
PDF: http://proceedings.mlr.press/v31/kim13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-kim13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Seungyeon
family: Kim
- given: Fuxin
family: Li
- given: Guy
family: Lebanon
- given: Irfan
family: Essa
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 360-369
id: kim13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 360
lastpage: 369
published: 2013-04-29 00:00:00 +0000
- title: 'Exact Learning of Bounded Tree-width Bayesian Networks'
abstract: 'Inference in Bayesian networks is known to be NP-hard, but if the network has bounded tree-width, then inference becomes tractable. Not surprisingly, learning networks that closely match the given data and have a bounded tree-width has recently attracted some attention. In this paper we aim to lay groundwork for future research on the topic by studying the exact complexity of this problem. We give the first non-trivial exact algorithm for the NP-hard problem of finding an optimal Bayesian network of tree-width at most w, with running time 3^n n^w + O(1), and provide an implementation of this algorithm. Additionally, we propose a variant of Bayesian network learning with “super-structures”, and show that finding a Bayesian network consistent with a given super-structure is fixed-parameter tractable in the tree-width of the super-structure.'
volume: 31
URL: https://proceedings.mlr.press/v31/korhonen13a.html
PDF: http://proceedings.mlr.press/v31/korhonen13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-korhonen13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Janne
family: Korhonen
- given: Pekka
family: Parviainen
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 370-378
id: korhonen13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 370
lastpage: 378
published: 2013-04-29 00:00:00 +0000
- title: 'Structural Expectation Propagation (SEP): Bayesian structure learning for networks with latent variables'
abstract: 'Learning the structure of discrete Bayesian networks has been the subject of extensive research in machine learning, with most Bayesian approaches focusing on fully observed networks. One of few the methods that can handle networks with latent variables is the "structural EM algorithm" which interleaves greedy structure search with the estimation of latent variables and parameters, maintaining a single best network at each step. We introduce Structural Expectation Propagation (SEP), an extension of EP which can infer the structure of Bayesian networks having latent variables and missing data. SEP performs variational inference in a joint model of structure, latent variables, and parameters, offering two advantages: (i) it accounts for uncertainty in structure and parameter values when making local distribution updates (ii) it returns a variational distribution over network structures rather than a single network. We demonstrate the performance of SEP both on synthetic problems and on real-world clinical data. '
volume: 31
URL: https://proceedings.mlr.press/v31/lazic13a.html
PDF: http://proceedings.mlr.press/v31/lazic13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-lazic13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Nevena
family: Lazic
- given: Christopher
family: Bishop
- given: John
family: Winn
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 379-387
id: lazic13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 379
lastpage: 387
published: 2013-04-29 00:00:00 +0000
- title: 'Structure Learning of Mixed Graphical Models'
abstract: 'We consider the problem of learning the structure of a pairwise graphical model over continuous and discrete variables. We present a new pairwise model for graphical models with both continuous and discrete variables that is amenable to structure learning. In previous work, authors have considered structure learning of Gaussian graphical models and structure learning of discrete models. Our approach is a natural generalization of these two lines of work to the mixed case. The penalization scheme is new and follows naturally from a particular parametrization of the model.'
volume: 31
URL: https://proceedings.mlr.press/v31/lee13a.html
PDF: http://proceedings.mlr.press/v31/lee13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-lee13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Jason
family: Lee
- given: Trevor
family: Hastie
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 388-396
id: lee13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 388
lastpage: 396
published: 2013-04-29 00:00:00 +0000
- title: 'Dynamic Scaled Sampling for Deterministic Constraints'
abstract: 'Deterministic and near-deterministic relationships among subsets of random variables in multivariate systems are known to cause serious problems for Monte Carlo algorithms. We examine the case in which the relationship Z = f(X_1,...,X_k) holds, where each X_i has a continuous prior pdf and we wish to obtain samples from the conditional distribution P(X_1,...,X_k | Z= s). When f is addition, the problem is NP-hard even when the X_i are independent. In more restricted cases — for example, i.i.d. Boolean or categorical X_i — efficient exact samplers have been obtained previously. For the general continuous case, we propose a dynamic scaling algorithm (DYSC), and prove that it has O(k) expected running time and finite variance. We discuss generalizations of DYSC to functions f described by binary operation trees. We evaluate the algorithm on several examples. '
volume: 31
URL: https://proceedings.mlr.press/v31/li13a.html
PDF: http://proceedings.mlr.press/v31/li13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-li13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Lei
family: Li
- given: Bharath
family: Ramsundar
- given: Stuart
family: Russell
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 397-405
id: li13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 397
lastpage: 405
published: 2013-04-29 00:00:00 +0000
- title: 'Learning Markov Networks With Arithmetic Circuits'
abstract: 'Markov networks are an effective way to represent complex probability distributions. However, learning their structure and parameters or using them to answer queries is typically intractable. One approach to making learning and inference tractable is to use approximations, such as pseudo-likelihood or approximate inference. An alternate approach is to use a restricted class of models where exact inference is always efficient. Previous work has explored low treewidth models, models with tree-structured features, and latent variable models. In this paper, we introduce ACMN, the first ever method for learning efficient Markov networks with arbitrary conjunctive features. The secret to ACMN’s greater flexibility is its use of arithmetic circuits, a linear-time inference representation that can handle many high treewidth models by exploiting local structure. ACMN uses the size of the corresponding arithmetic circuit as a learning bias, allowing it to trade off accuracy and inference complexity. In experiments on 12 standard datasets, the tractable models learned by ACMN are more accurate than both tractable models learned by other algorithms and approximate inference in intractable models. '
volume: 31
URL: https://proceedings.mlr.press/v31/lowd13a.html
PDF: http://proceedings.mlr.press/v31/lowd13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-lowd13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Daniel
family: Lowd
- given: Amirmohammad
family: Rooshenas
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 406-414
id: lowd13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 406
lastpage: 414
published: 2013-04-29 00:00:00 +0000
- title: 'Texture Modeling with Convolutional Spike-and-Slab RBMs and Deep Extensions'
abstract: 'We apply the spike-and-slab Restricted Boltzmann Machine (ssRBM) to texture modeling. The ssRBM with tiled-convolution weight sharing (TssRBM) achieves or surpasses the state-of-the-art on texture synthesis and inpainting by parametric models. We also develop a novel RBM model with a spike-and-slab visible layer and binary variables in the hidden layer. This model is designed to be stacked on top of the ssRBM. We show the resulting deep belief network (DBN) is a powerful generative model that improves on single-layer models and is capable of modeling not only single high-resolution and challenging textures but also multiple textures with fixed-size filters in the bottom layer.'
volume: 31
URL: https://proceedings.mlr.press/v31/luo13a.html
PDF: http://proceedings.mlr.press/v31/luo13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-luo13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Heng
family: Luo
- given: Pierre Luc
family: Carrier
- given: Aaron
family: Courville
- given: Yoshua
family: Bengio
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 415-423
id: luo13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 415
lastpage: 423
published: 2013-04-29 00:00:00 +0000
- title: 'Fast Near-GRID Gaussian Process Regression'
abstract: '\emphGaussian process regression (GPR) is a powerful non-linear technique for Bayesian inference and prediction. One drawback is its O(N^3) computational complexity for both prediction and hyperparameter estimation for N input points which has led to much work in sparse GPR methods. In case that the covariance function is expressible as a \emphtensor product kernel (TPK) and the inputs form a multidimensional grid, it was shown that the costs for exact GPR can be reduced to a sub-quadratic function of N. We extend these exact fast algorithms to sparse GPR and remark on a connection to \emphGaussian process latent variable models (GPLVMs). In practice, the inputs may also violate the multidimensional grid constraints so we pose and efficiently solve missing and extra data problems for both exact and sparse grid GPR. We demonstrate our method on synthetic, text scan, and magnetic resonance imaging (MRI) data reconstructions.'
volume: 31
URL: https://proceedings.mlr.press/v31/luo13b.html
PDF: http://proceedings.mlr.press/v31/luo13b.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-luo13b.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Yuancheng
family: Luo
- given: Ramani
family: Duraiswami
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 424-432
id: luo13b
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 424
lastpage: 432
published: 2013-04-29 00:00:00 +0000
- title: 'Estimating the Partition Function of Graphical Models Using Langevin Importance Sampling '
abstract: 'Graphical models are powerful in modeling a variety of applications. Computing the partition function of a graphical model is a typical inference problem and known as an NP-hard problem for general graphs. A few sampling algorithms like MCMC, Simulated Annealing Sampling (SAS), Annealed Importance Sampling (AIS) are developed to address this challenging problem. This paper describes a Langevin Importance Sampling (LIS) algorithm to compute the partition function of a graphical model. LIS first performs a random walk in the configuration-temperature space guided by the Langevin equation and then estimates the partition function using all the samples generated during the random walk, as opposed to the other configuration-temperature sampling methods, which uses only the samples at a specific temperature. Experimental results show that LIS can obtain much more accurate partition function than the others tested on several different types of graphical models. LIS performs especially well on relatively large graph models or those with a large number of local optima.'
volume: 31
URL: https://proceedings.mlr.press/v31/ma13a.html
PDF: http://proceedings.mlr.press/v31/ma13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-ma13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Jianzhu
family: Ma
- given: Jian
family: Peng
- given: Sheng
family: Wang
- given: Jinbo
family: Xu
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 433-441
id: ma13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 433
lastpage: 441
published: 2013-04-29 00:00:00 +0000
- title: 'Thompson Sampling in Switching Environments with Bayesian Online Change Detection'
abstract: 'Thompson Sampling has recently been shown to achieve the lower bound on regret in the Bernoulli Multi-Armed Bandit setting. This bandit problem assumes stationary distributions for the rewards. It is often unrealistic to model the real world as a stationary distribution. In this paper we derive and evaluate algorithms using Thompson Sampling for a Switching Multi-Armed Bandit Problem. We propose a Thompson Sampling strategy equipped with a Bayesian change point mechanism to tackle this problem. We develop algorithms for a variety of cases with constant switching rate: when switching occurs all arms change (Global Switching), switching occurs independently for each arm (Per-Arm Switching), when the switching rate is known and when it must be inferred from data. This leads to a family of algorithms we collectively term Change-Point Thompson Sampling (CTS). We show empirical results in 4 artificial environments, and 2 derived from real world data: news click-through and foreign exchange data, comparing them to some other bandit algorithms. In real world data CTS is the most effective. '
volume: 31
URL: https://proceedings.mlr.press/v31/mellor13a.html
PDF: http://proceedings.mlr.press/v31/mellor13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-mellor13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Joseph
family: Mellor
- given: Jonathan
family: Shapiro
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 442-450
id: mellor13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 442
lastpage: 450
published: 2013-04-29 00:00:00 +0000
- title: 'A Last-Step Regression Algorithm for Non-Stationary Online Learning'
abstract: 'The goal of a learner in standard online learning is to maintain an average loss close to the loss of the best-performing single function in some class. In many real-world problems, such as rating or ranking items, there is no single best target function during the runtime of the algorithm, instead the best (local) target function is drifting over time. We develop a novel last step minmax optimal algorithm in context of a drift. We analyze the algorithm in the worst-case regret framework and show that it maintains an average loss close to that of the best slowly changing sequence of linear functions, as long as the total of drift is sublinear. In some situations, our bound improves over existing bounds, and additionally the algorithm suffers logarithmic regret when there is no drift. We also build on the H1 filter and its bound, and develop and analyze a second algorithm for drifting setting. Synthetic simulations demonstrate the advantages of our algorithms in a worst-case constant drift setting.'
volume: 31
URL: https://proceedings.mlr.press/v31/moroshko13a.html
PDF: http://proceedings.mlr.press/v31/moroshko13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-moroshko13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Edward
family: Moroshko
- given: Koby
family: Crammer
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 451-462
id: moroshko13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 451
lastpage: 462
published: 2013-04-29 00:00:00 +0000
- title: 'Competing with an Infinite Set of Models in Reinforcement Learning'
abstract: 'We consider a reinforcement learning setting where the learner also has to deal with the problem of finding a suitable state-representation function from a given set of models. This has to be done while interacting with the environment in an online fashion (no resets), and the goal is to have small regret with respect to any Markov model in the set. For this setting, recently the BLB algorithm has been proposed, which achieves regret of order T^2/3, provided that the given set of models is finite. Our first contribution is to extend this result to a countably infinite set of models. Moreover, the BLB regret bound suffers from an additive term that can be exponential in the diameter of the MDP involved, since the diameter has to be guessed. The algorithm we propose avoids guessing the diameter, thus improving the regret bound.'
volume: 31
URL: https://proceedings.mlr.press/v31/nguyen13a.html
PDF: http://proceedings.mlr.press/v31/nguyen13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-nguyen13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Phuong
family: Nguyen
- given: Odalric-Ambrym
family: Maillard
- given: Daniil
family: Ryabko
- given: Ronald
family: Ortner
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 463-471
id: nguyen13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 463
lastpage: 471
published: 2013-04-29 00:00:00 +0000
- title: 'Efficient Variational Inference for Gaussian Process Regression Networks'
abstract: 'In multi-output regression applications the correlations between the response variables may vary with the input space and can be highly non-linear. Gaussian process regression networks (GPRNs) are flexible and effective models to represent such complex adaptive output dependencies. However, inference in GPRNs is intractable. In this paper we propose two efficient variational inference methods for GPRNs. The first method, GPRN-MF, adopts a mean-field approach with full Gaussians over the GPRN’s parameters as its factorizing distributions. The second method, GPRN-NPV, uses a nonparametric variational inference approach. We derive analytical forms for the evidence lower bound on both methods, which we use to learn the variational parameters and the hyper-parameters of the GPRN model. We obtain closed-form updates for the parameters of GPRN-MF and show that, while having relatively complex approximate posterior distributions, our approximate methods require the estimation of O(N) variational parameters rather than O(N2) for the parameters’ covariances. Our experiments on real data sets show that GPRN-NPV may give a better approximation to the posterior distribution compared to GPRN-MF, in terms of both predictive performance and stability. '
volume: 31
URL: https://proceedings.mlr.press/v31/nguyen13b.html
PDF: http://proceedings.mlr.press/v31/nguyen13b.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-nguyen13b.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Trung
family: Nguyen
- given: Edwin
family: Bonilla
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 472-480
id: nguyen13b
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 472
lastpage: 480
published: 2013-04-29 00:00:00 +0000
- title: 'High-dimensional Inference via Lipschitz Sparsity-Yielding Regularizers'
abstract: 'Non-convex regularizers are more and more applied to high-dimensional inference with sparsity prior knowledge. In general, the non-convex regularizer is superior to the convex ones in inference but it suffers the difficulties brought by local optimums and massive computation. A "good" regularizer should perform well in both inference and optimization. In this paper, we prove that some non-convex regularizers can be such "good" regularizers. They are a family of sparsity-yielding penalties with proper Lipschitz subgradients. These regularizers keep the superiority of non-convex regularizers in inference. Their estimation conditions based on sparse eigenvalues are weaker than the convex regularizers. Meanwhile, if properly tuned, they behave like convex regularizers since standard proximal methods guarantee to give stationary solutions. These stationary solutions, if sparse enough, are identical to the global solutions. If the solution sequence provided by proximal methods is along a sparse path, the convergence rate to the global optimum is on the order of 1/k where k is the number of iterations.'
volume: 31
URL: https://proceedings.mlr.press/v31/pan13a.html
PDF: http://proceedings.mlr.press/v31/pan13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-pan13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Zheng
family: Pan
- given: Changshui
family: Zhang
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 481-488
id: pan13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 481
lastpage: 488
published: 2013-04-29 00:00:00 +0000
- title: 'Bayesian Structure Learning for Functional Neuroimaging'
abstract: 'Predictive modeling of functional neuroimaging data has become an important tool for analyzing cognitive structures in the brain. Brain images are high-dimensional and exhibit large correlations, and imaging experiments provide a limited number of samples. Therefore, capturing the inherent statistical properties of the imaging data is critical for robust inference. Previous methods tackle this problem by exploiting either spatial sparsity or smoothness, which does not fully exploit the structure in the data. Here we develop a flexible, hierarchical model designed to simultaneously capture spatial block sparsity and smoothness in neuroimaging data. We exploit a function domain representation for the high-dimensional small-sample data and develop efficient inference, parameter estimation, and prediction procedures. Empirical results with simulated and real neuroimaging data suggest that simultaneously capturing the block sparsity and smoothness properties can significantly improve structure recovery and predictive modeling performance.'
volume: 31
URL: https://proceedings.mlr.press/v31/park13a.html
PDF: http://proceedings.mlr.press/v31/park13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-park13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Mijung
family: Park
- given: Oluwasanmi
family: Koyejo
- given: Joydeep
family: Ghosh
- given: Russell
family: Poldrack
- given: Jonathan
family: Pillow
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 489-497
id: park13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 489
lastpage: 497
published: 2013-04-29 00:00:00 +0000
- title: 'Random Projections for Support Vector Machines'
abstract: 'Let X be a data matrix of rank ρ, representing n points in d-dimensional space. The linear support vector machine constructs a hyperplane separator that maximizes the 1-norm soft margin. We develop a new oblivious dimension reduction technique which is precomputed and can be applied to any input matrix X. We prove that, with high probability, the margin and minimum enclosing ball in the feature space are preserved to within ε-relative error, ensuring comparable generalization as in the original space. We present extensive experiments with real and synthetic data to support our theory.'
volume: 31
URL: https://proceedings.mlr.press/v31/paul13a.html
PDF: http://proceedings.mlr.press/v31/paul13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-paul13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Saurabh
family: Paul
- given: Christos
family: Boutsidis
- given: Malik
family: Magdon-Ismail
- given: Petros
family: Drineas
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 498-506
id: paul13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 498
lastpage: 506
published: 2013-04-29 00:00:00 +0000
- title: 'Distribution-Free Distribution Regression'
abstract: 'Distribution regression refers to the situation where a response Y depends on a covariate P where P is a probability distribution. The model is Y=f(P) + e where f is an unknown regression function and e is a random error. Typically, we do not observe P directly, but rather, we observe a sample from P. In this paper we develop theory and methods for distribution-free versions of distribution regression. This means that we do not make strong distributional assumptions about the error term e and covariate P. We prove that when the effective dimension is small enough (as measured by the doubling dimension), then the excess prediction risk converges to zero with a polynomial rate.'
volume: 31
URL: https://proceedings.mlr.press/v31/poczos13a.html
PDF: http://proceedings.mlr.press/v31/poczos13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-poczos13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Barnabas
family: Poczos
- given: Aarti
family: Singh
- given: Alessandro
family: Rinaldo
- given: Larry
family: Wasserman
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 507-515
id: poczos13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 507
lastpage: 515
published: 2013-04-29 00:00:00 +0000
- title: 'Localization and Adaptation in Online Learning'
abstract: 'We introduce a formalism of localization for online learning problems, which, similarly to statistical learning theory, can be used to obtain fast rates. In particular, we introduce local sequential Rademacher complexities and other local measures. Based on the idea of relaxations for deriving algorithms, we provide a template method that takes advantage of localization. Furthermore, we build a general adaptive method that can take advantage of the suboptimality of the observed sequence. We illustrate the utility of the introduced concepts on several problems. Among them is a novel upper bound on regret in terms of classical Rademacher complexity when the data are i.i.d.'
volume: 31
URL: https://proceedings.mlr.press/v31/rakhlin13a.html
PDF: http://proceedings.mlr.press/v31/rakhlin13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-rakhlin13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Alexander
family: Rakhlin
- given: Ohad
family: Shamir
- given: Karthik
family: Sridharan
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 516-526
id: rakhlin13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 516
lastpage: 526
published: 2013-04-29 00:00:00 +0000
- title: 'A recursive estimate for the predictive likelihood in a topic model'
abstract: 'We consider the problem of evaluating the predictive log likelihood of a previously un- seen document under a topic model. This task arises when cross-validating for a model hyperparameter, when testing a model on a hold-out set, and when comparing the performance of different fitting strategies. Yet it is known to be very challenging, as it is equivalent to estimating a marginal likelihood in Bayesian model selection. We propose a fast algorithm for approximating this likelihood, one whose computational cost is linear both in document length and in the number of topics. The method is a first-order approximation to the algorithm of Carvalho et al. (2010), and can also be interpreted as a one-particle, Rao-Blackwellized version of the "left-to-right" method of Wallach et al. (2009). On our test examples, the proposed method gives similar answers to these other methods, but at lower computational cost.'
volume: 31
URL: https://proceedings.mlr.press/v31/scott13a.html
PDF: http://proceedings.mlr.press/v31/scott13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-scott13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: James
family: Scott
- given: Jason
family: Baldridge
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 527-535
id: scott13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 527
lastpage: 535
published: 2013-04-29 00:00:00 +0000
- title: 'Detecting Activations over Graphs using Spanning Tree Wavelet Bases'
abstract: 'We consider the detection of clusters of activation over graphs under Gaussian noise. This problem appears in many real world scenarios, such as the detecting contamination or seismic activity by sensor networks, viruses in human and computer networks, and groups with anomalous behavior in social and biological networks. Despite the wide applicability of such a detection algorithm, there has been little success in the development of computationally feasible methods with provable theoretical guarantees. To this end, we introduce the spanning tree wavelet basis over a graph, a localized basis that reflects the topology of the graph. We first provide a necessary condition for asymptotic distinguishability of the null and alternative hypotheses. Then we prove that for any spanning tree, we can hope to correctly detect signals in a low signal-to-noise regime using spanning tree wavelets. We propose a randomized test, in which we use a uniform spanning tree in the basis construction. Using electrical network theory, we show that the uniform spanning tree provides strong guarantees that in many cases match our necessary condition. We prove that for edge transitive graphs, k-nearest neighbor graphs, and ε-graphs we obtain nearly optimal performance with the uniform spanning tree wavelet detector.'
volume: 31
URL: https://proceedings.mlr.press/v31/sharpnack13a.html
PDF: http://proceedings.mlr.press/v31/sharpnack13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-sharpnack13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: James
family: Sharpnack
- given: Aarti
family: Singh
- given: Akshay
family: Krishnamurthy
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 536-544
id: sharpnack13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 536
lastpage: 544
published: 2013-04-29 00:00:00 +0000
- title: 'Changepoint Detection over Graphs with the Spectral Scan Statistic'
abstract: 'We consider the change-point detection problem of deciding, based on noisy measurements, whether an unknown signal over a given graph is constant or is instead piecewise constant over two induced subgraphs of relatively low cut size. We analyze the corresponding generalized likelihood ratio (GLR) statistic and relate it to the problem of finding a sparsest cut in a graph. We develop a tractable relaxation of the GLR statistic based on the combinatorial Laplacian of the graph, which we call the spectral scan statistic, and analyze its properties. We show how its performance as a testing procedure depends directly on the spectrum of the graph, and use this result to explicitly derive its asymptotic properties on few graph topologies. Finally, we demonstrate both theoretically and by simulations that the spectral scan statistic can outperform naive testing procedures based on edge thresholding and χ^2 testing.'
volume: 31
URL: https://proceedings.mlr.press/v31/sharpnack13b.html
PDF: http://proceedings.mlr.press/v31/sharpnack13b.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-sharpnack13b.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: James
family: Sharpnack
- given: Aarti
family: Singh
- given: Alessandro
family: Rinaldo
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 545-553
id: sharpnack13b
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 545
lastpage: 553
published: 2013-04-29 00:00:00 +0000
- title: 'Central Limit Theorems for Conditional Markov Chains'
abstract: 'This paper studies Central Limit Theorems for real-valued functionals of Conditional Markov Chains. Using a classical result by Dobrushin (1956) for non-stationary Markov chains, a conditional Central Limit Theorem for fixed sequences of observations is established. The asymptotic variance can be estimated by resampling the latent states conditional on the observations. If the conditional means themselves are asymptotically normally distributed, an unconditional Central Limit Theorem can be obtained. The methodology is used to construct a statistical hypothesis test which is applied to synthetically generated environmental data.'
volume: 31
URL: https://proceedings.mlr.press/v31/sinn13a.html
PDF: http://proceedings.mlr.press/v31/sinn13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-sinn13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Mathieu
family: Sinn
- given: Bei
family: Chen
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 554-562
id: sinn13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 554
lastpage: 562
published: 2013-04-29 00:00:00 +0000
- title: 'Statistical Tests for Contagion in Observational Social Network Studies'
abstract: 'Current tests for contagion in social network studies are vulnerable to the confounding effects of latent homophily (i.e., ties form preferentially between individuals with similar hidden traits). We demonstrate a general method to lower bound the strength of causal effects in observational social network studies, even in the presence of arbitrary, unobserved individual traits. Our tests require no parametric assumptions and each test is associated with an algebraic proof. We demonstrate the effectiveness of our approach by correctly deducing the causal effects for examples previously shown to expose defects in existing methodology. Finally, we discuss preliminary results on data taken from the Framingham Heart Study.'
volume: 31
URL: https://proceedings.mlr.press/v31/steeg13a.html
PDF: http://proceedings.mlr.press/v31/steeg13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-steeg13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Greg
family: Ver Steeg
- given: Aram
family: Galstyan
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 563-571
id: steeg13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 563
lastpage: 571
published: 2013-04-29 00:00:00 +0000
- title: 'Completeness Results for Lifted Variable Elimination'
abstract: 'Lifting aims at improving the efficiency of probabilistic inference by exploiting symmetries in the model. Various methods for lifted probabilistic inference have been proposed, but our understanding of these methods and the relationships between them is still limited, compared to their propositional counterparts. The only existing theoretical characterization of lifting is a completeness result for weighted first-order model counting. This paper addresses the question whether the same completeness result holds for other lifted inference algorithms. We answer this question positively for lifted variable elimination (LVE). Our proof relies on introducing a novel inference operator for LVE.'
volume: 31
URL: https://proceedings.mlr.press/v31/taghipour13a.html
PDF: http://proceedings.mlr.press/v31/taghipour13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-taghipour13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Nima
family: Taghipour
- given: Daan
family: Fierens
- given: Guy
family: Van den Broeck
- given: Jesse
family: Davis
- given: Hendrik
family: Blockeel
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 572-580
id: taghipour13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 572
lastpage: 580
published: 2013-04-29 00:00:00 +0000
- title: 'Supervised Sequential Classification Under Budget Constraints'
abstract: 'In this paper we develop a framework for a sequential decision making under budget constraints for multi-class classification. In many classification systems, such as medical diagnosis and homeland security, sequential decisions are often warranted. For each instance, a sensor is first chosen for acquiring measurements and then based on the available information one decides (rejects) to seek more measurements from a new sensor/modality or to terminate by classifying the example based on the available information. Different sensors have varying costs for acquisition, and these costs account for delay, throughput or monetary value. Consequently, we seek methods for maximizing performance of the system subject to budget constraints. We formulate a multi-stage multi-class empirical risk objective and learn sequential decision functions from training data. We show that reject decision at each stage can be posed as supervised binary classification. We derive bounds for the VC dimension of the multi-stage system to quantify the generalization error. We compare our approach to alternative strategies on several multi-class real world datasets.'
volume: 31
URL: https://proceedings.mlr.press/v31/trapeznikov13a.html
PDF: http://proceedings.mlr.press/v31/trapeznikov13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-trapeznikov13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Kirill
family: Trapeznikov
- given: Venkatesh
family: Saligrama
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 581-589
id: trapeznikov13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 581
lastpage: 589
published: 2013-04-29 00:00:00 +0000
- title: 'On the Asymptotic Optimality of Maximum Margin Bayesian Networks'
abstract: 'Maximum margin Bayesian networks (MMBNs) are Bayesian networks with discriminatively optimized parameters. They have shown good classification performance in various applications. However, there has not been any theoretic analysis of their asymptotic performance, e.g. their Bayes consistency. For specific classes of MMBNs, i.e. MMBNs with fully connected graphs and discrete-valued nodes, we show Bayes consistency for binary-class problems and a sufficient condition for Bayes consistency in the multi-class case. We provide simple examples showing that MMBNs in their current formulation are not Bayes consistent in general. These examples are especially interesting, as the model used for the MMBNs can represent the assumed true distributions. This indicates that the current formulations of MMBNs may be deficient. Furthermore, experimental results on the generalization performance are presented.'
volume: 31
URL: https://proceedings.mlr.press/v31/tschiatschek13a.html
PDF: http://proceedings.mlr.press/v31/tschiatschek13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-tschiatschek13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Sebastian
family: Tschiatschek
- given: Franz
family: Pernkopf
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 590-598
id: tschiatschek13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 590
lastpage: 598
published: 2013-04-29 00:00:00 +0000
- title: 'Collapsed Variational Bayesian Inference for Hidden Markov Models'
abstract: 'Approximate inference for Bayesian models is dominated by two approaches, variational Bayesian inference and Markov Chain Monte Carlo. Both approaches have their own advantages and disadvantages, and they can complement each other. Recently researchers have proposed collapsed variational Bayesian inference to combine the advantages of both. Such inference methods have been successful in several models whose hidden variables are conditionally independent given the parameters. In this paper we propose two collapsed variational Bayesian inference algorithms for hidden Markov models, a popular framework for representing time series data. We validate our algorithms on the natural language processing task of unsupervised part-of-speech induction, showing that they are both more computationally efficient than sampling, and more accurate than standard variational Bayesian inference for HMMs.'
volume: 31
URL: https://proceedings.mlr.press/v31/wang13b.html
PDF: http://proceedings.mlr.press/v31/wang13b.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-wang13b.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Pengyu
family: Wang
- given: Phil
family: Blunsom
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 599-607
id: wang13b
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 599
lastpage: 607
published: 2013-04-29 00:00:00 +0000
- title: 'Block Regularized Lasso for Multivariate Multi-Response Linear Regression'
abstract: 'The multivariate multi-response (MVMR) linear regression problem is investigated, in which design matrices can be distributed differently across K linear regressions. The support union of K p-dimensional regression vectors are recovered via block regularized Lasso which uses the l_1/l_2 norm for regression vectors across K tasks. Sufficient and necessary conditions to guarantee successful recovery of the support union are characterized. More specifically, it is shown that under certain conditions on the distributions of design matrices, if n > c_p1 ψ(B^*,Σ^(1:K))\log(p-s) where c_p1 is a constant and s is the size of the support set, then the l_1/l_2 regularized Lasso correctly recovers the support union; and if n < c_p2 ψ(B^*,Σ^(1:K))\log(p-s) where c_p2 is a constant, then the l_1/l_2 regularized Lasso fails to recover the support union. In particular, ψ(B^*,Σ^(1:K)) captures the sparsity of K regression vectors and the statistical properties of the design matrices. Numerical results are provided to demonstrate the advantages of joint support union recovery using multi-task Lasso problem over studying each problem individually.'
volume: 31
URL: https://proceedings.mlr.press/v31/wang13c.html
PDF: http://proceedings.mlr.press/v31/wang13c.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-wang13c.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Weiguang
family: Wang
- given: Yingbin
family: Liang
- given: Eric
family: Xing
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 608-617
id: wang13c
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 608
lastpage: 617
published: 2013-04-29 00:00:00 +0000
- title: 'Bethe Bounds and Approximating the Global Optimum'
abstract: 'Inference in general Markov random fields (MRFs) is NP-hard, though identifying the maximum a posteriori (MAP) configuration of pairwise MRFs with submodular cost functions is efficiently solvable using graph cuts. Marginal inference, however, even for this restricted class, is #P-hard. Restricting to binary pairwise models, we prove new formulations of derivatives of the Bethe free energy, provide bounds on the derivatives and bracket the locations of stationary points. Several results apply whether the model is associative or not. Applying these to discretized pseudo-marginals in the associative case, we present a polynomial time approximation scheme for global optimization of the Bethe free energy provided the maximum degree ∆=O(\log n), where n is the number of variables. Runtime is guaranteed O(ε^-3/2 n^6 Σ^3/4 Ω^3/2), where Σ=O(∆/n) is the fraction of possible edges present and Ωis a function of MRF parameters. We examine use of the algorithm in practice, demonstrating runtime that is typically much faster, and discuss several extensions.'
volume: 31
URL: https://proceedings.mlr.press/v31/weller13a.html
PDF: http://proceedings.mlr.press/v31/weller13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-weller13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Adrian
family: Weller
- given: Tony
family: Jebara
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 618-631
id: weller13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 618
lastpage: 631
published: 2013-04-29 00:00:00 +0000
- title: 'Dual Decomposition for Joint Discrete-Continuous Optimization'
abstract: 'We analyse convex formulations for combined discrete-continuous MAP inference using the dual decomposition method. As a consquence we can provide a more intuitive derivation for the resulting convex relaxation than presented in the literature. Further, we show how to strengthen the relaxation by reparametrizing the potentials, hence convex relaxations for discrete-continuous inference does not share an important feature of LP relaxations for discrete labeling problems: incorporating unary potentials into higher order ones affects the quality of the relaxation. We argue that the convex model for discrete-continuous inference is very general and can be used as alternative for alternation-based methods often employed for such joint inference tasks. '
volume: 31
URL: https://proceedings.mlr.press/v31/zach13a.html
PDF: http://proceedings.mlr.press/v31/zach13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-zach13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Christopher
family: Zach
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 632-640
id: zach13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 632
lastpage: 640
published: 2013-04-29 00:00:00 +0000
- title: 'Learning Social Infectivity in Sparse Low-rank Networks Using Multi-dimensional Hawkes Processes'
abstract: 'How will the behaviors of individuals in a social network be influenced by their neighbors, the authorities and the communities? Such knowledge is often hidden from us and we only observe its manifestation in the form of recurrent and time-stamped events occurring at the individuals involved. It is an important yet challenging problem to infer the network of social inference based on the temporal patterns of these historical events. We propose a convex optimization approach to discover the hidden network of social influence by modeling the recurrent events at different individuals as multi-dimensional Hawkes processes. Furthermore, our estimation procedure, using nuclear and \ell_1 norm regularization simultaneously on the parameters, is able to take into account the prior knowledge of the presence of neighbor interaction, authority influence, and community coordination. To efficiently solve the problem, we also design an algorithm ADM4 which combines techniques of alternating direction method of multipliers and majorization minimization. We experimented with both synthetic and real world data sets, and showed that the proposed method can discover the hidden network more accurately and produce a better predictive model.'
volume: 31
URL: https://proceedings.mlr.press/v31/zhou13a.html
PDF: http://proceedings.mlr.press/v31/zhou13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-zhou13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Ke
family: Zhou
- given: Hongyuan
family: Zha
- given: Le
family: Song
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 641-649
id: zhou13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 641
lastpage: 649
published: 2013-04-29 00:00:00 +0000
- title: 'Greedy Bilateral Sketch, Completion & Smoothing'
abstract: 'Recovering a large low-rank matrix from highly corrupted, incomplete or sparse outlier overwhelmed observations is the crux of various intriguing statistical problems. We explore the power of "greedy bilateral (GreB)" paradigm in reducing both time and sample complexities for solving these problems. GreB models a low-rank variable as a bilateral factorization, and updates the left and right factors in a mutually adaptive and greedy incremental manner. We detail how to model and solve low-rank approximation, matrix completion and robust PCA in GreB’s paradigm. On their MATLAB implementations, approximating a noisy 10000x10000 matrix of rank 500 with SVD accuracy takes 6s; MovieLens10M matrix of size 69878x10677 can be completed in 10s from 30% of 10^7 ratings with RMSE 0.86 on the rest 70%; the low-rank background and sparse moving outliers in a 120x160 video of 500 frames are accurately separated in 1s. This brings 30 to 100 times acceleration in solving these popular statistical problems.'
volume: 31
URL: https://proceedings.mlr.press/v31/zhou13b.html
PDF: http://proceedings.mlr.press/v31/zhou13b.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-zhou13b.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Tianyi
family: Zhou
- given: Dacheng
family: Tao
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 650-658
id: zhou13b
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 650
lastpage: 658
published: 2013-04-29 00:00:00 +0000
- title: 'Scoring anomalies: a M-estimation formulation'
abstract: 'It is the purpose of this paper to formulate the issue of scoring multivariate observations depending on their degree of abnormality/novelty as an unsupervised learning task. Whereas in the 1-d situation, this problem can be dealt with by means of tail estimation techniques, observations being viewed as all the more “abnormal” as they are located far in the tail(s) of the underlying probability distribution. In a wide variety of applications, it is desirable to dispose of a scalar valued “scoring” function allowing for comparing the degree of abnormality of multivariate observations. Here we formulate the issue of scoring anomalies as a M-estimation problem. A (functional) performance criterion is proposed, whose optimal elements are, as expected, nondecreasing transforms of the density. The question of empirical estimation of this criterion is tackled and preliminary statistical results related to the accuracy of partition-based techniques for optimizing empirical estimates of the empirical performance measure are established.'
volume: 31
URL: https://proceedings.mlr.press/v31/clemencon13a.html
PDF: http://proceedings.mlr.press/v31/clemencon13a.pdf
edit: https://github.com/mlresearch//v31/edit/gh-pages/_posts/2013-04-29-clemencon13a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics'
publisher: 'PMLR'
author:
- given: Stéphan
family: Clémençon
- given: Jérémie
family: Jakubowicz
editor:
- given: Carlos M.
family: Carvalho
- given: Pradeep
family: Ravikumar
address: Scottsdale, Arizona, USA
page: 659-667
id: clemencon13a
issued:
date-parts:
- 2013
- 4
- 29
firstpage: 659
lastpage: 667
published: 2013-04-29 00:00:00 +0000