- title: 'The 35th Uncertainty in Artificial Intelligence Conference: Preface'
  abstract: 'The Conference on Uncertainty in Artificial Intelligence (UAI) is the premier international conference on research related to representation, inference, learning and decision making in the presence of uncertainty within the field of Artificial Intelligence. This volume contains all papers that were accepted for the 35th UAI Conference, held in Tel Aviv, Israel from July 22 to 25, 2019.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/adams20a.html
  PDF: http://proceedings.mlr.press/v115/adams20a/adams20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-adams20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Ryan
    family: Adams
  - given: Vibhav
    family: Gogate
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1-17
  id: adams20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1
  lastpage: 17
  published: 2020-08-06 00:00:00 +0000
- title: 'Personalized Peer Truth Serum for Eliciting Multi-Attribute Personal Data'
  abstract: 'Several peer consistency mechanisms have been proposed to incentivize agents for honestly solving crowdsourcing tasks. These game-theoretic mechanisms evaluate the answers provided by an agent based on the correlation with answers provided by other agents ("peers") who solve the same tasks. In this paper, we consider the problem of eliciting personal attributes (for e.g. body measurements) of the agents. Since attributes are personal in nature, the tasks can not be shared between two agents. We show for the first time how to extend a peer consistency incentive mechanism, the Logarithmic Peer Truth Serum, to this setting for collecting personal attributes. When individuals report combinations of multiple personal data attributes, the correlation between them can be exploited to find peers. This new mechanism applies, for example, to collecting personal health records and other multi-attribute measurements at private properties such as smart homes. We provide a theoretical analysis of the incentive properties of the new mechanism and show the performance of the mechanism on several public datasets, which confirm the theoretical analysis.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/goel20a.html
  PDF: http://proceedings.mlr.press/v115/goel20a/goel20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-goel20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Naman
    family: Goel
  - given: Boi
    family: Faltings
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 18-27
  id: goel20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 18
  lastpage: 27
  published: 2020-08-06 00:00:00 +0000
- title: 'Conditional Expectation Propagation'
  abstract: 'Expectation propagation (EP) is a powerful approximate inference algorithm. However, a critical barrier in applying EP is that the moment matching in message updates can be intractable. Handcrafting approximations is usually tricky, and lacks generalizability. Importance sampling is very expensive. While Laplace propagation offers an excellent solution, it has to run numerical optimizations to find Laplace approximations in every update, which is still quite inefficient. To overcome these practical barriers, we propose conditional expectation propagation (CEP) that performs conditional moment matching given the variables outside each message fixed, and then takes expectation w.r.t their approximate posterior. The conditional moments are often analytical and much easier to derive. In the most general case, we can use (fully) factorized messages so that the conditional moments can be represented by quadrature formulas. We then compute the expectation of the conditional moments via Taylor approximations when necessary. In this way, our algorithm can always conduct efficient, analytical fixed point iterations. Experiments on several popular models for which standard EP is available or unavailable demonstrate the advantages of CEP in both inference quality and computational efficiency.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/wang20a.html
  PDF: http://proceedings.mlr.press/v115/wang20a/wang20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-wang20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Zheng
    family: Wang
  - given: Shandian
    family: Zhe
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 28-37
  id: wang20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 28
  lastpage: 37
  published: 2020-08-06 00:00:00 +0000
- title: 'A Sparse Representation-Based Approach to Linear Regression with Partially Shuffled Labels'
  abstract: 'Several recent papers have discussed a modification of linear regression in which the correspondence between input variables and labels is missing or erroneous, referred to as "Linear Regression with Unknown Permutation",  or "Linear Regression with Shuffled Data". Prior studies of this setup have shed light on the associated statistical limits. However, practical and well-founded approaches that overcome the computational challenges of the setup are still scarce. In this paper, we translate the problem of linear regression with unknown permutation to a robust subspace recovery problem, or alternatively, outlier recovery problem. Outliers correspond to mismatched data, and a specific sparse representation of the data is used to detect the outliers. The proposed approach requires at least a reasonably large fraction (but potentially significantly less than 50%) of the response-predictor pairs to be in correct correspondence. This assumption is often justified, e.g., if record linkage based on additional contextual information is sufficiently accurate to enable correct matching of such fraction of the data. It turns out that in this situation, estimation of the regression parameter on the one hand and recovery of the underlying permutation on the other hand can be decoupled so that the computational hardness associated with the latter can be sidestepped. '
  volume: 115
  URL: https://proceedings.mlr.press/v115/slawski20a.html
  PDF: http://proceedings.mlr.press/v115/slawski20a/slawski20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-slawski20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Martin
    family: Slawski
  - given: Mostafa
    family: Rahmani
  - given: Ping
    family: Li
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 38-48
  id: slawski20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 38
  lastpage: 48
  published: 2020-08-06 00:00:00 +0000
- title: 'On Fast Convergence of Proximal Algorithms for SQRT-Lasso Optimization: Don’t Worry About its Nonsmooth Loss Function'
  abstract: 'Many machine learning techniques sacrifice convenient computational structures to gain estimation robustness and modeling flexibility. However, by exploring the modeling structures, we find these “sacrifices” do not always require more computational efforts. To shed light on such a “free-lunch” phenomenon, we study the square-root-Lasso (SQRT-Lasso) type regression problem. Specifically, we show that the nonsmooth loss functions of SQRT-Lasso type regression ease tuning effort and gain adaptivity to inhomogeneous noise, but is not necessarily more challenging than Lasso in computation. We can directly apply proximal algorithms (e.g. proximal gradient descent, proximal Newton, and proximal quasi-Newton algorithms) without worrying about the nonsmoothness of the loss function. Theoretically, we prove that the proximal algorithms combined with the pathwise optimization scheme enjoy fast convergence guarantees with high probability. Numerical results are provided to support our theory.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/li20a.html
  PDF: http://proceedings.mlr.press/v115/li20a/li20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-li20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Xinguo
    family: Li
  - given: Haoming
    family: Jiang
  - given: Jarvis
    family: Haupt
  - given: Raman
    family: Arora
  - given: Han
    family: Liu
  - given: Mingyi
    family: Hong
  - given: Tuo
    family: Zhao
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 49-59
  id: li20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 49
  lastpage: 59
  published: 2020-08-06 00:00:00 +0000
- title: 'Correlated Learning for Aggregation Systems'
  abstract: 'Aggregation systems (e.g., Uber, Lyft, Food- Panda, Deliveroo) have been increasingly used to improve efficiency in numerous environments, including in transportation, logistics, food and grocery delivery. In these systems, a centralized entity (e.g., Uber) aggregates sup- ply and assigns them to demand so as to optimize a central metric such as profit, number of requests, delay etc. Due to optimizing a metric of importance to the centralized entity, the interests of individuals (e.g., drivers, de- livery boys) can be sacrificed. Therefore, in this paper, we focus on the problem of serving individual interests, i.e., learning revenue maximizing policies for individuals in the presence of a self interested central entity. Since there are large number of learning agents that are homogenous, we represent the problem as an Anonymous Multi-Agent Reinforcement Learn- ing (AyMARL) problem. By using the self interested centralized entity as a correlation entity, we provide a novel learning mechanism that helps individual agents to maximize their individual revenue. Our Correlated Learning (CL) algorithm is able to outperform existing mechanisms on a generic simulator for aggregation systems and multiple other benchmark Multi-Agent Reinforcement Learning (MARL) problems.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/verma20a.html
  PDF: http://proceedings.mlr.press/v115/verma20a/verma20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-verma20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Tanvi
    family: Verma
  - given: Pradeep
    family: Varakantham
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 60-70
  id: verma20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 60
  lastpage: 70
  published: 2020-08-06 00:00:00 +0000
- title: 'Causal Calculus in the Presence of Cycles, Latent Confounders and Selection Bias'
  abstract: 'We prove the main rules of causal calculus (also called do-calculus) for i/o structural causal models (ioSCMs), a generalization of a recently proposed general class of non-linear structural causal models that allow for cycles, latent confounders and arbitrary probability distributions. We also generalize adjustment criteria and formulas from the acyclic setting to the general one (i.e. ioSCMs). Such criteria then allow to estimate (conditional) causal effects from observational data that was (partially) gathered under selection bias and cycles. This generalizes the backdoor criterion, the selection-backdoor criterion and extensions of these to arbitrary ioSCMs. Together, our results thus enable causal reasoning in the presence of cycles, latent confounders and selection bias. Finally, we extend the ID algorithm for the identification of causal effects to ioSCMs.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/forre20a.html
  PDF: http://proceedings.mlr.press/v115/forre20a/forre20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-forre20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Patrick
    family: Forré
  - given: Joris M.
    family: Mooij
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 71-80
  id: forre20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 71
  lastpage: 80
  published: 2020-08-06 00:00:00 +0000
- title: 'Variational Regret Bounds for Reinforcement Learning'
  abstract: 'We consider undiscounted reinforcement learning in Markov decision processes (MDPs) where \textit{both} the reward functions and the state-transition probabilities may vary (gradually or abruptly) over time.  For this problem setting, we propose an algorithm and provide performance guarantees for the regret evaluated against the optimal non-stationary policy. The upper bound on the regret is given in terms of the total variation in the MDP. This is the first variational regret bound for the general reinforcement learning setting. '
  volume: 115
  URL: https://proceedings.mlr.press/v115/ortner20a.html
  PDF: http://proceedings.mlr.press/v115/ortner20a/ortner20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-ortner20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Ronald
    family: Ortner
  - given: Pratik
    family: Gajane
  - given: Peter
    family: Auer
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 81-90
  id: ortner20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 81
  lastpage: 90
  published: 2020-08-06 00:00:00 +0000
- title: 'Recommendation from Raw Data with Adaptive Compound Poisson Factorization'
  abstract: 'Count data are often used in recommender systems: they are widespread (song play counts, product purchases, clicks on web pages) and can reveal user preference without any explicit rating from the user. Such data are known to be sparse, over-dispersed and bursty, which makes their direct use in recommender systems challenging, often leading to pre-processing steps such as binarization. The aim of this paper is to build recommender systems from these raw data, by means of the recently proposed compound Poisson Factorization (cPF). The paper contributions are three-fold: we present a unified framework for discrete data (dcPF), leading to an adaptive and scalable algorithm; we show that our framework achieves a trade-off between Poisson Factorization (PF) applied to raw and binarized data; we study four specific instances that are relevant to recommendation and exhibit new links with combinatorics. Experiments with three different datasets show that dcPF is able to effectively adjust to over-dispersion, leading to better recommendation scores when compared with PF on either raw or binarized data.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/gouvert20a.html
  PDF: http://proceedings.mlr.press/v115/gouvert20a/gouvert20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-gouvert20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Olivier
    family: Gouvert
  - given: Thomas
    family: Oberlin
  - given: Cédric
    family: Févotte
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 91-101
  id: gouvert20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 91
  lastpage: 101
  published: 2020-08-06 00:00:00 +0000
- title: 'One-Shot Marginal MAP Inference in Markov Random Fields'
  abstract: 'Statistical inference in Markov random fields (MRFs) is NP-hard in all but the simplest of cases.  As a result, many algorithms, particularly in the case of discrete random variables, have been developed to perform approximate inference in practice.  However, most of these methods scale poorly, cannot be applied to continuous random variables, or are too slow to be used in situations that call for repeated statistical inference on the same model.  In this work, we propose a novel variational inference strategy that is flexible enough to handle both continuous and discrete random variables, efficient enough to be able to handle repeated statistical inferences, and scalable enough, via modern GPUs, to be practical on MRFs with hundreds of thousands of random variables.  We prove that our approach overcomes weaknesses of the current approaches and demonstrate the efficacy of our approach on both synthetic models and real-world applications.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/xiong20a.html
  PDF: http://proceedings.mlr.press/v115/xiong20a/xiong20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-xiong20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Hao
    family: Xiong
  - given: Yuanzhen
    family: Guo
  - given: Yibo
    family: Yang
  - given: Nicholas
    family: Ruozzi
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 102-112
  id: xiong20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 102
  lastpage: 112
  published: 2020-08-06 00:00:00 +0000
- title: 'Truly Proximal Policy Optimization'
  abstract: 'Proximal policy optimization (PPO) is one of the most successful deep reinforcement learning methods, achieving state-of-the-art performance across a wide range of challenging tasks. However, its optimization behavior is still far from being fully understood. In this paper, we show that PPO could neither strictly restrict the probability ratio as it attempts to do nor enforce a well-defined trust region constraint, which means that it may still suffer from the risk of performance instability. To address this issue, we present an enhanced PPO method, named Trust Region-based PPO with Rollback (TR-PPO-RB). Two critical improvements are made in our method: 1) it adopts a new clipping function to support a rollback behavior to restrict the ratio between the new policy and the old one; 2) the triggering condition for clipping is replaced with a trust region-based one, which is theoretically justified according to the trust region theorem. It seems, by adhering more truly to the “proximal” property − restricting the policy within the trust region, the new algorithm improves the original PPO on both stability and sample efficiency.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/wang20b.html
  PDF: http://proceedings.mlr.press/v115/wang20b/wang20b.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-wang20b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Yuhui
    family: Wang
  - given: Hao
    family: He
  - given: Xiaoyang
    family: Tan
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 113-122
  id: wang20b
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 113
  lastpage: 122
  published: 2020-08-06 00:00:00 +0000
- title: 'Learning Factored Markov Decision Processes with Unawareness'
  abstract: 'Methods for learning and planning in sequential decision problems often assume the learner is aware of all possible states and actions in advance. This assumption is sometimes untenable. In this paper, we give a method to learn factored markov decision problems from both domain exploration and expert assistance, which guarantees convergence to near-optimal behaviour, even when the agent begins unaware of factors critical to success. Our experiments show our agent learns optimal behaviour on both small and large problems, and that conserving information on discovering new possibilities results in faster convergence.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/innes20a.html
  PDF: http://proceedings.mlr.press/v115/innes20a/innes20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-innes20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Craig
    family: Innes
  - given: Alex
    family: Lascarides
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 123-133
  id: innes20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 123
  lastpage: 133
  published: 2020-08-06 00:00:00 +0000
- title: 'Expressive Priors in Bayesian Neural Networks: Kernel Combinations and Periodic Functions'
  abstract: 'A simple, flexible approach to creating expressive priors in Gaussian process (GP) models makes new kernels from a combination of basic kernels, e.g. summing a periodic and linear kernel can capture seasonal variation with a long term trend. Despite a well-studied link between GPs and Bayesian neural networks (BNNs), the BNN analogue of this has not yet been explored. This paper derives BNN architectures mirroring such kernel combinations. Furthermore, it shows how BNNs can produce periodic kernels, which are often useful in this context. These ideas provide a principled approach to designing BNNs that incorporate prior knowledge about a function. We showcase the practical value of these ideas with illustrative experiments in supervised and reinforcement learning settings.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/pearce20a.html
  PDF: http://proceedings.mlr.press/v115/pearce20a/pearce20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-pearce20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Tim
    family: Pearce
  - given: Russell
    family: Tsuchida
  - given: Mohamed
    family: Zaki
  - given: Alexandra
    family: Brintrup
  - given: Andy
    family: Neely
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 134-144
  id: pearce20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 134
  lastpage: 144
  published: 2020-08-06 00:00:00 +0000
- title: 'Countdown Regression: Sharp and Calibrated Survival Predictions'
  abstract: 'Probabilistic survival predictions (i.e. personalized survival curves) from models trained with Maximum Likelihood Estimation (MLE) can have high, and sometimes unacceptably high variance.  The field of meteorology, where the paradigm of maximizing sharpness subject to calibration is popular, has addressed this problem  by using scoring rules beyond MLE, such as the Continuous Ranked Probability Score (CRPS). In this paper we present the \emph{Survival-CRPS}, a generalization of the CRPS to the survival prediction setting, with right-censored and interval-censored variants. We evaluate our ideas on the mortality prediction task using two different Electronic Health Record (EHR) data sets (STARR and MIMIC-III) covering millions of patients, with suitable deep neural network architectures: a Recurrent Neural Network (RNN) for STARR and a Fully Connected Network (FCN) for MIMIC-III. We compare results between the two different scoring rules while keeping the network architecture and data fixed, and show that models trained with Survival-CRPS result in sharper predictive distributions compared to those trained by MLE, while still maintaining calibration.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/avati20a.html
  PDF: http://proceedings.mlr.press/v115/avati20a/avati20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-avati20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Anand
    family: Avati
  - given: Tony
    family: Duan
  - given: Sharon
    family: Zhou
  - given: Kenneth
    family: Jung
  - given: Nigam H.
    family: Shah
  - given: Andrew Y.
    family: Ng
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 145-155
  id: avati20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 145
  lastpage: 155
  published: 2020-08-06 00:00:00 +0000
- title: 'Reducing Exploration of Dying Arms in Mortal Bandits'
  abstract: 'Mortal bandits have proven to be extremely useful for providing news article recommendations, running automated online advertising campaigns, and for other applications where the set of available options changes over time. Previous work on this problem showed how to regulate exploration of new arms when they have recently appeared, but they do not adapt when the arms are about to disappear. Since in most applications we can determine either exactly or approximately when arms will disappear, we can leverage this information to improve performance: we should not be exploring arms that are about to disappear. We provide adaptations of algorithms, regret bounds, and experiments for this study, showing a clear benefit from regulating greed (exploration/exploitation) for arms that will soon disappear. We illustrate numerical performance on the Yahoo! Front Page Today Module User Click Log Dataset.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/traca20a.html
  PDF: http://proceedings.mlr.press/v115/traca20a/traca20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-traca20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Stefano
    family: Tracà
  - given: Cynthia
    family: Rudin
  - given: Weiyu
    family: Yan
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 156-163
  id: traca20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 156
  lastpage: 163
  published: 2020-08-06 00:00:00 +0000
- title: 'Comparing EM with GD in Mixture Models of Two Components'
  abstract: 'The expectation-maximization (EM) algorithm has been widely used in minimizing the negative log likelihood (also known as cross entropy) of mixture models. However, little is understood about the goodness of the fixed points it converges to. In this paper, we study the regions where one component is missing in two-component mixture models, which we call one-cluster regions. We analyze the propensity of such regions to trap EM and gradient descent (GD) for mixtures of two Gaussians and mixtures of two Bernoullis. In the case of Gaussian mixtures, EM escapes one-cluster regions exponentially fast, while GD escapes them linearly fast. In the case of mixtures of Bernoullis, we find that there exist one-cluster regions that are stable for GD and therefore trap GD, but those regions are unstable for EM, allowing {EM}  to escape. Those regions are local minima that appear universally in experiments and can be arbitrarily bad. This work implies that EM is less likely than GD to converge to certain bad local optima in mixture models. '
  volume: 115
  URL: https://proceedings.mlr.press/v115/zhang20a.html
  PDF: http://proceedings.mlr.press/v115/zhang20a/zhang20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-zhang20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Guojun
    family: Zhang
  - given: Pascal
    family: Poupart
  - given: George
    family: Trimponias
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 164-174
  id: zhang20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 164
  lastpage: 174
  published: 2020-08-06 00:00:00 +0000
- title: 'Efficient Search-Based Weighted Model Integration'
  abstract: 'Weighted model integration (WMI) extends Weighted model counting (WMC) to the integration of functions over mixed discrete-continuous domains. It has shown tremendous promise for solving inference problems in graphical models and probabilistic programming. Yet, state-of-the-art tools for WMI are limited in terms of performance and ignore the independence structure that is crucial to improving efficiency. To address this limitation, we propose an efficient model integration algorithm for theories with tree primal graphs. We exploit the sparse graph structure by using search to performing integration. Our algorithm greatly improves the computational efficiency on such problems and exploits context-specific independence between variables. Experimental results show dramatic speedups compared to existing WMI solvers on problems with tree-shaped dependencies.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/zeng20a.html
  PDF: http://proceedings.mlr.press/v115/zeng20a/zeng20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-zeng20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Zhe
    family: Zeng
  - given: Guy
    prefix: Van den
    family: Broeck
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 175-185
  id: zeng20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 175
  lastpage: 185
  published: 2020-08-06 00:00:00 +0000
- title: 'Causal Discovery with General Non-Linear Relationships using Non-Linear ICA'
  abstract: 'We consider the problem of inferring causal relationships between two or more passively observed variables. While the problem of such causal discovery has been extensively studied especially in the bivariate setting, the majority of current methods assume a linear causal relationship, and the few methods which consider non-linear dependencies usually make the assumption of additive noise. Here, we propose a framework through which we can perform causal discovery in the presence of general non-linear relationships. The proposed method is based on recent progress in non-linear independent component analysis and exploits the non-stationarity of observations in order to recover the underlying sources or latent disturbances. We show rigorously that in the case of bivariate causal discovery, such non-linear ICA can be used to infer the causal direction via a series of independence tests. We further propose an alternative measure of causal direction based on asymptotic approximations to the likelihood ratio, as well as an extension to multivariate causal discovery. We demonstrate the capabilities of the proposed method via a series of simulation studies and conclude with an application to neuroimaging data. '
  volume: 115
  URL: https://proceedings.mlr.press/v115/monti20a.html
  PDF: http://proceedings.mlr.press/v115/monti20a/monti20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-monti20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Ricardo Pio
    family: Monti
  - given: Kun
    family: Zhang
  - given: Aapo
    family: Hyvärinen
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 186-195
  id: monti20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 186
  lastpage: 195
  published: 2020-08-06 00:00:00 +0000
- title: 'BubbleRank: Safe Online Learning to Re-Rank via Implicit Click Feedback'
  abstract: 'In this paper, we study the problem of safe online learning to re-rank, where user feedback is used to improve the quality of displayed lists. Learning to rank has traditionally been studied in two settings. In the offline setting, rankers are typically learned from relevance labels created by judges. This approach has generally become standard in industrial applications of ranking, such as search. However, this approach lacks exploration and thus is limited by the information content of the offline training data. In the online setting, an algorithm can experiment with lists and learn from feedback on them in a sequential fashion. Bandit algorithms are well-suited for this setting but they tend to learn user preferences from scratch, which results in a high initial cost of exploration. This poses an additional challenge of safe exploration in ranked lists. We propose BubbleRank, a bandit algorithm for safe re-ranking that combines the strengths of both the offline and online settings. The algorithm starts with an initial base list and improves it online by gradually exchanging higher-ranked less attractive items for lower-ranked more attractive items. We prove an upper bound on the n-step regret of BubbleRank that degrades gracefully with the quality of the initial base list. Our theoretical findings are supported by extensive experiments on a large-scale real-world click dataset.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/li20b.html
  PDF: http://proceedings.mlr.press/v115/li20b/li20b.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-li20b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Chang
    family: Li
  - given: Branislav
    family: Kveton
  - given: Tor
    family: Lattimore
  - given: Ilya
    family: Markov
  - given: Maarten
    family: de Rijke
  - given: Csaba
    family: Szepesvári
  - given: Masrour
    family: Zoghi
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 196-206
  id: li20b
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 196
  lastpage: 206
  published: 2020-08-06 00:00:00 +0000
- title: 'Coordinating Users of Shared Facilities via Data-driven Predictive Assistants and Game Theory'
  abstract: 'We study data-driven assistants that provide congestion forecasts to users of shared facilities (roads, cafeterias, etc.), to support coordination between them, and increase efficiency of such collective systems. Key questions are: (1) when and how much can (accurate) predictions help for coordination, and (2) which assistant algorithms reach optimal predictions? First we lay conceptual ground for this setting where user preferences are a priori unknown and predictions influence outcomes. Addressing (1), we establish conditions under which self-fulfilling prophecies, i.e., “perfect” (probabilistic) predictions of what will happen, solve the coordination problem in the gametheoretic sense of selecting a Bayesian Nash equilibrium (BNE). Next we prove that such prophecies exist even in large-scale settings where only aggregated statistics about users are available. This entails a new (nonatomic) BNE existence result. Addressing (2), we propose two assistant algorithms that sequentially learn from users’ reactions, together with optimality/ convergence guarantees. We validate one of them in a large real-world experiment.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/geiger20a.html
  PDF: http://proceedings.mlr.press/v115/geiger20a/geiger20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-geiger20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Philipp
    family: Geiger
  - given: Michel
    family: Besserve
  - given: Justus
    family: Winkelmann
  - given: Claudius
    family: Proissl
  - given: Bernhard
    family: Schölkopf
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 207-216
  id: geiger20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 207
  lastpage: 216
  published: 2020-08-06 00:00:00 +0000
- title: 'The Incomplete Rosetta Stone problem: Identifiability results for Multi-view Nonlinear ICA'
  abstract: 'We consider the problem of recovering a common latent source with independent components from multiple views. This applies to settings in which a variable is measured with multiple experimental modalities, and where the goal is to synthesize the disparate measurements into a single unified representation. We consider the case that the observed views are a nonlinear mixing of component-wise corruptions of the sources. When the views are considered separately, this reduces to nonlinear Independent Component Analysis (ICA) for which it is provably impossible to undo the mixing. We present novel identifiability proofs that this is possible when the multiple views are considered jointly, showing that the mixing can theoretically be undone using function approximators such as deep neural networks. In contrast to known identifiability results for nonlinear ICA, we prove that independent latent sources with arbitrary mixing can be recovered as long as multiple, sufficiently different noisy views are available. '
  volume: 115
  URL: https://proceedings.mlr.press/v115/gresele20a.html
  PDF: http://proceedings.mlr.press/v115/gresele20a/gresele20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-gresele20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Luigi
    family: Gresele
  - given: Paul K.
    family: Rubenstein
  - given: Arash
    family: Mehrjou
  - given: Francesco
    family: Locatello
  - given: Bernhard
    family: Schölkopf
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 217-227
  id: gresele20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 217
  lastpage: 227
  published: 2020-08-06 00:00:00 +0000
- title: 'Random Clique Covers for Graphs with Local Density and Global Sparsity'
  abstract: 'Large real-world graphs tend to be sparse, but they often contain many densely connected subgraphs and exhibit high clustering coefficients. While recent random graph models can capture this sparsity, they ignore the local density, or vice versa. We develop a Bayesian nonparametric graph model based on random edge clique covers, and show that this model can capture power law degree distribution, global sparsity and non-vanishing local clustering coefficient. This distribution can be used directly as a prior on observed graphs, or as part of a hierarchical Bayesian model for inferring latent graph structures.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/williamson20a.html
  PDF: http://proceedings.mlr.press/v115/williamson20a/williamson20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-williamson20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Sinead A.
    family: Williamson
  - given: Mauricio
    family: Tec
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 228-238
  id: williamson20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 228
  lastpage: 238
  published: 2020-08-06 00:00:00 +0000
- title: 'Randomized Iterative Algorithms for Fisher Discriminant Analysis'
  abstract: 'Fisher discriminant analysis (FDA) is a widely used method for classification and dimensionality reduction. When the number of predictor variables greatly exceeds the number of observations, one of the alternatives for conventional FDA is regularized Fisher discriminant analysis (RFDA). In this paper, we present a simple, iterative, sketching-based algorithm for RFDA that comes with provable accuracy guarantees when compared to the conventional approach.  Our analysis builds upon two simple structural results that boil down to randomized matrix multiplication, a fundamental and well-understood primitive of randomized linear algebra. We analyze the behavior of RFDA when leverage scores and ridge leverage scores are used to select predictor variables, and prove that accurate approximations can be achieved by a sample whose size depends on the effective degrees of freedom of the RFDA problem. Our results yield significant improvements over existing approaches and our empirical evaluations support our theoretical analyses.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/chowdhury20a.html
  PDF: http://proceedings.mlr.press/v115/chowdhury20a/chowdhury20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-chowdhury20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Agniva
    family: Chowdhury
  - given: Jiasen
    family: Yang
  - given: Petros
    family: Drineas
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 239-249
  id: chowdhury20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 239
  lastpage: 249
  published: 2020-08-06 00:00:00 +0000
- title: 'Dynamic Trip-Vehicle Dispatch with Scheduled and On-Demand Requests'
  abstract: 'Transportation service providers that dispatch drivers and vehicles to riders start to support both on-demand ride requests posted in real time and rides scheduled in advance, leading to new challenges which, to the best of our knowledge, have not been addressed by existing works. To fill the gap, we design novel trip-vehicle dispatch algorithms to handle both types of requests while taking into account an estimated request distribution of on-demand requests. At the core of the algorithms is the newly proposed Constrained Spatio-Temporal value function (CST-function), which is polynomial-time computable and represents the expected value a vehicle could gain with the constraint that it needs to arrive at a specific location at a given time. Built upon CST-function, we design a randomized best-fit algorithm for scheduled requests and an online planning algorithm for on-demand requests given the scheduled requests as constraints. We evaluate the algorithms through extensive experiments on a real-world dataset of an online ride-hailing platform.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/huang20a.html
  PDF: http://proceedings.mlr.press/v115/huang20a/huang20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-huang20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Taoan
    family: Huang
  - given: Bohui
    family: Fang
  - given: Xiaohui
    family: Bei
  - given: Fei
    family: Fang
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 250-260
  id: huang20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 250
  lastpage: 260
  published: 2020-08-06 00:00:00 +0000
- title: 'Fall of Empires: Breaking Byzantine-tolerant SGD by Inner Product Manipulation'
  abstract: 'Recently, new defense techniques have been developed to tolerate Byzantine failures for distributed machine learning. The Byzantine model captures  workers that behave arbitrarily, including malicious and compromised workers. In this paper, we  break two prevailing Byzantine-tolerant techniques. Specifically we show that two robust aggregation methods for synchronous SGD–namely, coordinate-wise median and Krum–can be broken using new attack strategies based on inner product manipulation. We prove our results theoretically, as well as show empirical validation.  '
  volume: 115
  URL: https://proceedings.mlr.press/v115/xie20a.html
  PDF: http://proceedings.mlr.press/v115/xie20a/xie20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-xie20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Cong
    family: Xie
  - given: Oluwasanmi
    family: Koyejo
  - given: Indranil
    family: Gupta
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 261-270
  id: xie20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 261
  lastpage: 270
  published: 2020-08-06 00:00:00 +0000
- title: 'Adaptive Hashing for Model Counting'
  abstract: 'Randomized hashing algorithms have seen recent success in providing bounds on the model count of a propositional formula.  These methods repeatedly check the satisfiability of a formula subject to increasingly stringent random constraints.  Key to these approaches is the choice of a fixed family of hash functions that strikes a good balance between computational efficiency and statistical guarantees for a hypothetical worst case formula.  In this paper we propose a scheme where the family of hash functions is chosen adaptively, based on properties of the specific input formula.  Akin to adaptive importance sampling, we use solutions to the formula (found during the bounding procedure of current methods) to estimate properties of the solution set, which guides the construction of random constraints. Additionally, we introduce an orthogonal variance reduction technique that is broadly applicable to hashing based methods.  We empirically show that, when combined, these approaches lead to better lower bounds on existing benchmarks, with a median improvement factor of 2^13 over 1,198 propositional formulas.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/kuck20a.html
  PDF: http://proceedings.mlr.press/v115/kuck20a/kuck20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-kuck20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Jonathan
    family: Kuck
  - given: Tri
    family: Dao
  - given: Shengjia
    family: Zhao
  - given: Burak
    family: Bartan
  - given: Ashish
    family: Sabharwal
  - given: Stefano
    family: Ermon
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 271-280
  id: kuck20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 271
  lastpage: 280
  published: 2020-08-06 00:00:00 +0000
- title: 'Towards a Better Understanding and Regularization of GAN Training Dynamics'
  abstract: 'Generative adversarial networks (GANs) are notoriously difficult to train and the reasons underlying their (non-)convergence behaviors are still not completely understood. By first considering a simple yet representative GAN example, we mathematically analyze its local convergence behavior in a non-asymptotic way. Furthermore, the analysis is extended to general GANs under certain assumptions. We find that in order to ensure a good convergence rate, two factors of the Jacobian in the GAN training dynamics should be simultaneously avoided, which are (i) the Phase Factor, i.e., the Jacobian has complex eigenvalues with a large imaginary-to-real ratio, and (ii) the Conditioning Factor, i.e., the Jacobian is ill-conditioned. Previous methods of regularizing the Jacobian can only alleviate one of these two factors, while making the other more severe. Thus we propose a new JAcobian REgularization (JARE) for GANs, which simultaneously addresses both factors by construction. Finally, we conduct experiments that confirm our theoretical analysis and demonstrate the advantages of JARE over previous methods in stabilizing GANs.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/nie20a.html
  PDF: http://proceedings.mlr.press/v115/nie20a/nie20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-nie20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Weili
    family: Nie
  - given: Ankit B.
    family: Patel
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 281-291
  id: nie20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 281
  lastpage: 291
  published: 2020-08-06 00:00:00 +0000
- title: 'Domain Generalization via Multidomain Discriminant Analysis'
  abstract: 'Domain generalization (DG) aims to incorporate knowledge from multiple source domains into a single model that could generalize well on unseen target domains. This problem is ubiquitous in practice since the distributions of the target data may rarely be identical to those of the source data. In this paper, we propose Multidomain Discriminant Analysis (MDA) to address DG of classification tasks in general situations. MDA learns a domain-invariant feature transformation that aims to achieve appealing properties, including a minimal divergence among domains within each class, a maximal separability among classes, and overall maximal compactness of all classes. Furthermore, we provide the bounds on excess risk and generalization error by learning theory analysis. Comprehensive experiments on synthetic and real benchmark datasets demonstrate the effectiveness of MDA.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/hu20a.html
  PDF: http://proceedings.mlr.press/v115/hu20a/hu20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-hu20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Shoubo
    family: Hu
  - given: Kun
    family: Zhang
  - given: Zhitang
    family: Chen
  - given: Laiwan
    family: Chan
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 292-302
  id: hu20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 292
  lastpage: 302
  published: 2020-08-06 00:00:00 +0000
- title: 'Efficient Planning Under Uncertainty with Incremental Refinement'
  abstract: 'Online planning under uncertainty on robots and similar agents has very strict performance requirements in order to achieve reasonable behavior in complex domains with limited resources.  The underlying process of decision-making and information gathering is correctly modeled by POMDP’s, but their complexity makes many interesting and challenging problems virtually intractable.  We address this issue by introducing a method to estimate relevance values for elements of a planning domain, that allow an agent to focus on promising features.  This approach reduces the effective dimensionality of problems, allowing an agent to plan faster and collect higher rewards.  Experimental validation was performed on two challenging POMDP’s that resemble real-world robotic task planning, where it is crucial to interleave planning and acting in an efficient manner.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/saborio20a.html
  PDF: http://proceedings.mlr.press/v115/saborio20a/saborio20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-saborio20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Juan Carlos
    family: Saborío
  - given: Joachim
    family: Hertzberg
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 303-312
  id: saborio20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 303
  lastpage: 312
  published: 2020-08-06 00:00:00 +0000
- title: 'Cubic Regularization with Momentum for Nonconvex Optimization'
  abstract: 'Momentum is a popular technique to accelerate the convergence in practical training, and its impact on convergence guarantee has been well-studied for first-order algorithms. However, such a successful acceleration technique has not yet been proposed for second-order algorithms in nonconvex optimization. In this paper, we apply the momentum scheme to cubic regularized (CR) Newton’s method and explore the potential for acceleration. Our numerical experiments on various nonconvex optimization problems demonstrate that the momentum scheme can substantially facilitate the convergence of cubic regularization, and perform even better than the Nesterov’s acceleration scheme for CR. Theoretically, we prove that CR under momentum achieves the best possible convergence rate to a second-order stationary point for nonconvex optimization. Moreover, we study the proposed algorithm for solving problems satisfying an error bound condition and establish a local quadratic convergence rate. Then, particularly for finite-sum problems, we show that the proposed algorithm can allow computational inexactness that reduces the overall sample complexity without degrading the convergence rate.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/wang20c.html
  PDF: http://proceedings.mlr.press/v115/wang20c/wang20c.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-wang20c.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Zhe
    family: Wang
  - given: Yi
    family: Zhou
  - given: Yingbin
    family: Liang
  - given: Guanghui
    family: Lan
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 313-322
  id: wang20c
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 313
  lastpage: 322
  published: 2020-08-06 00:00:00 +0000
- title: 'Stability of Linear Structural Equation Models of Causal Inference'
  abstract: 'We consider numerical stability of the parameter recovery problem in Linear Structural Equation Model (LSEM) of causal inference. Numerical stability is essential for the recovered parameters to be reliable. A long line of work starting from Wright (1920) has focused on understanding which sub-classes of LSEM allow for efficient parameter recovery. Despite decades of study, this question is not yet fully resolved. The goal of the present paper is complementary to this line of work: we want to understand the stability of the recovery problem in the cases when efficient recovery is possible. Numerical stability of Pearl’s notion of causality was first studied in Schulman and Srivastava (2016) using the concept of condition number where they provide ill-conditioned examples. In this work, we provide a condition number analysis for the LSEM. First we prove that under a sufficient condition, for a certain sub-class of LSEM that are bow-free (Brito and Pearl (2002)), parameter recovery is numerically stable. We further prove that randomly chosen input parameters for this family satisfy the condition with a substantial probability. Hence for this family, on a large subset of parameter space, recovery is stable. Next we construct an example of LSEM on four vertices with unbounded condition number. We then corroborate our theoretical findings via simulations as well as real-world experiments for a sociology application. Finally, we provide a general heuristic for estimating the condition number of any LSEM instance. '
  volume: 115
  URL: https://proceedings.mlr.press/v115/sankararaman20a.html
  PDF: http://proceedings.mlr.press/v115/sankararaman20a/sankararaman20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-sankararaman20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Karthik Abhinav
    family: Sankararaman
  - given: Anand
    family: Louis
  - given: Navin
    family: Goyal
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 323-333
  id: sankararaman20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 323
  lastpage: 333
  published: 2020-08-06 00:00:00 +0000
- title: 'Random Sum-Product Networks: A Simple and Effective Approach to Probabilistic Deep Learning'
  abstract: 'Sum-product networks (SPNs) are expressive probabilistic models with a rich set of exact and efficient inference routines. However, in order to guarantee exact inference, they require specific structural constraints, which complicate learning SPNs from data. Thereby, most SPN structure learners proposed so far are tedious to tune, do not scale easily, and are not easily integrated with deep learning frameworks. In this paper, we follow a simple “deep learning” approach, by generating unspecialized random structures, scalable to millions of parameters, and subsequently applying GPU-based optimization. Somewhat surprisingly, our models often perform on par with state-of-the-art SPN structure learners and deep neural networks on a diverse range of generative and discriminative scenarios. At the same time, our models yield well-calibrated uncertainties, and stand out among most deep generative and discriminative models in being robust to missing features and being able to detect anomalies.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/peharz20a.html
  PDF: http://proceedings.mlr.press/v115/peharz20a/peharz20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-peharz20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Robert
    family: Peharz
  - given: Antonio
    family: Vergari
  - given: Karl
    family: Stelzner
  - given: Alejandro
    family: Molina
  - given: Xiaoting
    family: Shao
  - given: Martin
    family: Trapp
  - given: Kristian
    family: Kersting
  - given: Zoubin
    family: Ghahramani
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 334-344
  id: peharz20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 334
  lastpage: 344
  published: 2020-08-06 00:00:00 +0000
- title: 'Towards Robust Relational Causal Discovery'
  abstract: 'We consider the problem of learning causal relationships from relational data. Existing approaches rely on queries to a relational conditional independence (RCI) oracle to establish and orient causal relations in such a setting. In practice, queries to a RCI oracle have to be replaced by reliable tests for RCI against available data. Relational data present several unique challenges in testing for RCI. We study the conditions under which traditional iid-based CI tests yield reliable answers to RCI queries against relational data. We show how to conduct CI tests against relational data to robustly recover the underlying relational causal structure. Results of our experiments demonstrate the effectiveness of our proposed approach.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/lee20a.html
  PDF: http://proceedings.mlr.press/v115/lee20a/lee20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-lee20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Sanghack
    family: Lee
  - given: Vasant
    family: Honavar
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 345-355
  id: lee20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 345
  lastpage: 355
  published: 2020-08-06 00:00:00 +0000
- title: 'The Role of Memory in Stochastic Optimization'
  abstract: 'The choice of how to retain information about past gradients dramatically affects the convergence properties of state-of-the-art stochastic optimization methods, such as Heavy-ball, Nesterov’s momentum, RMSprop and Adam. Building on this observation, we use stochastic differential equations (SDEs) to explicitly study the role of memory in gradient-based algorithms. We first derive a general continuous-time model that can incorporate arbitrary types of memory, for both deterministic and stochastic settings. We provide convergence guarantees for this SDE for weakly-quasi-convex and quadratically growing functions. We then demonstrate how to discretize this SDE to get a flexible discrete-time algorithm that can implement a board spectrum of memories ranging from short- to long-term. Not only does this algorithm increase the degrees of freedom in algorithmic choice for practitioners but it also comes with better stability properties than classical momentum in the convex stochastic setting. In particular, no iterate averaging is needed for convergence. Interestingly, our analysis also provides a novel interpretation of Nesterov’s momentum as stable gradient amplification and highlights a possible reason for its unstable behavior in the (convex) stochastic setting. Furthermore, we discuss the use of long term memory for second-moment estimation in adaptive methods, such as Adam and RMSprop. Finally, we provide an extensive experimental study of the effect of different types of memory in both convex and nonconvex settings.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/orvieto20a.html
  PDF: http://proceedings.mlr.press/v115/orvieto20a/orvieto20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-orvieto20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Antonio
    family: Orvieto
  - given: Jonas
    family: Kohler
  - given: Aurelien
    family: Lucchi
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 356-366
  id: orvieto20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 356
  lastpage: 366
  published: 2020-08-06 00:00:00 +0000
- title: 'Random Search and Reproducibility for Neural Architecture Search'
  abstract: 'Neural architecture search (NAS) is a promising research direction that has the potential to replace expert-designed networks with learned, task-specific architectures. In order to help ground the empirical results in this field, we propose new NAS baselines that build off the following observations: (i) NAS is a specialized hyperparameter optimization problem; and (ii) random search is a competitive baseline for hyperparameter optimization. Leveraging these observations, we evaluate both random search with early-stopping and a novel random search with weight-sharing algorithm on two standard NAS benchmarks—PTB and CIFAR-10. Our results show that random search with early-stopping is a competitive NAS baseline, e.g., it performs at least as well as ENAS, a leading NAS method, on both benchmarks. Additionally, random search with weight-sharing outperforms random search with early-stopping, achieving a state-of-the-art NAS result on PTB and a highly competitive result on CIFAR-10. Finally, we explore the existing reproducibility issues of published NAS results.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/li20c.html
  PDF: http://proceedings.mlr.press/v115/li20c/li20c.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-li20c.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Liam
    family: Li
  - given: Ameet
    family: Talwalkar
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 367-377
  id: li20c
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 367
  lastpage: 377
  published: 2020-08-06 00:00:00 +0000
- title: 'Joint Nonparametric Precision Matrix Estimation with Confounding'
  abstract: 'We consider the problem of precision matrix estimation where, due to extraneous confounding of the underlying precision matrix, the data are independent but not identically distributed. While such confounding occurs in many scientific problems, our approach is inspired by recent neuroscientific research suggesting that brain function, as measured using functional magnetic resonance imagine (fMRI), is susceptible to confounding by physiological noise such as breathing and subject motion. Following the scientific motivation, we propose a graphical model, which in turn motivates a joint nonparametric estimator.  We provide theoretical guarantees for the consistency and the convergence rate of the proposed estimator. In addition, we demonstrate that the optimization of the proposed estimator can be transformed into a series of linear programming problems, and thus be efficiently solved in parallel. Empirical results are presented using simulated and real brain imaging data, which suggest that our approach improves precision matrix estimation, as compared to baselines, when confounding is present.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/geng20a.html
  PDF: http://proceedings.mlr.press/v115/geng20a/geng20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-geng20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Sinong
    family: Geng
  - given: Mladen
    family: Kolar
  - given: Oluwasanmi
    family: Koyejo
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 378-388
  id: geng20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 378
  lastpage: 388
  published: 2020-08-06 00:00:00 +0000
- title: 'General Identifiability with Arbitrary Surrogate Experiments'
  abstract: 'We study the problem of causal identification from an arbitrary collection of observational and experimental distributions, and substantive knowledge about the phenomenon under investigation, which usually comes in the form of a causal graph. We call this problem  \textit{g-identifiability}, or gID for short. The gID setting encompasses two  well-known problems in causal inference, namely, identifiability [Pearl, 1995] and z-identifiability [Bareinboim and Pearl, 2012] – the former assumes that an observational distribution is necessarily available, and no experiments can be performed, conditions that are both relaxed in the gID setting; the latter assumes that \textit{all} combinations of experiments are available, i.e., the power set of the experimental set $\mathbf{Z}$, which gID does not require a priori. In this paper, we introduce a general strategy to prove non-gID based on \textit{hedgelets} and \textit{thickets}, which leads to a necessary and sufficient graphical condition for the corresponding decision problem. We further develop a procedure for systematically computing the target effect, and prove that it is sound and complete for gID instances. In other words, failure of the algorithm in returning an expression implies that the target effect is not computable from the available distributions. Finally, as a corollary of these results, we show that do-calculus is complete for the task of g-identifiability.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/lee20b.html
  PDF: http://proceedings.mlr.press/v115/lee20b/lee20b.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-lee20b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Sanghack
    family: Lee
  - given: Juan D.
    family: Correa
  - given: Elias
    family: Bareinboim
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 389-398
  id: lee20b
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 389
  lastpage: 398
  published: 2020-08-06 00:00:00 +0000
- title: 'Differentiable Probabilistic Models of Scientific Imaging with the Fourier Slice Theorem'
  abstract: 'Scientific imaging techniques such as optical and electron microscopy and computed tomography (CT) scanning are used to study the 3D structure of an object through 2D observations.  These observations are related to the original 3D object through orthogonal integral projections. For common 3D reconstruction algorithms, computational efficiency requires the modeling of the 3D structures to take place in Fourier space by applying the Fourier slice theorem. At present, it is unclear how to differentiate through the projection operator, and hence current learning algorithms can not rely on gradient based methods to optimize 3D structure models.  In this paper we show how back-propagation through the projection operator in Fourier space can be achieved. We demonstrate the validity of the approach with experiments on 3D reconstruction of proteins. We further extend our approach to learning probabilistic models of 3D objects. This allows us to predict regions of low sampling rates or estimate noise. A higher sample efficiency can be reached by utilizing the learned uncertainties of the 3D structure as an unsupervised estimate of the model fit. Finally, we demonstrate how the reconstruction algorithm can be extended with an amortized inference scheme on unknown attributes such as object pose. Through empirical studies we show that joint inference of the 3D structure and the object pose becomes more difficult when the ground truth object contains more symmetries. Due to the presence of for instance (approximate) rotational symmetries, the pose estimation can easily get stuck in local optima, inhibiting a fine-grained high-quality estimate of the 3D structure.  '
  volume: 115
  URL: https://proceedings.mlr.press/v115/ullrich20a.html
  PDF: http://proceedings.mlr.press/v115/ullrich20a/ullrich20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-ullrich20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Karen
    family: Ullrich
  - given: Rianne
    prefix: van den
    family: Berg
  - given: Marcus
    family: Brubaker
  - given: David
    family: Fleet
  - given: Max
    family: Welling
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 399-411
  id: ullrich20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 399
  lastpage: 411
  published: 2020-08-06 00:00:00 +0000
- title: 'Approximate Inference in Structured Instances with Noisy Categorical Observations'
  abstract: 'We study the problem of recovering the latent ground truth labeling of a structured instance with categorical random variables in the presence of noisy observations. We present a new approximate algorithm for graphs with categorical variables that achieves low Hamming error in the presence of noisy vertex and edge observations. Our main result shows a logarithmic dependency of the Hamming error to the number of categories of the random variables. Our approach draws connections to correlation clustering with a fixed number of clusters. Our results generalize the works of Globerson et al. (2015) and Foster et al. (2018), who study the hardness of structured prediction under binary labels, to the case of categorical labels.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/heidari20a.html
  PDF: http://proceedings.mlr.press/v115/heidari20a/heidari20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-heidari20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Alireza
    family: Heidari
  - given: Ihab F.
    family: Ilyas
  - given: Theodoros
    family: Rekatsinas
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 412-421
  id: heidari20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 412
  lastpage: 421
  published: 2020-08-06 00:00:00 +0000
- title: 'Randomized Value Functions via Multiplicative Normalizing Flows'
  abstract: 'Randomized value functions offer a promising approach towards the challenge of efficient exploration in complex environments with high dimensional state and action spaces. Unlike traditional point estimate methods, randomized value functions maintain a posterior distribution over action-space values. This prevents the agent’s behavior policy from prematurely exploiting early estimates and falling into local optima. In this work, we leverage recent advances in variational Bayesian neural networks and combine these with traditional Deep Q-Networks (DQN) and Deep Deterministic Policy Gradient (DDPG) to achieve randomized value functions for high-dimensional domains. In particular, we augment DQN and DDPG with multiplicative normalizing flows in order to track a rich approximate posterior distribution over the parameters of the value function. This allows the agent to perform approximate Thompson sampling in a computationally efficient manner via stochastic gradient methods. We demonstrate the benefits  of our approach through an empirical comparison in high dimensional environments. '
  volume: 115
  URL: https://proceedings.mlr.press/v115/touati20a.html
  PDF: http://proceedings.mlr.press/v115/touati20a/touati20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-touati20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Ahmed
    family: Touati
  - given: Harsh
    family: Satija
  - given: Joshua
    family: Romoff
  - given: Joelle
    family: Pineau
  - given: Pascal
    family: Vincent
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 422-432
  id: touati20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 422
  lastpage: 432
  published: 2020-08-06 00:00:00 +0000
- title: 'A Fast Proximal Point Method for Computing Exact Wasserstein Distance'
  abstract: 'Wasserstein distance plays increasingly important roles in machine learning, stochastic programming and image processing. Major efforts have been under way to address its high computational complexity, some leading to approximate or regularized variations such as Sinkhorn distance. However, as we will demonstrate, regularized variations with large regularization parameter will degradate the performance in several important machine learning applications, and small regularization parameter will fail due to numerical stability issues with existing algorithms. We address this challenge by developing an Inexact Proximal point method for exact Optimal Transport problem (IPOT) with the proximal operator approximately evaluated at each iteration using projections to the probability simplex. The algorithm (a) converges to exact Wasserstein distance with theoretical guarantee and robust regularization parameter selection, (b) alleviates numerical stability issue, (c) has similar computational complexity to Sinkhorn, and (d) avoids the shrinking problem when applies to generative models. Furthermore, a new algorithm is proposed based on IPOT to obtain sharper Wasserstein barycenter.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/xie20b.html
  PDF: http://proceedings.mlr.press/v115/xie20b/xie20b.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-xie20b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Yujia
    family: Xie
  - given: Xiangfeng
    family: Wang
  - given: Ruijia
    family: Wang
  - given: Hongyuan
    family: Zha
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 433-453
  id: xie20b
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 433
  lastpage: 453
  published: 2020-08-06 00:00:00 +0000
- title: 'Neural Dynamics Discovery via Gaussian Process Recurrent Neural Networks'
  abstract: 'Latent dynamics discovery is challenging in extracting complex dynamics from highdimensional noisy neural data. Many dimensionality reduction methods have been widely adopted to extract low-dimensional, smooth and time-evolving latent trajectories. However, simple state transition structures, linear embedding assumptions, or inflexible inference networks impede the accurate recovery of dynamic portraits. In this paper, we propose a novel latent dynamic model that is capable of capturing nonlinear, non- Markovian, long short-term time-dependent dynamics via recurrent neural networks and tackling complex nonlinear embedding via non-parametric Gaussian process. Due to the complexity and intractability of the model and its inference, we also provide a powerful inference network with bi-directional long short-term memory networks that encode both past and future information into posterior distributions. In the experiment, we show that our model outperforms other state-of-the-art methods in reconstructing insightful latent dynamics from both simulated and experimental neural datasets with either Gaussian or Poisson observations, especially in the  low-sample scenario. Our codes and additional materials are available at https://github.com/sheqi/GP-RNN_UAI2019.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/she20a.html
  PDF: http://proceedings.mlr.press/v115/she20a/she20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-she20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Qi
    family: She
  - given: Anqi
    family: Wu
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 454-464
  id: she20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 454
  lastpage: 464
  published: 2020-08-06 00:00:00 +0000
- title: 'Fisher-Bures Adversary Graph Convolutional Networks'
  abstract: 'In a graph convolutional network, we assume that the graph $G$ is generated wrt some observation noise. During learning, we make small random perturbations $\Delta{}G$ of the graph and try to improve generalization. Based on quantum information geometry, $\Delta{}G$ can be characterized by the eigendecomposition of the graph Laplacian matrix. We try to minimize the loss wrt the perturbed $G+\Delta{G}$ while making $\Delta{G}$ to be effective in terms of the Fisher information of the neural network. Our proposed model can consistently improve graph convolutional networks on semi-supervised node classification tasks with reasonable computational overhead. We present three different geometries on the manifold of graphs: the intrinsic geometry measures the information theoretic dynamics of a graph; the extrinsic geometry characterizes how such dynamics can affect externally a graph neural network; the embedding geometry is for measuring node embeddings. These new analytical tools are useful in developing a good understanding of graph neural networks and fostering new techniques.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/sun20a.html
  PDF: http://proceedings.mlr.press/v115/sun20a/sun20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-sun20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Ke
    family: Sun
  - given: Piotr
    family: Koniusz
  - given: Zhen
    family: Wang
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 465-475
  id: sun20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 465
  lastpage: 475
  published: 2020-08-06 00:00:00 +0000
- title: 'Epsilon-BMC: A Bayesian Ensemble Approach to Epsilon-Greedy Exploration in Model-Free Reinforcement Learning'
  abstract: 'Resolving the exploration-exploitation trade-off remains a fundamental problem in the design and implementation of reinforcement learning (RL) algorithms. In this paper, we focus on model-free RL using the epsilon-greedy exploration policy, which despite its simplicity, remains one of the most frequently used forms of exploration. However, a key limitation of this policy is the specification of epsilon. In this paper, we provide a novel Bayesian perspective of epsilon as a measure of the uncertainty (and hence convergence) in the Q-value function. We introduce a closed-form Bayesian model update based on Bayesian model combination (BMC), based on this new perspective, which allows us to adapt epsilon using experiences from the environment in constant time with monotone convergence guarantees. We demonstrate that our proposed algorithm, epsilon-BMC, efficiently balances exploration and exploitation on different problems, performing comparably or outperforming the best tuned fixed annealing schedules and an alternative data-dependent epsilon adaptation scheme proposed in the literature.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/gimelfarb20a.html
  PDF: http://proceedings.mlr.press/v115/gimelfarb20a/gimelfarb20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-gimelfarb20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Michael
    family: Gimelfarb
  - given: Scott
    family: Sanner
  - given: Chi-Guhn
    family: Lee
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 476-485
  id: gimelfarb20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 476
  lastpage: 485
  published: 2020-08-06 00:00:00 +0000
- title: 'Periodic Kernel Approximation by Index Set Fourier Series Features'
  abstract: 'Periodicity is often studied in timeseries modelling with autoregressive methods but is less popular in the kernel literature, particularly for multi-dimensional problems such as in textures, crystallography, quantum mechanics, and robotics. Large datasets often make modelling periodicity untenable for otherwise powerful non-parametric methods like Gaussian Processes (GPs) which typically incur an $\mathcal{O}(N^3)$ computational cost, while approximate feature methods are impeded by their approximate accuracy. We introduce a method that efficiently decomposes multi-dimensional periodic kernels into a set of basis functions by exploiting multivariate Fourier series. Termed \emph{Index Set Fourier Series Features}, we show that our approximation produces significantly less predictive generalisation error than alternative approximations such as those based on random and deterministic Fourier features on regression problems with periodic data. '
  volume: 115
  URL: https://proceedings.mlr.press/v115/tompkins20a.html
  PDF: http://proceedings.mlr.press/v115/tompkins20a/tompkins20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-tompkins20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Anthony
    family: Tompkins
  - given: Fabio
    family: Ramos
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 486-496
  id: tompkins20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 486
  lastpage: 496
  published: 2020-08-06 00:00:00 +0000
- title: 'Efficient Neural Network Verification with Exactness Characterization'
  abstract: 'Remarkable progress has been made on verification of neural networks, i.e.,  showing that neural networks are provably consistent with specifications encoding properties like adversarial robustness. Recent methods developed for scalable neural network verification are based on computing an upper bound on the worst-case violation of the specification. Semidefinite programming (SDP) has been proposed as a means to obtain tight upper bounds. However, SDP solvers do not scale to large neural networks. We introduce a Lagrangian relaxation based on the SDP formulation and a novel algorithm to solve the relaxation that scales to networks that are two orders of magnitude larger than the off-the-shelf SDP solvers. Although verification of neural networks is known to be NP-hard in general, we develop the first known sufficient conditions under which a polynomial time verification algorithm (based on the above relaxation) is guaranteed to perform exact verification (i.e., either verify a property or establish it is untrue). The algorithm can be implemented using primitives available readily in common deep learning frameworks. Experiments show that the algorithm is fast, and is able to compute tight upper bounds on the error rates under adversarial attacks of convolutional networks trained on MNIST and CIFAR-10.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/dvijotham20a.html
  PDF: http://proceedings.mlr.press/v115/dvijotham20a/dvijotham20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-dvijotham20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Krishnamurthy (Dj)
    family: Dvijotham
  - given: Robert
    family: Stanforth
  - given: Sven
    family: Gowal
  - given: Chongli
    family: Qin
  - given: Soham
    family: De
  - given: Pushmeet
    family: Kohli
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 497-507
  id: dvijotham20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 497
  lastpage: 507
  published: 2020-08-06 00:00:00 +0000
- title: 'Augmenting and Tuning Knowledge Graph Embeddings'
  abstract: 'Knowledge graph embeddings rank among the most successful methods for link prediction in knowledge graphs, i.e., the task of completing an incomplete collection of relational facts. A downside of these models is their strong sensitivity to model hyperparameters, in particular regularizers, which have to be extensively tuned to reach good performance [Kadlec et al., 2017]. We propose an efficient method for large scale hyperparameter tuning by interpreting these models in a probabilistic framework. After a model augmentation that introduces per-entity hyperparameters, we use a variational expectation-maximization approach to tune thousands of such hyperparameters with minimal additional cost. Our approach is agnostic to details of the model and results in a new state of the art in link prediction on standard benchmark data.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/bamler20a.html
  PDF: http://proceedings.mlr.press/v115/bamler20a/bamler20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-bamler20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Robert
    family: Bamler
  - given: Farnood
    family: Salehi
  - given: Stephan
    family: Mandt
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 508-518
  id: bamler20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 508
  lastpage: 518
  published: 2020-08-06 00:00:00 +0000
- title: 'A Tighter Analysis of Randomised Policy Iteration'
  abstract: 'Policy Iteration (PI) is a popular family of algorithms to compute an optimal policy for a givenMarkov Decision Problem (MDP). Starting from an arbitrary initial policy, PI repeatedly performs locally-improving switches until an optimal policy is found. The exact form of the switching rule gives rise to different variants of PI. Two decades ago, Mansour and Singh [1999] provided the first non-trivial “strong” upper bound on the number of iterations taken by “Howard’s PI” (HPI), a widelyused variant of PI (strong bounds depend only on the number of states and actions in the MDP). They also proposed a randomised variant (RPI) and showed an even tighter strong upper bound. Their bounds for HPI and RPI have not been improved subsequently.\\{We} revisit the algorithms and analysis of Mansour and Singh [1999]. We prove a novel result on the structure of the policy space for k-action MDPs, $k\geq 2$, which generalises a result known for k = 2. Also proposing a new counting argument, we obtain a strong bound of (O$(\sqrt{k \log k}))^n$ iterations for an algorithm akin to RPI, improving significantly upon Mansour and Singh’s original bound of roughly O($(k/2)^n$). Similar analysis of a randomised variant of HPI also yields a strong upper bound of (O($\sqrt{k \log k}))^n$ iterations, registering the first exponential improvement for HPI over the trivial bound of $k^n$. Our other contributions include a lower bound of  $\Omega(n)$ iterations for RPI and an upper bound of $1.6001^n$ iterations for a randomised variant of “Batch-Switching PI” [Kalyanakrishnan et al., 2016a] on 2-action MDPs—the tightest strong upper bound shown yet for the PI family.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/taraviya20a.html
  PDF: http://proceedings.mlr.press/v115/taraviya20a/taraviya20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-taraviya20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Meet
    family: Taraviya
  - given: Shivaram
    family: Kalyanakrishnan
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 519-529
  id: taraviya20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 519
  lastpage: 529
  published: 2020-08-06 00:00:00 +0000
- title: 'Perturbed-History Exploration in Stochastic Linear Bandits'
  abstract: 'We propose a new online algorithm for cumulative regret minimization in a stochastic linear bandit. The algorithm pulls the arm with the highest estimated reward in a linear model trained on its perturbed history. Therefore, we call it perturbed-history exploration in a linear bandit (LinPHE). The perturbed history is a mixture of observed rewards and randomly generated i.i.d. pseudo-rewards. We derive a $\tilde{O}(d \sqrt{n})$ gap-free bound on the $n$-round regret of LinPHE, where $d$ is the number of features. The key steps in our analysis are new concentration and anti-concentration bounds on the weighted sum of Bernoulli random variables. To show the generality of our design, we generalize LinPHE to a logistic model. We evaluate our algorithms empirically and show that they are practical.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/kveton20a.html
  PDF: http://proceedings.mlr.press/v115/kveton20a/kveton20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-kveton20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Branislav
    family: Kveton
  - given: Csaba
    family: Szepesvári
  - given: Mohammad
    family: Ghavamzadeh
  - given: Craig
    family: Boutilier
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 530-540
  id: kveton20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 530
  lastpage: 540
  published: 2020-08-06 00:00:00 +0000
- title: 'An Improved Convergence Analysis of Stochastic Variance-Reduced Policy Gradient'
  abstract: 'We revisit the stochastic variance-reduced policy gradient (SVRPG) method proposed by \citet{papini2018stochastic} for reinforcement learning. We provide an improved convergence analysis of SVRPG and show that it can find an $\epsilon$-approximate stationary point of the performance function within $O(1/\epsilon^{5/3})$ trajectories. This sample complexity improves upon the best known result $O(1/\epsilon^2)$ by a factor of $O(1/\epsilon^{1/3})$. At the core of our analysis is (i) a tighter upper bound for the variance of importance sampling weights, where we prove that the variance can be controlled by the parameter distance between different policies; and (ii) a fine-grained analysis of the epoch length and batch size parameters such that we can significantly reduce the number of trajectories required in each iteration of SVRPG. We also empirically demonstrate the effectiveness of our theoretical claims of batch sizes on  reinforcement learning benchmark tasks. '
  volume: 115
  URL: https://proceedings.mlr.press/v115/xu20a.html
  PDF: http://proceedings.mlr.press/v115/xu20a/xu20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-xu20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Pan
    family: Xu
  - given: Felicia
    family: Gao
  - given: Quanquan
    family: Gu
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 541-551
  id: xu20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 541
  lastpage: 551
  published: 2020-08-06 00:00:00 +0000
- title: 'Deep Mixture of Experts via Shallow Embedding'
  abstract: 'Larger networks generally have greater representational power at the cost of increased computational complexity. Sparsifying such networks has been an active area of research but has been generally limited to static regularization or dynamic approaches using reinforcement learning. We explore a mixture of experts (MoE) approach to deep dynamic routing, which activates certain experts in the network on a per-example basis. Our novel DeepMoE architecture increases the representational power of standard convolutional networks by adaptively sparsifying and recalibrating channel-wise features in each convolutional layer.  We employ a multi-headed sparse gating network to determine the selection and scaling of channels for each input, leveraging exponential combinations of experts within a single convolutional network. Our proposed architecture is evaluated on four benchmark datasets and tasks, and we show that Deep-MoEs are able to achieve higher accuracy with lower computation than standard convolutional networks.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/wang20d.html
  PDF: http://proceedings.mlr.press/v115/wang20d/wang20d.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-wang20d.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Xin
    family: Wang
  - given: Fisher
    family: Yu
  - given: Lisa
    family: Dunlap
  - given: Yi-An
    family: Ma
  - given: Ruth
    family: Wang
  - given: Azalia
    family: Mirhoseini
  - given: Trevor
    family: Darrell
  - given: Joseph E.
    family: Gonzalez
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 552-562
  id: wang20d
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 552
  lastpage: 562
  published: 2020-08-06 00:00:00 +0000
- title: 'Sampling-Free Variational Inference of Bayesian Neural Networks by Variance Backpropagation'
  abstract: '  We propose a new Bayesian Neural Net formulation that affords variational inference for which the evidence lower bound is analytically tractable subject to a tight approximation. We achieve this tractability by (i) decomposing ReLU nonlinearities into the product of an identity and a Heaviside step function, (ii) introducing a separate path that decomposes the neural net expectation from its variance. We demonstrate formally that introducing separate latent binary variables to the activations allows representing the neural network likelihood as a chain of linear operations. Performing variational inference on this construction enables a sampling-free computation of the evidence lower bound which is a more effective approximation than the widely applied Monte Carlo sampling and CLT related techniques. We evaluate the model on a range of regression and classification tasks against BNN inference alternatives, showing competitive or improved performance over the current state-of-the-art. '
  volume: 115
  URL: https://proceedings.mlr.press/v115/haussmann20a.html
  PDF: http://proceedings.mlr.press/v115/haussmann20a/haussmann20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-haussmann20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Manuel
    family: Haußmann
  - given: Fred A.
    family: Hamprecht
  - given: Melih
    family: Kandemir
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 563-573
  id: haussmann20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 563
  lastpage: 573
  published: 2020-08-06 00:00:00 +0000
- title: 'Sliced Score Matching: A Scalable Approach to Density and Score Estimation'
  abstract: 'Score matching is a popular method for estimating unnormalized statistical models. However, it has been so far limited to simple, shallow models or low-dimensional data, due to the difficulty of computing the Hessian of log-density functions. We show this difficulty can be mitigated by projecting the scores onto random vectors before comparing them. This objective, called sliced score matching, only involves Hessian-vector products, which can be easily implemented using reverse-mode automatic differentiation. Therefore, sliced score matching is amenable to more complex models and higher dimensional data compared to score matching. Theoretically, we prove the consistency and asymptotic normality of sliced score matching estimators. Moreover, we demonstrate that sliced score matching can be used to learn deep score estimators for implicit distributions. In our experiments, we show sliced score matching can learn deep energy-based models effectively, and can produce accurate score estimates for applications such as variational inference with implicit distributions and training Wasserstein Auto-Encoders.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/song20a.html
  PDF: http://proceedings.mlr.press/v115/song20a/song20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-song20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Yang
    family: Song
  - given: Sahaj
    family: Garg
  - given: Jiaxin
    family: Shi
  - given: Stefano
    family: Ermon
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 574-584
  id: song20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 574
  lastpage: 584
  published: 2020-08-06 00:00:00 +0000
- title: 'Beyond Structural Causal Models: Causal Constraints Models'
  abstract: 'Structural Causal Models (SCMs) provide a popular causal modeling framework. In this work, we show that SCMs are not flexible enough to give a complete causal representation of dynamical systems at equilibrium. Instead, we propose a generalization of the notion of an SCM, that we call Causal Constraints Model (CCM), and prove that CCMs do capture the causal semantics of such systems. We show how CCMs can be constructed from differential equations and initial conditions and we illustrate our ideas further on a simple but ubiquitous (bio)chemical reaction. Our framework also allows to model functional laws, such as the ideal gas law, in a sensible and intuitive way.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/blom20a.html
  PDF: http://proceedings.mlr.press/v115/blom20a/blom20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-blom20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Tineke
    family: Blom
  - given: Stephan
    family: Bongers
  - given: Joris M.
    family: Mooij
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 585-594
  id: blom20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 585
  lastpage: 594
  published: 2020-08-06 00:00:00 +0000
- title: 'Be Greedy: How Chromatic Number meets Regret Minimization in Graph Bandits'
  abstract: 'We study the classical linear bandit problem on \emph{graphs} modelling arm rewards through an underlying graph structure $G$($N$,$E$) such that rewards of neighboring nodes are similar. Previous attempts along this line have primarily considered the arm rewards to be a smooth function over graph Laplacian, which however failed to characterize the inherent problem complexity in terms of the graph structure. We bridge this gap by showing a regret guarantee of $\tO(\chi(\overline{G})\sqrt{T})$ ($\tO(\cdot)$ notation hides dependencies on $\log T$), that scales only with the chromatic number of the complement graph $\chi(\overline{G})$, assuming the rewards to be a smooth function over a general class of graph embeddings—\emph{Orthonormal Representations}. Our proposed algorithms yield a regret guarantee of $\tilde O(r\sqrt T)$ for any general embedding of rank $r$. Furthermore, if the rewards correspond to a minimum rank embedding, the regret boils down to $O(\chi(\overline{G})\sqrt{T})$—none of the existing works were able to bring out such influences of graph structures over arm rewards. Finally noting that computing the above minimum rank embedding is NP-Hard, we also propose an alternative $O(N + |E|)$ time computable embedding scheme—{\it Greedy Embeddings}—based on greedy graph coloring, on which our algorithms perform optimally on a large family of graphs, e.g. union of cliques, complement of $k$-colorable graphs, regular graphs etc, and are also shown to  outperform the state-of-the-art methods on real datasets. Our findings open up new roads for exploiting graph structures on regret performance.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/s20a.html
  PDF: http://proceedings.mlr.press/v115/s20a/s20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-s20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Shreyas
    family: S
  - given: Aadirupa
    family: Saha
  - given: Chiranjib
    family: Bhattacharyya
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 595-605
  id: s20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 595
  lastpage: 605
  published: 2020-08-06 00:00:00 +0000
- title: 'Approximate Causal Abstractions'
  abstract: 'Scientific models describe natural phenomena at different levels of abstraction. Abstract descriptions can provide the basis for interventions on the system and explanation of observed phenomena at a level of granularity that is coarser than the most fundamental account of the system. Beckers and Halpern (2019), building on prior work of Rubinstein et al. (2017), developed an account of abstraction for causal models that is exact. Here we extend this account to the more realistic case where an abstract causal model only offers an approximation of the underlying system. We show how the resulting account handles the discrepancy that can arise between low- and high-level causal models of the same system, and in the process provide an account of how one causal model approximates another, a topic of independent interest. Finally, we extend the account of approximate abstractions to probabilistic causal models, indicating how and where uncertainty can enter into an approximate abstraction.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/beckers20a.html
  PDF: http://proceedings.mlr.press/v115/beckers20a/beckers20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-beckers20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Sander
    family: Beckers
  - given: Frederick
    family: Eberhardt
  - given: Joseph Y.
    family: Halpern
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 606-615
  id: beckers20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 606
  lastpage: 615
  published: 2020-08-06 00:00:00 +0000
- title: 'The Sensitivity of Counterfactual Fairness to Unmeasured Confounding'
  abstract: 'Causal approaches to fairness have seen substantial recent interest, both from the machine learning community and from wider parties interested in ethical prediction algorithms. In no small part, this has been due to the fact that causal models allow one to simultaneously leverage data and expert knowledge to remove discriminatory effects from predictions. However, one of the primary assumptions in causal modeling is that you know the causal graph. This introduces a new opportunity for bias, caused by misspecifying the causal model. One common way for misspecification to occur is via unmeasured confounding: the true causal effect between variables is partially described by unobserved quantities. In this work we design tools to assess the sensitivity of fairness measures to this confounding for the popular class of non-linear additive noise models (ANMs). Specifically, we give a procedure for computing the maximum difference between two counterfactually fair predictors, where one has become biased due to confounding. For the case of bivariate confounding our technique can be swiftly computed via a sequence of closed-form updates. For multivariate confounding we give an algorithm that can be efficiently solved via automatic differentiation. We demonstrate our new sensitivity analysis tools in real-world fairness scenarios to assess the bias arising from confounding.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/kilbertus20a.html
  PDF: http://proceedings.mlr.press/v115/kilbertus20a/kilbertus20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-kilbertus20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Niki
    family: Kilbertus
  - given: Philip J.
    family: Ball
  - given: Matt J.
    family: Kusner
  - given: Adrian
    family: Weller
  - given: Ricardo
    family: Silva
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 616-626
  id: kilbertus20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 616
  lastpage: 626
  published: 2020-08-06 00:00:00 +0000
- title: 'Belief Propagation: Accurate Marginals or Accurate Partition Function – Where is the Difference?'
  abstract: 'We analyze belief propagation on patch potential models – these are attractive models with varying local potentials – obtain all of the possibly many fixed points, and gather novel insights into belief propagation’s properties. In particular, we observe and theoretically explain several regions in the parameter space that behave fundamentally different. We specify and elaborate on one specific region that, despite the existence of multiple fixed points, is relatively well behaved and provides insights into the relationship between the accuracy of the marginals and the partition function. We demonstrate the inexistence of a principle relationship between both quantities and provide sufficient conditions for a fixed point to be optimal with respect to approximating both the marginals and the partition function.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/knoll20a.html
  PDF: http://proceedings.mlr.press/v115/knoll20a/knoll20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-knoll20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Christian
    family: Knoll
  - given: Franz
    family: Pernkopf
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 627-636
  id: knoll20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 627
  lastpage: 636
  published: 2020-08-06 00:00:00 +0000
- title: 'Finding Minimal d-separators in Linear Time and Applications'
  abstract: 'The study of graphical causal models is fundamentally the study of separations and conditional independences. We provide linear time algorithms for two graphical primitives: to test, if a given set is a minimal d-separator, and to find a minimal d-separator in directed acyclic graphs (DAGs), completed partially directed acyclic graphs (CPDAGs) and restricted chain graphs (RCGs) as well as minimal m-separators in ancestral graphs (AGs). These algorithms improve the runtime of the best previously known algorithms for minimal separators  that are based on moralization and thus require quadratic time to construct and handle the moral graph. (Minimal) separating sets have important applications like finding (minimal) covariate adjustment sets or conditional instrumental variables.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/van-der-zander20a.html
  PDF: http://proceedings.mlr.press/v115/van-der-zander20a/van-der-zander20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-van-der-zander20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Benito
    prefix: van der
    family: Zander
  - given: Maciej
    family: Liśkiewicz
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 637-647
  id: van-der-zander20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 637
  lastpage: 647
  published: 2020-08-06 00:00:00 +0000
- title: 'A Bayesian Approach to Robust Reinforcement Learning'
  abstract: 'Robust Markov Decision Processes (RMDPs) intend to ensure robustness with respect to changing or adversarial system behavior. In this framework, transitions are modeled as arbitrary elements of a known and properly structured uncertainty set and a robust optimal policy can be derived under the worst-case scenario. In this study, we address the issue of learning in RMDPs using a Bayesian approach. We introduce the Uncertainty Robust Bellman Equation (URBE) which encourages safe exploration for adapting the uncertainty set to new observations while preserving robustness. We propose a URBE-based algorithm, DQN-URBE, that scales this method to higher dimensional domains. Our experiments show that the derived URBE-based strategy leads to a better trade-off between less conservative solutions and robustness in the presence of model misspecification. In addition, we show that the DQN-URBE algorithm can adapt significantly faster to changing dynamics online compared to existing robust techniques with fixed uncertainty sets.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/derman20a.html
  PDF: http://proceedings.mlr.press/v115/derman20a/derman20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-derman20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Esther
    family: Derman
  - given: Daniel
    family: Mankowitz
  - given: Timothy
    family: Mann
  - given: Shie
    family: Mannor
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 648-658
  id: derman20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 648
  lastpage: 658
  published: 2020-08-06 00:00:00 +0000
- title: 'Adaptivity and Optimality: A Universal Algorithm for  Online Convex Optimization'
  abstract: 'In this paper, we study adaptive online convex optimization, and aim to design a universal algorithm that achieves optimal regret bounds for multiple common types of loss functions. Existing universal methods are limited in the sense that they are optimal for only a subclass of loss functions. To address this limitation, we propose a novel online algorithm, namely Maler, which enjoys the optimal $O(\sqrt{T})$, $O(d\log T)$ and $O(\log T)$ regret bounds for general convex, exponentially concave, and strongly convex functions respectively. The essential idea is to run multiple types of learning algorithms with different learning rates in parallel, and utilize a meta-algorithm to track the best on the fly. Empirical results demonstrate the effectiveness of our method.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/wang20e.html
  PDF: http://proceedings.mlr.press/v115/wang20e/wang20e.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-wang20e.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Guanghui
    family: Wang
  - given: Shiyin
    family: Lu
  - given: Lijun
    family: Zhang
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 659-668
  id: wang20e
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 659
  lastpage: 668
  published: 2020-08-06 00:00:00 +0000
- title: 'Evacuate or Not? A POMDP Model of the Decision Making of Individuals in Hurricane Evacuation Zones'
  abstract: 'Recent hurricanes in the Atlantic region of the southern United States triggered a series of evacuation orders in the coastal cities of Florida, Georgia, and Texas. While some of these urged voluntary evacuations, most were mandatory orders. Despite governments asking people to vacate their homes for their own safety, many do not. We aim to understand the observable and hidden variables involved in the decision-making process and model these in a partially observable Markov decision process, which predicts whether a person will evacuate or not given his or her current situation. We consider the features of the particular hurricane, the dynamic situation that the individual is experiencing, and demographic factors that influence the decision making of individuals. The process model is represented as a dynamic influence diagram and evaluated on data collected via a comprehensive survey of hurricane-impacted individuals.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/sankar20a.html
  PDF: http://proceedings.mlr.press/v115/sankar20a/sankar20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-sankar20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Adithya Raam
    family: Sankar
  - given: Prashant
    family: Doshi
  - given: Adam
    family: Goodie
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 669-678
  id: sankar20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 669
  lastpage: 678
  published: 2020-08-06 00:00:00 +0000
- title: 'Probabilistic Programming for Birth-Death Models of Evolution Using an Alive Particle Filter with Delayed Sampling'
  abstract: 'We consider probabilistic programming for birth-death models of evolution and introduce a new widely-applicable inference method that combines an extension of the alive particle filter (APF) with automatic Rao-Blackwellization via delayed sampling. Birth-death models of evolution are an important family of phylogenetic models of the diversification processes that lead to evolutionary trees. Probabilistic programming languages (PPLs) give phylogeneticists a new and exciting tool: their models can be implemented as probabilistic programs with just a basic knowledge of programming. The general inference methods in PPLs reduce the need for external experts, allow quick prototyping and testing, and accelerate the development and deployment of new models. We show how these birth-death models can be implemented as simple programs in existing PPLs, and demonstrate the usefulness of the proposed inference method for such models. For the popular BiSSE model the method yields an increase of the effective sample size and the conditional acceptance rate by a factor of 30 in comparison with a standard bootstrap particle filter. Although concentrating on phylogenetics, the extended APF is a general inference method that shows its strength in situations where particles are often assigned zero weight. In the case when the weights are always positive, the extra cost of using the APF rather than the bootstrap particle filter is negligible, making our method a suitable drop-in replacement for the bootstrap particle filter in probabilistic programming inference.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/kudlicka20a.html
  PDF: http://proceedings.mlr.press/v115/kudlicka20a/kudlicka20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-kudlicka20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Jan
    family: Kudlicka
  - given: Lawrence M.
    family: Murray
  - given: Fredrik
    family: Ronquist
  - given: Thomas B.
    family: Schön
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 679-689
  id: kudlicka20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 679
  lastpage: 689
  published: 2020-08-06 00:00:00 +0000
- title: 'Variational Sparse Coding'
  abstract: 'Unsupervised discovery of interpretable features and controllable generation with high-dimensional data are currently major challenges in machine learning, with applications in data visualisation, clustering and artificial data synthesis. We propose a model based on variational auto-encoders (VAEs) in which interpretation is induced through latent space sparsity with a mixture of Spike and Slab distributions as prior. We derive an evidence lower bound for this model and propose a specific training method for recovering disentangled features as sparse elements in latent vectors. In our experiments, we demonstrate superior disentanglement performance to standard VAE approaches when an estimate of the number of true sources of variation is not available and objects display different combinations of attributes. Furthermore, the new model provides unique capabilities, such as recovering feature exploitation, synthesising samples that share attributes with a given input object and controlling both discrete and continuous features upon generation.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/tonolini20a.html
  PDF: http://proceedings.mlr.press/v115/tonolini20a/tonolini20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-tonolini20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Francesco
    family: Tonolini
  - given: Bjørn Sand
    family: Jensen
  - given: Roderick
    family: Murray-Smith
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 690-700
  id: tonolini20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 690
  lastpage: 700
  published: 2020-08-06 00:00:00 +0000
- title: 'Learning with Non-Convex Truncated Losses by SGD'
  abstract: 'Learning with a convex loss function has been a dominating paradigm for many years. It remains an interesting question how non-convex loss functions help improve the generalization of learning with broad applicability. In this paper, we study a family of objective functions  formed by truncating traditional loss functions, which  is applicable to both shallow learning and deep learning. Truncating loss functions has  potential to be less vulnerable and more robust to large noise in observations that could be adversarial. More importantly, it is a generic technique without assuming the knowledge of  noise distribution. To justify non-convex learning with truncated losses, we establish excess risk bounds of empirical risk minimization based on truncated losses for heavy-tailed output, and statistical error of an approximate stationary point found by stochastic gradient descent (SGD) method. Our experiments for shallow and deep learning for regression with outliers, corrupted data and heavy-tailed noise further justify the proposed method.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/xu20b.html
  PDF: http://proceedings.mlr.press/v115/xu20b/xu20b.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-xu20b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Yi
    family: Xu
  - given: Shenghuo
    family: Zhu
  - given: Sen
    family: Yang
  - given: Chi
    family: Zhang
  - given: Rong
    family: Jin
  - given: Tianbao
    family: Yang
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 701-711
  id: xu20b
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 701
  lastpage: 711
  published: 2020-08-06 00:00:00 +0000
- title: 'Active Multi-Information Source Bayesian Quadrature'
  abstract: 'Bayesian quadrature (BQ) is a sample-efficient probabilistic numerical method to solve integrals of expensive-to-evaluate black-box functions, yet so far, active BQ learning schemes focus merely on the integrand itself as information source, and do not allow for information transfer from cheaper, related functions. Here, we set the scene for active learning in BQ when multiple related information sources of variable cost (in input and source) are accessible. This setting arises for example when evaluating the integrand requires a complex simulation to be run that can be approximated by simulating at lower levels of sophistication and at lesser expense. We construct meaningful cost-sensitive multi-source acquisition-rates as an extension to common utility functions from vanilla BQ (VBQ), and discuss pitfalls that arise from blindly generalizing. In proof-of-concept experiments we scrutinize the behavior of our generalized acquisition functions. On an epidemiological model, we demonstrate that active multi-source BQ (AMS-BQ) is more cost-efficient than VBQ in learning the integral to a good accuracy.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/gessner20a.html
  PDF: http://proceedings.mlr.press/v115/gessner20a/gessner20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-gessner20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Alexandra
    family: Gessner
  - given: Javier
    family: Gonzalez
  - given: Maren
    family: Mahsereci
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 712-721
  id: gessner20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 712
  lastpage: 721
  published: 2020-08-06 00:00:00 +0000
- title: 'Cascading Linear Submodular Bandits: Accounting for Position Bias and Diversity in Online Learning to Rank'
  abstract: 'Online learning, position bias, and diversified retrieval are three crucial aspects in designing ranking systems based on user clicks. One simple click model which explains the position bias is the cascade model. Many online learning variants of the cascade model have been proposed, but none so far captures diversity. Similarly, online learning to rank methods which capture diversity do not handle position bias. Motivated by these limitations, we propose a novel click model, and the associated online learning variant to address both position bias and diversified retrieval in a unified framework. Despite the objective function not being a standard submodular set function, we prove that a greedy-approach is a reasonable approximation. We then propose, CascadeLSB, an algorithm whose goal is to recommend K most attractive items from a large set of L items with the attractiveness of items being modeled as a submodular utility function. We analyze the algorithm and prove a gap-free upper bound on its n-step regret. We comprehensively evaluate CascadeLSB on synthetic and real datasets, where it  outperforms all the baselines.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/hiranandani20a.html
  PDF: http://proceedings.mlr.press/v115/hiranandani20a/hiranandani20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-hiranandani20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Gaurush
    family: Hiranandani
  - given: Harvineet
    family: Singh
  - given: Prakhar
    family: Gupta
  - given: Iftikhar Ahamath
    family: Burhanuddin
  - given: Zheng
    family: Wen
  - given: Branislav
    family: Kveton
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 722-732
  id: hiranandani20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 722
  lastpage: 732
  published: 2020-08-06 00:00:00 +0000
- title: 'Sinkhorn AutoEncoders'
  abstract: 'Optimal transport offers an alternative to maximum likelihood for learning generative autoencoding models. We show that minimizing the $p$-Wasserstein distance between the generator and the true data distribution is equivalent to the unconstrained min-min optimization of the $p$-Wasserstein distance between the encoder aggregated posterior and the prior in latent space, plus a reconstruction error. We also identify the role of its trade-off hyperparameter as the capacity of the generator: its Lipschitz constant. Moreover, we prove that optimizing the encoder over any class of universal approximators, such as deterministic neural networks, is enough to come arbitrarily close to the optimum. We therefore advertise this framework, which holds for any metric space and prior, as a sweet-spot of current generative autoencoding objectives.We then introduce the Sinkhorn autoencoder (SAE), which approximates and minimizes the $p$-Wasserstein distance in latent space via backprogation through the Sinkhorn algorithm. SAE directly works on samples, i.e. it models the aggregated posterior as an implicit distribution, with no need for a reparameterization trick for gradients estimations. SAE is thus able to work with different metric spaces and priors with minimal adaptations. We demonstrate the flexibility of SAE on latent spaces with different geometries and priors and compare with other methods on benchmark data sets.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/patrini20a.html
  PDF: http://proceedings.mlr.press/v115/patrini20a/patrini20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-patrini20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Giorgio
    family: Patrini
  - given: Rianne
    prefix: van den
    family: Berg
  - given: Patrick
    family: Forré
  - given: Marcello
    family: Carioni
  - given: Samarth
    family: Bhargav
  - given: Max
    family: Welling
  - given: Tim
    family: Genewein
  - given: Frank
    family: Nielsen
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 733-743
  id: patrini20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 733
  lastpage: 743
  published: 2020-08-06 00:00:00 +0000
- title: 'How to Exploit Structure while Solving Weighted Model Integration Problems'
  abstract: 'Weighted model counting (WMC) is a state-of-the-art technique for probabilistic inference in discrete domains. WMC has recently been extended towards weighted model integration (WMI) in order to handle discrete and continuous distributions alike. While a number of WMI solvers have been introduced, their relationships, strengths and weaknesses are not yet well understood. WMI solving consists of two sub-problems: 1) finding convex polytopes; and 2) integrating over them efficiently. We formalize the first step as $\lambda$-SMT and discuss what strategies solvers apply to solve both the $\lambda$-SMT and the integration problem. This formalization allows us to compare state-of-the-art solvers and their behaviour across different types of WMI problems. Moreover, we identify factorizability of WMI problems as a key property that emerges in the context of probabilistic programming. Problems that can be factorized can be solved more efficiently. However, current solvers exploiting this property restrict themselves to WMI problems with univariate conditions and fully factorizable weight functions. We introduce a new algorithm, F-XSDD, that lifts these restrictions and can exploit factorizability in WMI problems with multivariate conditions and partially factorizable weight functions. Through an empirical evaluation, we show the effectiveness of our approach.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/kolb20a.html
  PDF: http://proceedings.mlr.press/v115/kolb20a/kolb20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-kolb20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Samuel
    family: Kolb
  - given: Pedro Zuidberg Dos
    family: Martires
  - given: Luc
    family: De Raedt
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 744-754
  id: kolb20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 744
  lastpage: 754
  published: 2020-08-06 00:00:00 +0000
- title: 'Multi-Class Gaussian Process Classification Made Conjugate: Efficient Inference via Data Augmentation'
  abstract: 'We propose a new scalable multi-class Gaussian process classification approach building on a novel modified softmax likelihood function. The new likelihood has two benefits: it leads to well-calibrated uncertainty estimates and allows for an efficient latent variable augmentation. The augmented model has the advantage that it is conditionally conjugate leading to a fast variational inference method via block coordinate ascent updates. Previous approaches suffered from a trade-off between uncertainty calibration and speed. Our experiments show that our method leads to well-calibrated uncertainty estimates and competitive predictive performance while being up to two orders faster than the state of the art.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/galy-fajou20a.html
  PDF: http://proceedings.mlr.press/v115/galy-fajou20a/galy-fajou20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-galy-fajou20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Théo
    family: Galy-Fajou
  - given: Florian
    family: Wenzel
  - given: Christian
    family: Donner
  - given: Manfred
    family: Opper
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 755-765
  id: galy-fajou20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 755
  lastpage: 765
  published: 2020-08-06 00:00:00 +0000
- title: 'A Flexible Framework for Multi-Objective Bayesian Optimization using Random Scalarizations'
  abstract: 'Many real world applications can be framed as multi-objective optimization problems, where we wish to simultaneously optimize for multiple criteria. Bayesian optimization techniques for the multi-objective setting are pertinent when the evaluation of the functions in question are expensive. Traditional methods for multi-objective optimization, both Bayesian and otherwise, are aimed at recovering the Pareto front of these objectives. However, in certain cases a practitioner might desire to identify Pareto optimal points only in a subset of the Pareto front due to external considerations. In this work, we propose a strategy based on random scalarizations of the objectives that addresses this problem. Our approach is able to flexibly sample from desired regions of the Pareto front and, computationally, is considerably cheaper than most approaches for MOO. We also study a notion of regret in the multi-objective setting and show that our strategy achieves sublinear regret. We experiment with both synthetic and real-life problems, and demonstrate superior performance of our proposed algorithm in terms of the flexibility and regret.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/paria20a.html
  PDF: http://proceedings.mlr.press/v115/paria20a/paria20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-paria20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Biswajit
    family: Paria
  - given: Kirthevasan
    family: Kandasamy
  - given: Barnabás
    family: Póczos
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 766-776
  id: paria20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 766
  lastpage: 776
  published: 2020-08-06 00:00:00 +0000
- title: 'Efficient Multitask Feature and Relationship Learning'
  abstract: 'We consider a multitask learning problem, in which several predictors are learned jointly. Prior research has shown that learning the relations between tasks, and between the input features, together with the predictor, can lead to better generalization and interpretability, which proved to be useful for applications in many domains. In this paper, we consider a formulation of multitask learning that learns the relationships both between tasks and between features, represented through a task covariance and a feature covariance matrix, respectively. First, we demonstrate that existing methods proposed for this problem present an issue that may lead to ill-posed optimization. We then propose an alternative formulation, as well as an efficient algorithm to optimize it. Using ideas from optimization and graph theory, we propose an efficient coordinate-wise minimization algorithm that has a closed form solution for each block subproblem. Our experiments show that the proposed optimization method is orders of magnitude faster than its competitors. We also provide a nonlinear extension that is able to achieve better generalization than existing methods.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/zhao20a.html
  PDF: http://proceedings.mlr.press/v115/zhao20a/zhao20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-zhao20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Han
    family: Zhao
  - given: Otilia
    family: Stretcu
  - given: Alexander J.
    family: Smola
  - given: Geoffrey J.
    family: Gordon
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 777-787
  id: zhao20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 777
  lastpage: 787
  published: 2020-08-06 00:00:00 +0000
- title: 'Practical Multi-fidelity Bayesian Optimization for Hyperparameter Tuning'
  abstract: 'Bayesian optimization is popular for optimizing time-consuming black-box objectives.  Nonetheless, for hyperparameter tuning in deep neural networks, the time required to evaluate the validation error for even a few hyperparameter settings remains a bottleneck.  Multi-fidelity optimization promises relief using cheaper proxies to such objectives —  for example, validation error for a network trained using a subset of the training points or fewer iterations than required for convergence.  We propose a highly flexible and practical approach to multi-fidelity Bayesian optimization, focused on efficiently optimizing hyperparameters for iteratively trained supervised learning models.  We introduce a new acquisition function, the trace-aware knowledge-gradient, which efficiently leverages both multiple continuous fidelity controls and trace observations — values of the objective at a sequence of fidelities, available when varying fidelity using training iterations.  We provide a provably convergent method for optimizing our acquisition function and show it outperforms state-of-the-art alternatives for  hyperparameter tuning of deep neural networks and large-scale kernel learning.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/wu20a.html
  PDF: http://proceedings.mlr.press/v115/wu20a/wu20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-wu20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Jian
    family: Wu
  - given: Saul
    family: Toscano-Palmerin
  - given: Peter I.
    family: Frazier
  - given: Andrew Gordon
    family: Wilson
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 788-798
  id: wu20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 788
  lastpage: 798
  published: 2020-08-06 00:00:00 +0000
- title: 'Adaptively Truncating Backpropagation Through Time to Control Gradient Bias'
  abstract: 'Truncated backpropagation through time (TBPTT) is a popular method for learning in recurrent neural networks (RNNs) that saves computation and memory at the cost of bias by truncating backpropagation after a fixed number of lags. In practice, choosing the optimal truncation length is difficult: TBPTT will not converge if the truncation length is too small, or will converge slowly if it is too large. We propose an adaptive TBPTT scheme that converts the problem from choosing a temporal lag to one of choosing a tolerable amount of gradient bias. For many realistic RNNs, the TBPTT gradients decay geometrically in expectation for large lags; under this condition, we can control the bias by varying the truncation length adaptively.  For RNNs with smooth activation functions, we prove that this bias controls the convergence rate of SGD with biased gradients for our non-convex loss. Using this theory, we develop a practical method for adaptively estimating the truncation length during training. We evaluate our adaptive TBPTT method on synthetic data and language modeling tasks and find that our adaptive TBPTT ameliorates the computational pitfalls of fixed TBPTT.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/aicher20a.html
  PDF: http://proceedings.mlr.press/v115/aicher20a/aicher20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-aicher20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Christopher
    family: Aicher
  - given: Nicholas J.
    family: Foti
  - given: Emily B.
    family: Fox
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 799-808
  id: aicher20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 799
  lastpage: 808
  published: 2020-08-06 00:00:00 +0000
- title: 'Sampling-free Uncertainty Estimation in Gated Recurrent Units with Applications to Normative Modeling in Neuroimaging'
  abstract: 'There has recently been a concerted effort to derive mechanisms in vision and machine learning systems to offer uncertainty estimates of the predictions they make. Clearly, there are enormous benefits to a system that is not only accurate but also has a sense for when it is not. Existing proposals center around Bayesian interpretations of modern deep architectures - these are effective but can often be computationally demanding. We show how classical ideas in the literature on exponential families on probabilistic networks provide an excellent starting point to derive uncertainty estimates in Gated Recurrent Units (GRU). Our proposal directly quantifies uncertainty deterministically, without the need for costly sampling-based estimation. We show that while uncertainty is quite useful by itself in computer vision and machine learning, we also demonstrate that it can play a key role in enabling statistical analysis with deep networks in neuroimaging studies with normative modeling methods. To our knowledge, this is the first result describing sampling-free uncertainty estimation for powerful sequential models such as GRUs.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/hwang20a.html
  PDF: http://proceedings.mlr.press/v115/hwang20a/hwang20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-hwang20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Seong Jae
    family: Hwang
  - given: Ronak R.
    family: Mehta
  - given: Hyunwoo J.
    family: Kim
  - given: Sterling C.
    family: Johnson
  - given: Vikas
    family: Singh
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 809-819
  id: hwang20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 809
  lastpage: 819
  published: 2020-08-06 00:00:00 +0000
- title: 'Online Factorization and Partition of Complex Networks by Random Walk'
  abstract: 'Finding the reduced-dimensional structure is critical to understanding complex networks. Existing approaches such as spectral clustering are applicable only when the full network is explicitly observed. In this paper, we focus on the online factorization and partition of implicit large lumpable networks based on observations from an associated random walk. We formulate this into a nonconvex stochastic factorization problem and propose an efficient and scalable stochastic generalized Hebbian algorithm (GHA). The algorithm is able to process random walk data in a streaming fashion and learn a low-dimensional representation for each vertex. By applying a diffusion approximation analysis, we show that the continuous-time limiting process of the stochastic algorithm converges globally to the “principal components” of the Markov chain. We also establish a finite-sample error bound that matches the nonimprovable state-of-art result for online factorization. Once learned the low-dimensional state representations, we further apply clustering techniques to recover the network partition. We show that when the associated Markov process is lumpable, one can recover the partition exactly with high probability given sufficient data. We apply the proposed approach to model the traffic flow of Manhattan as city-wide random walks. By using our algorithm to analyze the taxi trip data, we discover a latent partition of the Manhattan city that closely matches the traffic dynamics.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/yang20a.html
  PDF: http://proceedings.mlr.press/v115/yang20a/yang20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-yang20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Lin
    family: Yang
  - given: Zheng
    family: Yu
  - given: Vladimir
    family: Braverman
  - given: Tuo
    family: Zhao
  - given: Mengdi
    family: Wang
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 820-830
  id: yang20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 820
  lastpage: 830
  published: 2020-08-06 00:00:00 +0000
- title: 'On Densification for Minwise Hashing'
  abstract: 'One Permutation Hashing (OPH) is a significantly more efficient alternative to the popular minwise hashing. To produce a sketch of size $k$, OPH requires just one hash function whereas the classical minwise hashing requires $k$ hash functions. However, OPH does not have the desirable locality sensitive hashing (LSH) property that is important for indexing. [Srivastava and Li 2014, ICML]  proposed the novel idea of densification to produce LSH sketches from OPH sketches, and gave the first densification routine. In this paper, we give a necessary and sufficient condition for a densification routine to result in LSH sketches when applied to OPH sketches. Furthermore, we give a novel densification routine that for every input, takes O($k \log k$) time in expectation and achieves better variance than the previous best bound obtained by [Srivastava 2017]. The running time of the densification routine given in [Srivastava 2017] for worst case inputs is O($k^2$) in expectation.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/mai20a.html
  PDF: http://proceedings.mlr.press/v115/mai20a/mai20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-mai20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Tung
    family: Mai
  - given: Anup
    family: Rao
  - given: Matt
    family: Kapilevich
  - given: Ryan
    family: Rossi
  - given: Yasin
    family: Abbasi-Yadkori
  - given: Ritwik
    family: Sinha
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 831-840
  id: mai20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 831
  lastpage: 840
  published: 2020-08-06 00:00:00 +0000
- title: 'N-GCN: Multi-scale Graph Convolution for Semi-supervised Node Classification'
  abstract: 'Graph Convolutional Networks (GCNs) have shown significant improvements in semi-supervised learning on graph-structured data. Concurrently, unsupervised learning of graph embeddings has benefited from the information contained in random walks. In this paper, we propose a model: Network of GCNs (N-GCN), which marries these two lines of work. At its core, N-GCN trains multiple instances of GCNs over node pairs discovered at different distances in random walks, and learns a combination of the instance outputs which optimizes the classification objective. Our experiments show that our proposed N-GCN model improves state-of-the-art baselines on all of the challenging node classification tasks we consider: Cora, Citeseer, Pubmed, and PPI. In addition, our proposed method has other desirable properties, including generalization to recently proposed semi-supervised learning methods such as GraphSAGE, allowing us to propose N-SAGE, and resilience to adversarial input perturbations.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/abu-el-haija20a.html
  PDF: http://proceedings.mlr.press/v115/abu-el-haija20a/abu-el-haija20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-abu-el-haija20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Sami
    family: Abu-El-Haija
  - given: Amol
    family: Kapoor
  - given: Bryan
    family: Perozzi
  - given: Joonseok
    family: Lee
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 841-851
  id: abu-el-haija20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 841
  lastpage: 851
  published: 2020-08-06 00:00:00 +0000
- title: 'Problem-dependent Regret Bounds for Online Learning with Feedback Graphs'
  abstract: 'This paper addresses the stochastic multi-armed bandit problem with an undirected feedback graph. We devise a UCB-based algorithm, UCB-NE, to provide a problem-dependent regret bound that depends on a clique covering. Our algorithm obtains regret which provably scales linearly with the clique covering number. Additionally, we provide problem-dependent regret bounds for a Thompson Sampling-based algorithm, TS-N, where again the bounds are linear in the clique covering number. Finally, we present experimental results to see how UCB-NE, TS-N, and a few related algorithms perform practically.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/hu20b.html
  PDF: http://proceedings.mlr.press/v115/hu20b/hu20b.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-hu20b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Bingshan
    family: Hu
  - given: Nishant A.
    family: Mehta
  - given: Jianping
    family: Pan
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 852-861
  id: hu20b
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 852
  lastpage: 861
  published: 2020-08-06 00:00:00 +0000
- title: 'Wasserstein Fair Classification'
  abstract: 'We propose an approach to fair classification that enforces independence between the classifier outputs and sensitive information by minimizing Wasserstein-1 distances. The approach has desirable theoretical properties and is robust to specific choices of the threshold used to obtain class predictions from model outputs.We introduce different methods that enable hid-ing sensitive information at test time or have a simple and fast implementation.  We show empirical  performance  against  different  fair-ness baselines on several benchmark fairness datasets.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/jiang20a.html
  PDF: http://proceedings.mlr.press/v115/jiang20a/jiang20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-jiang20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Ray
    family: Jiang
  - given: Aldo
    family: Pacchiano
  - given: Tom
    family: Stepleton
  - given: Heinrich
    family: Jiang
  - given: Silvia
    family: Chiappa
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 862-872
  id: jiang20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 862
  lastpage: 872
  published: 2020-08-06 00:00:00 +0000
- title: 'Variational Training for Large-Scale Noisy-OR Bayesian Networks'
  abstract: 'We propose a stochastic variational inference algorithm for training large-scale Bayesian networks, where noisy-OR conditional distributions are used to capture higher-order relationships. One application is to the learning of hierarchical topic models for text data. While previous work has focused on two-layer networks popular in applications like medical diagnosis, we develop scalable algorithms for deep networks that capture a multi-level hierarchy of interactions. Our key innovation is a family of constrained variational bounds that only explicitly optimize posterior probabilities for the sub-graph of topics most related to the sparse observations in a given document. These constrained bounds have comparable accuracy but dramatically reduced computational cost. Using stochastic gradient updates based on our variational bounds, we learn noisy-OR Bayesian networks orders of magnitude faster than was possible with prior Monte Carlo learning algorithms, and provide a new tool for understanding large-scale binary data.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/ji20a.html
  PDF: http://proceedings.mlr.press/v115/ji20a/ji20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-ji20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Geng
    family: Ji
  - given: Dehua
    family: Cheng
  - given: Huazhong
    family: Ning
  - given: Changhe
    family: Yuan
  - given: Hanning
    family: Zhou
  - given: Liang
    family: Xiong
  - given: Erik B.
    family: Sudderth
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 873-882
  id: ji20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 873
  lastpage: 882
  published: 2020-08-06 00:00:00 +0000
- title: 'Guaranteed Scalable Learning of Latent Tree Models'
  abstract: 'We present an integrated approach to structure and parameter estimation in latent tree graphical models, where some nodes are hidden. Our overall approach follows a “divide-and-conquer” strategy that learns models  over small groups of variables and iteratively merges into a global solution.   The structure learning involves  combinatorial operations such as minimum spanning tree construction and local recursive grouping; the parameter learning is based on the method of moments and on tensor decompositions. Our method  is guaranteed to correctly recover the unknown tree structure and the model parameters with low sample complexity for the class of linear multivariate latent tree models which includes discrete and Gaussian distributions, and Gaussian mixtures. Our bulk asynchronous parallel algorithm is implemented in parallel and scales logarithmically with the number of variables and linearly with dimensionality of each variable. '
  volume: 115
  URL: https://proceedings.mlr.press/v115/huang20b.html
  PDF: http://proceedings.mlr.press/v115/huang20b/huang20b.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-huang20b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Furong
    family: Huang
  - given: Niranjan Uma
    family: Naresh
  - given: Ioakeim
    family: Perros
  - given: Robert
    family: Chen
  - given: Jimeng
    family: Sun
  - given: Anima
    family: Anandkumar
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 883-893
  id: huang20b
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 883
  lastpage: 893
  published: 2020-08-06 00:00:00 +0000
- title: 'On First-Order Bounds, Variance and Gap-Dependent Bounds for Adversarial Bandits'
  abstract: 'We make three contributions to the theory of k-armed adversarial bandits. First, we prove a first-order bound for a modified variant of the INF strategy by Audibert and Bubeck [2009], without sacrificing worst case optimality or modifying the loss estimators. Second, we provide a variance analysis for algorithms based on follow the regularised leader, showing that without adaptation the variance of the regret is typically {\Omega}(n^2) where n is the horizon. Finally, we study bounds that depend on the degree of separation of the arms, generalising the results by Cowan and Katehakis [2015] from the stochastic setting to the adversarial and improving the result of Seldin and Slivkins [2014] by a factor of log(n)/log(log(n)).'
  volume: 115
  URL: https://proceedings.mlr.press/v115/pogodin20a.html
  PDF: http://proceedings.mlr.press/v115/pogodin20a/pogodin20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-pogodin20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Roman
    family: Pogodin
  - given: Tor
    family: Lattimore
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 894-904
  id: pogodin20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 894
  lastpage: 904
  published: 2020-08-06 00:00:00 +0000
- title: 'Noise Contrastive Priors for Functional Uncertainty'
  abstract: 'Obtaining reliable uncertainty estimates of neural network predictions is a long standing challenge. Bayesian neural networks have been proposed as a solution, but it remains open how to specify their prior. In particular, the common practice of an independent normal prior in weight space imposes relatively weak constraints on the function posterior, allowing it to generalize in unforeseen ways on inputs outside of the training distribution. We propose noise contrastive priors (NCPs) to obtain reliable uncertainty estimates. The key idea is to train the model to output high uncertainty for data points outside of the training distribution. NCPs do so using an input prior, which adds noise to the inputs of the current mini batch, and an output prior, which is a wide distribution given these inputs. NCPs are compatible with any model that can output uncertainty estimates, are easy to scale, and yield reliable uncertainty estimates throughout training. Empirically, we show that NCPs prevent overfitting outside of the training distribution and result in uncertainty estimates that are useful for active learning. We demonstrate the scalability of our method on the flight delays data set, where we significantly improve upon previously published results.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/hafner20a.html
  PDF: http://proceedings.mlr.press/v115/hafner20a/hafner20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-hafner20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Danijar
    family: Hafner
  - given: Dustin
    family: Tran
  - given: Timothy
    family: Lillicrap
  - given: Alex
    family: Irpan
  - given: James
    family: Davidson
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 905-914
  id: hafner20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 905
  lastpage: 914
  published: 2020-08-06 00:00:00 +0000
- title: 'Fake It Till You Make It: Learning-Compatible Performance Support'
  abstract: 'A longstanding goal of artificial intelligence is to develop technologies that augment or assist humans. Current approaches to developing agents that can assist humans focus on adapting behavior of the assistant, and do not consider the potential for assistants to support human learning. We argue that in many cases it is worthwhile to provide assistance in a manner that also promotes task learning or skill maintenance. We term such assistance Learning-Compatible Performance Support, and present the Stochastic Q Bumpers algorithm for greatly improving learning outcomes while still providing high levels of performance support. We demonstrate the effectiveness of our approach in multiple domains, including a complex flight control task.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/bragg20a.html
  PDF: http://proceedings.mlr.press/v115/bragg20a/bragg20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-bragg20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Jonathan
    family: Bragg
  - given: Emma
    family: Brunskill
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 915-924
  id: bragg20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 915
  lastpage: 924
  published: 2020-08-06 00:00:00 +0000
- title: 'Literal or Pedagogic Human? Analyzing Human Model Misspecification in Objective Learning'
  abstract: 'It is incredibly easy for a system designer to misspecify the objective for an autonomous system (“robot"), thus motivating the desire to have the robot learn the objective from human behavior instead.  Recent work has suggested that people have an interest in the robot performing well, and will thus behave pedagogically, choosing actions that are informative to the robot. In turn, robots benefit from interpreting the behavior by accounting for this pedagogy. In this work, we focus on misspecification: we argue that robots might not know whether people are being pedagogic or literal and that it is important to ask which assumption is safer to make. We cast objective learning into the more general form of a common-payoff game between the robot and human, and prove that in any such game literal interpretation is more robust to misspecification. Experiments with human data support our theoretical results and point to the sensitivity of the pedagogic assumption.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/milli20a.html
  PDF: http://proceedings.mlr.press/v115/milli20a/milli20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-milli20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Smitha
    family: Milli
  - given: Anca D.
    family: Dragan
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 925-934
  id: milli20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 925
  lastpage: 934
  published: 2020-08-06 00:00:00 +0000
- title: 'Convergence Analysis of Gradient-Based Learning in Continuous Games'
  abstract: 'Considering a class of gradient-based multi-agent learning algorithms in non-cooperative settings, we provide convergence guarantees to a neighborhood of a stable Nash equilibrium. In particular, we consider continuous games where agents learn in 1) deterministic settings with oracle access to their gradient and 2) stochastic settings with an unbiased estimator of their gradient. We also study the effects of non-uniform learning rates, which causes a distortion of the vector field that can alter which equilibrium the agents converge to and the path they take. We support the analysis with numerical examples that provide insight into how one might synthesize games to achieve desired equilibria.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/chasnov20a.html
  PDF: http://proceedings.mlr.press/v115/chasnov20a/chasnov20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-chasnov20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Benjamin
    family: Chasnov
  - given: Lillian
    family: Ratliff
  - given: Eric
    family: Mazumdar
  - given: Samuel
    family: Burden
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 935-944
  id: chasnov20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 935
  lastpage: 944
  published: 2020-08-06 00:00:00 +0000
- title: 'End-to-end Training of Deep Probabilistic CCA on Paired Biomedical Observations'
  abstract: 'Medical pathology images are visually evaluated by experts for disease diagnosis, but the connection between image features and the state of the cells in an image is typically unknown. To understand this relationship, we develop a multimodal modeling and inference framework that estimates shared latent structure of joint gene expression levels and medical image features. Our method is built around probabilistic canonical correlation analysis (PCCA), which is fit to image embeddings that are learned using convolutional neural networks and linear embeddings of paired gene expression data. Using a differentiable take on the EM algorithm, we train the model end-to-end so that the PCCA and neural network parameters are estimated simultaneously. We demonstrate the utility of this method in constructing image features that are predictive of gene expression levels on simulated data and the Genotype-Tissue Expression data. We demonstrate that the latent variables are interpretable by disentangling the latent subspace through shared and modality-specific views.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/gundersen20a.html
  PDF: http://proceedings.mlr.press/v115/gundersen20a/gundersen20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-gundersen20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Gregory
    family: Gundersen
  - given: Bianca
    family: Dumitrascu
  - given: Jordan T.
    family: Ash
  - given: Barbara E.
    family: Engelhardt
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 945-955
  id: gundersen20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 945
  lastpage: 955
  published: 2020-08-06 00:00:00 +0000
- title: 'Approximate Relative Value Learning for Average-reward Continuous State MDPs'
  abstract: 'In this paper, we propose an approximate relative value learning (ARVL) algorithm for non- parametric MDPs with continuous state space and finite actions and average reward criterion. It is a sampling based algorithm combined with kernel density estimation and function approximation via nearest neighbors. The theoretical analysis is done via a random contraction operator framework and stochastic dominance argument. This is the first such algorithm for continuous state space MDPs with average re- ward criteria with these provable properties which does not require any discretization of state space as far as we know. We then evaluate the proposed algorithm on a benchmark problem numerically.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/sharma20a.html
  PDF: http://proceedings.mlr.press/v115/sharma20a/sharma20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-sharma20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Hiteshi
    family: Sharma
  - given: Mehdi
    family: Jafarnia-Jahromi
  - given: Rahul
    family: Jain
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 956-964
  id: sharma20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 956
  lastpage: 964
  published: 2020-08-06 00:00:00 +0000
- title: 'Exact Sampling of Directed Acyclic Graphs from Modular Distributions'
  abstract: 'We consider the problem of sampling directed acyclic graphs (DAGs) from a given distribution. We assume the sampling distribution is modular, i.e., the probability of a DAG is a product of local factors, each of which only depends on a node and its parents in the DAG. Using inclusion–exclusion recurrence relations, we give an exact sampler that requires Õ(3^n) time for preprocessing and Õ(2^n) per sample, where n is the number of nodes and Õ suppresses polylogarithmic factors. We also consider the symmetric special case where the factors only depend on the size of the parent set—this covers uniform sampling under indegree constraints. In this case, our exact sampler requires O(n^3) time for preprocessing and O(n^2) per sample; this outperforms the previous best bound even for the uniform distribution. We demonstrate the performance of both samplers also empirically.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/talvitie20a.html
  PDF: http://proceedings.mlr.press/v115/talvitie20a/talvitie20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-talvitie20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Topi
    family: Talvitie
  - given: Aleksis
    family: Vuoksenmaa
  - given: Mikko
    family: Koivisto
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 965-974
  id: talvitie20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 965
  lastpage: 974
  published: 2020-08-06 00:00:00 +0000
- title: 'Intervening on Network Ties'
  abstract: 'A foundational tool for making causal inferences is the emulation of randomized control trials via variable interventions. This approach has been applied to a wide variety of contexts, from health to economics [3, 6]. Variable interventions have long been studied in independent and identically distributed (iid) data contexts, but recently non-iid settings, such as networks with interacting agents [8, 17, 28] have attracted interest. In this paper, we propose a type of structural intervention [12] relevant in network contexts: the network intervention. Rather than estimating the effect of changing variables, we consider changes to social network structure resulting from creation or severance of ties between agents. We define the individual participant and average bystander effects for these interventions and describe identification criteria. We then prove a series of theoretical results that show existing identification theory obtains minimally KL-divergent distributions corresponding to network interventions. Finally, we demonstrate estimation of effects of network interventions via a simulation study.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/sherman20a.html
  PDF: http://proceedings.mlr.press/v115/sherman20a/sherman20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-sherman20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Eli
    family: Sherman
  - given: Ilya
    family: Shpitser
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 975-984
  id: sherman20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 975
  lastpage: 984
  published: 2020-08-06 00:00:00 +0000
- title: 'Generating and Sampling Orbits for Lifted Probabilistic Inference'
  abstract: 'A key goal in the design of probabilistic inference algorithms is identifying and exploit- ing properties of the distribution that make inference tractable. Lifted inference algorithms identify symmetry as a property that enables efficient inference and seek to scale with the degree of symmetry of a probability model. A limitation of existing exact lifted inference techniques is that they do not apply to non- relational representations like factor graphs. In this work we provide the first example of an exact lifted inference algorithm for arbitrary discrete factor graphs. In addition we describe a lifted Markov-Chain Monte-Carlo algorithm that provably mixes rapidly in the degree of symmetry of the distribution.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/holtzen20a.html
  PDF: http://proceedings.mlr.press/v115/holtzen20a/holtzen20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-holtzen20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Steven
    family: Holtzen
  - given: Todd
    family: Millstein
  - given: Guy
    family: Van den Broeck
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 985-994
  id: holtzen20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 985
  lastpage: 994
  published: 2020-08-06 00:00:00 +0000
- title: 'Real-Time Robotic Search using Structural Spatial Point Processes'
  abstract: 'Aerial robots hold great potential for aiding Search and Rescue (SAR) efforts over large areas, such as during natural disasters. Traditional approaches typically search an area exhaustively, thereby ignoring that the density of victims varies based on predictable factors, such as the terrain, population density and the type of disaster. We present a probabilistic model to automate SAR planning, with explicit minimization of the expected time to discovery. The proposed model is a spatial point process with three interacting spatial fields for i) the point patterns of persons in the area, ii) the probability of detecting persons and iii) the probability of injury. This structure allows inclusion of informative priors from e.g. geographic or cell phone traffic data, while falling back to latent Gaussian processes when priors are missing or inaccurate. To solve this problem in real-time, we propose a combination of fast approximate inference using Integrated Nested Laplace Approximation (INLA), and a novel Monte Carlo tree search tailored to the problem. Experiments using data simulated from real world Geographic Information System (GIS) maps show that the framework outperforms competing approaches, finding many more injured in the crucial first hours.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/andersson20a.html
  PDF: http://proceedings.mlr.press/v115/andersson20a/andersson20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-andersson20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Olov
    family: Andersson
  - given: Per
    family: Sidén
  - given: Johan
    family: Dahlin
  - given: Patrick
    family: Doherty
  - given: Mattias
    family: Villani
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 995-1005
  id: andersson20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 995
  lastpage: 1005
  published: 2020-08-06 00:00:00 +0000
- title: 'Social Reinforcement Learning to Combat Fake News Spread'
  abstract: 'In this work, we develop a social reinforcement learning approach to combat the spread of fake news. Specifically, we aim to learn an intervention model to promote the spread of true news in a social network—in order to mitigate the impact of fake news. We model news diffusion as a Multivariate Hawkes Process (MHP) and make interventions that are learnt via policy optimization. The key insight is to estimate the response a user will get from the social network upon sharing a post, as it indicates her impact on diffusion, and will  thus help in efficient allocation of incentive. User responses also depend on political bias and peer influence, which we model as a second MHP, interleaving it with the news diffusion process. We evaluate our model on semi-synthetic and real-world data. The results demonstrate that our proposed model outperforms other alternatives that do not consider estimates of user responses and political bias when learning how to allocate incentives.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/goindani20a.html
  PDF: http://proceedings.mlr.press/v115/goindani20a/goindani20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-goindani20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Mahak
    family: Goindani
  - given: Jennifer
    family: Neville
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1006-1016
  id: goindani20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1006
  lastpage: 1016
  published: 2020-08-06 00:00:00 +0000
- title: 'P3O: Policy-on Policy-off Policy Optimization'
  abstract: 'On-policy reinforcement learning (RL) algorithms have high sample complexity while off-policy algorithms are difficult to tune. Merging the two holds the promise to develop efficient algorithms that generalize across diverse environments. It is however challenging in practice to find suitable hyper-parameters that govern this trade off. This paper develops a simple algorithm named P3O that interleaves off-policy updates with on-policy updates. P3O uses the effective sample size between the behavior policy and the target policy to control how far they can be from each other and does not introduce any additional hyper-parameters. Extensive experiments on the Atari-2600 and MuJoCo benchmark suites show that this simple technique is effective in reducing the sample complexity of state-of-the-art algorithms. Code to reproduce experiments in this paper is at https://github.com/rasoolfa/P3O.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/fakoor20a.html
  PDF: http://proceedings.mlr.press/v115/fakoor20a/fakoor20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-fakoor20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Rasool
    family: Fakoor
  - given: Pratik
    family: Chaudhari
  - given: Alexander J.
    family: Smola
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1017-1027
  id: fakoor20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1017
  lastpage: 1027
  published: 2020-08-06 00:00:00 +0000
- title: 'Causal Inference Under Interference And Network Uncertainty'
  abstract: 'Classical causal and statistical inference methods typically assume the observed data consists of independent realizations. However, in many applications this assumption is inappropriate due to a network of dependences between units in the data. Methods for estimating causal effects have been developed in the setting where the structure of dependence between units is known exactly, but in practice there is often substantial uncertainty about the precise network structure. This is true, for example, in trial data drawn from vulnerable communities where social ties are difficult to query directly. In this paper we combine techniques from the structure learning and interference literatures in causal inference, proposing a general method for estimating causal effects under data dependence when the structure of this dependence is not known a priori. We demonstrate the utility of our method on synthetic datasets which exhibit network dependence.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/bhattacharya20a.html
  PDF: http://proceedings.mlr.press/v115/bhattacharya20a/bhattacharya20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-bhattacharya20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Rohit
    family: Bhattacharya
  - given: Daniel
    family: Malinsky
  - given: Ilya
    family: Shpitser
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1028-1038
  id: bhattacharya20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1028
  lastpage: 1038
  published: 2020-08-06 00:00:00 +0000
- title: 'Revisiting Reweighted Wake-Sleep for Models with Stochastic Control Flow'
  abstract: 'Stochastic control-flow models (SCFMs) are a class of generative models that involve branching on choices from discrete random variables. Amortized gradient-based learning of SCFMs is challenging as most approaches targeting discrete variables rely on their continuous relaxations—which can be intractable in SCFMs, as branching on relaxations requires evaluating all (exponentially many) branching paths. Tractable alternatives mainly combine REINFORCE with complex control-variate schemes to improve the variance of naive estimators. Here, we revisit the reweighted wake-sleep (RWS) [5] algorithm, and through extensive evaluations, show that it outperforms current state-of-the-art methods in learning SCFMs. Further, in contrast to the importance weighted autoencoder, we observe that RWS learns better models and inference networks with increasing numbers of particles. Our results suggest that RWS is a competitive, often preferable, alternative for learning SCFMs.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/le20a.html
  PDF: http://proceedings.mlr.press/v115/le20a/le20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-le20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Tuan Anh
    family: Le
  - given: Adam R.
    family: Kosiorek
  - given: N.
    family: Siddharth
  - given: Yee Whye
    family: Teh
  - given: Frank
    family: Wood
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1039-1049
  id: le20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1039
  lastpage: 1049
  published: 2020-08-06 00:00:00 +0000
- title: 'Learnability for the Information Bottleneck'
  abstract: 'The Information Bottleneck (IB) method (Tishby et al. (2000)) provides an insightful and principled approach for balancing compression and prediction for representation learning. The IB objective I (X ; Z ) − βI(Y ; Z) employs a Lagrange multiplier β to tune this trade-off. However, in practice, not only is β chosen empirically without theoretical guidance, there is also a lack of theoretical understanding between β, learnability, the intrinsic nature of the dataset and model capacity. In this paper, we show that if β is improperly chosen, learning cannot happen – the trivial representation P(Z|X) = P(Z) becomes the global minimum of the IB objective. We show how this can be avoided, by identifying a sharp phase transition between the unlearnable and the learnable which arises as β is varied. This phase transition defines the concept of IB-Learnability. We prove several sufficient conditions for IB-Learnability, which provides theoretical guidance for choosing a good β. We further show that IB-learnability is determined by the largest confident, typical, and imbalanced subset of the examples (the conspicuous subset), and discuss its relation with model capacity. We give practical algorithms to estimate the minimum β for a given dataset. We also empirically demonstrate our theoretical conditions with analyses of synthetic datasets, MNIST, and CIFAR10.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/wu20b.html
  PDF: http://proceedings.mlr.press/v115/wu20b/wu20b.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-wu20b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Tailin
    family: Wu
  - given: Ian
    family: Fischer
  - given: Isaac L.
    family: Chuang
  - given: Max
    family: Tegmark
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1050-1060
  id: wu20b
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1050
  lastpage: 1060
  published: 2020-08-06 00:00:00 +0000
- title: 'Learning Belief Representations for Imitation Learning in POMDPs'
  abstract: 'We consider the problem of imitation learning from expert demonstrations in partially observable Markov decision processes (POMDPs). Belief representations, which characterize the distribution over the latent states in a POMDP, have been modeled using recurrent neural networks and probabilistic latent variable models, and shown to be effective for reinforcement learning in POMDPs. In this work, we investigate the belief representation learning problem for generative adversarial imitation learning in POMDPs. Instead of training the belief module and the policy separately as suggested in prior work, we learn the belief module jointly with the policy, using a task-aware imitation loss to ensure that the representation is more aligned with the policy’s objective. To improve robustness of representation, we introduce several informative belief regularization techniques, including multi-step prediction of dynamics and action-sequences. Evaluated on various partially observable continuous-control locomotion tasks, our belief-module imitation learning approach (BMIL) substantially outperforms several baselines, including the original GAIL algorithm and the task-agnostic belief learning algorithm. Extensive ablation analysis indicates the effectiveness of task-aware belief learning and belief regularization. Code for the project is available online.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/gangwani20a.html
  PDF: http://proceedings.mlr.press/v115/gangwani20a/gangwani20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-gangwani20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Tanmay
    family: Gangwani
  - given: Joel
    family: Lehman
  - given: Qiang
    family: Liu
  - given: Jian
    family: Peng
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1061-1071
  id: gangwani20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1061
  lastpage: 1071
  published: 2020-08-06 00:00:00 +0000
- title: 'Object Conditioning for Causal Inference'
  abstract: 'We describe and analyze a form of conditioning that is widely applied within social science and applied statistics but that is virtually unknown within causal graphical models. This approach, which we term object conditioning, can adjust for the effects of latent confounders and yet avoid the pitfall of conditioning on colliders. We describe object conditioning using plate models and show how its probabilistic implications can be explained using the property of exchangeability. We show that several seemingly obvious interpretations of object conditioning are insufficient to describe its probabilistic implications.  Finally, we use object conditioning to describe and unify key aspects of a diverse set of techniques for causal inference, including within-subjects designs, difference-in-differences designs, and interrupted time-series designs.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/jensen20a.html
  PDF: http://proceedings.mlr.press/v115/jensen20a/jensen20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-jensen20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: David
    family: Jensen
  - given: Javier
    family: Burroni
  - given: Matthew
    family: Rattigan
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1072-1082
  id: jensen20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1072
  lastpage: 1082
  published: 2020-08-06 00:00:00 +0000
- title: 'CCMI : Classifier based Conditional Mutual Information Estimation'
  abstract: 'Conditional Mutual Information (CMI) is a measure of conditional dependence between random variables X and Y, given another random variable Z. It can be used to quantify conditional dependence among variables in many data-driven inference problems such as graphical models, causal learning, feature selection and time-series analysis. While k-nearest neighbor (kNN) based estimators as well as kernel-based methods have been widely used for CMI estimation, they suffer severely from the curse of dimensionality. In this paper, we leverage advances in classifiers and generative models to design methods for CMI estimation. Specifically, we introduce an estimator for KL-Divergence based on the likelihood ratio by training a classifier to distinguish the observed joint distribution from the product distribution. We then show how to construct several CMI estimators using this basic divergence estimator by drawing ideas from conditional generative models. We demonstrate that the estimates from our proposed approaches do not degrade in performance with increasing dimension and obtain significant improvement over the widely used KSG estimator. Finally, as an application of accurate CMI estimation, we use our best estimator for conditional independence testing and achieve superior performance than the state-of-the-art tester on both simulated and real data-sets.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/mukherjee20a.html
  PDF: http://proceedings.mlr.press/v115/mukherjee20a/mukherjee20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-mukherjee20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Sudipto
    family: Mukherjee
  - given: Himanshu
    family: Asnani
  - given: Sreeram
    family: Kannan
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1083-1093
  id: mukherjee20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1083
  lastpage: 1093
  published: 2020-08-06 00:00:00 +0000
- title: 'Empirical Mechanism Design: Designing Mechanisms from Data'
  abstract: 'We introduce a methodology for the design of parametric mechanisms, which are multiagent systems inhabited by strategic agents, with knobs that can be adjusted to achieve specific goals. We assume agents play approximate equilibria, which we estimate using the probably approximately correct learning framework. Under this assumption, we further learn approximately optimal mechanism parameters. We do this theoretically, assuming a finite design space, and heuristically, using Bayesian optimization (BO). Our BO algorithm incorporates the noise associated with modern concentration inequalities, such as Hoeffding’s, into the underlying Gaussian process. We show experimentally that our search techniques outperform standard baselines in a stylized but rich model of advertisement exchanges.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/viqueira20a.html
  PDF: http://proceedings.mlr.press/v115/viqueira20a/viqueira20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-viqueira20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Enrique Areyan
    family: Viqueira
  - given: Cyrus
    family: Cousins
  - given: Yasser
    family: Mohammad
  - given: Amy
    family: Greenwald
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1094-1104
  id: viqueira20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1094
  lastpage: 1104
  published: 2020-08-06 00:00:00 +0000
- title: 'On the Relationship Between Satisfiability and Markov Decision Processes'
  abstract: 'Stochastic satisfiability (SSAT) and decision-theoretic planning in finite horizon partially observable Markov decision processes (POMDPs) are both PSPACE-Complete. We describe constructive reductions between SSAT and flat POMDPs that open the door to comparisons and future cross-fertilization between the solution techniques of those problems.  We also propose a new SSAT solver called Prime that incorporates recent advances from the SAT and #SAT literature.  Using our reduction from POMDP to SSAT, we demonstrate the competitiveness of Prime on finite horizon POMDP problems.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/salmon20a.html
  PDF: http://proceedings.mlr.press/v115/salmon20a/salmon20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-salmon20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Ricardo
    family: Salmon
  - given: Pascal
    family: Poupart
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1105-1115
  id: salmon20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1105
  lastpage: 1115
  published: 2020-08-06 00:00:00 +0000
- title: 'Interpretable Almost Matching Exactly With Instrumental Variables'
  abstract: 'Uncertainty in the estimation of the causal effect in observational studies is often due to unmeasured confounding, i.e., the presence of unobserved covariates linking treatments and outcomes. Instrumental Variables (IV) are commonly used to reduce the effects of unmeasured confounding. Existing methods for IV estimation either require strong parametric assumptions, use arbitrary distance metrics, or do not scale well to large datasets. We propose a matching framework for IV in the presence of observed categorical confounders that addresses these weaknesses. Our method first matches units exactly, and then consecutively drops variables to approximately match the remaining units on as many variables as possible. We show that our algorithm constructs better matches than other existing methods on simulated datasets, and we produce interesting results in an application to political canvassing.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/awan20a.html
  PDF: http://proceedings.mlr.press/v115/awan20a/awan20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-awan20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: M. Usaid
    family: Awan
  - given: Yameng
    family: Liu
  - given: Marco
    family: Morucci
  - given: Sudeepa
    family: Roy
  - given: Cynthia
    family: Rudin
  - given: Alexander
    family: Volfovsky
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1116-1126
  id: awan20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1116
  lastpage: 1126
  published: 2020-08-06 00:00:00 +0000
- title: 'Low Frequency Adversarial Perturbation'
  abstract: 'Adversarial images aim to change a target model’s decision by minimally perturbing a target image. In the black-box setting, the absence of gradient information often renders this search problem costly in terms of query complexity. In this paper we propose to restrict the search for adversarial images to a low frequency domain. This approach is readily compatible with many existing black-box attack frameworks and consistently reduces their query cost by 2 to 4 times. Further, we can circumvent image transformation defenses even when both the model and the defense strategy are unknown. Finally, we demonstrate the efficacy of this technique by fooling the Google Cloud Vision platform with an unprecedented low number of model queries.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/guo20a.html
  PDF: http://proceedings.mlr.press/v115/guo20a/guo20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-guo20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Chuan
    family: Guo
  - given: Jared S.
    family: Frank
  - given: Kilian Q.
    family: Weinberger
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1127-1137
  id: guo20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1127
  lastpage: 1137
  published: 2020-08-06 00:00:00 +0000
- title: 'Markov Logic Networks for Knowledge Base Completion: A Theoretical Analysis Under the MCAR Assumption'
  abstract: 'We study the following question. We are given a knowledge base in which some facts are missing. We learn the weights of a Markov logic network using maximum likelihood estimation on this knowledge base and then use the learned Markov logic network to predict the missing facts. Assuming that the facts are missing independently and with the same probability, can we say that this approach is consistent in some precise sense? This is a non-trivial question because we are learning from only one training example. In this paper we show that the answer to this question is positive.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/kuzelka20a.html
  PDF: http://proceedings.mlr.press/v115/kuzelka20a/kuzelka20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-kuzelka20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Ondřej
    family: Kuželka
  - given: Jesse
    family: Davis
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1138-1148
  id: kuzelka20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1138
  lastpage: 1148
  published: 2020-08-06 00:00:00 +0000
- title: 'Identification In Missing Data Models Represented By Directed Acyclic Graphs'
  abstract: 'Missing data is a pervasive problem in data analyses, resulting in datasets that contain censored realizations of a target distribution. Many approaches to inference on the target distribution using censored observed data, rely on missing data models represented as a factorization with respect to a directed acyclic graph. In this paper we consider the identifiability of the target distribution within this class of models, and show that the most general identification strategies proposed so far retain a significant gap in that they fail to identify a wide class of identifiable distributions. To address this gap, we propose a new algorithm that significantly generalizes the types of manipulations used in the ID algorithm [14, 16], developed in the context of causal inference, in order to obtain identification.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/bhattacharya20b.html
  PDF: http://proceedings.mlr.press/v115/bhattacharya20b/bhattacharya20b.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-bhattacharya20b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Rohit
    family: Bhattacharya
  - given: Razieh
    family: Nabi
  - given: Ilya
    family: Shpitser
  - given: James M.
    family: Robins
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1149-1158
  id: bhattacharya20b
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1149
  lastpage: 1158
  published: 2020-08-06 00:00:00 +0000
- title: 'A Weighted Mini-Bucket Bound for Solving Influence Diagram'
  abstract: 'Influence diagrams provide a modeling and inference framework for sequential decision problems, representing the probabilistic knowledge by a Bayesian network and the preferences of an agent by utility functions over the random variables and decision variables.  The time and space complexity of computing the maximum expected utility (MEU) and its maximizing policy are exponential in the induced width of the underlying graphical model, which is often prohibitively large. In this paper, we develop a weighted mini-bucket approach for bounding the MEU.  These bounds can be used as a stand-alone approximation that can be improved as a function of a controlling i-bound parameter, or as a heuristic function to guide subsequent search.  We evaluate the scheme empirically against the current state of the art, illustrating its potential.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/lee20c.html
  PDF: http://proceedings.mlr.press/v115/lee20c/lee20c.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-lee20c.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Junkyu
    family: Lee
  - given: Radu
    family: Marinescu
  - given: Alexander
    family: Ihler
  - given: Rina
    family: Dechter
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1159-1168
  id: lee20c
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1159
  lastpage: 1168
  published: 2020-08-06 00:00:00 +0000
- title: 'Subspace Inference for Bayesian Deep Learning'
  abstract: 'Bayesian inference was once a gold standard for learning with neural networks, providing accurate full predictive distributions and well calibrated uncertainty. However, scaling Bayesian inference techniques to deep neural networks is challenging due to the high dimensionality of the parameter space. In this paper, we construct low-dimensional subspaces of parameter space, such as the first principal components of the stochastic gradient descent (SGD) trajectory, which contain diverse sets of high performing models. In these subspaces, we are able to apply elliptical slice sampling and variational inference, which struggle in the full parameter space. We show that Bayesian model averaging over the induced posterior in these subspaces produces accurate predictions and well-calibrated predictive uncertainty for both regression and image classification.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/izmailov20a.html
  PDF: http://proceedings.mlr.press/v115/izmailov20a/izmailov20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-izmailov20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Pavel
    family: Izmailov
  - given: Wesley J.
    family: Maddox
  - given: Polina
    family: Kirichenko
  - given: Timur
    family: Garipov
  - given: Dmitry
    family: Vetrov
  - given: Andrew Gordon
    family: Wilson
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1169-1179
  id: izmailov20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1169
  lastpage: 1179
  published: 2020-08-06 00:00:00 +0000
- title: 'Off-Policy Policy Gradient with Stationary Distribution Correction'
  abstract: 'We study the problem of off-policy policy optimization in Markov decision processes, and develop a novel off-policy policy gradient method. Prior off-policy policy gradient approaches have generally ignored the mismatch between the distribution of states visited under the behavior policy used to collect data, and what would be the distribution of states under the learned policy. Here we build on recent progress for estimating the ratio of the state distributions under behavior and evaluation policies for policy evaluation, and present an off-policy policy gradient optimization technique that can account for this mismatch in distributions. We present an illustrative example of why this is important and a theoretical convergence guarantee for our approach. Empirically, we compare our method in simulations to several strong baselines which do not correct for this mismatch, significantly improving in the quality of the policy discovered.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/liu20a.html
  PDF: http://proceedings.mlr.press/v115/liu20a/liu20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-liu20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Yao
    family: Liu
  - given: Adith
    family: Swaminathan
  - given: Alekh
    family: Agarwal
  - given: Emma
    family: Brunskill
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1180-1190
  id: liu20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1180
  lastpage: 1190
  published: 2020-08-06 00:00:00 +0000
- title: 'Co-training for Policy Learning'
  abstract: 'We study the problem of learning sequential decision-making policies in settings with multiple state-action representations. Such settings naturally arise in many domains, such as planning (e.g., multiple integer programming formulations) and various combinatorial optimization problems (e.g., those with both integer programming and graph-based formulations). Inspired by the classical co-training framework for classification, we study the problem of co-training for policy learning. We present sufficient conditions under which learning from two views can improve upon learning from a single view alone. Motivated by these theoretical insights, we present a meta-algorithm for co-training for sequential decision making. Our framework is compatible with both reinforcement learning and imitation learning. We validate the effectiveness of our approach across a wide range of tasks, including discrete/continuous control and combinatorial optimization.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/song20b.html
  PDF: http://proceedings.mlr.press/v115/song20b/song20b.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-song20b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Jialin
    family: Song
  - given: Ravi
    family: Lanka
  - given: Yisong
    family: Yue
  - given: Masahiro
    family: Ono
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1191-1201
  id: song20b
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1191
  lastpage: 1201
  published: 2020-08-06 00:00:00 +0000
- title: 'Variational Inference of Penalized Regression with Submodular Functions'
  abstract: 'Various regularizers inducing structured-sparsity are constructed as Lovász extensions of submodular functions. In this paper, we consider a hierarchical probabilistic model of linear regression and its kernel extension with this type of regularization, and develop a variational inference scheme for the posterior estimate on this model. We derive an upper bound on the partition function with an approximation guarantee, and then show that minimizing this bound is equivalent to the minimization of a quadratic function over the polyhedron determined by the corresponding submodular function, which can be solved efficiently by the proximal gradient algorithm. Our scheme gives a natural extension of the Bayesian Lasso model for the maximum a posteriori (MAP) estimation to a variety of regularizers inducing structured sparsity, and thus this work provides a principled way to transfer the advantages of the Bayesian formulation into those models. Finally, we investigate the empirical performance of our scheme with several Bayesian variants of widely known models such as Lasso, generalized fused Lasso, and non-overlapping group Lasso.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/takeuchi20a.html
  PDF: http://proceedings.mlr.press/v115/takeuchi20a/takeuchi20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-takeuchi20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Koh
    family: Takeuchi
  - given: Yuichi
    family: Yoshida
  - given: Yoshinobu
    family: Kawahara
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1202-1211
  id: takeuchi20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1202
  lastpage: 1211
  published: 2020-08-06 00:00:00 +0000
- title: 'Probability Distillation: A Caveat and Alternatives'
  abstract: 'Due to Van den Oord et al. (2018), probability distillation has recently been of interest to deep learning practitioners, where,  as a practical workaround for deploying autoregressive models in real-time applications, a student net-work is used to obtain quality samples in parallel.  We identify a pathological optimization issue with the adopted stochastic minimization of the reverse-KL divergence:  the curse of dimensionality results in a skewed gradient distribution that renders training inefficient.  This means that KL-based "evaluative" training can be susceptible to poor exploration if the target distribution is highly structured.  We then explore alternative principles for distillation, including one with an "instructive" signal,  and show that it is possible to achieve qualitatively better results than with KL minimization.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/huang20c.html
  PDF: http://proceedings.mlr.press/v115/huang20c/huang20c.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-huang20c.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Chin-Wei
    family: Huang
  - given: Faruk
    family: Ahmed
  - given: Kundan
    family: Kumar
  - given: Alexandre
    family: Lacoste
  - given: Aaron
    family: Courville
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1212-1221
  id: huang20c
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1212
  lastpage: 1221
  published: 2020-08-06 00:00:00 +0000
- title: 'Bayesian Optimization with Binary Auxiliary Information'
  abstract: 'This paper presents novel mixed-type Bayesian optimization (BO) algorithms to accelerate the optimization of a target objective function by exploiting correlated auxiliary information of binary type that can be more cheaply obtained, such as in policy search for reinforcement learning and hyperparameter tuning of machine learning models with early stopping. To achieve this, we first propose a mixed-type multi-output Gaussian process (MOGP) to jointly model the continuous target function and binary auxiliary functions. Then, we propose information-based acquisition functions such as mixed-type entropy search (MT-ES) and mixed-type predictive ES (MT-PES) for mixed-type BO based on the MOGP predictive belief of the target and auxiliary functions. The exact acquisition functions of MT-ES and MT-PES cannot be computed in closed form and need to be approximated. We derive an efficient approximation of MT-PES via a novel mixed-type random features approximation of the MOGP model whose cross-correlation structure between the target and auxiliary functions can be exploited for improving the belief of the global target maximizer using the observations from evaluating these functions. We also propose new practical constraints to relate the global target maximizer to the binary auxiliary functions. We empirically evaluate the performance of MT-ES and MT-PES with synthetic and real-world experiments. '
  volume: 115
  URL: https://proceedings.mlr.press/v115/zhang20b.html
  PDF: http://proceedings.mlr.press/v115/zhang20b/zhang20b.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-zhang20b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Yehong
    family: Zhang
  - given: Zhongxiang
    family: Dai
  - given: Bryan Kian Hsiang
    family: Low
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1222-1232
  id: zhang20b
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1222
  lastpage: 1232
  published: 2020-08-06 00:00:00 +0000
- title: 'On Open-Universe Causal Reasoning'
  abstract: 'We extend two kinds of causal models, structural equation models and simulation models, to infinite variable spaces. This enables a semantics of counterfactuals, calculus of intervention, and axiomatization of causal reasoning for rich, expressive generative models—including those in which a causal representation exists only implicitly—in an open-universe setting.  Further, we show that under suitable restrictions the two kinds of models are equivalent, perhaps surprisingly since their conditional logics differ substantially in the general case. We give a series of complete axiomatizations in which the open-universe nature of the setting is seen to be essential.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/ibeling20a.html
  PDF: http://proceedings.mlr.press/v115/ibeling20a/ibeling20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-ibeling20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Duligur
    family: Ibeling
  - given: Thomas
    family: Icard
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1233-1243
  id: ibeling20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1233
  lastpage: 1243
  published: 2020-08-06 00:00:00 +0000
- title: 'Embarrassingly Parallel MCMC using Deep Invertible Transformations'
  abstract: 'While MCMC methods have become a main work-horse for Bayesian inference, scaling them to large distributed datasets is still a challenge. Embarrassingly parallel MCMC strategies take a divide-and-conquer stance to achieve this by writing the target posterior as a product of subposteriors, running MCMC for each of them in parallel and subsequently combining the results. The challenge then lies in devising efficient aggregation strategies. Current strategies  trade-off between approximation quality, and costs of communication and computation. In this work, we introduce a novel method that addresses these issues simultaneously. Our key insight is to introduce a deep invertible transformation to approximate each of the subposteriors. These approximations can be made accurate even for complex distributions and serve as intermediate representations, keeping the total communication cost limited. Moreover, they enable us to sample from the product of the subposteriors using an efficient and stable importance sampling scheme. We demonstrate that the approach outperforms available state-of-the-art methods in a range of challenging scenarios, including high-dimensional and heterogeneous subposteriors.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/mesquita20a.html
  PDF: http://proceedings.mlr.press/v115/mesquita20a/mesquita20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-mesquita20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Diego
    family: Mesquita
  - given: Paul
    family: Blomstedt
  - given: Samuel
    family: Kaski
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1244-1252
  id: mesquita20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1244
  lastpage: 1252
  published: 2020-08-06 00:00:00 +0000
- title: 'Fast Proximal Gradient Descent for A Class of Non-convex and Non-smooth Sparse Learning Problems'
  abstract: 'Non-convex and non-smooth optimization problems are important for statistics and machine learning. However, solving such problems is always challenging. In this paper, we propose fast proximal gradient descent based methods to solve a class of non-convex and non-smooth sparse learning problems, i.e. the $\ell^0$ regularization problems. We prove improved convergence rate of proximal gradient descent on the $\ell^0$ regularization problems, and propose two accelerated versions by support projection. The proposed accelerated proximal gradient descent methods by support projection have convergence rates which match the Nesterov’s optimal convergence rate of first-order methods on smooth and convex objective function with Lipschitz continuous gradient. Experimental results demonstrate the effectiveness of the proposed algorithms. We also propose feedforward neural networks as fast encoders to approximate the optimization results generated by the proposed accelerated algorithms.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/yang20b.html
  PDF: http://proceedings.mlr.press/v115/yang20b/yang20b.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-yang20b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Yingzhen
    family: Yang
  - given: Jiahui
    family: Yu
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1253-1262
  id: yang20b
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1253
  lastpage: 1262
  published: 2020-08-06 00:00:00 +0000
- title: 'Block Neural Autoregressive Flow'
  abstract: 'Normalising flows (NFs) map two density functions via a differentiable bijection whose Jacobian determinant can be computed efficiently. Recently, as an alternative to hand-crafted bijections, Huang et al. (2018) proposed neural autoregressive flow (NAF) which is a universal approximator for density functions. Their flow is a neural network (NN) whose parameters are predicted by another NN. The latter grows quadratically with the size of the former and thus an efficient technique for parametrization is needed. We propose block neural autoregressive flow (B-NAF), a much more compact universal approximator of density functions, where we model a bijection directly using a single feed-forward network. Invertibility is ensured by carefully designing each affine transformation with block matrices that make the flow autoregressive and (strictly) monotone. We compare B-NAF to NAF and other established flows on density estimation and approximate inference for latent variable models. Our proposed flow is competitive across datasets while using orders of magnitude fewer parameters.'
  volume: 115
  URL: https://proceedings.mlr.press/v115/de-cao20a.html
  PDF: http://proceedings.mlr.press/v115/de-cao20a/de-cao20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-de-cao20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Nicola
    family: De Cao
  - given: Wilker
    family: Aziz
  - given: Ivan
    family: Titov
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1263-1273
  id: de-cao20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1263
  lastpage: 1273
  published: 2020-08-06 00:00:00 +0000
- title: 'Exclusivity Graph Approach to Instrumental Inequalities'
  abstract: 'Instrumental variables allow the estimation of cause and effect relations even in presence of unobserved latent factors, thus providing a powerful tool for any science wherein causal inference plays an important role. More recently, the instrumental scenario has also attracted increasing attention in quantum physics, since it is related to the seminal Bell’s theorem and in fact allows the detection of even stronger quantum effects, thus enhancing our current capabilities to process information and becoming a valuable tool in quantum cryptography. In this work, we further explore this bridge between causality and quantum theory and apply a technique, originally developed in the field of quantum foundations, to express the constraints implied by causal relations in the language of graph theory. This new approach can be applied to any causal model containing a latent variable. Here, by focusing on the instrumental scenario, it allows us to easily reproduce known results as well as obtain new ones and gain new insights on the connections and differences between the instrumental and the Bell scenarios. '
  volume: 115
  URL: https://proceedings.mlr.press/v115/poderini20a.html
  PDF: http://proceedings.mlr.press/v115/poderini20a/poderini20a.pdf
  edit: https://github.com/mlresearch//v115/edit/gh-pages/_posts/2020-08-06-poderini20a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of The 35th Uncertainty in Artificial Intelligence Conference'
  publisher: 'PMLR'
  author: 
  - given: Davide
    family: Poderini
  - given: Rafael
    family: Chaves
  - given: Iris
    family: Agresti
  - given: Gonzalo
    family: Carvacho
  - given: Fabio
    family: Sciarrino
  editor: 
  - given: Ryan P.
    family: Adams
  - given: Vibhav
    family: Gogate
  page: 1274-1283
  id: poderini20a
  issued:
    date-parts: 
      - 2020
      - 8
      - 6
  firstpage: 1274
  lastpage: 1283
  published: 2020-08-06 00:00:00 +0000