
- title: 'Mutation-driven follow the regularized leader for last-iterate convergence in zero-sum games'
  abstract: 'In this study, we consider a variant of the Follow the Regularized Leader (FTRL) dynamics in two-player zero-sum games. FTRL is guaranteed to converge to a Nash equilibrium when time-averaging the strategies, while a lot of variants suffer from the issue of limit cycling behavior, i.e., lack the last-iterate convergence guarantee. To this end, we propose mutant FTRL (M-FTRL), an algorithm that introduces mutation for the perturbation of action probabilities. We then investigate the continuous-time dynamics of M-FTRL and provide the strong convergence guarantees toward stationary points that approximate Nash equilibria under full-information feedback. Furthermore, our simulation demonstrates that M-FTRL can enjoy faster convergence rates than FTRL and optimistic FTRL under full-information feedback and surprisingly exhibits clear convergence under bandit feedback.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/abe22a.html
  PDF: https://proceedings.mlr.press/v180/abe22a/abe22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-abe22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Kenshi
    family: Abe
  - given: Mitsuki
    family: Sakamoto
  - given: Atsushi
    family: Iwasaki
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1-10
  id: abe22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1
  lastpage: 10
  published: 2022-08-17 00:00:00 +0000
- title: 'NeuroBE: Escalating neural network approximations of Bucket Elimination'
  abstract: 'A major limiting factor in graphical model inference is the complexity of computing the partition function. Exact message-passing algorithms such as Bucket Elimination (BE) require exponential memory to compute the partition function; therefore, approximations are necessary. In this paper, we build upon a recently introduced methodology called Deep Bucket Elimination (DBE) that uses classical Neural Networks to approximate messages generated by BE for large buckets. The main feature of our new scheme, renamed NeuroBE, is that it customizes the architecture of the neural networks, their learning process and in particular, adapts the loss function to the internal form or distribution of messages. Our experiments demonstrate significant improvements in accuracy and time compared with the earlier DBE scheme.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/agarwal22a.html
  PDF: https://proceedings.mlr.press/v180/agarwal22a/agarwal22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-agarwal22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Sakshi
    family: Agarwal
  - given: Kalev
    family: Kask
  - given: Alex
    family: Ihler
  - given: Rina
    family: Dechter
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 11-21
  id: agarwal22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 11
  lastpage: 21
  published: 2022-08-17 00:00:00 +0000
- title: 'Regret guarantees for model-based reinforcement learning with long-term average constraints'
  abstract: 'We consider the problem of constrained Markov Decision Process (CMDP) where an agent interacts with an ergodic Markov Decision Process. At every interaction, the agent obtains a reward and incurs $K$ costs. The agent aims to maximize the long-term average reward while simultaneously keeping the $K$ long-term average costs lower than a certain threshold. In this paper, we propose \NAM, a posterior sampling based algorithm using which the agent can learn optimal policies to interact with the CMDP. We show that with the assumption of slackness, characterized by $\kappa$, the optimization problem is feasible for the sampled MDPs. Further, for MDP with $S$ states, $A$ actions, and mixing time $T_M$, we prove that following \NAM{} algorithm, the agent can bound the regret of not accumulating rewards from an optimal policy by $\Tilde{O}(T_MS\sqrt{AT})$. Further, we show that the violations for any of the $K$ constraints is also bounded by $\Tilde{O}(T_MS\sqrt{AT})$. To the best of our knowledge, this is the first work that obtains a $\Tilde{O}(\sqrt{T})$ regret bounds for ergodic MDPs with long-term average constraints using a posterior sampling method.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/agarwal22b.html
  PDF: https://proceedings.mlr.press/v180/agarwal22b/agarwal22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-agarwal22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Mridul
    family: Agarwal
  - given: Qinbo
    family: Bai
  - given: Vaneet
    family: Aggarwal
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 22-31
  id: agarwal22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 22
  lastpage: 31
  published: 2022-08-17 00:00:00 +0000
- title: 'GNN2GNN: Graph neural networks to generate neural networks'
  abstract: 'The success of neural networks (NNs) is tightly linked with their architectural design—a complex problem by itself. We here introduce a novel framework leveraging Graph Neural Networks to Generate Neural Networks (GNN2GNN) where powerful NN architectures can be learned out of a set of available architecture-performance pairs. GNN2GNN relies on a three-way adversarial training of GNN, to optimise a generator model capable of producing predictions about powerful NN architectures. Unlike Neural Architecture Search (NAS) techniques proposing efficient searching algorithms over a set of NN architec- tures, GNN2GNN relies on learning NN architectural design criteria. GNN2GNN learns to propose NN architectures in a single step – i.e., training of the generator –, overcoming the recursive approach characterising NAS. Therefore, GNN2GNN avoids the expensive and inflexible search of efficient structures typical of NAS approaches. Extensive experiments over two state-of-the-art datasets prove the strength of our framework, showing that it can generate powerful architectures with high probability. Moreover, GNN2GNN outperforms possible counterparts for generating NN architectures, and shows flexibility against dataset quality degradation. Finally, GNN2GNN paves the way towards generalisation between datasets.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/agiollo22a.html
  PDF: https://proceedings.mlr.press/v180/agiollo22a/agiollo22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-agiollo22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Andrea
    family: Agiollo
  - given: Andrea
    family: Omicini
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 32-42
  id: agiollo22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 32
  lastpage: 42
  published: 2022-08-17 00:00:00 +0000
- title: 'Neuro-symbolic entropy regularization'
  abstract: 'In structured output prediction, the goal is to jointly predict several output variables that together encode a structured object – a path in a graph, an entity-relation triple, or an ordering of objects. Such a large output space makes learning hard and requires vast amounts of labeled data. Different approaches leverage alternate sources of supervision. One approach – entropy regularization – posits that decision boundaries should lie in low-probability regions. It extracts supervision from unlabeled examples, but remains agnostic to the structure of the output space. Conversely, neuro-symbolic approaches exploit the knowledge that not every prediction corresponds to a valid structure in the output space. Yet, they do not further restrict the learned output distribution.This paper introduces a framework that unifies both approaches. We propose a loss, neuro-symbolic entropy regularization, that encourages the model to confidently predict a valid object. It is obtained by restricting entropy regularization to the distribution over only the valid structures. This loss can be computed efficiently when the output constraint is expressed as a tractable logic circuit. Moreover, it seamlessly integrates with other neuro-symbolic losses that eliminate invalid predictions. We demonstrate the efficacy of our approach on a series of semi-supervised and fully-supervised structured-prediction experiments, where it leads to models whose predictions are more accurate as well as more likely to be valid.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/ahmed22a.html
  PDF: https://proceedings.mlr.press/v180/ahmed22a/ahmed22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-ahmed22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Kareem
    family: Ahmed
  - given: Eric
    family: Wang
  - given: Kai-Wei
    family: Chang
  - given: Guy
    prefix: Van den
    family: Broeck
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 43-53
  id: ahmed22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 43
  lastpage: 53
  published: 2022-08-17 00:00:00 +0000
- title: 'Non-parametric inference of relational dependence'
  abstract: 'Independence testing plays a central role in statistical and causal inference from observational data. Standard independence tests assume that the data samples are independent and identically distributed (i.i.d.) but that assumption is violated in many real-world datasets and applications centered on relational systems. This work examines the problem of estimating independence in data drawn from relational systems by defining sufficient representations for the sets of observations influencing individual instances. Specifically, we define marginal and conditional independence tests for relational data by considering the kernel mean embedding as a flexible aggregation function for relational variables. We propose a consistent, non-parametric, scalable kernel test to operationalize the relational independence test for non-i.i.d. observational data under a set of structural assumptions. We empirically evaluate our proposed method on a variety of synthetic and semi-synthetic networks and demonstrate its effectiveness compared to state-of-the-art kernel-based independence tests.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/ahsan22a.html
  PDF: https://proceedings.mlr.press/v180/ahsan22a/ahsan22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-ahsan22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Ragib
    family: Ahsan
  - given: Zahra
    family: Fatemi
  - given: David
    family: Arbour
  - given: Elena
    family: Zheleva
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 54-63
  id: ahsan22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 54
  lastpage: 63
  published: 2022-08-17 00:00:00 +0000
- title: 'Data dependent randomized smoothing'
  abstract: 'Randomized smoothing is a recent technique that achieves state-of-art performance in training certifiably robust deep neural networks. While the smoothing family of distributions is often connected to the choice of the norm used for certification, the parameters of these distributions are always set as global hyper parameters independent from the input data on which a network is certified. In this work, we revisit Gaussian randomized smoothing and show that the variance of the Gaussian distribution can be optimized at each input so as to maximize the certification radius for the construction of the smooth classifier. Since the data dependent classifier does not directly enjoy sound certification with existing approaches, we propose a memory-enhanced data dependent smooth classifier that is certifiable by construction. This new approach is generic, parameter-free, and easy to implement. In fact, we show that our data dependent framework can be seamlessly incorporated into 3 randomized smoothing approaches, leading to consistent improved certified accuracy. When this framework is used in the training routine of these approaches followed by a data dependent certification, we achieve 9% and 6% improvement over the certified accuracy of the strongest baseline for a radius of 0.5 on CIFAR10 and ImageNet.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/alfarra22a.html
  PDF: https://proceedings.mlr.press/v180/alfarra22a/alfarra22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-alfarra22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Motasem
    family: Alfarra
  - given: Adel
    family: Bibi
  - given: Philip H. S.
    family: Torr
  - given: Bernard
    family: Ghanem
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 64-74
  id: alfarra22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 64
  lastpage: 74
  published: 2022-08-17 00:00:00 +0000
- title: 'Multi-winner approval voting goes epistemic'
  abstract: 'Epistemic voting interprets votes as noisy signals about a ground truth. We consider contexts where the truth consists of a set of objective winners, knowing a lower and upper bound on its cardinality. A prototypical problem for this setting is the aggregation of multi-label annotations with prior knowledge on the size of the ground truth. We posit noise models, for which we define rules that output an optimal set of winners. We report on experiments on multi-label annotations (which we collected).'
  volume: 180
  URL: https://proceedings.mlr.press/v180/allouche22a.html
  PDF: https://proceedings.mlr.press/v180/allouche22a/allouche22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-allouche22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tahar
    family: Allouche
  - given: Jérôme
    family: Lang
  - given: Florian
    family: Yger
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 75-84
  id: allouche22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 75
  lastpage: 84
  published: 2022-08-17 00:00:00 +0000
- title: 'Inductive synthesis of finite-state controllers for POMDPs'
  abstract: 'We present a novel learning framework to obtain finite-state controllers (FSCs) for partially observable Markov decision processes and illustrate its applicability for indefinite-horizon specifications. Our framework builds on oracle-guided inductive synthesis to explore a design space compactly representing available FSCs. The inductive synthesis approach consists of two stages: The outer stage determines the design space, i.e., the set of FSC candidates, while the inner stage efficiently explores the design space. This framework is easily generalisable and shows promising results when compared to existing approaches. Experiments indicate that our technique is (i) competitive to state-of-the-art belief-based approaches for indefinite-horizon properties, (ii) yields smaller FSCs than existing methods for several POMDP models, and (iii) naturally treats multi-objective specifications.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/andriushchenko22a.html
  PDF: https://proceedings.mlr.press/v180/andriushchenko22a/andriushchenko22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-andriushchenko22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Roman
    family: Andriushchenko
  - given: Milan
    family: Češka
  - given: Sebastian
    family: Junges
  - given: Joost-Pieter
    family: Katoen
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 85-95
  id: andriushchenko22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 85
  lastpage: 95
  published: 2022-08-17 00:00:00 +0000
- title: 'Discovery of extended summary graphs in time series'
  abstract: 'This study addresses the problem of learning an extended summary causal graph from time series. The algorithms we propose fit within the well-known constraint-based framework for causal discovery and make use of information-theoretic measures to determine (in)dependencies between time series. We first introduce generalizations of the causation entropy measure to any lagged or instantaneous relations, prior to using this measure to construct extended summary causal graphs by adapting two well-known algorithms, namely PC and FCI. The behaviour of our method is illustrated through several experiments.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/assaad22a.html
  PDF: https://proceedings.mlr.press/v180/assaad22a/assaad22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-assaad22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Charles K.
    family: Assaad
  - given: Emilie
    family: Devijver
  - given: Eric
    family: Gaussier
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 96-106
  id: assaad22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 96
  lastpage: 106
  published: 2022-08-17 00:00:00 +0000
- title: 'Asymmetric DQN for partially observable reinforcement learning'
  abstract: 'Offline training in simulated partially observable environments allows reinforcement learning methods to exploit privileged state information through a mechanism known as asymmetry. Such privileged information has the potential to greatly improve the optimal convergence properties, if used appropriately. However, current research in asymmetric reinforcement learning is often heuristic in nature, with few connections to underlying theory or theoretical guarantees, and is primarily tested through empirical evaluation. In this work, we develop the theory of Asymmetric Policy Iteration, an exact model-based dynamic programming solution method, and then apply relaxations which eventually result in Asymmetric DQN, a model-free deep reinforcement learning algorithm. Our theoretical findings are complemented and validated by empirical experimentation performed in environments which exhibit significant amounts of partial observability, and require both information gathering strategies and memorization.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/baisero22a.html
  PDF: https://proceedings.mlr.press/v180/baisero22a/baisero22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-baisero22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Andrea
    family: Baisero
  - given: Brett
    family: Daley
  - given: Christopher
    family: Amato
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 107-117
  id: baisero22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 107
  lastpage: 117
  published: 2022-08-17 00:00:00 +0000
- title: 'Physics guided neural networks for spatio-temporal super-resolution of turbulent flows'
  abstract: 'Direct numerical simulation (DNS) of turbulent flows is computationally expensive and cannot be applied to flows with large Reynolds numbers.  Low-resolution large eddy simulation (LES) is a popular alternative, but it is unable to capture all of the scales of turbulent transport accurately. Reconstructing DNS from low-resolution LES is critical for large-scale simulation in many scientific and engineering disciplines, but it poses many challenges to existing super-resolution methods due to the complexity of turbulent flows and computational cost of generating frequent LES data.  We propose a physics-guided neural network for reconstructing frequent DNS from sparse LES data by enhancing its spatial resolution and temporal frequency. Our proposed method consists of a partial differential equation (PDE)-based recurrent unit for capturing underlying temporal processes and a physics-guided super-resolution model that incorporates additional physical constraints.  We demonstrate the effectiveness of both components in reconstructing the Taylor-Green Vortex using sparse LES data. Moreover, we show that the proposed recurrent unit can preserve the physical characteristics of turbulent flows by leveraging the physical relationships in the Navier-Stokes equation.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/bao22a.html
  PDF: https://proceedings.mlr.press/v180/bao22a/bao22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-bao22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tianshu
    family: Bao
  - given: Shengyu
    family: Chen
  - given: Taylor T
    family: Johnson
  - given: Peyman
    family: Givi
  - given: Shervin
    family: Sammak
  - given: Xiaowei
    family: Jia
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 118-128
  id: bao22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 118
  lastpage: 128
  published: 2022-08-17 00:00:00 +0000
- title: 'Byzantine-tolerant distributed multiclass sparse linear discriminant analysis'
  abstract: 'Communication cost and security issues are both important in large-scale distributed machine learning. In this paper, we investigate a multiclass sparse classification problem under two distributed systems. We propose two distributed multiclass sparse discriminant analysis algorithms based on mean-aggregation and median-aggregation under the normal distributed system or Byzantine failure system. Both of them are computation and communication efficient. Several theoretical results, including estimation error bounds, and support recovery, are established. With moderate initial estimators, our iterative estimators achieve a (near-)optimal rate and exact support recovery after a constant number of rounds. Experiments on both synthetic and real datasets are provided to demonstrate the effectiveness of our proposed methods.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/bao22b.html
  PDF: https://proceedings.mlr.press/v180/bao22b/bao22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-bao22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yajie
    family: Bao
  - given: Weidong
    family: Liu
  - given: Xiaojun
    family: Mao
  - given: Weijia
    family: Xiong
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 129-138
  id: bao22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 129
  lastpage: 138
  published: 2022-08-17 00:00:00 +0000
- title: 'Equilibrium aggregation: encoding sets via optimization'
  abstract: 'Processing sets or other unordered, potentially variable-sized inputs in neural networks is usually handled by aggregating a number of input tensors into a single representation. While a number of aggregation methods already exist from simple sum pooling to multi-head attention, they are limited in their representational power both from theoretical and empirical perspectives. On the search of a principally more powerful aggregation strategy, we propose an optimization-based method called Equilibrium Aggregation. We show that many existing aggregation methods can be recovered as special cases of Equilibrium Aggregation and that it is provably more efficient in some important cases. Equilibrium Aggregation can be used as a drop-in replacement in many existing architectures and applications. We validate its efficiency on three different tasks: median estimation, class counting, and molecular property prediction. In all experiments, Equilibrium Aggregation achieves higher performance than the other aggregation techniques we test.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/bartunov22a.html
  PDF: https://proceedings.mlr.press/v180/bartunov22a/bartunov22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-bartunov22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Sergey
    family: Bartunov
  - given: Fabian B.
    family: Fuchs
  - given: Timothy P.
    family: Lillicrap
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 139-149
  id: bartunov22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 139
  lastpage: 149
  published: 2022-08-17 00:00:00 +0000
- title: 'Empirical bayes approach to truth discovery problems'
  abstract: 'When aggregating information from conflicting sources, one’s goal is to find the truth. Most real-value truth discovery (TD) algorithms try to achieve this goal by estimating the competence of each source and then aggregating the conflicting information by weighing each source’s answer proportionally to her competence. However, each of those algorithms requires more than a single source for such estimation and usually does not consider different estimation methods other than a weighted mean. Therefore, in this work we formulate, prove, and empirically test the conditions for an Empirical Bayes Estimator (EBE) to dominate the weighted mean aggregation. Our main result demonstrates that EBE, under mild conditions, can be used as a second step of any TD algorithm in order to reduce the expected error.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/ben-shabat22a.html
  PDF: https://proceedings.mlr.press/v180/ben-shabat22a/ben-shabat22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-ben-shabat22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tsviel
    family: Ben Shabat
  - given: Reshef
    family: Meir
  - given: David
    family: Azriel
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 150-158
  id: ben-shabat22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 150
  lastpage: 158
  published: 2022-08-17 00:00:00 +0000
- title: 'On early extinction and the effect of travelling in the SIR model'
  abstract: 'We consider a population protocol version of the SIR model. In every round, an individual is chosen uniformly at random. If the individual is susceptible, then it becomes infected w.p. $\beta I_t/N$, where $I_t$ is the number of infections at time $t$ and $N$ is the total number of individuals. If the individual is infected, then it recovers w.p. $\gamma$, whereas, if the individual is already recovered, nothing happens. We prove sharp bounds on the probability of the disease becoming pandemic vs extinguishing early (dying out quickly). The probability of extinguishing early, $\Pr{\mathcal{E}_{ext}}$, is typically neglected in prior work since most use (deterministic) differential equations. Leveraging on this, using $\Pr{\mathcal{E}_{ext}}$, we proceed by bounding the expected size of the population that contracts the disease $\mathbf{E}\left[R_\infty\right]$. Prior work only calculated $\mathbf{E}\left[R_\infty | \overline{\mathcal{E}_{ext}}\right]$, or obtained non-closed form solutions. We then study the two-country model also accounting for the role of $\Pr{\mathcal{E}_{ext}}$. We assume that both countries have different infection rates $\beta^{(i)}$, but share the same recovery rate $\gamma$. In this model, each round has two steps: First, an individual is chosen u.a.r. and travels w.p. $p_{travel}$ to the other country. Afterwards, the process continues as before with the respective infection rates. Finally, using simulations, we characterise the influence of $p_{travel}$ on the total number of infections. Our simulations show that, depending on the $\beta^{(i)}$, increasing $p_{travel}$ can decrease or increase the expected total number of infections $\mathbf{E}\left[R_\infty\right]$.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/berenbrink22a.html
  PDF: https://proceedings.mlr.press/v180/berenbrink22a/berenbrink22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-berenbrink22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Petra
    family: Berenbrink
  - given: Colin
    family: Cooper
  - given: Cristina
    family: Gava
  - given: David
    family: Kohan Marzagão
  - given: Frederik
    family: Mallmann-Trenn
  - given: Tomasz
    family: Radzik
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 159-169
  id: berenbrink22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 159
  lastpage: 169
  published: 2022-08-17 00:00:00 +0000
- title: 'Learning soft interventions in complex equilibrium systems'
  abstract: 'Complex systems often contain feedback loops that can be described as cyclic causal models. Intervening in such systems may lead to counterintuitive effects, which cannot be inferred directly from the graph structure. After establishing a framework for differentiable soft interventions based on Lie groups, we take advantage of modern automatic differentiation techniques and their application to implicit functions in order to optimize interventions in cyclic causal models. We illustrate the use of this framework by investigating scenarios of transition to sustainable economies.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/besserve22a.html
  PDF: https://proceedings.mlr.press/v180/besserve22a/besserve22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-besserve22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Michel
    family: Besserve
  - given: Bernhard
    family: Schölkopf
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 170-180
  id: besserve22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 170
  lastpage: 180
  published: 2022-08-17 00:00:00 +0000
- title: 'Identifying near-optimal decisions in linear-in-parameter  bandit models with continuous decision sets'
  abstract: 'We consider an online optimization problem in a bandit  setting in which a learner chooses decisions from a continuous decision  set at discrete decision epochs, and receives noisy rewards from the  environment in response. While the noise samples are assumed to be  independent and sub-Gaussian, the mean reward at each epoch is a fixed but  unknown linear function of a feature vector, which depends on the decision  through a known (and possibly nonlinear)  feature map. We study the  problem within the framework of best-arm identification with fixed  confidence, and provide a template algorithm for approximately learning  the optimal decision in a probably approximately correct (PAC) setting.  More precisely, the template algorithm samples the decision space till a  stopping condition is met,  and returns a subset of decisions such that,  with the required confidence, every element of the subset is approximately  optimal for the unknown mean reward function.  We provide a sample  complexity bound for the template algorithm and then specialize it to the  case where the mean-reward function is a univariate polynomial of a single  decision variable. We provide an implementable algorithm for this case by  explicitly instantiating all the steps in the template algorithm. Finally,  we provide experimental results to demonstrate the efficacy of our  algorithms.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/bhat22a.html
  PDF: https://proceedings.mlr.press/v180/bhat22a/bhat22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-bhat22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Sanjay P.
    family: Bhat
  - given: Chaitanya
    family: Amballa
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 181-190
  id: bhat22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 181
  lastpage: 190
  published: 2022-08-17 00:00:00 +0000
- title: 'Offline change detection under contamination'
  abstract: 'In this work, we propose a non-parametric and robust change detection algorithm to detect multiple change points in time series data under non-adversarial contamination. The algorithm is designed for the offline setting, where the objective is to detect changes when all data are received. We only make weak moment assumptions on the inliers (uncorrupted data) to handle a large class of distributions. The robust scan statistic in the change detection algorithm is fashioned using mean estimators based on influence functions. We establish the consistency of the estimated change point indexes as the number of samples increases, and provide empirical evidence to support the consistency results.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/bhatt22a.html
  PDF: https://proceedings.mlr.press/v180/bhatt22a/bhatt22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-bhatt22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Sujay
    family: Bhatt
  - given: Guanhua
    family: Fang
  - given: Ping
    family: Li
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 191-201
  id: bhatt22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 191
  lastpage: 201
  published: 2022-08-17 00:00:00 +0000
- title: 'On testability of the front-door model via Verma constraints'
  abstract: 'The front-door criterion can be used to identify and compute causal effects despite the existence of unmeasured confounders between a treatment and outcome. However, the key assumptions – (i) the existence of a variable (or set of variables) that fully mediates the effect of the treatment on the outcome, and (ii) which simultaneously does not suffer from similar issues of confounding as the treatment-outcome pair – are often deemed implausible. This paper explores the testability of these assumptions. We show that under mild conditions involving an auxiliary variable, the assumptions encoded in the front-door model (and simple extensions of it) may be tested via generalized equality constraints a.k.a Verma constraints. We propose two goodness-of-fit tests based on this observation, and evaluate the efficacy of our proposal on real and synthetic data. We also provide theoretical and empirical comparisons to instrumental variable approaches to handling unmeasured confounding.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/bhattacharya22a.html
  PDF: https://proceedings.mlr.press/v180/bhattacharya22a/bhattacharya22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-bhattacharya22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Rohit
    family: Bhattacharya
  - given: Razieh
    family: Nabi
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 202-212
  id: bhattacharya22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 202
  lastpage: 212
  published: 2022-08-17 00:00:00 +0000
- title: 'Robustness of model predictions under extension'
  abstract: 'Mathematical models of the real world are simplified representations of complex systems. A caveat to using mathematical models is that predicted causal effects and conditional independences may not be robust under model extensions, limiting applicability of such models. In this work, we consider conditions under which qualitative model predictions are preserved when two models are combined. Under mild assumptions, we show how to use the technique of causal ordering to efficiently assess the robustness of qualitative model predictions. We also characterize a large class of model extensions that preserve qualitative model predictions. For dynamical systems at equilibrium, we demonstrate how novel insights help to select appropriate model extensions and to reason about the presence of feedback loops. We illustrate our ideas with a viral infection model with immune responses.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/blom22a.html
  PDF: https://proceedings.mlr.press/v180/blom22a/blom22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-blom22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tineke
    family: Blom
  - given: Joris M.
    family: Mooij
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 213-222
  id: blom22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 213
  lastpage: 222
  published: 2022-08-17 00:00:00 +0000
- title: 'Information theoretic approach to detect collusion in multi-agent games'
  abstract: 'Collusion in a competitive multi-agent game occurs when two or more agents co-operate covertly to the disadvantage of others. Most competitive multi-agent games do not allow players to share information and explicitly prohibit collusion. In this paper, we present a novel way of detecting collusion using a domain-independent information-theoretic approach. Specifically, we show that the use of mutual information between actions of the agents provides a good indication of collusive behavior. Our experiments show that our method can detect varying levels of collusion in repeated simultaneous games like iterated Rock Paper Scissors. We further extend the detection to partially observable sequential games like poker and show the effectiveness of our methodology.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/bonjour22a.html
  PDF: https://proceedings.mlr.press/v180/bonjour22a/bonjour22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-bonjour22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Trevor
    family: Bonjour
  - given: Vaneet
    family: Aggarwal
  - given: Bharat
    family: Bhargava
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 223-232
  id: bonjour22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 223
  lastpage: 232
  published: 2022-08-17 00:00:00 +0000
- title: 'Lifting in multi-agent systems under uncertainty'
  abstract: ' A decentralised partially observable Markov decision problem (DecPOMDP) formalises collaborative multi-agent decision making. A solution to a DecPOMDP is a joint policy for the agents, fulfilling an optimality criterion such as maximum expected utility. A crux is that the problem is intractable regarding the number of agents. Inspired by lifted inference, this paper examines symmetries within the agent set for a potential tractability. Specifically, this paper contributes (i) specifications of counting and isomorphic symmetries, (ii) a compact encoding of symmetric DecPOMDPs as partitioned DecPOMDPs, and (iii) a formal analysis of complexity and tractability. This works allows tractability in terms of agent numbers and a new query type for isomorphic DecPOMDPs. '
  volume: 180
  URL: https://proceedings.mlr.press/v180/braun22a.html
  PDF: https://proceedings.mlr.press/v180/braun22a/braun22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-braun22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tanya
    family: Braun
  - given: Marcel
    family: Gehrke
  - given: Florian
    family: Lau
  - given: Ralf
    family: Möller
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 233-243
  id: braun22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 233
  lastpage: 243
  published: 2022-08-17 00:00:00 +0000
- title: 'On-the-fly adaptation of patrolling strategies in changing environments'
  abstract: ' We consider the problem of efficient patrolling strategy adaptation in a changing environment where the topology of Defender’s moves and the importance of guarded targets change unpredictably. The Defender must instantly switch to a new strategy optimized for the new environment, not disrupting the ongoing patrolling task, and the new strategy must be computed promptly under all circumstances. Since strategy switching may cause unintended security risks compromising the achieved protection, our solution includes mechanisms for detecting and mitigating this problem. The efficiency of our framework is evaluated experimentally.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/brazdil22a.html
  PDF: https://proceedings.mlr.press/v180/brazdil22a/brazdil22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-brazdil22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tomáš
    family: Brázdil
  - given: David
    family: Klaška
  - given: Antonı́n
    family: Kučera
  - given: Vı́t
    family: Musil
  - given: Petr
    family: Novotný
  - given: Vojtěch
    family: Řehák
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 244-254
  id: brazdil22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 244
  lastpage: 254
  published: 2022-08-17 00:00:00 +0000
- title: 'On the inductive bias of neural networks for learning read-once DNFs'
  abstract: 'Learning functions over Boolean variables is a fundamental problem in machine learning. But not much is known about learning such functions using neural networks. Here we focus on learning read-once disjunctive normal forms (DNFs) under the uniform distribution with a convex neural network and gradient methods. We first observe empirically that gradient methods converge to compact solutions with neurons that are aligned with the terms of the DNF. This is despite the fact that there are many zero training error networks that do not have this property. Thus, the learning process has a clear inductive bias towards such logical formulas. Following recent results which connect the inductive bias of gradient flow (GF) to Karush-Kuhn-Tucker (KKT) points of minimum norm problems, we study these KKT points in our setting. We prove that zero training error solutions that memorize training points are not KKT points and therefore GF cannot converge to them. On the other hand, we prove that globally optimal KKT points correspond exactly to networks that are aligned with the DNF terms. These results suggest a strong connection between the inductive bias of GF and solutions that align with the DNF. We conclude with extensive experiments which verify our findings.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/bronstein22a.html
  PDF: https://proceedings.mlr.press/v180/bronstein22a/bronstein22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-bronstein22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Ido
    family: Bronstein
  - given: Alon
    family: Brutzkus
  - given: Amir
    family: Globerson
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 255-265
  id: bronstein22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 255
  lastpage: 265
  published: 2022-08-17 00:00:00 +0000
- title: 'AUTM flow: atomic unrestricted time machine for monotonic normalizing flows'
  abstract: 'Nonlinear monotone transformations are used extensively in normalizing flows to construct invertible triangular mappings from simple distributions to complex ones. In existing literature, monotonicity is usually enforced by restricting function classes or model parameters and the inverse transformation is often approximated by root-finding algorithms as a closed-form inverse is unavailable. In this paper, we introduce a new integral-based approach termed: Atomic Unrestricted Time Machine (AUTM), equipped with unrestricted integrands and easy-to-compute explicit inverse. AUTM offers a versatile and efficient way to the design of normalizing flows with explicit inverse and unrestricted function classes or parameters. Theoretically, we present a constructive proof that AUTM is universal: all monotonic normalizing flows can be viewed as limits of AUTM flows. We provide a concrete example to show how to approximate any given monotonic normalizing flow using AUTM flows with guaranteed convergence. The result implies that AUTM can be used to transform an existing flow into a new one equipped with explicit inverse and unrestricted parameters. The performance of the new approach is evaluated on high dimensional density estimation, variational inference and image generation. Experiments demonstrate superior speed and memory efficiency of AUTM.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/cai22a.html
  PDF: https://proceedings.mlr.press/v180/cai22a/cai22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-cai22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Difeng
    family: Cai
  - given: Yuliang
    family: Ji
  - given: Huan
    family: He
  - given: Qiang
    family: Ye
  - given: Yuanzhe
    family: Xi
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 266-274
  id: cai22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 266
  lastpage: 274
  published: 2022-08-17 00:00:00 +0000
- title: 'Active approximately metric-fair learning'
  abstract: 'Existing studies on individual fairness focus on the passive setting and typically require $O(\frac{1}{\varepsilon^2})$ labeled instances to achieve an $\varepsilon$ bias budget. In this paper, we build on the elegant Approximately Metric-Fair (AMF) learning framework and propose an active AMF learner that can provably achieve the same budget with only $O(\log \frac{1}{\varepsilon})$ labeled instances. To our knowledge, this is a first and substantial improvement of the existing sample complexity for achieving individual fairness. Through experiments on three data sets, we show the proposed active AMF learner improves fairness on linear and non-linear models more efficiently than its passive counterpart as well as state-of-the-art active learners, while maintaining a comparable accuracy. To facilitate algorithm design and analysis, we also design a provably equivalent form of the approximate metric fairness based on uniform continuity instead of the existing almost Lipschitz continuity. '
  volume: 180
  URL: https://proceedings.mlr.press/v180/cao22a.html
  PDF: https://proceedings.mlr.press/v180/cao22a/cao22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-cao22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yiting
    family: Cao
  - given: Chao
    family: Lan
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 275-285
  id: cao22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 275
  lastpage: 285
  published: 2022-08-17 00:00:00 +0000
- title: 'Capturing actionable dynamics with structured latent ordinary differential equations'
  abstract: 'End-to-end learning of dynamical systems with black-box models, such as neural ordinary differential equations (ODEs), provides a flexible framework for learning dynamics from data without prescribing a mathematical model for the dynamics. Unfortunately, this flexibility comes at the cost of understanding the dynamical system, for which ODEs are used ubiquitously. Further, experimental data are collected under various conditions (inputs), such as treatments, or grouped in some way, such as part of sub-populations. Understanding the effects of these system inputs on system outputs is crucial to have any meaningful model of a dynamical system. To that end, we propose a structured latent ODE model that explicitly captures system input variations within its latent representation. Building on a static latent variable specification, our model learns (independent) stochastic factors of variation for each input to the system, thus separating the effects of the system inputs in the latent space. This approach provides actionable modeling through the controlled generation of time-series data for novel input combinations (or perturbations). Additionally, we propose a flexible approach for quantifying uncertainties, leveraging a quantile regression formulation. Results on challenging biological datasets show consistent improvements over competitive baselines in the controlled generation of observational data and inference of biologically meaningful system inputs.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/chapfuwa22a.html
  PDF: https://proceedings.mlr.press/v180/chapfuwa22a/chapfuwa22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-chapfuwa22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Paidamoyo
    family: Chapfuwa
  - given: Sherri
    family: Rose
  - given: Lawrence
    family: Carin
  - given: Edward
    family: Meeds
  - given: Ricardo
    family: Henao
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 286-295
  id: chapfuwa22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 286
  lastpage: 295
  published: 2022-08-17 00:00:00 +0000
- title: 'Privacy-aware compression for federated data analysis'
  abstract: 'Federated data analytics is a framework for distributed data analysis where a server compiles noisy responses from a group of distributed low-bandwidth user devices to estimate aggregate statistics. Two major challenges in this framework are privacy, since user data is often sensitive, and compression, since the user devices have low network bandwidth. Prior work has addressed these challenges separately by combining standard compression algorithms with known privacy mechanisms. In this work, we take a holistic look at the problem and design a family of privacy-aware compression mechanisms that work for any given communication budget. We first propose a mechanism for transmitting a single real number that has optimal variance under certain conditions. We then show how to extend it to metric differential privacy for location privacy use-cases, as well as vectors, for application to federated learning. Our experiments illustrate that our mechanism can lead to better utility vs. compression trade-offs for the same privacy loss in a number of settings.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/chaudhuri22a.html
  PDF: https://proceedings.mlr.press/v180/chaudhuri22a/chaudhuri22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-chaudhuri22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Kamalika
    family: Chaudhuri
  - given: Chuan
    family: Guo
  - given: Mike
    family: Rabbat
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 296-306
  id: chaudhuri22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 296
  lastpage: 306
  published: 2022-08-17 00:00:00 +0000
- title: 'The optimal noise in noise-contrastive learning is not what you think'
  abstract: ' Learning a parametric model of a data distribution is a well-known statistical problem that has seen renewed interest as it is brought to scale in deep learning. Framing the problem as a self-supervised task, where data samples are discriminated from noise samples, is at the core of state-of-the-art methods, beginning with Noise-Contrastive Estimation (NCE). Yet, such contrastive learning requires a good noise distribution, which is hard to specify; domain-specific heuristics are therefore widely used. While a comprehensive theory is missing, it is widely assumed that the optimal noise should in practice be made equal to the data, both in distribution and proportion. This setting underlies Generative Adversarial Networks (GANs) in particular. Here, we empirically and theoretically challenge this assumption on the optimal noise. We show that deviating from this assumption can actually lead to better statistical estimators, in terms of asymptotic variance. In particular, the optimal noise distribution is different from the data’s and even from a different family.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/chehab22a.html
  PDF: https://proceedings.mlr.press/v180/chehab22a/chehab22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-chehab22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Omar
    family: Chehab
  - given: Alexandre
    family: Gramfort
  - given: Aapo
    family: Hyvärinen
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 307-316
  id: chehab22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 307
  lastpage: 316
  published: 2022-08-17 00:00:00 +0000
- title: 'A competitive analysis of online failure-aware assignment'
  abstract: 'Motivated by a new generation of Internet advertising that has emerged in the live streaming e-commerce markets (e.g., Tiktok) over the past five years, we study a variant of online bipartite matching problem: advertisers send ad requests to influencers (aka, key opinion leaders) on a social media platform. Each influencer has a maximum number of ad requests she can accommodate. We assign a fixed number of influencers to an advertiser when she enters the platform. The advertiser then matches with each of the assigned influencers with a probability, which can be thought of as a set of negotiations between the advertiser and the set of assigned influencers. Unlike the standard online assignment problems, the outcome of any of these matches is not revealed throughout the session (negotiations take time). Our goal is to maximize the expected number of matches between advertisers and influencers. We put forward a new deterministic algorithm with a competitive ratio of $1/2$ and prove that no deterministic algorithm can achieve a better competitive ratio. We also show that the competitive ratio can be improved when randomness is allowed. We then study a setting where a match is successful with either probability 0 or a fixed $p$. We present an optimal randomized algorithm that achieves a competitive ratio of $1-1/e$ in this setting.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/chen22a.html
  PDF: https://proceedings.mlr.press/v180/chen22a/chen22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-chen22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Mengjing
    family: Chen
  - given: Pingzhong
    family: Tang
  - given: Zihe
    family: Wang
  - given: Shenke
    family: Xiao
  - given: Xiwang
    family: Yang
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 317-325
  id: chen22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 317
  lastpage: 325
  published: 2022-08-17 00:00:00 +0000
- title: 'Stackmix: a complementary mix algorithm'
  abstract: 'Techniques combining multiple images as input/output have proven to be effective data augmentations for training convolutional neural networks. In this paper, we present StackMix: each input is presented as a concatenation of two images, and the label is the mean of the two one-hot labels. On its own, StackMix rivals other widely used methods in the “Mix” line of work. More importantly, unlike previous work, significant gains across a variety of benchmarks are achieved by combining StackMix with existing Mix augmentation, effectively mixing more than two images. E.g., by combining StackMix with CutMix, test error in the supervised setting is improved across a variety of settings over CutMix, including 0.8% on ImageNet, 3% on Tiny ImageNet, 2% on CIFAR-100, 0.5% on CIFAR-10, and 1.5% on STL-10. Similar results are achieved with Mixup. We further show that gains hold for robustness to common input corruptions and perturbations at varying severities with a 0.7% improvement on CIFAR-100-C, by combining StackMix with AugMix over AugMix. On its own, improvements with StackMix hold across different number of labeled samples on CIFAR-100, maintaining approximately a 2% gap in test accuracy –down to using only 5% of the whole dataset– and is effective in the semi-supervised setting with a 2% improvement with the standard benchmark Pi-model. Finally, we perform an extensive ablation study to better understand the proposed methodology.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/chen22b.html
  PDF: https://proceedings.mlr.press/v180/chen22b/chen22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-chen22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: John
    family: Chen
  - given: Samarth
    family: Sinha
  - given: Anastasios
    family: Kyrillidis
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 326-335
  id: chen22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 326
  lastpage: 335
  published: 2022-08-17 00:00:00 +0000
- title: 'Knowledge representation combining quaternion path integration and depth-wise atrous circular convolution'
  abstract: 'Knowledge models endeavor to improve representation and feature extraction capabilities while keeping low computational cost. Firstly, existing embedding models in hypercomplex spaces of non-Abelian group are optimized. Then a method for fast quaternion multiplication is proposed with proof, with which path semantics are computed and further integrated with the attention mechanism based on the idea semantic extraction of relation sequences could be regarded as a multiple rotational blending problem. A depth-wise atrous circular convolution framework is set up for better feature extraction. Experiments including Link Prediction and Path Query are conducted on benchmark datasets verifying our model holds advantages over state-of-the-art models like Rotate3D. Moreover, the model is tested on a biomedical dataset simulating real-world applications. An ablation study is also performed to explore the effectiveness of different components. '
  volume: 180
  URL: https://proceedings.mlr.press/v180/chen22c.html
  PDF: https://proceedings.mlr.press/v180/chen22c/chen22c.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-chen22c.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Xinyuan
    family: Chen
  - given: Zhongmei
    family: Zhou
  - given: Meichun
    family: Gao
  - given: Daya
    family: Shi
  - given: Mohd N.
    family: Husen
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 336-345
  id: chen22c
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 336
  lastpage: 345
  published: 2022-08-17 00:00:00 +0000
- title: 'Sublinear time algorithms for greedy selection in high dimensions'
  abstract: 'Greedy selection is a widely used idea for solving many machine learning problems. But greedy selection algorithms often have high complexities and thus may be prohibitive for large-scale data. In this paper, we consider two fundamental optimization problems in machine learning: k-center clustering and convex hull approximation, where they both can be solved via greedy selection. We propose sublinear time algorithms for them through combining the strategies of randomization and greedy selection. Our results are similar in spirit to the linear time stochastic greedy selection algorithms for submodular maximization, but with several important differences. Our runtimes are independent of the number of input data items n. In particular, our runtime for k-center clustering significantly improves upon that of the uniform sampling approach, especially when the dimensionality is high. Our sublinear algorithms can also reduce the computational complexities for various applications, such as data selection and compression, active learning, and topic modeling, etc.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/chen22d.html
  PDF: https://proceedings.mlr.press/v180/chen22d/chen22d.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-chen22d.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Qi
    family: Chen
  - given: Kai
    family: Liu
  - given: Ruilong
    family: Yao
  - given: Hu
    family: Ding
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 346-356
  id: chen22d
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 346
  lastpage: 356
  published: 2022-08-17 00:00:00 +0000
- title: 'Shoring up the foundations: fusing model embeddings and weak supervision'
  abstract: 'Foundation models offer an exciting new paradigm for constructing models with out-of-the-box embeddings and a few labeled examples. However, it is not clear how to best apply foundation models without labeled data. A potential approach is to fuse foundation models with weak supervision frameworks, which use weak label sources—pre-trained models, heuristics, crowd-workers—to construct pseudolabels. The challenge is building a combination that best exploits the signal available in both foundation models and weak sources. We propose LIGER, a combination that uses foundation model embeddings to improve two crucial elements of existing weak supervision techniques. First, we produce finer estimates of weak source quality by partitioning the embedding space and learning per-part source accuracies. Second, we improve source coverage by extending source votes in embedding space. Despite the black-box nature of foundation models, we prove results characterizing how our approach improves performance and show that lift scales with the smoothness of label distributions in embedding space. On six benchmark NLP and video tasks, LIGER outperforms vanilla weak supervision by 14.1 points, weakly-supervised kNN and adapters by 11.8 points, and kNN and adapters supervised by traditional hand labels by 7.2 points.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/chen22e.html
  PDF: https://proceedings.mlr.press/v180/chen22e/chen22e.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-chen22e.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Mayee F.
    family: Chen
  - given: Daniel Y.
    family: Fu
  - given: Dyah
    family: Adila
  - given: Michael
    family: Zhang
  - given: Frederic
    family: Sala
  - given: Kayvon
    family: Fatahalian
  - given: Christopher
    family: Ré
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 357-367
  id: chen22e
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 357
  lastpage: 367
  published: 2022-08-17 00:00:00 +0000
- title: 'On the definition and computation of causal treewidth'
  abstract: 'Causal treewidth is a recently introduced notion allowing one to speed up Bayesian network inference and to bound its complexity in the presence of functional dependencies (causal mechanisms) whose identities are unknown. Causal treewidth is no greater than treewidth and can be bounded even when treewidth is unbounded. The utility of causal treewidth has been illustrated recently in the context of causal inference and model-based supervised learning. However, the current definition of causal treewidth is descriptive rather than perspective, therefore limiting its full exploitation in a practical setting. We provide an extensive study of causal treewidth in this paper which moves us closer to realizing the full computational potential of this notion both theoretically and practically.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/chen22f.html
  PDF: https://proceedings.mlr.press/v180/chen22f/chen22f.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-chen22f.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yizuo
    family: Chen
  - given: Adnan
    family: Darwiche
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 368-377
  id: chen22f
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 368
  lastpage: 377
  published: 2022-08-17 00:00:00 +0000
- title: 'Offline reinforcement learning under value and density-ratio realizability: The power of gaps'
  abstract: 'We consider a challenging theoretical problem in offline reinforcement learning (RL): obtaining sample-efficiency guarantees with a dataset lacking sufficient coverage, under only realizability-type assumptions for the function approximators. While the existing theory has addressed learning under realizability and under non-exploratory data separately, no work has been able to address both simultaneously (except for a concurrent work which we compare in detail). Under an additional gap assumption, we provide guarantees to a simple pessimistic algorithm based on a version space formed by marginalized importance sampling (MIS), and the guarantee only requires the data to cover the optimal policy and the function classes to realize the optimal value and density-ratio functions. While similar gap assumptions have been used in other areas of RL theory, our work is the first to identify the utility and the novel mechanism of gap assumptions in offline RL with weak function approximation.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/chen22g.html
  PDF: https://proceedings.mlr.press/v180/chen22g/chen22g.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-chen22g.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Jinglin
    family: Chen
  - given: Nan
    family: Jiang
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 378-388
  id: chen22g
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 378
  lastpage: 388
  published: 2022-08-17 00:00:00 +0000
- title: 'Greedy modality selection via approximate submodular maximization'
  abstract: 'Multimodal learning considers learning from multi-modality data, aiming to fuse heterogeneous sources of information. However, it is not always feasible to leverage all available modalities due to memory constraints. Further, training on all the modalities may be inefficient when redundant information exists within data, such as different subsets of modalities providing similar performance. In light of these challenges, we study modality selection, intending to efficiently select the most informative and complementary modalities under certain computational constraints. We formulate a theoretical framework for optimizing modality selection in multimodal learning and introduce a utility measure to quantify the benefit of selecting a modality. For this optimization problem, we present efficient algorithms when the utility measure exhibits monotonicity and approximate submodularity. We also connect the utility measure with existing Shapley-value-based feature importance scores. Last, we demonstrate the efficacy of our algorithm on synthetic (Patch-MNIST) and real-world (PEMS-SF, CMU-MOSI) datasets.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/cheng22a.html
  PDF: https://proceedings.mlr.press/v180/cheng22a/cheng22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-cheng22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Runxiang
    family: Cheng
  - given: Gargi
    family: Balasubramaniam
  - given: Yifei
    family: He
  - given: Yao-Hung Hubert
    family: Tsai
  - given: Han
    family: Zhao
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 389-399
  id: cheng22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 389
  lastpage: 399
  published: 2022-08-17 00:00:00 +0000
- title: 'Feature selection for discovering distributional treatment effect modifiers'
  abstract: ' Finding the features relevant to the difference in treatment effects is essential to unveil the underlying causal mechanisms. Existing methods seek such features by measuring how greatly the feature attributes affect the degree of the {\it conditional average treatment effect} (CATE). However, these methods may overlook important features because CATE, a measure of the average treatment effect, cannot detect differences in distribution parameters other than the mean (e.g., variance). To resolve this weakness of existing methods, we propose a feature selection framework for discovering {\it distributional treatment effect modifiers}. We first formulate a feature importance measure that quantifies how strongly the feature attributes influence the discrepancy between potential outcome distributions. Then we derive its computationally efficient estimator and develop a feature selection algorithm that can control the type I error rate to the desired level. Experimental results show that our framework successfully discovers important features and outperforms the existing mean-based method.     '
  volume: 180
  URL: https://proceedings.mlr.press/v180/chikahara22a.html
  PDF: https://proceedings.mlr.press/v180/chikahara22a/chikahara22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-chikahara22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yoichi
    family: Chikahara
  - given: Makoto
    family: Yamada
  - given: Hisashi
    family: Kashima
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 400-410
  id: chikahara22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 400
  lastpage: 410
  published: 2022-08-17 00:00:00 +0000
- title: 'Combating the instability of mutual information-based losses via regularization'
  abstract: 'Notable progress has been made in numerous fields of machine learning based on neural network-driven mutual information (MI) bounds. However, utilizing the conventional MI-based losses is often challenging due to their practical and mathematical limitations. In this work, we first identify the symptoms behind their instability: (1) the neural network not converging even after the loss seemed to converge, and (2) saturating neural network outputs causing the loss to diverge. We mitigate both issues by adding a novel regularization term to the existing losses. We theoretically and experimentally demonstrate that added regularization stabilizes training. Finally, we present a novel benchmark that evaluates MI-based losses on both the MI estimation power and its capability on the downstream tasks, closely following the pre-existing supervised and contrastive learning settings. We evaluate six different MI-based losses and their regularized counterparts on multiple benchmarks to show that our approach is simple yet effective.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/choi22a.html
  PDF: https://proceedings.mlr.press/v180/choi22a/choi22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-choi22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Kwanghee
    family: Choi
  - given: Siyeong
    family: Lee
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 411-421
  id: choi22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 411
  lastpage: 421
  published: 2022-08-17 00:00:00 +0000
- title: 'A geometric method for improved uncertainty estimation in real-time'
  abstract: 'Machine learning classifiers are probabilistic in nature, and thus inevitably involve uncertainty. Predicting the probability of a specific input to be correct is called uncertainty (or confidence) estimation and is crucial for risk management.Post-hoc model calibrations can improve models’ uncertainty estimations without the need for retraining, and without changing the model.Our work puts forward a geometric-based approach for uncertainty estimation. Roughly speaking, we use the geometric distance of the current input from the existing training inputs as a signal for estimating uncertainty and then calibrate that signal (instead of the model’s estimation) using standard post-hoc calibration techniques.  We show that our method yields better uncertainty estimations than recently proposed approaches by extensively evaluating multiple datasets and models. In addition, we also demonstrate the possibility of performing our approach in near real-time applications. Our code is available at our Github: https://github.com/NoSleepDeveloper/Geometric-Calibrator '
  volume: 180
  URL: https://proceedings.mlr.press/v180/chouraqui22a.html
  PDF: https://proceedings.mlr.press/v180/chouraqui22a/chouraqui22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-chouraqui22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Gabriella
    family: Chouraqui
  - given: Liron
    family: Cohen
  - given: Gil
    family: Einziger
  - given: Liel
    family: Leman
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 422-432
  id: chouraqui22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 422
  lastpage: 432
  published: 2022-08-17 00:00:00 +0000
- title: 'Cyclic test time augmentation with entropy weight method'
  abstract: 'In the recent studies of data augmentation of neural networks, the application of test time augmentation has been studied to extract optimal transformation policies to enhance performance with minimum cost. The policy search method with the best level of input data dependency involves training a loss predictor network to estimate suitable transformations for each of the given input image in independent manner, resulting in instance-level transformation extraction. In this work, we propose a method to utilize and modify the loss prediction pipeline to further improve the performance with the cyclic search for suitable transformations and the use of the entropy weight method. The cyclic usage of the loss predictor allows refining each input image with multiple transformations with a more flexible transformation magnitude. For cases where multiple augmentations are generated, we implement the entropy weight method to reflect the data uncertainty of each augmentation to force the final result to focus on augmentations with low uncertainty. The experimental results show convincing qualitative outcomes and robust performance for the corrupted conditions of data.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/chun22a.html
  PDF: https://proceedings.mlr.press/v180/chun22a/chun22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-chun22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Sewhan
    family: Chun
  - given: Jae Young
    family: Lee
  - given: Junmo
    family: Kim
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 433-442
  id: chun22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 433
  lastpage: 442
  published: 2022-08-17 00:00:00 +0000
- title: 'Greedy equivalence search in the presence of latent confounders'
  abstract: 'We investigate Greedy PAG Search (GPS) for score-based causal discovery  over equivalence classes, similar to the famous Greedy Equivalence Search algorithm, except now in the presence of latent confounders. It is based on a novel characterization of Markov equivalence classes for MAGs, that not only improves state-of-the-art identification of Markov equivalence between MAGs to linear time complexity for sparse graphs, but also allows for efficient traversal over equivalence classes in the space of all MAGs. The resulting GPS algorithm is evaluated against several existing alternatives and found to show promising performance, both in terms of speed and accuracy.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/claassen22a.html
  PDF: https://proceedings.mlr.press/v180/claassen22a/claassen22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-claassen22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tom
    family: Claassen
  - given: Ioan G.
    family: Bucur
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 443-452
  id: claassen22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 443
  lastpage: 452
  published: 2022-08-17 00:00:00 +0000
- title: 'Counterfactual inference of second Opinions'
  abstract: 'Automated decision support systems that are able to infer second opinions from experts can potentially facilitate a more efficient allocation of resources—they can help decide when and from whom to seek a second opinion. In this paper, we look at the design of this type of support systems from the perspective of counterfactual inference. We focus on a multiclass classification setting and first show that, if experts make predictions on their own, the underlying causal mechanism generating their predictions needs to satisfy a desirable set invariant property. Further, we show that, for any causal mechanism satisfying this property, there exists an equivalent mechanism where the predictions by each expert are generated by independent sub-mechanisms governed by a common noise. This motivates the design of a set invariant Gumbel-Max structural causal model where the structure of the noise governing the sub-mechanisms underpinning the model depends on an intuitive notion of similarity between experts which can be estimated from data. Experiments on both synthetic and real data show that our model can be used to infer second opinions more accurately than its non-causal counterpart.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/corvelo-benz22a.html
  PDF: https://proceedings.mlr.press/v180/corvelo-benz22a/corvelo-benz22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-corvelo-benz22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Nina L.
    family: Corvelo Benz
  - given: Manuel
    family: Gomez Rodriguez
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 453-463
  id: corvelo-benz22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 453
  lastpage: 463
  published: 2022-08-17 00:00:00 +0000
- title: 'Variational message passing neural network for Maximum-A-Posteriori (MAP) inference'
  abstract: 'Maximum-A-Posteriori (MAP) inference is a fundamental task in probabilistic inference and belief propagation (BP) is a widely used algorithm for MAP inference. Though BP has been applied successfully to many different fields, it offers no performance guarantee and often performs poorly on loopy graphs. To improve the performance on loopy graphs and to scale up to large graphs, we propose a variational message passing neural network (V-MPNN), where we leverage both the power of neural networks in modeling complex functions and the well-established algorithmic theories on variational belief propagation. Instead of relying on a hand-crafted variational assumption, we propose a neural-augmented free energy where a general variational distribution is parameterized through a neural network. A message passing neural network is utilized for the minimization of neural-augmented free energy. Training of the MPNN is thus guided by neural-augmented free energy, without requiring exact MAP configurations as annotations. We empirically demonstrate the effectiveness of the proposed V-MPNN by comparing against both state-of-the-art training-free methods and training-based methods.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/cui22a.html
  PDF: https://proceedings.mlr.press/v180/cui22a/cui22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-cui22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Zijun
    family: Cui
  - given: Hanjing
    family: Wang
  - given: Tian
    family: Gao
  - given: Kartik
    family: Talamadupula
  - given: Qiang
    family: Ji
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 464-474
  id: cui22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 464
  lastpage: 474
  published: 2022-08-17 00:00:00 +0000
- title: 'On provably robust meta-Bayesian optimization'
  abstract: 'Bayesian optimization (BO) has become popular for sequential optimization of black-box functions. When BO is used to optimize a target function, we often have access to previous evaluations of potentially related functions. This begs the question as to whether we can leverage these previous experiences to accelerate the current BO task through meta-learning (meta-BO), while ensuring robustness against potentially harmful dissimilar tasks that could sabotage the convergence of BO. This paper introduces two scalable and provably robust meta-BO algorithms: robust meta-Gaussian process-upper confidence bound (RM-GP-UCB) and RM-GP-Thompson sampling (RM-GP-TS). We prove that both algorithms are asymptotically no-regret even when some or all previous tasks are dissimilar to the current task, and show that RM-GP-UCB enjoys a better theoretical robustness than RM-GP-TS. We also exploit the theoretical guarantees to optimize the weights assigned to individual previous tasks through regret minimization via online learning, which diminishes the impact of dissimilar tasks and hence further enhances the robustness. Empirical evaluations show that (a) RM-GP-UCB performs effectively and consistently across various applications, and (b) RM-GP-TS, despite being less robust than RM-GP-UCB both in theory and in practice, performs competitively in some scenarios with less dissimilar tasks and is more computationally efficient.  '
  volume: 180
  URL: https://proceedings.mlr.press/v180/dai22a.html
  PDF: https://proceedings.mlr.press/v180/dai22a/dai22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-dai22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Zhongxiang
    family: Dai
  - given: Yizhou
    family: Chen
  - given: Haibin
    family: Yu
  - given: Bryan Kian Hsiang
    family: Low
  - given: Patrick
    family: Jaillet
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 475-485
  id: dai22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 475
  lastpage: 485
  published: 2022-08-17 00:00:00 +0000
- title: 'Individual fairness in feature-based pricing for monopoly markets'
  abstract: 'We study fairness in the context of feature-based price discrimination in monopoly markets. We propose a new notion of individual fairness, namely, \alpha-fairness, which guarantees that individuals with similar features face similar prices. First, we study discrete valuation space and give an analytical solution for optimal fair feature-based pricing.  We show that the cost of fair pricing is defined as the ratio of expected revenue in an optimal feature-based pricing to the expected revenue in an optimal fair feature-based pricing  (CoF) can be arbitrarily large in general. When the revenue function is continuous and concave with respect to the prices, we show that one can achieve CoF strictly less than 2, irrespective of the model parameters. Finally, we provide an algorithm to compute fair feature-based pricing strategy that achieves this CoF.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/das22a.html
  PDF: https://proceedings.mlr.press/v180/das22a/das22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-das22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Shantanu
    family: Das
  - given: Swapnil
    family: Dhamal
  - given: Ganesh
    family: Ghalme
  - given: Shweta
    family: Jain
  - given: Sujit
    family: Gujar
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 486-495
  id: das22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 486
  lastpage: 495
  published: 2022-08-17 00:00:00 +0000
- title: 'Faster non-convex federated learning via global and local momentum'
  abstract: 'We propose \texttt{FedGLOMO}, a novel federated learning (FL) algorithm with an iteration complexity of $\mathcal{O}(\epsilon^{-1.5})$ to converge to an $\epsilon$-stationary point (i.e., $\mathbb{E}[\|\nabla f(x)\|^2] \leq \epsilon$) for smooth non-convex functions – under arbitrary client heterogeneity and compressed communication – compared to the $\mathcal{O}(\epsilon^{-2})$ complexity of most prior works. Our key algorithmic idea that enables achieving this improved complexity is based on the observation that the convergence in FL is hampered by two sources of high variance: (i) the global server aggregation step with multiple local updates, exacerbated by client heterogeneity, and (ii) the noise of the local client-level stochastic gradients. The first issue is particularly detrimental to FL algorithms that perform plain averaging at the server. By modeling the server aggregation step as a generalized gradient-type update, we propose a variance-reducing momentum-based global update at the server, which when applied in conjunction with variance-reduced local updates at the clients, enables \texttt{FedGLOMO} to enjoy an improved convergence rate. Our experiments illustrate the intrinsic variance reduction effect of \texttt{FedGLOMO}, which implicitly suppresses client-drift in heterogeneous data distribution settings and promotes communication efficiency.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/das22b.html
  PDF: https://proceedings.mlr.press/v180/das22b/das22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-das22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Rudrajit
    family: Das
  - given: Anish
    family: Acharya
  - given: Abolfazl
    family: Hashemi
  - given: Sujay
    family: Sanghavi
  - given: Inderjit S.
    family: Dhillon
  - given: Ufuk
    family: Topcu
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 496-506
  id: das22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 496
  lastpage: 506
  published: 2022-08-17 00:00:00 +0000
- title: 'Multi-objective Bayesian optimization over high-dimensional search spaces'
  abstract: 'Many real world scientific and industrial applications require optimizing multiple competing black-box objectives. When the objectives are expensive-to-evaluate, multi-objective Bayesian optimization (BO) is a popular approach because of its high sample efficiency. However, even with recent methodological advances, most existing multi-objective BO methods perform poorly on search spaces with more than a few dozen parameters and rely on global surrogate models that scale cubically with the number of observations. In this work we propose MORBO, a scalable method for multi-objective BO over high-dimensional search spaces. MORBO identifies diverse globally optimal solutions by performing BO in multiple local regions of the design space in parallel using a coordinated strategy. We show that MORBO significantly advances the state-of-the-art in sample efficiency for several high-dimensional synthetic problems and real world applications, including an optical display design problem and a vehicle design problem with 146 and 222 parameters, respectively. On these problems, where existing BO algorithms fail to scale and perform well, MORBO provides practitioners with order-of-magnitude improvements in sample efficiency over the current approach.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/daulton22a.html
  PDF: https://proceedings.mlr.press/v180/daulton22a/daulton22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-daulton22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Samuel
    family: Daulton
  - given: David
    family: Eriksson
  - given: Maximilian
    family: Balandat
  - given: Eytan
    family: Bakshy
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 507-517
  id: daulton22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 507
  lastpage: 517
  published: 2022-08-17 00:00:00 +0000
- title: 'Bayesian structure learning with generative flow networks'
  abstract: 'In Bayesian structure learning, we are interested in inferring a distribution over the directed acyclic graph (DAG) structure of Bayesian networks, from data. Defining such a distribution is very challenging, due to the combinatorially large sample space, and approximations based on MCMC are often required. Recently, a novel class of probabilistic models, called Generative Flow Networks (GFlowNets), have been introduced as a general framework for generative modeling of discrete and composite objects, such as graphs. In this work, we propose to use a GFlowNet as an alternative to MCMC for approximating the posterior distribution over the structure of Bayesian networks, given a dataset of observations. Generating a sample DAG from this approximate distribution is viewed as a sequential decision problem, where the graph is constructed one edge at a time, based on learned transition probabilities. Through evaluation on both simulated and real data, we show that our approach, called DAG-GFlowNet, provides an accurate approximation of the posterior over DAGs, and it compares favorably against other methods based on MCMC or variational inference.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/deleu22a.html
  PDF: https://proceedings.mlr.press/v180/deleu22a/deleu22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-deleu22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tristan
    family: Deleu
  - given: António
    family: Góis
  - given: Chris
    family: Emezue
  - given: Mansi
    family: Rankawat
  - given: Simon
    family: Lacoste-Julien
  - given: Stefan
    family: Bauer
  - given: Yoshua
    family: Bengio
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 518-528
  id: deleu22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 518
  lastpage: 528
  published: 2022-08-17 00:00:00 +0000
- title: 'Bayesian spillover graphs for dynamic networks'
  abstract: 'We present Bayesian Spillover Graphs (BSG), a novel method for learning temporal relationships, identifying critical nodes, and quantifying uncertainty for multi-horizon spillover effects in a dynamic system. BSG leverages both an interpretable framework via forecast error variance decompositions (FEVD) and comprehensive uncertainty quantification via Bayesian time series models to contextualize temporal relationships in terms of systemic risk and prediction variability. Forecast horizon hyperparameter h allows for learning both short-term and equilibrium state network behaviors. Experiments for identifying source and sink nodes under various graph and error specifications show significant performance gains against state-of-the-art Bayesian Networks and deep-learning baselines. Applications to real-world systems also showcase BSG as an exploratory analysis tool for uncovering indirect spillovers and quantifying systemic risk.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/deng22a.html
  PDF: https://proceedings.mlr.press/v180/deng22a/deng22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-deng22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Grace
    family: Deng
  - given: David S.
    family: Matteson
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 529-538
  id: deng22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 529
  lastpage: 538
  published: 2022-08-17 00:00:00 +0000
- title: 'Multiclass classification for Hawkes processes'
  abstract: 'We investigate the multiclass classification prob- lem where the features are event sequences. More precisely, the data are assumed to be generated by a mixture of simple linear Hawkes processes. In this new setting, the classes are discriminated by various triggering kernels. A challenge is then to build an efficient classification procedure. We de- rive the optimal Bayes rule and provide a two-step estimation procedure of the Bayes classifier. In the first step, the weights of the mixture are estimated; in the second step, an empirical risk minimization procedure is performed to estimate the parameters of the Hawkes processes. We establish the consis- tency of the resulting procedure and derive rates of convergence. Finally, the numerical properties of the data-driven algorithm are illustrated through a simulation study where the triggering kernels are assumed to belong to the popular parametric expo- nential family. It highlights the accuracy and the robustness of the proposed algorithm. In particular, even if the underlying kernels are misspecified, the procedure exhibits good performance.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/denis22a.html
  PDF: https://proceedings.mlr.press/v180/denis22a/denis22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-denis22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Christophe
    family: Denis
  - given: Charlotte
    family: Dion-Blanc
  - given: Laure
    family: Sansonnet
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 539-547
  id: denis22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 539
  lastpage: 547
  published: 2022-08-17 00:00:00 +0000
- title: 'Quantification of Credal Uncertainty in Machine Learning: A Critical Analysis and Empirical Comparison'
  abstract: 'The representation and quantification of uncertainty has received increasing attention in machine learning in the recent past. The formalism of credal sets provides an interesting alternative in this regard, especially as it combines the representation of epistemic (lack of knowledge) and aleatoric (statistical) uncertainty in a rather natural way. In this paper, we elaborate on uncertainty measures for credal sets from the perspective of machine learning. More specifically, we provide an overview of proposals, discuss existing measures in a critical way, and also propose a new measure that is more tailored to the machine learning setting. Based on an experimental study, we conclude that theoretically well-justified measures also lead to better performance in practice. Besides, we corroborate the difficulty of the disaggregation problem, that is, of decomposing the amount of total uncertainty into aleatoric and epistemic uncertainty in a sound manner, thereby complementing theoretical findings with empirical evidence.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/hullermeier22a.html
  PDF: https://proceedings.mlr.press/v180/hullermeier22a/hullermeier22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-hullermeier22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Eyke
    family: Hüllermeier
  - given: Sébastien
    family: Destercke
  - given: Mohammad Hossein
    family: Shaker
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 548-557
  id: hullermeier22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 548
  lastpage: 557
  published: 2022-08-17 00:00:00 +0000
- title: 'Balancing adaptability and non-exploitability in repeated games'
  abstract: 'We study the problem of adaptability in repeated games: simultaneously guaranteeing low regret for several classes of opponents. We add the constraint that our algorithm is non-exploitable, in that the opponent lacks an incentive to use an algorithm against which we cannot achieve rewards exceeding some “fair” value. Our solution is an expert algorithm (LAFF), which searches within a set of sub-algorithms that are optimal for each opponent class, and punishes evidence of exploitation by switching to a policy that enforces a fair solution. With benchmarks that depend on the opponent class, we first show that LAFF has sublinear regret uniformly over  these classes. Second, we show that LAFF discourages exploitation,  because exploitative opponents have linear regret.  To our knowledge, this work is the first to provide guarantees for both regret and non-exploitability in multi-agent learning.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/digiovanni22a.html
  PDF: https://proceedings.mlr.press/v180/digiovanni22a/digiovanni22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-digiovanni22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Anthony
    family: DiGiovanni
  - given: Ambuj
    family: Tewari
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 559-568
  id: digiovanni22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 559
  lastpage: 568
  published: 2022-08-17 00:00:00 +0000
- title: 'Variational- and metric-based deep latent space for out-of-distribution detection'
  abstract: 'One popular deep-learning approach for the task of Out-Of-Distribution (OOD) detection is based on thresholding the values of per-class Gaussian likelihood of deep features. However, two issues arise with that approach: first, the distributions are often far from being Gaussian; second, many OOD data points  fall within the effective support of the known classes’ Gaussians. Thus, either way it is hard to find a good threshold. In contrast, our proposed solution for OOD detection is based on a new latent space where: 1) each known class is well captured by a nearly-isotropic Gaussian; 2) those Gaussians are far from each other and from the origin of the space (together, these properties effectively leave the area around the origin free for OOD data). Concretely, given a (possibly-trained) backbone deep net of choice, we use it to train a conditional variational model via a Kullback Leibler loss, a triplet loss, and a new distancing loss that pushes classes away from each other.  During inference, the class-dependent log-likelihood values of a deep feature ensemble of the test point are also weighted based on reconstruction errors, improving further the decision rule. Experiments on popular benchmarks show that our method yields state-of-the-art results, a feat achieved despite the fact that, unlike some competitors, we make no use of OOD data for training or hyperparameter tuning. Our code is available at \url{https://github.com/BGU-CS-VIL/vmdls}.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/dinari22a.html
  PDF: https://proceedings.mlr.press/v180/dinari22a/dinari22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-dinari22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Or
    family: Dinari
  - given: Oren
    family: Freifeld
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 569-578
  id: dinari22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 569
  lastpage: 578
  published: 2022-08-17 00:00:00 +0000
- title: 'Revisiting DP-Means: fast scalable algorithms via parallelism and delayed cluster creation'
  abstract: 'DP-means, a nonparametric generalization of  K-means, extends the latter to the  case where the number of clusters is unknown. Unlike K-means, however, DP-means is hard to parallelize, a limitation hindering its usage in large-scale tasks. This work bridges this practicality gap by rendering the DP-means approach a viable, fast, and highly-scalable solution. First, we study the strengths and weaknesses of previous attempts to parallelize the DP-means algorithm. Next, we propose a new parallel algorithm, called PDC-DP-Means (Parallel Delayed Cluster DP-Means), based in part on delayed creation of clusters. Compared with DP-Means, PDC-DP-Means provides not only a major speedup but also performance gains. Finally, we propose two extensions of PDC-DP-Means. The first combines it with an existing method, leading to further speedups. The second extends PDC-DP-Means to  a Mini-Batch setting (with an optional support for an online mode),  allowing for another major speedup. We verify the utility of the proposed methods on multiple datasets. We also show that the proposed methods outperform other nonparametric methods (e.g., DBSCAN). Our highly-efficient code can be used to reproduce our experiments and is available at https://github.com/BGU-CS-VIL/pdc-dp-means'
  volume: 180
  URL: https://proceedings.mlr.press/v180/dinari22b.html
  PDF: https://proceedings.mlr.press/v180/dinari22b/dinari22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-dinari22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Or
    family: Dinari
  - given: Oren
    family: Freifeld
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 579-588
  id: dinari22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 579
  lastpage: 588
  published: 2022-08-17 00:00:00 +0000
- title: 'X-MEN: guaranteed XOR-maximum entropy constrained inverse reinforcement learning'
  abstract: 'Inverse Reinforcement Learning (IRL)  is a powerful way of learning from demonstrations.  In this paper, we address IRL problems with the  availability of prior knowledge that optimal policies  will never violate certain constraints. Conventional  approaches ignoring these constraints need many  demonstrations to converge. We propose XOR-Maximum Entropy  Constrained Inverse Reinforcement Learning (X-MEN),  which is guaranteed to converge to the global optimal  reward function in linear rate w.r.t. the number of  learning iterations. X-MEN embeds XOR-sampling –  a provable sampling approach which transforms  the #-P complete sampling problem into queries  to NP oracles – into the framework of maximum  entropy IRL. X-MEN also guarantees the learned  IRL agent will never generate trajectories that  violate constraints. Empirical results in navigation  demonstrate that X-MEN converges faster to the  optimal rewards compared to baseline approaches  and always generates trajectories that satisfy  multi-state combinatorial constraints.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/ding22a.html
  PDF: https://proceedings.mlr.press/v180/ding22a/ding22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-ding22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Fan
    family: Ding
  - given: Yexiang
    family: Xue
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 589-598
  id: ding22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 589
  lastpage: 598
  published: 2022-08-17 00:00:00 +0000
- title: 'Improving sign-random-projection via count sketch'
  abstract: 'Computing the angular similarity between pairs of vectors is a core part of various machine learning algorithms. The seminal work of Charikar (a.k.a. Sign-Random-Projection (SRP) or SimHash) provides an unbiased estimate for the same. However, SRP suffers from the following limitations: (i) large variance in the similarity estimation, (ii) and high running time while computing the sketch. There are improved variants that address these limitations. However, they are known to improve on only one aspect in their proposal, for e.g. Yu et al. suggest a faster algorithm, Ji et al., Kang and Wong, provide estimates with a smaller variance. In this work, we propose a sketching algorithm that addresses both aspects in one algorithm – a faster algorithm along with a smaller variance in the similarity estimation. Moreover, our algorithm is space-efficient as well. We present a rigorous theoretical analysis of our proposal and complement it via experiments on synthetic and real-world datasets. '
  volume: 180
  URL: https://proceedings.mlr.press/v180/dubey22a.html
  PDF: https://proceedings.mlr.press/v180/dubey22a/dubey22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-dubey22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Punit Pankaj
    family: Dubey
  - given: Bhisham Dev
    family: Verma
  - given: Rameshwar
    family: Pratap
  - given: Keegan
    family: Kang
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 599-609
  id: dubey22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 599
  lastpage: 609
  published: 2022-08-17 00:00:00 +0000
- title: 'ResIST: Layer-wise decomposition of ResNets for distributed training'
  abstract: 'We propose ResIST, a novel distributed training protocol for Residual Networks (ResNets). ResIST randomly decomposes a global ResNet into several shallow sub-ResNets that are trained independently in a distributed manner for several local iterations, before having their updates synchronized and aggregated into the global model. In the next round, new sub-ResNets are randomly generated and the process repeats until convergence. By construction, per iteration, ResIST communicates only a small portion of network parameters to each machine and never uses the full model during training. Thus, ResIST reduces the per-iteration communication, memory, and time requirements of ResNet training to only a fraction of the requirements of full-model training. In comparison to common protocols, like data-parallel training and data-parallel training with local SGD, ResIST yields a decrease in communication and compute requirements, while being competitive with respect to model performance.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/dun22a.html
  PDF: https://proceedings.mlr.press/v180/dun22a/dun22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-dun22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Chen
    family: Dun
  - given: Cameron R.
    family: Wolfe
  - given: Christopher M.
    family: Jermaine
  - given: Anastasios
    family: Kyrillidis
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 610-620
  id: dun22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 610
  lastpage: 620
  published: 2022-08-17 00:00:00 +0000
- title: 'Learning explainable templated graphical models'
  abstract: 'Templated graphical models (TGMs) encode model structure using rules that capture recurring relationships between multiple random variables. While the rules in TGMs are interpretable, it is not clear how they can be used to generate explanations for the individual predictions of the model. Further, learning these rules from data comes with high computational costs: it typically requires an expensive combinatorial search over the space of rules and repeated optimization over rule weights. In this work, we propose a new structure learning algorithm, Explainable Structured Model Search (ESMS), that learns a templated graphical model and an explanation framework for its predictions. ESMS uses a novel search procedure to efficiently search the space of models and discover models that trade-off predictive accuracy and explainability. We introduce the notion of relational stability and prove that our proposed explanation framework is stable. Further, our proposed piecewise pseudolikelihood (PPLL) objective does not require re-optimizing the rule weights across models during each iteration of the search. In our empirical evaluation on three realworld datasets, we show that our proposed approach not only discovers models that are explainable, but also significantly outperforms existing state-out-the-art structure learning approaches.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/embar22a.html
  PDF: https://proceedings.mlr.press/v180/embar22a/embar22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-embar22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Varun
    family: Embar
  - given: Sriram
    family: Srinivasa
  - given: Lise
    family: Getoor
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 621-630
  id: embar22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 621
  lastpage: 630
  published: 2022-08-17 00:00:00 +0000
- title: 'SENTINEL: taming uncertainty with ensemble based distributional reinforcement learning'
  abstract: 'In this paper, we consider risk-sensitive sequential decision-making in Reinforcement Learning (RL).  Our contributions are two-fold. First, we introduce a novel and coherent quantification of risk, namely composite risk, which quantifies the joint effect of aleatory and epistemic risk during the learning process. Existing works considered either aleatory or epistemic risk individually, or as an additive combination. We prove that the additive formulation is a particular case of the composite risk when the epistemic risk measure is replaced with expectation. Thus, the composite risk is more sensitive to both aleatory and epistemic uncertainty than the individual and additive formulations. We also propose an algorithm, SENTINEL-K, based on ensemble bootstrapping and distributional RL for representing epistemic and aleatory uncertainty respectively. The ensemble of K learners uses Follow The Regularised Leader (FTRL) to aggregate the return distributions and obtain the composite risk. We experimentally verify that SENTINEL-K estimates the return distribution better, and while used with composite risk estimates, demonstrates higher risk-sensitive performance than state-of-the-art risk-sensitive and distributional RL algorithms.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/eriksson22a.html
  PDF: https://proceedings.mlr.press/v180/eriksson22a/eriksson22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-eriksson22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Hannes
    family: Eriksson
  - given: Debabrota
    family: Basu
  - given: Mina
    family: Alibeigi
  - given: Christos
    family: Dimitrakakis
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 631-640
  id: eriksson22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 631
  lastpage: 640
  published: 2022-08-17 00:00:00 +0000
- title: 'Temporal abstractions-augmented temporally contrastive learning: An alternative to the Laplacian in RL'
  abstract: 'In reinforcement learning, the graph Laplacian has proved to be a valuable tool in the task-agnostic setting, with applications ranging from skill discovery to reward shaping. Recently, learning the Laplacian representation has been framed as the optimization of a temporally-contrastive objective to overcome its computational limitations in large (or continuous) state spaces. However, this approach requires uniform access to all states in the state space, overlooking the exploration problem that emerges during the representation learning process. In this work, we propose an alternative method that is able to recover, in a non-uniform-prior setting, the expressiveness and the desired properties of the Laplacian representation. We do so by combining the representation learning with a skill-based covering policy, which provides a better training distribution to extend and refine the representation. We also show that a simple augmentation of the representation objective with the learned temporal abstractions improves dynamics-awareness and helps exploration. We find that our method succeeds as an alternative to the Laplacian in the non-uniform setting and scales to challenging continuous control environments. Finally, even if our method is not optimized for skill discovery, the learned skills can successfully solve difficult continuous navigation tasks with sparse rewards, where standard skill discovery approaches are no so effective.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/erraqabi22a.html
  PDF: https://proceedings.mlr.press/v180/erraqabi22a/erraqabi22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-erraqabi22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Akram
    family: Erraqabi
  - given: Marlos C.
    family: Machado
  - given: Mingde
    family: Zhao
  - given: Sainbayar
    family: Sukhbaatar
  - given: Alessandro
    family: Lazaric
  - given: Denoyer
    family: Ludovic
  - given: Yoshua
    family: Bengio
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 641-651
  id: erraqabi22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 641
  lastpage: 651
  published: 2022-08-17 00:00:00 +0000
- title: 'Implicit kernel meta-learning using kernel integral forms'
  abstract: 'Meta-learning algorithms have made significant progress in the context of meta-learning for image classification but less attention has been given to the regression setting. In this paper we propose to learn the probability distribution representing a random feature kernel that we wish to use within kernel ridge regression (KRR). We introduce two instances of this meta-learning framework, learning a neural network pushforward for a translation-invariant kernel and an affine pushforward for a neural network random feature kernel, both mapping from a Gaussian latent distribution. We learn the parameters of the pushforward by minimizing a meta-loss associated to the KRR objective. Since the resulting kernel does not admit an analytical form, we adopt a random feature sampling approach to approximate it. We call the resulting method Implicit Kernel Meta-Learning (IKML). We derive a meta-learning bound for IKML, which shows the role played by the number of tasks $T$, the task sample size $n$, and the number of random features $M$. In particular the bound implies that $M$ can be the chosen independently of $T$ and only mildly dependent on $n$. We introduce one synthetic and two real-world meta-learning regression benchmark datasets. Experiments on these datasets show that IKML'
  volume: 180
  URL: https://proceedings.mlr.press/v180/falk22a.html
  PDF: https://proceedings.mlr.press/v180/falk22a/falk22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-falk22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: John Isak Texas
    family: Falk
  - given: Carlo
    family: Cilibert
  - given: Massimiliano
    family: Pontil
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 652-662
  id: falk22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 652
  lastpage: 662
  published: 2022-08-17 00:00:00 +0000
- title: 'Self-distribution distillation: efficient uncertainty estimation'
  abstract: 'Deep learning is increasingly being applied in safety-critical domains. For these scenarios it is important to know the level of uncertainty in a model’s prediction to ensure appropriate decisions are made by the system. Deep ensembles are the de-facto standard approach to obtaining various measures of uncertainty. However, ensembles often significantly increase the resources required in the training and/or deployment phases. Approaches have been developed that typically address the costs in one of these phases. In this work we propose a novel training approach, self-distribution distillation (S2D), which is able to efficiently train a single model that can estimate uncertainties. Furthermore it is possible to build ensembles of these models and apply hierarchical ensemble distillation approaches. Experiments on CIFAR-100 showed that S2D models outperformed standard models and Monte-Carlo dropout. Additional out-of-distribution detection experiments on LSUN, Tiny ImageNet, SVHN showed that even a standard deep ensemble can be outperformed using S2D based ensembles and novel distilled models.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/fathullah22a.html
  PDF: https://proceedings.mlr.press/v180/fathullah22a/fathullah22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-fathullah22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yassir
    family: Fathullah
  - given: Mark J. F.
    family: Gales
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 663-673
  id: fathullah22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 663
  lastpage: 673
  published: 2022-08-17 00:00:00 +0000
- title: 'Sequential algorithmic modification with test data reuse'
  abstract: ' After initial release of a machine learning algorithm, the model can be fine-tuned by retraining on subsequently gathered data, adding newly discovered features, or more. Each modification introduces a risk of deteriorating performance and must be validated on a test dataset. It may not always be practical to assemble a new dataset for testing each modification, especially when most modifications are minor or are implemented in rapid succession. Recent work has shown how one can repeatedly test modifications on the same dataset and protect against overfitting by (i) discretizing test results along a grid and (ii) applying a Bonferroni correction to adjust for the total number of modifications considered by an adaptive developer. However, the standard Bonferroni correction is overly conservative when most modifications are beneficial and/or highly correlated. This work investigates more powerful approaches using alpha-recycling and sequentially-rejective graphical procedures (SRGPs). We introduce two novel extensions that account for correlation between adaptively chosen algorithmic modifications: the first leverages the correlation between consecutive modifications using flexible fixed sequence tests, and the second leverages the correlation between the proposed modifications and those generated by a hypothetical prespecified model updating procedure. In empirical analyses, both SRGPs control the error rate of approving deleterious modifications and approve significantly more beneficial modifications than previous approaches.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/feng22a.html
  PDF: https://proceedings.mlr.press/v180/feng22a/feng22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-feng22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Jean
    family: Feng
  - given: Gene
    family: Pennllo
  - given: Nicholas
    family: Petrick
  - given: Berkman
    family: Sahiner
  - given: Romain
    family: Pirracchio
  - given: Alexej
    family: Gossmann
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 674-684
  id: feng22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 674
  lastpage: 684
  published: 2022-08-17 00:00:00 +0000
- title: 'Estimating transfer entropy under long ranged dependencies'
  abstract: 'Estimating Transfer Entropy (TE) between time series is a highly impactful  problem in fields such as finance and neuroscience. The well-known nearest neighbor estimator of TE potentially fails if temporal dependencies are noisy and long ranged, primarily because it estimates TE indirectly relying on the estimation of joint entropy terms in high dimensions, which is a hard problem in itself. Other estimators, such as those based on Copula entropy or conditional mutual information have similar limitations. Leveraging the successes of modern discriminative models that operate in high dimensional (noisy) feature spaces, we express TE as a difference of two conditional entropy terms, which we directly estimate from conditional likelihoods computed in-sample from any discriminator (timeseries forecaster) trained per maximum likelihood principle. To ensure that the in-sample log likelihood estimates are not overfit to the data, we propose a novel perturbation model based on locality sensitive hash (LSH) functions, which regularizes a discriminative model to have smooth functional outputs within local neighborhoods of the input space. Our estimator is consistent, and its variance reduces linearly in sample size. We also demonstrate its superiority w.r.t. state-of-the-art estimators through empirical evaluations on a synthetic as well as real world datasets from the neuroscience and finance domains.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/garg22a.html
  PDF: https://proceedings.mlr.press/v180/garg22a/garg22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-garg22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Sahil
    family: Garg
  - given: Umang
    family: Gupta
  - given: Yu
    family: Chen
  - given: Syamantak Datta
    family: Gupta
  - given: Yeshaya
    family: Adler
  - given: Anderson
    family: Schneider
  - given: Yuriy
    family: Nevmyvaka
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 685-695
  id: garg22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 685
  lastpage: 695
  published: 2022-08-17 00:00:00 +0000
- title: 'Mitigating statistical bias within differentially private synthetic data'
  abstract: 'Increasing interest in privacy-preserving machine learning has led to new and evolved approaches for generating private synthetic data from undisclosed real data. However, mechanisms of privacy preservation can significantly reduce the utility of synthetic data, which in turn impacts downstream tasks such as learning predictive models or inference. We propose several re-weighting strategies using privatised likelihood ratios that not only mitigate statistical bias of downstream estimators but also have general applicability to differentially private generative models. Through large-scale empirical evaluation, we show that private importance weighting provides simple and effective privacy-compliant augmentation for general applications of synthetic data.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/ghalebikesabi22a.html
  PDF: https://proceedings.mlr.press/v180/ghalebikesabi22a/ghalebikesabi22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-ghalebikesabi22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Sahra
    family: Ghalebikesabi
  - given: Harry
    family: Wilde
  - given: Jack
    family: Jewson
  - given: Arnaud
    family: Doucet
  - given: Sebastian
    family: Vollmer
  - given: Chris
    family: Holmes
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 696-705
  id: ghalebikesabi22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 696
  lastpage: 705
  published: 2022-08-17 00:00:00 +0000
- title: 'Neural-progressive hedging: Enforcing constraints in reinforcement learning with stochastic programming'
  abstract: 'We propose a framework, called neural-progressive hedging (NP), that leverages stochastic programming during the online phase of executing a reinforcement learning (RL) policy. The goal is to ensure feasibility with respect to constraints  and risk-based objectives such as conditional value-at-risk (CVaR) during the execution of the policy, using probabilistic models of the state transitions to guide policy adjustments. The framework is particularly amenable to the class of sequential resource allocation problems since feasibility with respect to typical resource constraints cannot be enforced in a scalable manner. The NP framework provides an alternative that adds modest overhead during the online phase. Experimental results demonstrate the efficacy of the NP framework on two continuous real-world tasks: (i) the portfolio optimization problem with liquidity constraints for financial planning, characterized by non-stationary state distributions; and (ii) the dynamic repositioning problem in bike sharing systems, that embodies the class of supply-demand matching problems. We show that the NP framework produces policies that are better than deep RL and other baseline approaches, adapting to non-stationarity, whilst satisfying structural constraints and accommodating risk measures in the resulting policies. Additional benefits of the NP framework are ease of implementation and  better explainability of the policies.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/ghosh22a.html
  PDF: https://proceedings.mlr.press/v180/ghosh22a/ghosh22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-ghosh22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Supriyo
    family: Ghosh
  - given: Laura
    family: Wynter
  - given: Shiau Hong
    family: Lim
  - given: Duc Thien
    family: Nguyen
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 707-717
  id: ghosh22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 707
  lastpage: 717
  published: 2022-08-17 00:00:00 +0000
- title: 'Do Bayesian variational autoencoders know what they don’t know?'
  abstract: 'The problem of detecting the Out-of-Distribution (OoD) inputs is of paramount importance for Deep Neural Networks. It has been previously shown that even Deep Generative Models that allow estimating the density of the inputs may not be reliable and often tend to make over-confident predictions for OoDs, assigning to them a higher density than to the in-distribution data. This over-confidence in a single model can be potentially mitigated with Bayesian inference over the model parameters that take into account epistemic uncertainty. This paper investigates three approaches to Bayesian inference: stochastic gradient Markov chain Monte Carlo, Bayes by Backpropagation, and Stochastic Weight Averaging-Gaussian. The inference is implemented over the weights of the deep neural networks that parameterize the likelihood of the Variational Autoencoder. We empirically evaluate the approaches against several benchmarks that are often used for OoD detection: estimation of the marginal likelihood utilizing sampled model ensemble, typicality test, disagreement score, and Watanabe-Akaike Information Criterion. Finally, we introduce two simple scores that demonstrate the state-of-the-art performance.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/glazunov22a.html
  PDF: https://proceedings.mlr.press/v180/glazunov22a/glazunov22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-glazunov22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Misha
    family: Glazunov
  - given: Apostolis
    family: Zarras
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 718-727
  id: glazunov22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 718
  lastpage: 727
  published: 2022-08-17 00:00:00 +0000
- title: 'Robust expected information gain for optimal Bayesian experimental design using ambiguity sets'
  abstract: 'The ranking of experiments by expected information gain (EIG) in Bayesian experimental design is sensitive to changes in the model’s prior distribution, and the approximation of EIG yielded by sampling will have errors similar to the use of a perturbed prior. We define and analyze Robust Expected Information Gain(REIG), a modification of the objective in EIG maximization by minimizing an affine relaxation of EIG over an ambiguity set of distributions that are close to the original prior in KL-divergence. We show that, when combined with a sampling-based approach to estimating EIG, REIG corresponds to a "log-sum-exp" stabilization of the samples used to estimate EIG, meaning that it can be efficiently implemented in practice. Numerical tests combining REIG with variational nested Monte Carlo (VNMC), adaptive contrastive estimation (ACE) and mutual information neural estimation (MINE) suggest that in practice REIG also compensates for the variability of under-sampled estimators.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/go22a.html
  PDF: https://proceedings.mlr.press/v180/go22a/go22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-go22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Jinwoo
    family: Go
  - given: Tobin
    family: Isaac
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 728-737
  id: go22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 728
  lastpage: 737
  published: 2022-08-17 00:00:00 +0000
- title: 'Efficient and transferable adversarial examples from bayesian neural networks'
  abstract: 'An established way to improve the transferability of black-box evasion attacks is to craft the adversarial examples on an ensemble-based surrogate to increase diversity. We argue that transferability is fundamentally related to uncertainty. Based on a state-of-the-art Bayesian Deep Learning technique, we propose a new method to efficiently build a surrogate by sampling approximately from the posterior distribution of neural network weights, which represents the belief about the value of each parameter. Our extensive experiments on ImageNet, CIFAR-10 and MNIST show that our approach improves the success rates of four state-of-the-art attacks significantly (up to 83.2 percentage points), in both intra-architecture and inter-architecture transferability. On ImageNet, our approach can reach 94% of success rate while reducing training computations from 11.6 to 2.4 exaflops, compared to an ensemble of independently trained DNNs. Our vanilla surrogate achieves 87.5% of the time higher transferability than three test-time techniques designed for this purpose. Our work demonstrates that the way to train a surrogate has been overlooked, although it is an important element of transfer-based attacks. We are, therefore, the first to review the effectiveness of several training methods in increasing transferability. We provide new directions to better understand the transferability phenomenon and offer a simple but strong baseline for future work.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/gubri22a.html
  PDF: https://proceedings.mlr.press/v180/gubri22a/gubri22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-gubri22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Martin
    family: Gubri
  - given: Maxime
    family: Cordy
  - given: Mike
    family: Papadakis
  - given: Yves
    family: Le Traon
  - given: Koushik
    family: Sen
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 738-748
  id: gubri22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 738
  lastpage: 748
  published: 2022-08-17 00:00:00 +0000
- title: 'Learning a neural Pareto manifold extractor with constraints'
  abstract: 'Multi-objective optimization (MOO) problems require balancing competing objectives, often under constraints. The Pareto optimal solution set defines all possible optimal trade-offs over such objectives. In this work, we present a novel method for Pareto-front learning: inducing the full Pareto manifold at train-time so users can pick any desired optimal trade-off point at run-time. Our key insight is to exploit Fritz-John Conditions for a novel guided double gradient descent strategy. Evaluation on synthetic benchmark problems allows us to vary MOO problem difficulty in controlled fashion and measure accuracy \vs known analytic solutions. We further test scalability and generalization in learning optimal neural model parameterizations for Multi-Task Learning (MTL) on image classification. Results show consistent improvement in  accuracy and efficiency over prior MTL methods as well as techniques from operations research.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/gupta22a.html
  PDF: https://proceedings.mlr.press/v180/gupta22a/gupta22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-gupta22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Soumyajit
    family: Gupta
  - given: Gurpreet
    family: Singh
  - given: Raghu
    family: Bollapragada
  - given: \Matthew
    family: Lease
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 749-758
  id: gupta22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 749
  lastpage: 758
  published: 2022-08-17 00:00:00 +0000
- title: 'Modeling extremes with $d$-max-decreasing neural networks'
  abstract: 'We propose a neural network architecture that enables non-parametric calibration and generation of multivariate extreme value distributions (MEVs).  MEVs arise from Extreme Value Theory (EVT) as the necessary class of models when extrapolating a distributional fit over large spatial and temporal scales based on data observed in intermediate scales.  In turn, EVT dictates that $d$-max-decreasing, a stronger form of convexity, is an essential shape constraint in the characterization of MEVs.  As far as we know, our proposed architecture provides the first class of non-parametric estimators for MEVs that preserve these essential shape constraints.  We show that the architecture approximates the dependence structure encoded by MEVs at parametric rate.  Moreover, we present a new method for sampling high-dimensional MEVs using a generative model.  We demonstrate our methodology on a wide range of experimental settings, ranging from environmental sciences to financial mathematics and verify that the structural properties of MEVs are retained compared to existing methods.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/hasan22a.html
  PDF: https://proceedings.mlr.press/v180/hasan22a/hasan22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-hasan22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Ali
    family: Hasan
  - given: Khalil
    family: Elkhalil
  - given: Yuting
    family: Ng
  - given: João M.
    family: Pereira
  - given: Sina
    family: Farsiu
  - given: Jose
    family: Blanchet
  - given: Vahid
    family: Tarokh
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 759-768
  id: hasan22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 759
  lastpage: 768
  published: 2022-08-17 00:00:00 +0000
- title: 'Generalizing off-policy learning under sample selection bias'
  abstract: 'Learning personalized decision policies that generalize to the target population is of great relevance. Since training data is often not representative of the target population, standard policy learning methods may yield policies that do not generalize target population. To address this challenge, we propose a novel framework for learning policies that generalize to the target population. For this, we characterize the difference between the training data and the target population as a sample selection bias using a selection variable. Over an uncertainty set around this selection variable, we optimize the minimax value of a policy to achieve the best worst-case policy value on the target population. In order to solve the minimax problem, we derive an efficient algorithm based on a convex-concave procedure and prove convergence for parametrized spaces of policies such as logistic policies. We prove that, if the uncertainty set is well-specified, our policies generalize to the target population as they can not do worse than on the training data. Using simulated data and a clinical trial, we demonstrate that, compared to standard policy learning methods, our framework improves the generalizability of policies substantially.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/hatt22a.html
  PDF: https://proceedings.mlr.press/v180/hatt22a/hatt22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-hatt22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tobias
    family: Hatt
  - given: Daniel
    family: Tschernutter
  - given: Stefan
    family: Feuerriegel
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 769-779
  id: hatt22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 769
  lastpage: 779
  published: 2022-08-17 00:00:00 +0000
- title: 'Reinforcement learning in many-agent settings under partial observability'
  abstract: 'Recent renewed interest in multi-agent reinforcement learning (MARL) has generated an impressive array of techniques that leverage deep RL, primarily actor-critic architectures, and can be applied to a limited range of settings in terms of observability and communication. However, a continuing limitation of much of this work is the curse of dimensionality when it comes to representations based on joint actions, which grow exponentially with the number of agents. In this paper, we squarely focus on this challenge of scalability. We apply the key insight of action anonymity to a recently presented actor-critic based MARL algorithm, interactive A2C. We introduce a Dirichlet-multinomial model for maintaining beliefs over the agent population when agents’ actions are not perfectly observable. We show that the posterior is a mixture of Dirichlet distributions that we approximate as a single component for tractability. We also show that the prediction accuracy of this method increases with more agents. Finally we show empirically that our method can learn optimal behaviors in two recently introduced pragmatic domains with large agent population, and demonstrates robustness in partially observable environments.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/he22a.html
  PDF: https://proceedings.mlr.press/v180/he22a/he22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-he22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Keyang
    family: He
  - given: Prashant
    family: Doshi
  - given: Bikramjit
    family: Banerjee
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 780-789
  id: he22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 780
  lastpage: 789
  published: 2022-08-17 00:00:00 +0000
- title: 'Variational multiple shooting for Bayesian ODEs with Gaussian processes'
  abstract: 'Recent machine learning advances have proposed black-box estimation of \textit{unknown continuous-time system dynamics} directly from data. However, earlier works are based on approximative solutions or point estimates. We propose a novel Bayesian nonparametric model that uses Gaussian processes to infer posteriors of unknown ODE systems directly from data. We derive sparse variational inference with decoupled functional sampling to represent vector field posteriors. We also introduce a probabilistic shooting augmentation to enable efficient inference from arbitrarily long trajectories. The method demonstrates the benefit of computing vector field posteriors, with predictive uncertainty scores outperforming alternative methods on multiple ODE learning tasks.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/hegde22a.html
  PDF: https://proceedings.mlr.press/v180/hegde22a/hegde22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-hegde22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Pashupati
    family: Hegde
  - given: Çağatay
    family: Yıldız
  - given: Harri
    family: Lähdesmäki
  - given: Samuel
    family: Kaski
  - given: Markus
    family: Heinonen
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 790-799
  id: hegde22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 790
  lastpage: 799
  published: 2022-08-17 00:00:00 +0000
- title: 'Learning sparse representations of preferences within Choquet expected utility theory'
  abstract: 'This paper deals with preference elicitation within Choquet Expected Utility (CEU) theory for decision making under uncertainty. We consider the Savage’s framework with a finite set of states and assume that preferences of the Decision Maker over acts are observable. The CEU model involves two parameters that must be tuned to the value system of the decision maker: a set function (capacity) modeling weights attached to events, of size exponential in the number of states, and a utility function defined on the space of outcomes.  Our aim is to learn a sparse representation of the CEU model from preference data.  We propose and test a preference learning approach based on a spline representation of utilities and the sparse learning of capacities to obtain CEU models achieving a good tradeoff between the aim of sparsity and the expressivity required by preference data.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/herin22a.html
  PDF: https://proceedings.mlr.press/v180/herin22a/herin22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-herin22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Margot
    family: Herin
  - given: Patrice
    family: Perny
  - given: Nataliya
    family: Sokolovska
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 800-810
  id: herin22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 800
  lastpage: 810
  published: 2022-08-17 00:00:00 +0000
- title: 'Quadratic metric elicitation for fairness and beyond'
  abstract: 'Metric elicitation is a recent framework for eliciting classification performance metrics that best reflect implicit user preferences based on the task and context. However, available elicitation strategies have been limited to linear (or quasi-linear) functions of predictive rates, which can be practically restrictive for many applications including fairness. This paper develops a strategy for eliciting more flexible multiclass metrics defined by quadratic functions of rates, designed to reflect human preferences better. We show its application in eliciting quadratic violation-based group-fair metrics. Our strategy requires only relative preference feedback, is robust to noise, and achieves near-optimal query complexity. We further extend this strategy to eliciting polynomial metrics – thus broadening the use cases for metric elicitation.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/hiranandani22a.html
  PDF: https://proceedings.mlr.press/v180/hiranandani22a/hiranandani22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-hiranandani22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Gaurush
    family: Hiranandani
  - given: Jatin
    family: Mathur
  - given: Harikrishna
    family: Narasimhan
  - given: Oluwasanmi
    family: Koyejo
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 811-821
  id: hiranandani22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 811
  lastpage: 821
  published: 2022-08-17 00:00:00 +0000
- title: 'Fast predictive uncertainty for classification with Bayesian deep networks'
  abstract: 'In Bayesian Deep Learning, distributions over the output of classification neural networks are often approximated by first constructing a Gaussian distribution over the weights, then sampling from it to receive a distribution over the softmax outputs. This is costly. We reconsider old work (Laplace Bridge) to construct a Dirichlet approximation of this softmax output distribution, which yields an analytic map between Gaussian distributions in logit space and Dirichlet distributions (the conjugate prior to the Categorical distribution) in the output space.  Importantly, the vanilla Laplace Bridge comes with certain limitations. We analyze those and suggest a simple solution that compares favorably to other commonly used estimates of the softmax-Gaussian integral. We demonstrate that the resulting Dirichlet distribution has multiple advantages, in particular, more efficient computation of the uncertainty estimate and scaling to large datasets and networks like ImageNet and DenseNet.  We further demonstrate the usefulness of this Dirichlet approximation by using it to construct a lightweight uncertainty-aware output ranking for ImageNet. '
  volume: 180
  URL: https://proceedings.mlr.press/v180/hobbhahn22a.html
  PDF: https://proceedings.mlr.press/v180/hobbhahn22a/hobbhahn22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-hobbhahn22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Marius
    family: Hobbhahn
  - given: Agustinus
    family: Kristiadi
  - given: Philipp
    family: Hennig
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 822-832
  id: hobbhahn22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 822
  lastpage: 832
  published: 2022-08-17 00:00:00 +0000
- title: 'CIGMO: Categorical invariant representations in a deep generative framework'
  abstract: 'Data of general object images have two most common structures: (1) each object of a given shape can be rendered in multiple different views, and (2) shapes of objects can be categorized in such a way that the diversity of shapes is much larger across categories than within a category.  Existing deep generative models can typically capture either structure, but not both.  In this work, we introduce a novel deep generative model, called CIGMO, that can learn to represent category, shape, and view factors from image data.  The model is comprised of multiple modules of shape representations that are each specialized to a particular category and disentangled from view representation, and can be learned using a group-based weakly supervised learning method.  By empirical investigation, we show that our model can effectively discover categories of object shapes despite large view variation and quantitatively supersede various previous methods including the state-of-the-art invariant clustering algorithm.  Further, we show that our approach using category-specialization can enhance the learned shape representation to better perform down-stream tasks such as one-shot object identification as well as shape-view disentanglement.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/hosoya22a.html
  PDF: https://proceedings.mlr.press/v180/hosoya22a/hosoya22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-hosoya22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Haruo
    family: Hosoya
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 833-843
  id: hosoya22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 833
  lastpage: 843
  published: 2022-08-17 00:00:00 +0000
- title: 'Near-optimal Thompson sampling-based algorithms for differentially private stochastic bandits'
  abstract: 'We address differentially private stochastic bandits. We present two (near)-optimal  Thompson Sampling-based learning algorithms: DP-TS and Lazy-DP-TS. The core idea in achieving optimality  is  the principle of optimism in the face of uncertainty. We reshape the posterior distribution in an optimistic way as compared to the  non-private Thompson Sampling. Our DP-TS achieves a $\sum\limits_{j \in \mathcal{A}: \Delta_j > 0} O \left(\frac{\log(T)}{\min \left\{\epsilon, \Delta_j \right\} )} \log \left(\frac{\log(T)}{\epsilon \cdot \Delta_j} \right) \right)$ regret bound, where $\mathcal{A}$ is the arm set, $\Delta_j$ is the sub-optimality gap of a sub-optimal arm $j$, and $\epsilon$ is the  privacy parameter.  Our Lazy-DP-TS gets rid of the extra $\log$ factor by using the idea of dropping observations. The regret of Lazy-DP-TS  is  $ \sum\limits_{j \in \mathcal{A}: \Delta_j > 0} O \left(\frac{\log(T)}{\min \left\{\epsilon, \Delta_j \right\}} \right)$, which matches the  regret lower bound. Additionally, we conduct experiments to compare the empirical performance of our proposed  algorithms with the existing optimal  algorithms for differentially private stochastic bandits.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/hu22a.html
  PDF: https://proceedings.mlr.press/v180/hu22a/hu22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-hu22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Bingshan
    family: Hu
  - given: Nidhi
    family: Hegde
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 844-852
  id: hu22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 844
  lastpage: 852
  published: 2022-08-17 00:00:00 +0000
- title: 'Uncertainty-aware pseudo-labeling for quantum calculations'
  abstract: 'Machine learning models have recently shown promise in predicting molecular quantum chemical properties. However, the path to real-life adoption requires (1) learning under low-resource constraints and (2) out-of-distribution generalization to unseen, structurally diverse molecules. We observe that these two challenges can be addressed via abundant labels, which is often not the case in quantum chemistry. We hypothesize that pseudo-labeling on a vast array of unlabeled molecules can serve as gold-label proxies to expand the training labeled dataset significantly. The challenge in pseudo-labeling is to prevent the bad pseudo-labels from biasing the model. Motivated by the entropy minimization framework, we develop a simple and effective strategy Pseudo that can assign pseudo-labels, detect bad pseudo-labels through evidential uncertainty, and prevent them from biasing the model using adaptive weighting. Empirically, Pseudo improves quantum calculations accuracy in full data, low data, and out-of-distribution settings. '
  volume: 180
  URL: https://proceedings.mlr.press/v180/huang22a.html
  PDF: https://proceedings.mlr.press/v180/huang22a/huang22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-huang22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Kexin
    family: Huang
  - given: Vishnu
    family: Sresht
  - given: Brajesh
    family: Rai
  - given: Mykola
    family: Bordyuh
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 853-862
  id: huang22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 853
  lastpage: 862
  published: 2022-08-17 00:00:00 +0000
- title: 'A mutually exciting latent space Hawkes process model for continuous-time networks'
  abstract: 'Networks and temporal point processes serve as fundamental building blocks for modeling complex dynamic relational data in various domains. We propose the latent space Hawkes (LSH) model, a novel generative model for continuous-time networks of relational events, using a latent space representation for nodes. We model relational events between nodes using mutually exciting Hawkes processes with baseline intensities dependent upon the distances between the nodes in the latent space and sender and receiver specific effects. We demonstrate that our proposed LSH model can replicate many features observed in real temporal networks including reciprocity and transitivity, while also achieving superior prediction accuracy and providing more interpretable fits than existing models.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/huang22b.html
  PDF: https://proceedings.mlr.press/v180/huang22b/huang22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-huang22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Zhipeng
    family: Huang
  - given: Hadeel
    family: Soliman
  - given: Subhadeep
    family: Paul
  - given: Kevin S.
    family: Xu
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 863-873
  id: huang22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 863
  lastpage: 873
  published: 2022-08-17 00:00:00 +0000
- title: 'Binary independent component analysis: a non-stationarity-based approach'
  abstract: 'We consider independent component analysis of binary data. While fundamental in practice, this case has been much less developed than ICA for continuous data. We start by assuming a linear mixing model in a continuous-valued latent space, followed by a binary observation model. Importantly, we assume that the sources are non-stationary; this is necessary since any non-Gaussianity would essentially be destroyed by the binarization. Interestingly, the model allows for closed-form likelihood by employing the cumulative distribution function of the multivariate Gaussian distribution. In stark contrast to the continuous-valued case, we prove non-identifiability of the model with few observed variables; our empirical results imply identifiability when the number of observed variables is higher. We present a practical method for binary ICA that uses only pairwise marginals, which are faster to compute than the full multivariate likelihood. Experiments give insight into the requirements for the number of observed variables, segments, and latent sources that allow the model to be estimated.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/hyttinen22a.html
  PDF: https://proceedings.mlr.press/v180/hyttinen22a/hyttinen22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-hyttinen22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Antti
    family: Hyttinen
  - given: Vitória
    family: Barin Pacela
  - given: Aapo
    family: Hyvärinen
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 874-884
  id: hyttinen22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 874
  lastpage: 884
  published: 2022-08-17 00:00:00 +0000
- title: 'Balancing utility and scalability in metric differential privacy'
  abstract: 'Metric differential privacy (mDP) is a modification of differential privacy that is more suitable when records can be represented in a general metric space, such as text data represented as word embed- dings or geographical coordinates on a map. We consider the task of releasing elements of the metric space under metric differential privacy where utility is measured as the distance of the released element to the original element. Linear programming (LP) can be used to construct a mechanism that achieves the optimal utility for a particular mDP constraint. However, these LPs suffer from a polynomial explosion of variables and constraints that render them impractical for solving real-world problems. An important question is how to design rigorous mDP mechanisms that balance the utility- scalability tradeoff. Our main contribution is a new method for reducing the LP size used to generate mDP mechanisms by constraining the search space such that certain input and output pairs have transition probabilities derived from the exponential mechanism. Our method produces mDP mechanisms whose LPs are smaller that all prior work in this area. We also provide a lower bound on the best possible mechanism utility. Our experiments on real-world metric spaces highlight the superior utility-scalability tradeoff of our mechanism.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/imola22a.html
  PDF: https://proceedings.mlr.press/v180/imola22a/imola22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-imola22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Jacob
    family: Imola
  - given: Shiva
    family: Kasiviswanathan
  - given: Stephen
    family: White
  - given: Abhinav
    family: Aggarwal
  - given: Nathanael
    family: Teissier
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 885-894
  id: imola22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 885
  lastpage: 894
  published: 2022-08-17 00:00:00 +0000
- title: 'Towards painless policy optimization for constrained MDPs'
  abstract: 'We study policy optimization in an infinite horizon, $\gamma$-discounted constrained Markov decision process (CMDP). Our objective is to return a policy that achieves large expected reward with a small constraint violation. We consider the online setting with linear function approximation and assume global access to the corresponding features. We propose a generic primal-dual framework that allows us to bound the reward sub-optimality and constraint violation for arbitrary algorithms in terms of their primal and dual regret on online linear optimization problems. We instantiate this framework to use coin-betting algorithms and propose the \textbf{Coin Betting Politex (CBP)} algorithm. Assuming that the action-value functions are $\epsilon_{\text{\tiny{b}}}$-close to the span of the $d$-dimensional state-action features and no sampling errors, we prove that $T$ iterations of CBP result in an $O\left(\frac{1}{(1 - \gamma)^3 \sqrt{T}} + \frac{\epsilon_{\text{\tiny{b}}} \sqrt{d}}{(1 - \gamma)^2} \right)$ reward sub-optimality and an $O\left(\frac{1}{(1 - \gamma)^2 \sqrt{T}} + \frac{\epsilon_{\text{\tiny{b}}} \sqrt{d}}{1 - \gamma} \right)$ constraint violation. Importantly, unlike gradient descent-ascent and other recent methods, CBP does not require extensive hyperparameter tuning. Via experiments on synthetic and Cartpole environments, we demonstrate the effectiveness and robustness of CBP.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/jain22a.html
  PDF: https://proceedings.mlr.press/v180/jain22a/jain22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-jain22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Arushi
    family: Jain
  - given: Sharan
    family: Vaswani
  - given: Reza
    family: Babanezhad
  - given: Csaba
    family: Szepesvári
  - given: Doina
    family: Precup
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 895-905
  id: jain22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 895
  lastpage: 905
  published: 2022-08-17 00:00:00 +0000
- title: 'Fedvarp: Tackling the variance due to partial client participation in federated learning'
  abstract: 'Data-heterogeneous federated learning (FL) systems suffer from two significant sources of convergence error: 1) client drift error caused by performing multiple local optimization steps at clients, and 2) partial client participation error caused by the fact that only a small subset of the edge clients participate in every training round. We find that among these, only the former has received significant attention in the literature. To remedy this, we propose FedVARP, a novel variance reduction algorithm applied at the server that eliminates error due to partial client participation. To do so, the server simply maintains in memory the most recent update for each client and uses these as surrogate updates for the non-participating clients in every round. Further, to alleviate the memory requirement at the server, we propose a novel clustering-based variance reduction algorithm ClusterFedVARP. Unlike previously proposed methods, both FedVARP and ClusterFedVARP do not require additional computation at clients or communication of additional optimization parameters. Through extensive experiments, we show that FedVARP outperforms state-of-the-art methods, and ClusterFedVARP achieves performance comparable to FedVARP with much less memory requirements.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/jhunjhunwala22a.html
  PDF: https://proceedings.mlr.press/v180/jhunjhunwala22a/jhunjhunwala22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-jhunjhunwala22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Divyansh
    family: Jhunjhunwala
  - given: Pranay
    family: Sharma
  - given: Aushim
    family: Nagarkatti
  - given: Gauri
    family: Joshi
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 906-916
  id: jhunjhunwala22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 906
  lastpage: 916
  published: 2022-08-17 00:00:00 +0000
- title: 'Orthogonal Gromov-Wasserstein discrepancy with efficient lower bound'
  abstract: 'Comparing structured data from possibly different metric-measure spaces is a fundamental task in machine learning, with applications in, e.g., graph classification. The Gromov-Wasserstein (GW) discrepancy formulates a coupling between the structured data based on optimal transportation, tackling the incomparability between different structures by aligning the intra-relational geometries. Although efficient local solvers such as conditional gradient and Sinkhorn are available, the inherent non-convexity still prevents a tractable evaluation, and the existing lower bounds are not tight enough for practical use. To address this issue, we take inspirations from the connection with the quadratic assignment problem, and propose the orthogonal Gromov-Wasserstein (OGW) discrepancy as a surrogate of GW.  It admits an efficient and closed-form lower bound with O(n^3) complexity, and directly extends to the fused Gromov-Wasserstein distance, incorporating node features into the coupling.  Extensive experiments on both the synthetic and real-world datasets show the tightness of our lower bounds, and both OGW and its lower bounds efficiently deliver accurate predictions and satisfactory barycenters for graph sets.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/jin22a.html
  PDF: https://proceedings.mlr.press/v180/jin22a/jin22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-jin22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Hongwei
    family: Jin
  - given: Zishun
    family: Yu
  - given: Xinhua
    family: Zhang
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 917-927
  id: jin22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 917
  lastpage: 927
  published: 2022-08-17 00:00:00 +0000
- title: 'If you’ve trained one you’ve trained them all: inter-architecture similarity increases with robustness'
  abstract: 'Previous work has shown that commonly-used metrics for comparing representations between neural networks overestimate similarity due to correlations between data points. We show that intra-example feature correlations also causes significant overestimation of network similarity and propose an image inversion technique to analyze only the features used by a network. With this technique, we find that similarity across architectures is significantly lower than commonly understood, but we surprisingly find that similarity between models with different architectures increases as the adversarial robustness of the models increase. Our findings indicate that robust networks tend toward a universal set of representations, regardless of architecture, and that the robust training criterion is a strong prior constraint on the functions that can be learned by diverse modern architectures. We also find that the representations learned by a robust network of any architecture have an asymmetric overlap with non-robust networks of many architectures, indicating that the representations used by robust neural networks are highly entangled with the representations used by non-robust networks.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/jones22a.html
  PDF: https://proceedings.mlr.press/v180/jones22a/jones22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-jones22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Haydn T.
    family: Jones
  - given: Jacob M.
    family: Springer
  - given: Garrett T.
    family: Kenyon
  - given: Juston S.
    family: Moore
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 928-937
  id: jones22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 928
  lastpage: 937
  published: 2022-08-17 00:00:00 +0000
- title: 'Decision-theoretic planning with communication in open multiagent systems'
  abstract: 'In open multiagent systems, the set of agents operating in the environment changes over time and in ways that are nontrivial to predict. For example, if collaborative robots were tasked with fighting wildfires, they may run out of suppressants and be temporarily unavailable to assist their peers. Because an agent’s optimal action depends on the actions of others, each agent must not only predict the actions of its peers, but, before that, reason whether they are even present to perform an action.  Addressing openness thus requires agents to model each other’s presence, which can be enhanced through agents communicating about their presence in the environment.  At the same time, communicative acts can also incur costs (e.g., consuming limited bandwidth), and thus an agent must tradeoff the benefits of enhanced coordination with the costs of communication.  We present a new principled, decision-theoretic method in the context provided by the recent communicative interactive POMDP framework for planning in open agent settings that balances this tradeoff. Simulations of multiagent wildfire suppression problems demonstrate how communication can improve planning in open agent environments, as well as how agents tradeoff the benefits and costs of communication under different scenarios.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/kakarlapudi22a.html
  PDF: https://proceedings.mlr.press/v180/kakarlapudi22a/kakarlapudi22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-kakarlapudi22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Anirudh
    family: Kakarlapudi
  - given: Gayathri
    family: Anil
  - given: Adam
    family: Eck
  - given: Prashant
    family: Doshi
  - given: Leen-Kiat
    family: Soh
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 938-948
  id: kakarlapudi22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 938
  lastpage: 948
  published: 2022-08-17 00:00:00 +0000
- title: 'Optimal control of partially observable Markov decision processes with finite linear temporal logic constraints'
  abstract: 'Autonomous agents often operate in environments where the state is partially observed. In addition to maximizing their cumulative reward, agents must execute complex tasks with rich temporal and logical structures. These tasks can be expressed  using temporal logic languages like finite linear temporal logic. This paper, for the first time, provides a structured framework for designing agent policies that maximize the reward while ensuring that the probability of satisfying the temporal logic specification is sufficiently high. We reformulate the problem as a constrained partially observable Markov decision process (POMDP) and  provide a novel approach that can leverage off-the-shelf unconstrained POMDP solvers for solving it. Our approach guarantees approximate optimality and constraint satisfaction with high probability. We demonstrate its effectiveness by implementing it on several models of interest.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/kalagarla22a.html
  PDF: https://proceedings.mlr.press/v180/kalagarla22a/kalagarla22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-kalagarla22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Krishna C.
    family: Kalagarla
  - given: Kartik
    family: Dhruva
  - given: Dongming
    family: Shen
  - given: Rahul
    family: Jain
  - given: Ashutosh
    family: Nayyar
  - given: Pierluigi
    family: Nuzzo
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 949-958
  id: kalagarla22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 949
  lastpage: 958
  published: 2022-08-17 00:00:00 +0000
- title: 'Test for non-negligible adverse shifts'
  abstract: 'Statistical tests for dataset shift are susceptible to false alarms: they are sensitive to minor differences when there is in fact adequate sample coverage and predictive performance. We propose instead a framework to detect adverse shifts based on outlier scores, D-SOS for short. D-SOS holds that the new (test) sample is not substantively worse than the reference (training) sample, and not that the two are equal. The key idea is to reduce observations to outlier scores and compare contamination rates at varying weighted thresholds. Users can define what worse means in terms of relevant notions of outlyingness, including proxies for predictive performance. Compared to tests of equal distribution, our approach is uniquely tailored to serve as a robust metric for model monitoring and data validation. We show how versatile and practical D-SOS is on a wide range of real and simulated data.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/kamulete22a.html
  PDF: https://proceedings.mlr.press/v180/kamulete22a/kamulete22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-kamulete22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Vathy M
    family: Kamulete
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 959-968
  id: kamulete22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 959
  lastpage: 968
  published: 2022-08-17 00:00:00 +0000
- title: 'Improved feature importance computation for tree models based on the Banzhaf value'
  abstract: 'The Shapley value – a fundamental game-theoretic solution concept – has recently become one of the main tools used to explain predictions of tree ensemble models. Another well-known game-theoretic solution concept is the Banzhaf value. Although the Banzhaf value is closely related to the Shapley value, its properties w.r.t. feature attribution have not been understood equally well. This paper shows that, for tree ensemble models, the Banzhaf value offers some crucial advantages over the Shapley value while providing similar feature attributions. In particular, we first give an optimal O(TL + n) time algorithm for computing the Banzhaf value-based attribution of a tree ensemble model’s output. Here, T is the number of trees, L is the maximum number of leaves in a tree, and n is the number of features. In comparison, the state-of-the-art Shapley value-based algorithm runs in O(TLD^2 + n) time, where D denotes the maximum depth of a tree in the ensemble. Next, we experimentally compare the Banzhaf and Shapley values for tree ensemble models. Both methods deliver essentially the same average importance scores for the studied datasets using two different tree ensemble models (the sklearn implementation of Decision Trees or xgboost implementation of Gradient Boosting Decision Trees). However, our results indicate that, on top of being computable faster, the Banzhaf is more numerically robust than the Shapley value.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/karczmarz22a.html
  PDF: https://proceedings.mlr.press/v180/karczmarz22a/karczmarz22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-karczmarz22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Adam
    family: Karczmarz
  - given: Tomasz
    family: Michalak
  - given: Anish
    family: Mukherjee
  - given: Piotr
    family: Sankowski
  - given: Piotr
    family: Wygocki
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 969-979
  id: karczmarz22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 969
  lastpage: 979
  published: 2022-08-17 00:00:00 +0000
- title: 'Dynamic relocation in ridesharing via fixpoint construction '
  abstract: 'To address spatial imbalances in the supply and demand of drivers, ridesharing platforms can make use of policies to direct driver relocation.  We study a simple model of this problem, which allows us to give a constructive characterization of the unique fixpoint of system dynamics.  Using this construction, we design a dynamic policy that provides stronger, than previous work,  guarantees about its rate of convergence to the fixpoint.  Simulations demonstrate the benefits of our approach.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/kash22a.html
  PDF: https://proceedings.mlr.press/v180/kash22a/kash22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-kash22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Ian A.
    family: Kash
  - given: Zhongkai
    family: Wen
  - given: Lenore D.
    family: Zuck
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 980-989
  id: kash22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 980
  lastpage: 989
  published: 2022-08-17 00:00:00 +0000
- title: 'Restless and uncertain: Robust policies for restless bandits via deep multi-agent reinforcement learning'
  abstract: 'We introduce robustness in \textit{restless multi-armed bandits} (RMABs), a popular model for constrained resource allocation among independent stochastic processes (arms). Nearly all RMAB techniques assume stochastic dynamics are precisely known. However, in many real-world settings, dynamics are estimated with significant uncertainty, e.g., via historical data, which can lead to bad outcomes if ignored. To address this, we develop an algorithm to compute minimax regret–robust policies for RMABs. Our approach uses a double oracle framework (oracles for \textit{agent} and \textit{nature}), which is often used for single-process robust planning but requires significant new techniques to accommodate the combinatorial nature of RMABs. Specifically, we design a deep reinforcement learning (RL) algorithm, DDLPO, which tackles the combinatorial challenge by learning an auxiliary “$\lambda$-network” in tandem with policy networks per arm, greatly reducing sample complexity, with guarantees on convergence. DDLPO, of general interest, implements our reward-maximizing agent oracle. We then tackle the challenging regret-maximizing nature oracle, a non-stationary RL challenge, by formulating it as a multi-agent RL problem between a policy optimizer and adversarial nature. This formulation is of general interest—we solve it for RMABs by creating a multi-agent extension of DDLPO with a shared critic. We show our approaches work well in three experimental domains.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/killian22a.html
  PDF: https://proceedings.mlr.press/v180/killian22a/killian22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-killian22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Jackson A.
    family: Killian
  - given: Lily
    family: Xu
  - given: Arpita
    family: Biswas
  - given: Milind
    family: Tambe
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 990-1000
  id: killian22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 990
  lastpage: 1000
  published: 2022-08-17 00:00:00 +0000
- title: 'Combinatorial Bayesian optimization with random mapping functions to convex polytopes'
  abstract: 'Bayesian optimization is a popular method for solving the problem of global optimization of an expensive-to-evaluate black-box function. It relies on a probabilistic surrogate model of the objective function, upon which an acquisition function is built to determine where next to evaluate the objective function. In general, Bayesian optimization with Gaussian process regression operates on a continuous space. When input variables are categorical or discrete, an extra care is needed. A common approach is to use one-hot encoded or Boolean representation for categorical variables which might yield a combinatorial explosion problem. In this paper we present a method for Bayesian optimization in a combinatorial space, which can operate well in a large combinatorial space. The main idea is to use a random mapping which embeds the combinatorial space into a convex polytope in a continuous space, on which all essential process is performed to determine a solution to the black-box optimization in the combinatorial space. We describe our combinatorial Bayesian optimization algorithm and present its regret analysis. Numerical experiments demonstrate that our method shows satisfactory performance compared to existing methods.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/kim22a.html
  PDF: https://proceedings.mlr.press/v180/kim22a/kim22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-kim22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Jungtaek
    family: Kim
  - given: Seungjin
    family: Choi
  - given: Minsu
    family: Cho
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1001-1011
  id: kim22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1001
  lastpage: 1011
  published: 2022-08-17 00:00:00 +0000
- title: 'On the effectiveness of adversarial training against common corruptions'
  abstract: 'The literature on robustness towards common corruptions shows no consensus on whether adversarial training can improve the performance in this setting. First, we show that, when used with an appropriately selected perturbation radius, Lp adversarial training can serve as a strong baseline against common corruptions improving both accuracy and calibration. Then we explain why adversarial training performs better than data augmentation with simple Gaussian noise which has been observed to be a meaningful baseline on common corruptions. Related to this, we identify the sigma-overfitting phenomenon when Gaussian augmentation overfits to a particular standard deviation used for training which has a significant detrimental effect on common corruption accuracy. We discuss how to alleviate this problem and then how to further enhance Lp adversarial training by introducing an efficient relaxation of adversarial training with learned perceptual image patch similarity as the distance metric. Through experiments on CIFAR-10 and ImageNet-100, we show that our approach does not only improve the Lp adversarial training baseline but also has cumulative gains with data augmentation methods such as AugMix, DeepAugment, ANT, and SIN, leading to state-of-the-art performance on common corruptions.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/kireev22a.html
  PDF: https://proceedings.mlr.press/v180/kireev22a/kireev22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-kireev22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Klim
    family: Kireev
  - given: Maksym
    family: Andriushchenko
  - given: Nicolas
    family: Flammarion
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1012-1021
  id: kireev22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1012
  lastpage: 1021
  published: 2022-08-17 00:00:00 +0000
- title: 'Revisiting the general identifiability problem'
  abstract: 'We revisit the problem of general identifiability originally introduced in [Lee et al., 2019] for causal inference and note that it is necessary to add positivity assumption of observational distribution to the original definition of the problem. We show that without such an assumption the rules of do-calculus and consequently the proposed algorithm in [Lee et al., 2019] are not sound. Moreover, adding the assumption will cause the completeness proof in [Lee et al., 2019] to fail. Under positivity assumption, we present a new algorithm that is provably both sound and complete. A nice property of this new algorithm is that it establishes a connection between  general identifiability and classical identifiability by Pearl [1995] through decomposing the general identifiability problem into a series of classical identifiability  sub-problems.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/kivva22a.html
  PDF: https://proceedings.mlr.press/v180/kivva22a/kivva22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-kivva22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yaroslav
    family: Kivva
  - given: Ehsan
    family: Mokhtarian
  - given: Jalal
    family: Etesami
  - given: Negar
    family: Kiyavash
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1022-1030
  id: kivva22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1022
  lastpage: 1030
  published: 2022-08-17 00:00:00 +0000
- title: 'Hitting times for continuous-time imprecise-Markov chains'
  abstract: 'We study the problem of characterizing the expected hitting times for a robust generalization of continuous-time Markov chains. This generalization is based on the theory of imprecise probabilities, and the models with which we work essentially constitute sets of stochastic processes. Their inferences are tight lower- and upper bounds with respect to variation within these sets.  We consider three distinct types of these models, corresponding to different levels of generality and structural independence assumptions on the constituent processes.  Our main results are twofold; first, we demonstrate that the hitting times for all three types are equivalent. Moreover, we show that these inferences are described by a straightforward generalization of a well-known linear system of equations that characterizes expected hitting times for traditional time-homogeneous continuous-time Markov chains.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/krak22a.html
  PDF: https://proceedings.mlr.press/v180/krak22a/krak22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-krak22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Thomas
    family: Krak
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1031-1040
  id: krak22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1031
  lastpage: 1040
  published: 2022-08-17 00:00:00 +0000
- title: 'Calibrated ensembles can mitigate accuracy tradeoffs under distribution shift'
  abstract: 'We often see undesirable tradeoffs in robust machine learning where out-of-distribution (OOD) accuracy is at odds with in-distribution (ID) accuracy. A robust classifier obtained via specialized techniques such as removing spurious features often has better OOD but worse ID accuracy compared to a standard classifier trained via vanilla ERM. In this paper, we find that a simple approach of ensembling the standard and robust models, after calibrating on only ID data, outperforms prior state-of-the-art both ID and OOD. On ten natural distribution shift datasets, ID-calibrated ensembles get the best of both worlds: strong ID accuracy of the standard model and OOD accuracy of the robust model. We analyze this method in stylized settings, and identify two important conditions for ensembles to perform well on both ID and OOD: (1) standard and robust models should be calibrated (on ID data, because OOD data is unavailable), (2) OOD has no anticorrelated spurious features.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/kumar22a.html
  PDF: https://proceedings.mlr.press/v180/kumar22a/kumar22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-kumar22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Ananya
    family: Kumar
  - given: Tengyu
    family: Ma
  - given: Percy
    family: Liang
  - given: Aditi
    family: Raghunathan
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1041-1051
  id: kumar22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1041
  lastpage: 1051
  published: 2022-08-17 00:00:00 +0000
- title: 'Greedy relaxations of the sparsest permutation algorithm'
  abstract: 'There has been an increasing interest in methods that exploit permutation reasoning to search for directed acyclic causal models, including the “Ordering Search’’ of Teyssier and Kohler and GSP of Solus, Wang and Uhler. We extend the methods of the latter by a permutation-based operation tuck, and develop a class of algorithms, namely GRaSP, that are computationally efficient and pointwise consistent under increasingly weaker assumptions than faithfulness. The most relaxed form of GRaSP outperforms many state-of-the-art causal search algorithms in simulation, allowing efficient and accurate search even for dense graphs and graphs with more than 100 variables.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/lam22a.html
  PDF: https://proceedings.mlr.press/v180/lam22a/lam22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-lam22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Wai-Yin
    family: Lam
  - given: Bryan
    family: Andrews
  - given: Joseph
    family: Ramsey
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1052-1062
  id: lam22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1052
  lastpage: 1062
  published: 2022-08-17 00:00:00 +0000
- title: 'Interpolating between sampling and variational inference with infinite stochastic mixtures'
  abstract: 'Sampling and Variational Inference (VI) are two large families of methods for approximate inference that have complementary strengths. Sampling methods excel at approximating arbitrary probability distributions, but can be inefficient. VI methods are efficient, but may misrepresent the true distribution. Here, we develop a general framework where approximations are stochastic mixtures of simple component distributions. Both sampling and VI can be seen as special cases: in sampling, each mixture component is a delta-function and is chosen stochastically, while in standard VI a single component is chosen to minimize divergence. We derive a practical method that interpolates between sampling and VI by analytically solving an optimization problem over a mixing distribution. Intermediate inference methods then arise by varying a single parameter. Our method provably improves on sampling (reducing variance) and on VI (reducing bias+variance despite increasing variance). We demonstrate our method’s bias/variance trade-off in practice on reference problems, and we compare outcomes to commonly used sampling and VI methods. This work takes a step towards a highly flexible yet simple family of inference methods that combines the complementary strengths of sampling and VI.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/lange22a.html
  PDF: https://proceedings.mlr.press/v180/lange22a/lange22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-lange22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Richard D.
    family: Lange
  - given: Ari S.
    family: Benjamin
  - given: Ralf M.
    family: Haefner
  - given: Xaq
    family: Pitkow
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1063-1073
  id: lange22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1063
  lastpage: 1073
  published: 2022-08-17 00:00:00 +0000
- title: 'Systematized event-aware learning for multi-object tracking'
  abstract: 'We propose an end-to-end online multi-object tracking (MOT) framework with a systematized event-aware loss, which is designed to control possible occurrences in an online MOT situation and compel the tracker to take appropriate actions when such events occur. Training samples from real candidates using a simulation tracker are generated, and a systematized event-aware association matrix is constructed for every frame to enable the tracker to learn the ideal action in a running environment. Several experiments, including ablation studies on various public MOT benchmark datasets, are conducted. The experimental results verify that each event affecting the tracking measure can be controlled, and the proposed method presents optimal results compared with recent state-of-the-art MOT methods.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/lee22a.html
  PDF: https://proceedings.mlr.press/v180/lee22a/lee22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-lee22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Hyemin
    family: Lee
  - given: Daijin
    family: Kim
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1074-1084
  id: lee22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1074
  lastpage: 1084
  published: 2022-08-17 00:00:00 +0000
- title: 'Fixing the Bethe approximation: How structural modifications in a graph improve belief propagation'
  abstract: 'Belief propagation is an iterative method for inference in probabilistic graphical models. Its well-known relationship to a classical concept from statistical physics, the Bethe free energy, puts it on a solid theoretical foundation. If belief propagation fails to approximate the marginals, then this is often due to a failure of the Bethe approximation. In this work, we show how modifications in a graphical model can be a great remedy for fixing the Bethe approximation. Specifically, we analyze how the removal of edges influences and improves belief propagation, and demonstrate that this positive effect is particularly distinct for dense graphs.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/leisenberger22a.html
  PDF: https://proceedings.mlr.press/v180/leisenberger22a/leisenberger22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-leisenberger22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Harald
    family: Leisenberger
  - given: Franz
    family: Pernkopf
  - given: Christian
    family: Knoll
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1085-1095
  id: leisenberger22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1085
  lastpage: 1095
  published: 2022-08-17 00:00:00 +0000
- title: 'Recursive Monte Carlo and variational inference with auxiliary variables'
  abstract: 'A key design constraint when implementing Monte Carlo and variational inference algorithms is that it must be possible to cheaply and exactly evaluate the marginal densities of proposal distributions and variational families. This takes many interesting proposals off the table, such as those based on involved simulations or stochastic optimization. This paper broadens the design space, by presenting a framework for applying Monte Carlo and variational inference algorithms when proposal densities cannot be exactly evaluated. Our framework, recursive auxiliary-variable inference (RAVI), instead approximates the necessary densities using meta-inference: an additional layer of Monte Carlo or variational inference, that targets the proposal, rather than the model. RAVI generalizes and uniﬁes several existing methods for inference with expressive approximating families, which we show correspond to speciﬁc choices of meta-inference algorithm, and provides new theory for analyzing their bias and variance. We illustrate RAVI’s design framework and theorems by using them to analyze and improve upon Salimans et al.’s Markov Chain Variational Inference, and to design a novel sampler for Dirichlet process mixtures, achieving state-of-the-art results on a standard benchmark dataset from astronomy and on a challenging datacleaning task with Medicare hospital data.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/lew22a.html
  PDF: https://proceedings.mlr.press/v180/lew22a/lew22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-lew22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Alexander K.
    family: Lew
  - given: Marco
    family: Cusumano-Towner
  - given: Vikash K.
    family: Mansinghka
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1096-1106
  id: lew22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1096
  lastpage: 1106
  published: 2022-08-17 00:00:00 +0000
- title: 'Solving structured hierarchical games using differential backward induction'
  abstract: 'From large-scale organizations to decentralized political systems, hierarchical strategic decision making is commonplace. We introduce a novel class of structured hierarchical games (SHGs) that formally capture such hierarchical strategic interactions. In an SHG, each player is a node in a tree, and strategic choices of players are sequenced from root to leaves, with root moving first, followed by its children, then followed by their children, and so on until the leaves. A player’s utility in an SHG depends on its own decision, and on the choices of its parent and all the tree leaves. SHGs thus generalize simultaneous-move games, as well as Stackelberg games with many followers.  We leverage the structure of both the sequence of player moves as well as payoff dependence to develop a gradient-based back propagation-style algorithm, which we call Differential Backward Induction (DBI), for approximating equilibria of SHGs. We provide a sufficient condition for convergence of DBI and  demonstrate its efficacy in finding approximate equilibrium solutions to several SHG models of hierarchical policy-making problems.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/li22a.html
  PDF: https://proceedings.mlr.press/v180/li22a/li22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-li22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Zun
    family: Li
  - given: Feiran
    family: Jia
  - given: Aditya
    family: Mate
  - given: Shahin
    family: Jabbari
  - given: Mithun
    family: Chakraborty
  - given: Milind
    family: Tambe
  - given: Yevgeniy
    family: Vorobeychik
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1107-1117
  id: li22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1107
  lastpage: 1117
  published: 2022-08-17 00:00:00 +0000
- title: 'PDQ-Net: Deep probabilistic dual quaternion network for absolute pose regression on $SE(3)$'
  abstract: 'Accurate absolute pose regression is one of the key challenges in robotics and computer vision. Existing direct regression methods suffer from two limitations. First, some noisy scenarios such as poor illumination conditions are likely to result in the uncertainty of pose estimation. Second, the output n-dimensional feature vector in the Euclidean space $\mathbb{R}^n$ cannot be well mapped to $SE(3)$ manifold. In this work, we propose a deep dual quaternion network that performs the absolute pose regression on $SE(3)$.  We first develop an antipodally symmetric probability distribution over the unit dual quaternion on $SE(3)$ to model uncertainties and then propose an intermediary differential representation space to replace the final output pose, which avoids the mapping problem from $\mathbb{R}^n$ to $SE(3)$. In addition, we introduce a backpropagation method that considers the continuousness and differentiability of the proposed intermediary space. Extensive experiments on the camera re-localization task on the Cambridge Landmarks and 7-Scenes datasets demonstrate that our method greatly improves the accuracy of the pose as well as the robustness in dealing with uncertainty and ambiguity, compared to the state-of-the-art.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/li22b.html
  PDF: https://proceedings.mlr.press/v180/li22b/li22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-li22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Wenjie
    family: Li
  - given: Wasif
    family: Naeem
  - given: Jia
    family: Liu
  - given: Dequan
    family: Zheng
  - given: Wei
    family: Hao
  - given: Lijun
    family: Chen
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1118-1127
  id: li22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1118
  lastpage: 1127
  published: 2022-08-17 00:00:00 +0000
- title: 'Accelerating training of batch normalization: A manifold perspective'
  abstract: 'Batch normalization (BN) has become a critical component across diverse deep neural networks. The network with BN is invariant to positively linear re-scale transformation, which makes there exist infinite functionally equivalent networks with different scales of weights. However, optimizing these equivalent networks with the first-order method such as stochastic gradient descent will obtain a series of iterates converging to different local optima owing to their different gradients across training. To obviate this, we propose a quotient manifold PSI manifold, in which all the equivalent weights of the network with BN are regarded as the same element. Next, we construct gradient descent and stochastic gradient descent on the proposed PSI manifold to train the network with BN. The two algorithms guarantee that every group of equivalent weights (caused by positively re-scaling) converge to the equivalent optima. Besides that, we give convergence rates of the proposed algorithms on the PSI manifold. The results show that our methods accelerate training compared with the algorithms on the Euclidean weight space. Finally, empirical results verify that our algorithms consistently improve the existing methods in both convergence rate and generalization ability under various experimental settings.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/yi22a.html
  PDF: https://proceedings.mlr.press/v180/yi22a/yi22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-yi22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Mingyang
    family: Yi
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1128-1137
  id: yi22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1128
  lastpage: 1137
  published: 2022-08-17 00:00:00 +0000
- title: 'Deep Dirichlet process mixture models'
  abstract: 'In this paper we propose the deep Dirichlet process mixture (DDPM) model, which is an unsupervised method that simultaneously performs clustering and feature learning. The traditional Dirichlet process mixture model can infer the number of mixture components, but its flexibility is restricted since the clustering is performed in the raw feature space. Our method alleviates this limitation by using the flow-based deep neural network to learn more expressive features. DDPM unifies Dirichlet processes and the flow-based model with Monte Carlo expectation-maximization, and uses Gibbs sampling to sample from the posterior. This combination allows our method to exploit the mutually beneficial relation between clustering and feature learning. The effectiveness of DDPM is demonstrated by thorough experiments in various synthetic and real-world datasets.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/li22c.html
  PDF: https://proceedings.mlr.press/v180/li22c/li22c.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-li22c.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Naiqi
    family: Li
  - given: Wenjie
    family: Li
  - given: Yong
    family: Jiang
  - given: Shu-Tao
    family: Xia
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1138-1147
  id: li22c
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1138
  lastpage: 1147
  published: 2022-08-17 00:00:00 +0000
- title: 'Proportional allocation of indivisible resources under ordinal and uncertain preferences.'
  abstract: 'We study a fair resource allocation problem with indivisible items. The agents’ preferences over items are assumed to be ordinal and have uncertainties. We adopt stochastic dominance proportionality as our fairness notion and study a sequence of problems related to finding allocations that are fair with a high probability. We provide complexity analysis for each problem and efficient algorithms for some problems. Finally, we propose several heuristic algorithms to find an allocation that is fair with the highest probability. We thoroughly evaluate the performance of the algorithms on both synthetic and real datasets.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/li22d.html
  PDF: https://proceedings.mlr.press/v180/li22d/li22d.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-li22d.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Zihao
    family: Li
  - given: Xiaohui
    family: Bei
  - given: Zhenzhen
    family: Yan
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1148-1157
  id: li22d
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1148
  lastpage: 1157
  published: 2022-08-17 00:00:00 +0000
- title: 'Efficient resource allocation with fairness constraints in restless multi-armed bandits'
  abstract: 'Restless Multi-Armed Bandits (RMAB) is an apt model to represent decision-making problems in public health interventions (e.g., tuberculosis, maternal, and child care), anti-poaching planning, sensor monitoring, personalized recommendations and many more. Existing research in RMAB has contributed mechanisms and theoretical results to a wide variety of settings, where the focus is on maximizing expected value. In this paper, we are interested in ensuring that RMAB decision making is also fair to different arms while maximizing expected value. In the context of public health settings, this would ensure that different people and/or communities are fairly represented while making public health intervention decisions. To achieve this goal, we formally define the fairness constraints in RMAB and provide planning and learning methods to solve RMAB in a fair manner. We demonstrate key theoretical properties of fair RMAB and experimentally demonstrate that our proposed methods handle fairness constraints without sacrificing significantly on solution quality.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/li22e.html
  PDF: https://proceedings.mlr.press/v180/li22e/li22e.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-li22e.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Dexun.
    family: Li
  - given: Pradeep
    family: Varakantham
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1158-1167
  id: li22e
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1158
  lastpage: 1167
  published: 2022-08-17 00:00:00 +0000
- title: 'A label efficient two-sample test'
  abstract: 'Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are easily measured whereas sample labels are unknown and costly to obtain. Accordingly, we devise a three-stage framework in service of performing an effective two-sample test with only a small number of sample label queries: first, a classifier is trained with samples uniformly labeled to model the posterior probabilities of the labels; second, a novel query scheme dubbed bimodal query is used to query labels of samples from both classes, and last, the classical Friedman-Rafsky (FR) two-sample test is performed on the queried samples. Theoretical analysis and extensive experiments performed on several datasets demonstrate that the proposed test controls the Type I error and has decreased Type II error relative to uniform querying and certainty-based querying. Source code for our algorithms and experimental results is available at https://github.com/wayne0908/Label-Efficient-Two-Sample.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/li22f.html
  PDF: https://proceedings.mlr.press/v180/li22f/li22f.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-li22f.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Weizhi
    family: Li
  - given: Gautam
    family: Dasarathy
  - given: Karthikeyan Natesan
    family: Ramamurthy
  - given: Visar
    family: Berisha
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1168-1177
  id: li22f
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1168
  lastpage: 1177
  published: 2022-08-17 00:00:00 +0000
- title: '$\ell_∞$-Bounds of the MLE in the BTL Model under General Comparison Graphs'
  abstract: 'The Bradley-Terry-Luce (BTL) model is a popular statistical approach for estimating the global ranking of a collection of items using pairwise comparisons. To ensure accurate ranking, it is essential to obtain precise estimates of the model parameters in the $\ell_{\infty}$-loss. The difficulty of this task depends crucially on the topology of the pairwise comparison graph over the given items. However, beyond very few well-studied cases, such as the complete and Erd{ö}s-R{é}nyi comparison graphs, little is known about the performance of the maximum likelihood estimator (MLE) of the BTL model parameters in the $\ell_{\infty}$-loss under more general graph topologies. In this paper, we derive novel, general upper bounds on the $\ell_{\infty}$ estimation error of the BTL MLE that depend explicitly on the algebraic connectivity of the comparison graph, the maximal performance gap across items and the sample complexity. We demonstrate that the derived bounds perform well and in some cases are sharper compared to known results obtained using different loss functions and more restricted assumptions and graph topologies. We carefully compare our results to Yan et al. (2012), which is closest in spirit to our work. We further provide minimax lower bounds under $\ell_{\infty}$-error that nearly match the upper bounds over a class of sufficiently regular graph topologies. Finally, we study the implications of our $\ell_{\infty}$-bounds for efficient (offline) tournament design. We illustrate and discuss our findings through various examples and simulations.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/li22g.html
  PDF: https://proceedings.mlr.press/v180/li22g/li22g.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-li22g.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Wanshan
    family: Li
  - given: Shamindra
    family: Shrotriya
  - given: Alessandro
    family: Rinaldo
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1178-1187
  id: li22g
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1178
  lastpage: 1187
  published: 2022-08-17 00:00:00 +0000
- title: 'AdaCat: Adaptive categorical discretization for autoregressive models'
  abstract: 'Autoregressive generative models can estimate complex continuous data distributions, like trajectory rollouts in an RL environment, image intensities, and audio. Most state-of-the-art models discretize continuous data into several bins and use categorical distributions over the bins to approximate the continuous data distribution. The advantage is that the categorical distribution can easily express multiple modes and are straightforward to optimize. However, such approximation cannot express sharp changes in density without using significantly more bins, which makes it parameter inefficient. We propose an efficient, expressive, multimodal parameterization called Adaptive Categorical Discretization (AdaCat). AdaCat discretizes each dimension of an autoregressive model adaptively, which allows the model to allocate density to fine intervals of interest, improving parameter efficiency. AdaCat generalizes both categoricals and quantile-based regression. AdaCat is a simple add-on to any discretization-based distribution estimator. In experiments, AdaCat improves density estimation for real-world tabular data, images, audio, and trajectories, and improves planning in model-based offline RL.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/li22h.html
  PDF: https://proceedings.mlr.press/v180/li22h/li22h.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-li22h.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Qiyang
    family: Li
  - given: Ajay
    family: Jain
  - given: Pieter
    family: Abbeel
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1188-1198
  id: li22h
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1188
  lastpage: 1198
  published: 2022-08-17 00:00:00 +0000
- title: 'Laplace approximated Gaussian process state-space models'
  abstract: 'Gaussian process state-space models describe time series data in a probabilistic and non-parametric manner using a Gaussian process transition function. As inference is intractable, recent methods use variational inference and either rely on simplifying independence assumptions on the approximate posterior or learn the temporal states iteratively. The latter hampers optimization since the posterior over the presence can only be learned once the posterior governing the past has converged. We present a novel inference scheme that applies stochastic variational inference for the Gaussian process posterior and the Laplace approximation on the temporal states. This approach respects the conditional dependencies in the model and, through the Laplace approximation, treats the temporal states jointly, thereby avoiding their sequential learning. Our method is computationally efficient and  leads to better calibrated predictions compared to state-of-the art alternatives on synthetic data and on a range of benchmark datasets.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/lindinger22a.html
  PDF: https://proceedings.mlr.press/v180/lindinger22a/lindinger22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-lindinger22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Jakob
    family: Lindinger
  - given: Barbara
    family: Rakitsch
  - given: Christoph
    family: Lippert
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1199-1209
  id: lindinger22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1199
  lastpage: 1209
  published: 2022-08-17 00:00:00 +0000
- title: 'Dimension reduction for high-dimensional small counts with KL divergence'
  abstract: 'Dimension reduction for high-dimensional count data with a large proportion of zeros is an important task in various applications. As a large number of dimension reduction methods rely on the proximity measure, we develop a dissimilarity measure that is well-suited for small counts based on the Kullback-Leibler divergence. We compare the proposed measure with other widely used dissimilarity measures and show that the proposed one has superior discriminative ability when applied to high-dimensional count data having an excess of zeros. Extensive empirical results, on both simulated and publicly-available real-world datasets that contain many zeros, demonstrate that the proposed dissimilarity measure can improve a wide range of dimension reduction methods. '
  volume: 180
  URL: https://proceedings.mlr.press/v180/ling22a.html
  PDF: https://proceedings.mlr.press/v180/ling22a/ling22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-ling22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yurong
    family: Ling
  - given: Jing-Hao
    family: Xue
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1210-1220
  id: ling22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1210
  lastpage: 1220
  published: 2022-08-17 00:00:00 +0000
- title: 'Federated online clustering of bandits'
  abstract: 'Contextual multi-armed bandit (MAB) is an important sequential decision-making problem in recommendation systems. A line of works, called the clustering of bandits (CLUB), utilize the collaborative effect over users and dramatically improve the recommendation quality. Owing to the increasing application scale and public concerns about privacy, there is a growing demand to keep user data decentralized and push bandit learning to the local server side. Existing CLUB algorithms, however, are designed under the centralized setting where data are available at a central server. We focus on studying the federated online clustering of bandit (FCLUB) problem, which aims to minimize the total regret while satisfying privacy and communication considerations. We design a new phase-based scheme for cluster detection and a novel asynchronous communication protocol for cooperative bandit learning for this problem. To protect users’ privacy, previous differential privacy (DP) definitions are not very suitable, and we propose a new DP notion that acts on the user cluster level. We provide rigorous proofs to show that our algorithm simultaneously achieves (clustered) DP, sublinear communication complexity and sublinear regret. Finally, experimental evaluations show our superior performance compared with benchmark algorithms.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/liu22a.html
  PDF: https://proceedings.mlr.press/v180/liu22a/liu22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-liu22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Xutong
    family: Liu
  - given: Haoru
    family: Zhao
  - given: Tong
    family: Yu
  - given: Shuai
    family: Li
  - given: John C.S.
    family: Lui
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1221-1231
  id: liu22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1221
  lastpage: 1231
  published: 2022-08-17 00:00:00 +0000
- title: 'PathFlow: A normalizing flow generator that finds transition paths'
  abstract: 'Sampling from a Boltzmann distribution to calculate important macro statistics is one of the central tasks in the study of large atomic and molecular systems.  Recently, a one-shot configuration sampler, the Boltzmann generator  [Noé et al., 2019], is introduced. Though a Boltzmann generator can directly generate independent metastable states, it lacks the ability to find transition pathways and describe the whole transition process. In this paper, we propose PathFlow that can function as a one-shot generator as well as a transition pathfinder. More specifically, a normalizing flow model is constructed to map the base distribution and linear interpolated path in the latent space to the Boltzmann distribution and a minimum (free) energy path in the configuration space simultaneously. PathFlow can be trained by standard gradient-based optimizers using the proposed gradient estimator with a theoretical guarantee. PathFlow, validated with the extensively studied examples including a synthetic Müller potential and Alanine dipeptide, shows a remarkable performance. '
  volume: 180
  URL: https://proceedings.mlr.press/v180/liu22b.html
  PDF: https://proceedings.mlr.press/v180/liu22b/liu22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-liu22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tianyi
    family: Liu
  - given: Weihao
    family: Gao
  - given: Zhirui
    family: Wang
  - given: Chong
    family: Wang
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1232-1242
  id: liu22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1232
  lastpage: 1242
  published: 2022-08-17 00:00:00 +0000
- title: 'SASH: Efficient secure aggregation based on SHPRG for federated learning'
  abstract: 'To prevent private training data leakage in Federated Learning systems, we propose a novel secure aggregation scheme based on seed homomorphic pseudo-random generator (SHPRG), named SASH. SASH leverages the homomorphic property of SHPRG to simplify the masking and demasking scheme, which for each of the clients and for the server, entails a overhead linear w.r.t model size and constant w.r.t number of clients. We prove that even against worst-case colluding adversaries, SASH preserves training data privacy, while being resilient to dropouts without extra overhead. We experimentally demonstrate SASH significantly improves the efficiency to 20× over baseline, especially in the more realistic case where the numbers of clients and model size become large, and a certain percentage of clients drop out from the system.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/liu22c.html
  PDF: https://proceedings.mlr.press/v180/liu22c/liu22c.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-liu22c.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Zizhen
    family: Liu
  - given: Si
    family: Chen
  - given: Jing
    family: Ye
  - given: Junfeng
    family: Fan
  - given: Huawei
    family: Li
  - given: Xiaowei
    family: Li
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1243-1252
  id: liu22c
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1243
  lastpage: 1252
  published: 2022-08-17 00:00:00 +0000
- title: 'Offline policy optimization with eligible actions'
  abstract: 'Offline policy optimization could have a large impact on many real-world decision-making problems, as online learning may be infeasible in many applications. Importance sampling and its variants are a common used type of estimator in offline policy evaluation, and such estimators typically do not require assumptions on the properties and representational capabilities of value function or decision process model function classes. In this paper, we identify an important overfitting phenomenon in optimizing the importance weighted return, in which it may be possible for the learned policy to essentially avoid making aligned decisions for part of the initial state space. We propose an algorithm to avoid this overfitting through a new per-state-neighborhood normalization constraint, and provide a theoretical justification of the proposed algorithm. We also show the limitations of previous attempts to this approach. We test our algorithm in a healthcare-inspired simulator, a logged dataset collected from real hospitals and continuous control tasks. These experiments show the proposed method yields less overfitting and better test performance compared to state-of-the-art batch reinforcement learning algorithms.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/liu22d.html
  PDF: https://proceedings.mlr.press/v180/liu22d/liu22d.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-liu22d.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yao
    family: Liu
  - given: Yannis
    family: Flet-Berliac
  - given: Emma
    family: Brunskill
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1253-1263
  id: liu22d
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1253
  lastpage: 1263
  published: 2022-08-17 00:00:00 +0000
- title: 'Data poisoning attacks on off-policy policy evaluation methods'
  abstract: 'Off-policy Evaluation (OPE) methods are a crucial tool for evaluating policies in high-stakes domains such as healthcare, where exploration is often infeasible, unethical, or expensive. However, the extent to which such methods can be trusted under adversarial threats to data quality is largely unexplored. In this work, we make the first attempt at investigating the sensitivity of OPE methods to marginal adversarial perturbations to the data. We design a generic data poisoning attack framework leveraging influence functions from robust statistics to carefully construct perturbations that maximize error in the policy value estimates. We carry out extensive experimentation with multiple healthcare and control datasets. Our results demonstrate that many existing OPE methods are highly prone to generating value estimates with large errors when subject to data poisoning attacks, even for small adversarial perturbations. These findings question the reliability of policy values derived using OPE methods and motivate the need for developing OPE methods that are statistically robust to train-time data poisoning attacks.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/lobo22a.html
  PDF: https://proceedings.mlr.press/v180/lobo22a/lobo22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-lobo22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Elita
    family: Lobo
  - given: Harvineet
    family: Singh
  - given: Marek
    family: Petrik
  - given: Cynthia
    family: Rudin
  - given: Himabindu
    family: Lakkaraju
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1264-1274
  id: lobo22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1264
  lastpage: 1274
  published: 2022-08-17 00:00:00 +0000
- title: 'Nonparametric exponential family graph embeddings for multiple representation learning'
  abstract: 'In graph data, each node often serves multiple functionalities. However, most graph embedding models assume that each node can only possess one representation. We address this issue by proposing a nonparametric graph embedding model. The model allows each node to learn multiple representations where they are needed to represent the complexity of random walks in the graph. It extends the Exponential family graph embedding model with two nonparametric prior settings, the Dirichlet process and the uniform process. The model combines the ability of Exponential family graph embedding to take the number of occurrences of context nodes into account with nonparametric priors giving it the flexibility to learn more than one latent representation for each node. The learned embeddings outperform other state of the art approaches in link prediction and node classification tasks.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/lu22a.html
  PDF: https://proceedings.mlr.press/v180/lu22a/lu22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-lu22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Chien
    family: Lu
  - given: Jaakko
    family: Peltonen
  - given: Timo
    family: Nummenmaa
  - given: Jyrki
    family: Nummenmaa
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1275-1285
  id: lu22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1275
  lastpage: 1285
  published: 2022-08-17 00:00:00 +0000
- title: 'Local calibration: metrics and recalibration'
  abstract: 'Probabilistic classifiers output confidence scores along with their predictions, and these confidence scores should be calibrated, i.e., they should reflect the reliability of the prediction. Confidence scores that minimize standard metrics such as the expected calibration error (ECE) accurately measure the reliability \textit{on average} across the entire population. However, it is in general impossible to measure the reliability of an \textit{individual} prediction. In this work, we propose the local calibration error (LCE) to span the gap between average and individual reliability. For each individual prediction, the LCE measures the average reliability of a set of similar predictions, where similarity is quantified by a kernel function on a pretrained feature space and by a binning scheme over predicted model confidences. We show theoretically that the LCE can be estimated sample-efficiently from data, and empirically find that it reveals miscalibration modes that are more fine-grained than the ECE can detect. Our key result is a novel {\textbf{lo}cal \textbf{re}calibration} method \method{}, to improve confidence scores for individual predictions and decrease the LCE. Experimentally, we show that our recalibration method produces more accurate confidence scores, which improves downstream fairness and decision making on classification tasks with both image and tabular data.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/luo22a.html
  PDF: https://proceedings.mlr.press/v180/luo22a/luo22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-luo22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Rachel
    family: Luo
  - given: Aadyot
    family: Bhatnagar
  - given: Yu
    family: Bai
  - given: Shengjia
    family: Zhao
  - given: Huan
    family: Wang
  - given: Caiming
    family: Xiong
  - given: Silvio
    family: Savarese
  - given: Stefano
    family: Ermon
  - given: Edward
    family: Schmerling
  - given: Marco
    family: Pavone
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1286-1295
  id: luo22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1286
  lastpage: 1295
  published: 2022-08-17 00:00:00 +0000
- title: 'Data sampling affects the complexity of online SGD over dependent data'
  abstract: 'Conventional machine learning applications typically assume that data samples are independently and identically distributed (i.i.d.). However, practical scenarios often involve a data-generating process that produces highly dependent data samples, which are known to heavily bias the stochastic optimization process and slow down the convergence of learning. In this paper, we conduct a fundamental study on how different stochastic data sampling schemes affect the sample complexity of online stochastic gradient descent (SGD) over highly dependent data. Specifically, with a $\phi$-mixing process of data, we show that online SGD with proper periodic data-subsampling achieves an improved sample complexity over the standard online SGD in the full spectrum of the data dependence level. Interestingly, even subsampling a subset of data samples can accelerate the convergence of online SGD over highly dependent data.  Moreover, we show that online SGD with mini-batch sampling can further substantially improve the sample complexity over online SGD with periodic data-subsampling over highly dependent data. Numerical experiments validate our theoretical results.  '
  volume: 180
  URL: https://proceedings.mlr.press/v180/ma22a.html
  PDF: https://proceedings.mlr.press/v180/ma22a/ma22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-ma22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Shaocong
    family: Ma
  - given: Ziyi
    family: Chen
  - given: Yi
    family: Zhou
  - given: Kaiyi
    family: Ji
  - given: Yingbin
    family: Liang
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1296-1305
  id: ma22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1296
  lastpage: 1305
  published: 2022-08-17 00:00:00 +0000
- title: 'Low-precision arithmetic for fast Gaussian processes'
  abstract: 'Low precision arithmetic has had a transformative effect on the training of neural networks, reducing computation, memory and energy requirements. However, despite their promise, low precision operations have received little attention for Gaussian process (GP) training, largely because GPs require sophisticated linear algebra routines that are unstable in low precision. We study the different failure modes that can occur when training GPs in half-precision. To circumvent these failure modes, we propose a multi-faceted approach involving conjugate gradients with re-orthogonalization, mixed precision, compact kernels, and preconditioners. Our approach significantly improves the numerical stability and practical performance of conjugate gradients in low precision over a wide range of settings, and reduces the runtime of 1.8 million data points to 10 hours on a single GPU, without requiring any sparse approximations.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/maddox22a.html
  PDF: https://proceedings.mlr.press/v180/maddox22a/maddox22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-maddox22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Wesley J.
    family: Maddox
  - given: Andres
    family: Potapcynski
  - given: Andrew Gordon
    family: Wilson
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1306-1316
  id: maddox22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1306
  lastpage: 1316
  published: 2022-08-17 00:00:00 +0000
- title: 'Perturbation type categorization for multiple adversarial perturbation robustness'
  abstract: 'Recent works in adversarial robustness have proposed defenses to improve the robustness of a single model against the union of multiple perturbation types. However, these methods still suffer significant trade-offs compared to the ones specifically trained to be robust against a single perturbation type. In this work, we introduce the problem of categorizing adversarial examples based on their perturbation types. We first theoretically show on a toy task that adversarial examples of different perturbation types constitute different distributions—making it possible to distinguish them. We support these arguments with experimental validation on multiple l_p attacks and common corruptions. Instead of training a single classifier, we propose PROTECTOR, a two-stage pipeline that first categorizes the perturbation type of the input, and then makes the final prediction using the classifier specifically trained against the predicted perturbation type. We theoretically show that at test time the adversary faces a natural trade-off between fooling the perturbation classifier and the succeeding classifier optimized with perturbation-specific adversarial training. This makes it challenging for an adversary to plant strong attacks against the whole pipeline. Experiments on MNIST and CIFAR-10 show that PROTECTOR outperforms prior adversarial training-based defenses by over 5% when tested against the union of l_1, l_2, l_inf attacks. Additionally, our method extends to a more diverse attack suite, also showing large robustness gains against multiple l_p, spatial and recolor attacks.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/maini22a.html
  PDF: https://proceedings.mlr.press/v180/maini22a/maini22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-maini22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Pratyush
    family: Maini
  - given: Xinyun
    family: Chen
  - given: Bo
    family: Li
  - given: Dawn
    family: Song
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1317-1327
  id: maini22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1317
  lastpage: 1327
  published: 2022-08-17 00:00:00 +0000
- title: 'A causal bandit approach to learning good atomic interventions in presence of unobserved confounders'
  abstract: 'We study the problem of determining the best atomic intervention in a Causal Bayesian Network (CBN) specified only by its causal graph. We model this as a stochastic multi-armed bandit (MAB) problem with side-information, where interventions on CBN correspond to arms of the bandit instance. First, we propose a simple regret minimization algorithm that takes as input a causal graph with observable and unobservable nodes and in $T$ exploration rounds achieves $\tilde{O}(\sqrt{m(\mathcal{C})/T})$ expected simple regret. Here $m(\mathcal{C})$ is a parameter dependent on the input CBN $\mathcal{C}$ and could be much smaller than the number of arms. We also show that this is almost optimal for CBNs whose causal graphs have an $n$-ary tree structure.  Next, we propose a cumulative regret minimization algorithm that takes as input a causal graph with observable nodes and performs better than the optimal MAB algorithms that do not use causal side-information. We experimentally compare both our algorithms with the best known algorithms in the literature.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/maiti22a.html
  PDF: https://proceedings.mlr.press/v180/maiti22a/maiti22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-maiti22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Aurghya
    family: Maiti
  - given: Vineet
    family: Nair
  - given: Gaurav
    family: Sinha
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1328-1338
  id: maiti22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1328
  lastpage: 1338
  published: 2022-08-17 00:00:00 +0000
- title: 'Case-based off-policy evaluation using prototype learning'
  abstract: 'Importance sampling (IS) is often used to perform off-policy evaluation but it is prone to several issues—especially when the behavior policy is unknown and must be estimated from data. Significant differences between target and behavior policies can result in uncertain value estimates due to, for example, high variance. Standard practices such as inspecting IS weights may be insufficient to diagnose such problems and determine for which type of inputs the policies differ in suggested actions and resulting values. To address this, we propose estimating the behavior policy for IS using prototype learning. The learned prototypes provide a condensed summary of the input-action space, which allows for describing differences between policies and assessing the support for evaluating a certain target policy. In addition, we can describe a value estimate in terms of prototypes to understand which parts of the target policy have the most impact on the estimate. We find that this provides new insights in the examination of a learned policy for sepsis management. Moreover, we study the bias resulting from restricting models to use prototypes, how bias propagates to IS weights and estimated values and how this varies with history length.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/matsson22a.html
  PDF: https://proceedings.mlr.press/v180/matsson22a/matsson22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-matsson22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Anton
    family: Matsson
  - given: Fredrik D.
    family: Johansson
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1339-1349
  id: matsson22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1339
  lastpage: 1349
  published: 2022-08-17 00:00:00 +0000
- title: 'Multistate analysis with infinite mixtures of Markov chains'
  abstract: 'Driven by applications in clinical medicine and business, we address the problem of modeling trajectories over multiple states. We build on well-known methods from survival analysis and introduce a family of sequence models based on localized Bayesian Markov chains. We develop inference and prediction algorithms, and we apply the model to real-world data, demonstrating favorable empirical results. Our approach provides a practical and effective alternative to plain Markov chains and to existing (finite) mixture models; It retains the simplicity and computational benefits of the former while matching or exceeding the predictive performance of the latter.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/maystre22a.html
  PDF: https://proceedings.mlr.press/v180/maystre22a/maystre22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-maystre22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Lucas
    family: Maystre
  - given: Tiffany
    family: Wu
  - given: Roberto
    family: Sanchis-Ojeda
  - given: Tony
    family: Jebara
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1350-1359
  id: maystre22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1350
  lastpage: 1359
  published: 2022-08-17 00:00:00 +0000
- title: 'Forget-me-not! Contrastive critics for mitigating posterior collapse'
  abstract: 'Variational autoencoders (VAEs) suffer from posterior collapse, where the powerful neural networks used for modeling and inference optimize the objective without meaningfully using the latent representation. We introduce inference critics that detect and incentivize against posterior collapse by requiring correspondence between latent variables and the observations. By connecting the critic’s objective to the literature in self-supervised contrastive representation learning, we show both theoretically and empirically that optimizing inference critics increases the mutual information between observations and latents, mitigating posterior collapse. This approach is straightforward to implement and requires significantly less training time than prior methods, yet obtains competitive results on three established datasets. Overall, the approach lays the foundation to bridge the previously disconnected frameworks of contrastive learning and probabilistic modeling with variational autoencoders, underscoring the benefits both communities may find at their intersection.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/menon22a.html
  PDF: https://proceedings.mlr.press/v180/menon22a/menon22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-menon22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Sachit
    family: Menon
  - given: David
    family: Blei
  - given: Carl
    family: Vondrick
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1360-1370
  id: menon22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1360
  lastpage: 1370
  published: 2022-08-17 00:00:00 +0000
- title: 'Can mean field control (mfc) approximate cooperative multi agent reinforcement learning (marl) with non-uniform interaction?'
  abstract: 'Mean-Field Control (MFC) is a powerful tool to solve Multi-Agent Reinforcement Learning (MARL) problems. Recent studies have shown that MFC can well-approximate MARL when the population size is large and the agents are exchangeable. Unfortunately, the presumption of exchangeability implies that all agents uniformly interact with one another which is not true in many practical scenarios. In this article, we relax the assumption of exchangeability and model the interaction between agents via an arbitrary doubly stochastic matrix. As a result, in our framework, the mean-field ‘seen’ by different agents are different. We prove that, if the reward of each agent is an affine function of the mean-field seen by that agent, then one can approximate such a non-uniform MARL problem via its associated MFC problem within an error of $e=\mathcal{O}(\frac{1}{\sqrt{N}}[\sqrt{|\mathcal{X}|} + \sqrt{|\mathcal{U}|}])$ where $N$ is the population size and $|\mathcal{X}|$, $|\mathcal{U}|$ are the sizes of state and action spaces respectively. Finally, we develop a Natural Policy Gradient (NPG) algorithm that can provide a solution to the non-uniform MARL with an error $\mathcal{O}(\max\{e,\epsilon\})$ and a sample complexity of $\mathcal{O}(\epsilon^{-3})$ for any $\epsilon >0$.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/mondal22a.html
  PDF: https://proceedings.mlr.press/v180/mondal22a/mondal22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-mondal22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Washim Uddin
    family: Mondal
  - given: Vaneet
    family: Aggarwal
  - given: Satish V.
    family: Ukkusuri
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1371-1380
  id: mondal22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1371
  lastpage: 1380
  published: 2022-08-17 00:00:00 +0000
- title: 'Monotonicity regularization: Improved penalties and novel applications to disentangled representation learning and robust classification'
  abstract: 'We study settings where gradient penalties are used alongside risk minimization with the goal of obtaining predictors satisfying different notions of monotonicity. Specifically, we present two sets of contributions. In the first part of the paper, we show that different choices of penalties define the regions of the input space where the property is observed. As such, previous methods result in models that are monotonic only in a small volume of the input space. We thus propose an approach that uses mixtures of training instances and random points to populate the space and enforce the penalty in a much larger region. As a second set of contributions, we introduce regularization strategies that enforce other notions of monotonicity in different settings. In this case, we consider applications, such as image classification and generative modeling, where monotonicity is not a hard constraint but can help improve some aspects of the model. Namely, we show that inducing monotonicity can be beneficial in applications such as: (1) allowing for controllable data generation, (2) defining strategies to detect anomalous data, and (3) generating explanations for predictions. Our proposed approaches do not introduce relevant computational overhead while leading to efficient procedures that provide extra benefits over baseline models.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/monteiro22a.html
  PDF: https://proceedings.mlr.press/v180/monteiro22a/monteiro22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-monteiro22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: João
    family: Monteiro
  - given: Mohamed Osama
    family: Ahmed
  - given: Hoseein
    family: Hajimirsadeghi
  - given: Greg
    family: Mori
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1381-1391
  id: monteiro22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1381
  lastpage: 1391
  published: 2022-08-17 00:00:00 +0000
- title: 'Set-valued prediction in hierarchical classification with constrained representation complexity'
  abstract: 'Set-valued prediction is a well-known concept in multi-class classification. When a classifier is uncertain about the class label for a test instance, it can predict a set of classes instead of a single class. In this paper, we focus on hierarchical multi-class classification problems, where valid sets (typically) correspond to internal nodes of the hierarchy. We argue that this is a very strong restriction, and we propose a relaxation by introducing the notion of representation complexity for a predicted set. In combination with probabilistic classifiers, this leads to a challenging inference problem for which specific combinatorial optimization algorithms are needed. We propose three methods and evaluate them on benchmark datasets: a naïve approach that is based on matrix-vector multiplication, a reformulation as a knapsack problem with conflict graph, and a recursive tree search method. Experimental results demonstrate that the last method is computationally more efficient than the other two approaches, due to a hierarchical factorization of the conditional class distribution.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/mortier22a.html
  PDF: https://proceedings.mlr.press/v180/mortier22a/mortier22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-mortier22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Thomas
    family: Mortier
  - given: Eyke
    family: Hüllermeier
  - given: Krzysztof
    family: Dembczyński
  - given: Willem
    family: Waegeman
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1392-1401
  id: mortier22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1392
  lastpage: 1401
  published: 2022-08-17 00:00:00 +0000
- title: 'Safety aware changepoint detection for piecewise i.i.d. bandits'
  abstract: 'In this paper, we consider the setting of piecewise i.i.d. bandits under a safety constraint. In this piecewise i.i.d. setting, there exists a finite number of changepoints where the mean of some or all arms change simultaneously. We introduce the safety constraint studied in Wu et al. (2016) to this setting such that at any round the cumulative reward is above a constant factor of the default action reward. We propose two actively adaptive algorithms for this setting that satisfy the safety constraint, detect changepoints, and restart without the knowledge of the number of changepoints or their locations. We provide regret bounds for our algorithms and show that the bounds are comparable to their counterparts from the safe bandit and piecewise i.i.d. bandit literature. We also provide the first matching lower bounds for this setting.  Empirically, we show that our safety-aware algorithms match the performance of the state-of-the-art actively adaptive algorithms that do not satisfy the safety constraint.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/mukherjee22a.html
  PDF: https://proceedings.mlr.press/v180/mukherjee22a/mukherjee22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-mukherjee22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Subhojyoti
    family: Mukherjee
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1402-1412
  id: mukherjee22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1402
  lastpage: 1412
  published: 2022-08-17 00:00:00 +0000
- title: 'ReVar: Strengthening policy evaluation via reduced variance sampling'
  abstract: 'This paper studies the problem of data collection for policy evaluation in Markov decision processes (MDPs). In policy evaluation, we are given a \textit{target} policy and asked to estimate the expected cumulative reward it will obtain in an environment formalized as an MDP. We develop theory for optimal data collection within the class of tree-structured MDPs by first deriving an oracle exploration strategy that uses knowledge of  the variance of the reward distributions. We then introduce the \textbf{Re}duced \textbf{Var}iance Sampling (\rev\!) algorithm that approximates the oracle strategy when the reward variances are unknown a priori and bound its sub-optimality compared to the oracle strategy. Finally, we empirically validate that \rev leads to policy evaluation with mean squared error comparable to the oracle strategy and significantly lower than simply running the target policy.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/mukherjee22b.html
  PDF: https://proceedings.mlr.press/v180/mukherjee22b/mukherjee22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-mukherjee22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Subhojyoti
    family: Mukherjee
  - given: Josiah P.
    family: Hanna
  - given: Robert D
    family: Nowak
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1413-1422
  id: mukherjee22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1413
  lastpage: 1422
  published: 2022-08-17 00:00:00 +0000
- title: 'Probabilistic surrogate networks for simulators with unbounded randomness'
  abstract: 'We present a framework for automatically structuring and training fast, approximate, deep neural surrogates of stochastic simulators. Unlike traditional approaches to surrogate modeling, our surrogates retain the interpretable structure and control flow of the reference simulator. Our surrogates target stochastic simulators where the number of random variables itself can be stochastic and potentially unbounded. Our framework further enables an automatic replacement of the reference simulator with the surrogate when undertaking amortized inference. The fidelity and speed of our surrogates allow for both faster stochastic simulation and accurate and substantially faster posterior inference. Using an illustrative yet non-trivial example we show our surrogates’ ability to accurately model a probabilistic program with an unbounded number of random variables. We then proceed with an example that shows our surrogates are able to accurately model a complex structure like an unbounded stack in a program synthesis example. We further demonstrate how our surrogate modeling technique makes amortized inference in complex black-box simulators an order of magnitude faster. Specifically, we do simulator-based materials quality testing, inferring safety-critical latent internal temperature profiles of composite materials undergoing curing.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/munk22a.html
  PDF: https://proceedings.mlr.press/v180/munk22a/munk22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-munk22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Andreas
    family: Munk
  - given: Berend
    family: Zwartsenberg
  - given: Adam
    family: Ścibior
  - given: Atılım Güneş G.
    family: Baydin
  - given: Andrew
    family: Stewart
  - given: Goran
    family: Fernlund
  - given: Anoush
    family: Poursartip
  - given: Frank
    family: Wood
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1423-1433
  id: munk22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1423
  lastpage: 1433
  published: 2022-08-17 00:00:00 +0000
- title: 'Data augmentation in Bayesian neural networks and the cold posterior effect'
  abstract: 'Bayesian neural networks that incorporate data augmentation implicitly use a “randomly perturbed log-likelihood [which] does not have a clean interpretation as a valid likelihood function” (Izmailov et al. 2021). Here, we provide several approaches to developing principled Bayesian neural networks incorporating data augmentation. We introduce a “finite orbit” setting which allows valid likelihoods to be computed exactly, and for the more usual “full orbit” setting we derive multi-sample bounds tighter than those used previously. These models cast light on the origin of the cold posterior effect. In particular, we find that the cold posterior effect persists even in these principled models incorporating data augmentation. This suggests that the cold posterior effect cannot be dismissed as an artifact of data augmentation using incorrect likelihoods.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/nabarro22a.html
  PDF: https://proceedings.mlr.press/v180/nabarro22a/nabarro22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-nabarro22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Seth
    family: Nabarro
  - given: Stoil
    family: Ganev
  - given: Adrià
    family: Garriga-Alonso
  - given: Vincent
    family: Fortuin
  - given: Mark
    prefix: van der
    family: Wilk
  - given: Laurence
    family: Aitchison
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1434-1444
  id: nabarro22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1434
  lastpage: 1444
  published: 2022-08-17 00:00:00 +0000
- title: 'Semiparametric causal sufficient dimension reduction of multidimensional treatments'
  abstract: 'Cause-effect relationships are typically evaluated by comparing outcome responses to binary treatment values, representing two arms of a hypothetical randomized controlled trial. However, in certain applications, treatments of interest are continuous and multidimensional. For example, understanding the causal relationship between severity of radiation therapy, summarized by a multidimensional vector of radiation exposure values and post-treatment side effects is a problem of clinical interest in radiation oncology. An appropriate strategy for making interpretable causal conclusions is to reduce the dimension of treatment. If individual elements of a multidimensional treatment vector weakly affect the outcome, but the overall relationship between treatment and outcome is strong, careless approaches to dimension reduction may not preserve this relationship. Further, methods developed for regression problems do not directly transfer to causal inference due to confounding complications. In this paper, we use semiparametric inference theory for structural models to give a general approach to causal sufficient dimension reduction of a multidimensional treatment such that the cause-effect relationship between treatment and outcome is preserved. We illustrate the utility of our proposals through simulations and a real data application in radiation oncology.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/nabi22a.html
  PDF: https://proceedings.mlr.press/v180/nabi22a/nabi22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-nabi22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Razieh
    family: Nabi
  - given: Todd
    family: McNutt
  - given: Ilya
    family: Shpitser
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1445-1455
  id: nabi22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1445
  lastpage: 1455
  published: 2022-08-17 00:00:00 +0000
- title: 'Partially adaptive regularized multiple regression analysis for estimating linear causal effects'
  abstract: 'This paper assumes that cause-effect relationships among variables can be described with a linear structural equation model. Then, a situation is considered where a set of observed covariates satisfies the back-door criterion but the ordinary least squares method cannot be applied to estimate linear causal effects because of multicollinearity/high-dimensional data problems. In this situation, we propose a novel regression approach, the “partially adaptive L$_p$-regularized multiple regression analysis” (PAL$_p$MA) method for estimating the total effects. Different from standard regularized regression analysis, PAL$_p$MA provides a consistent or less-biased estimator of the linear causal effect. PAL$_p$MA is also applicable to evaluating direct effects through the single-door criterion.  Given space constraints, the proofs, some numerical experiments, and an industrial case study on setting up painting conditions of car bodies are provided in the Supplementary Material.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/nanmo22a.html
  PDF: https://proceedings.mlr.press/v180/nanmo22a/nanmo22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-nanmo22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Hisayoshi
    family: Nanmo
  - given: Manabu
    family: Kuroki
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1456-1465
  id: nanmo22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1456
  lastpage: 1465
  published: 2022-08-17 00:00:00 +0000
- title: 'Efficient learning of sparse and decomposable PDEs using random projection'
  abstract: 'Learning physics models in the form of Partial Differential Equations (PDEs) is carried out through back-propagation to match the simulations of the physics model with experimental observations. Nevertheless, such matching involves computation over billions of elements, presenting a significant computational overhead. We notice many PDEs in real world problems are sparse and decomposable, where the temporal updates and the spatial features are sparsely concentrated on small interface regions. We propose RAPID-PDE, an algorithm to expedite the learning of sparse and decomposable PDEs. Our RAPID-PDE first uses random projection to compress the high dimensional sparse updates and features into low dimensional representations and then use these compressed signals during learning. Crucially, such a conversion is only carried out once prior to learning and the entire learning process is conducted in the compressed space. Theoretically, we derive a constant factor approximation between the projected loss function and the original one with logarithmic number of projected dimensions. Empirically, we demonstrate RAPID-PDE with data compressed to 0.05% of its original size learns similar models compared with uncompressed algorithms in learning a set of phase-field models which govern the spatial-temporal dynamics of nano-scale structures in metallic materials.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/nasim22a.html
  PDF: https://proceedings.mlr.press/v180/nasim22a/nasim22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-nasim22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Md
    family: Nasim
  - given: Xinghang
    family: Zhang
  - given: Anter
    family: El-Azab
  - given: Yexiang
    family: Xue
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1466-1476
  id: nasim22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1466
  lastpage: 1476
  published: 2022-08-17 00:00:00 +0000
- title: 'Linearizing contextual bandits with latent state dynamics'
  abstract: 'In many real-world applications of multi-armed bandit problems, both rewards and contexts are often influenced by confounding latent variables which evolve stochastically over time. While the observed contexts and rewards are nonlinearly related, we show that prior knowledge of latent causal structure can be used to reduce the problem to the linear bandit setting. We develop two algorithms, Latent Linear Thompson Sampling (L2TS) and Latent Linear UCB (L2UCB), which use online EM algorithms for hidden Markov models to learn the latent transition model and maintain a posterior belief over the latent state, and then use the resulting posteriors as context features in a linear bandit problem. We upper bound the error in reward estimation in the presence of a dynamical latent state, and derive a novel problem-dependent regret bound for linear Thompson sampling with non-stationarity and unconstrained reward distributions, which we apply to L2TS under certain conditions. Finally, we demonstrate the superiority of our algorithms over related bandit algorithms through experiments.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/nelson22a.html
  PDF: https://proceedings.mlr.press/v180/nelson22a/nelson22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-nelson22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Elliot
    family: Nelson
  - given: Debarun
    family: Bhattacharjya
  - given: Tian
    family: Gao
  - given: Miao
    family: Liu
  - given: Djallel
    family: Bouneffouf
  - given: Pascal
    family: Poupart
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1477-1487
  id: nelson22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1477
  lastpage: 1487
  published: 2022-08-17 00:00:00 +0000
- title: 'CounteRGAN: Generating counterfactuals for real-time recourse and interpretability using residual GANs'
  abstract: 'Model interpretability, fairness, and recourse for end users have increased as machine learning models have become increasingly popular in areas including criminal justice, finance, healthcare, and job marketplaces. This work presents a novel method of addressing these issues by producing meaningful counterfactuals that are aimed at providing recourse to users and highlighting potential model biases. A meaningful counterfactual is a reasonable alternative scenario that illustrates how input data perturbations can influence the model’s output. The CounteRGAN method generates meaningful counterfactuals for a target classifier by utilizing a novel Residual Generative Adversarial Network (RGAN). We compare our method against leading state-of-the-art approaches on image and tabular datasets over a variety of performance metrics. The results indicate a significant improvement over existing techniques in combined metric performance, with a latency reduction of 2 to 7 orders of magnitude which enables providing real-time recourse to users. The code for reproducibility can be found here: https://github.com/gan-counterfactuals/countergan.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/nemirovsky22a.html
  PDF: https://proceedings.mlr.press/v180/nemirovsky22a/nemirovsky22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-nemirovsky22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Daniel
    family: Nemirovsky
  - given: Nicolas
    family: Thiebaut
  - given: Ye
    family: Xu
  - given: Abhishek
    family: Gupta
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1488-1497
  id: nemirovsky22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1488
  lastpage: 1497
  published: 2022-08-17 00:00:00 +0000
- title: 'Robust Bayesian recourse'
  abstract: 'Algorithmic recourse aims to recommend an informative feedback to overturn an unfavorable machine learning decision. We introduce in this paper the Bayesian recourse, a model-agnostic recourse that minimizes the posterior probability odds ratio. Further, we present its min-max robust counterpart with the goal of hedging against future changes in the machine learning model parameters. The robust counterpart explicitly takes into account possible perturbations of the data in a Gaussian mixture ambiguity set prescribed using the optimal transport (Wasserstein) distance. We show that the resulting worst-case objective function can be decomposed into solving a series of two-dimensional optimization subproblems, and the min-max recourse finding problem is thus amenable to a gradient descent algorithm. Contrary to existing methods for generating robust recourses, the robust Bayesian recourse does not require a linear approximation step. The numerical experiment demonstrates the effectiveness of our proposed robust Bayesian recourse facing model shifts. Our code is available at https://github.com/VinAIResearch/robust-bayesian-recourse.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/nguyen22a.html
  PDF: https://proceedings.mlr.press/v180/nguyen22a/nguyen22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-nguyen22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tuan-Duy H.
    family: Nguyen
  - given: Ngoc
    family: Bui
  - given: Duy
    family: Nguyen
  - given: Man-Chung
    family: Yue
  - given: Viet Anh
    family: Nguyen
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1498-1508
  id: nguyen22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1498
  lastpage: 1508
  published: 2022-08-17 00:00:00 +0000
- title: 'Efficient and accurate top-k recovery from choice data'
  abstract: 'The intersection of learning to rank and choice modeling is an active area of research with applications in e-commerce, information retrieval and the social sciences. In some applications such as recommendation systems, the statistician is primarily interested in recovering the set of the top ranked items from a large pool of items as efficiently as possible using passively collected discrete choice data, i.e., the user picks one item from a set of multiple items. Motivated by this practical consideration, we propose the choice-based Borda count algorithm as a fast and accurate ranking algorithm for top $K$-recovery i.e., correctly identifying all of the top $K$ items. We show that the choice-based Borda count algorithm has optimal sample complexity for top-$K$ recovery under a broad class of random utility models. We prove that in the limit, the choice-based Borda count algorithm produces the same top-$K$ estimate as the commonly used Maximum Likelihood Estimate method but the former’s speed and simplicity brings considerable advantages in practice. Experiments on both synthetic and real datasets show that the counting algorithm is competitive with commonly used ranking algorithms in terms of accuracy while being several orders of magnitude faster.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/nguyen22b.html
  PDF: https://proceedings.mlr.press/v180/nguyen22b/nguyen22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-nguyen22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Duc
    family: Nguyen
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1509-1518
  id: nguyen22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1509
  lastpage: 1518
  published: 2022-08-17 00:00:00 +0000
- title: 'Cycle class consistency with distributional optimal transport and knowledge distillation for unsupervised domain adaptation'
  abstract: 'Unsupervised domain adaptation (UDA) aims to transfer knowledge from a model trained on a labeled source domain to an unlabeled target domain. To this end, we propose in this paper a novel cycle class-consistent model based on optimal transport (OT) and knowledge distillation. The model consists of two agents, a teacher and a student cooperatively working in a cycle process under the guidance of the distributional optimal transport and distillation manner. The OT distance is designed to bridge the gap between the distribution of the target data and a distribution over the source class-conditional distributions. The optimal probability matrix then provides pseudo labels to learn a teacher that achieves a good classification performance on the target domain. Knowledge distillation is performed in the next step in which the teacher distills and transfers its knowledge to the student. And finally, the student produces its prediction for the optimal transport step. This process forms a closed cycle in which the teacher and student networks are simultaneously trained to conduct transfer learning from the source to the target domain. Extensive experiments show that our proposed method outperforms existing methods, especially the class-aware and OT-based ones on benchmark datasets including Office-31, Office-Home, and ImageCLEF-DA.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/nguyen22c.html
  PDF: https://proceedings.mlr.press/v180/nguyen22c/nguyen22c.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-nguyen22c.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tuan
    family: Nguyen
  - given: Van
    family: Nguyen
  - given: Trung
    family: Le
  - given: He
    family: Zhao
  - given: Quan Hung
    family: Tran
  - given: Dinh
    family: Phung
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1519-1529
  id: nguyen22c
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1519
  lastpage: 1529
  published: 2022-08-17 00:00:00 +0000
- title: 'Ordinal causal discovery'
  abstract: 'Causal discovery for purely observational, categorical data is a long-standing challenging problem. Unlike continuous data, the vast majority of existing methods for categorical data focus on inferring the Markov equivalence class only, which leaves the direction of some causal relationships undetermined. This paper proposes an identifiable ordinal causal discovery method that exploits the ordinal information contained in many real-world applications to uniquely identify the causal structure. The proposed method is applicable beyond ordinal data via data discretization. Through real-world and synthetic experiments, we demonstrate that the proposed ordinal causal discovery method combined with simple score-and-search algorithms has favorable and robust performance compared to state-of-the-art alternative methods in both ordinal categorical and non-categorical data. An accompanied R package OCD is freely available at the first author’s website.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/ni22a.html
  PDF: https://proceedings.mlr.press/v180/ni22a/ni22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-ni22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yang
    family: Ni
  - given: Bani
    family: Mallick
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1530-1540
  id: ni22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1530
  lastpage: 1540
  published: 2022-08-17 00:00:00 +0000
- title: 'An explore-then-commit algorithm for submodular maximization under full-bandit feedback'
  abstract: 'We investigate the problem of combinatorial multi-armed bandits with stochastic submodular (in expectation) rewards and full-bandit feedback, where no extra information other than the reward of selected action at each time step $t$ is observed. We propose a simple algorithm, Explore-Then-Commit Greedy (ETCG) and prove that it achieves a $(1-1/e)$-regret upper bound of $\mathcal{O}(n^\frac{1}{3}k^\frac{4}{3}T^\frac{2}{3}\log(T)^\frac{1}{2})$ for a horizon $T$, number of base elements $n$, and cardinality constraint $k$. We also show in experiments with synthetic and real-world data that the ETCG empirically outperforms other full-bandit methods.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/nie22a.html
  PDF: https://proceedings.mlr.press/v180/nie22a/nie22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-nie22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Guanyu
    family: Nie
  - given: Mridul
    family: Agarwal
  - given: Abhishek Kumar
    family: Umrawal
  - given: Vaneet
    family: Aggarwal
  - given: Christopher
    family: John Quinn
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1541-1551
  id: nie22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1541
  lastpage: 1551
  published: 2022-08-17 00:00:00 +0000
- title: 'Evaluating high-order predictive distributions in deep learning'
  abstract: 'Most work on supervised learning research has focused on marginal predictions. In decision problems, joint predictive distributions are essential for good performance. Previous work has developed methods for assessing low-order predictive distributions with inputs sampled i.i.d. from the testing distribution. With low-dimensional inputs, these methods distinguish agents that effectively estimate uncertainty from those that do not. We establish that the predictive distribution order required for such differentiation increases greatly with input dimension, rendering these methods impractical. To accommodate high-dimensional inputs, we introduce dyadic sampling, which focuses on predictive distributions associated with random pairs of inputs. We demonstrate that this approach efficiently distinguishes agents in high-dimensional examples involving simple logistic regression as well as complex synthetic and empirical data.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/osband22a.html
  PDF: https://proceedings.mlr.press/v180/osband22a/osband22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-osband22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Ian
    family: Osband
  - given: Zheng
    family: Wen
  - given: Seyed Mohammad
    family: Asghari
  - given: Vikranth
    family: Dwaracherla
  - given: Xiuyuan
    family: Lu
  - given: Benjamin
    family: Van Roy
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1552-1560
  id: osband22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1552
  lastpage: 1560
  published: 2022-08-17 00:00:00 +0000
- title: 'Understanding and mitigating the limitations of prioritized experience replay'
  abstract: 'Prioritized Experience Replay (ER) has been empirically shown to improve sample efficiency across many domains and attracted great attention; however, there is little theoretical understanding of why such prioritized sampling helps and its limitations. In this work, we take a deep look at the prioritized ER. In a supervised learning setting, we show the equivalence between the error-based prioritized sampling method for minimizing mean squared error and the uniform sampling for cubic power loss. We then provide theoretical insight into why error-based prioritized sampling improves convergence rate upon uniform sampling when minimizing mean squared error during early learning. Based on the insight, we further point out two limitations of the prioritized ER method: 1) outdated priorities and 2) insufficient coverage of the sample space. To mitigate the limitations, we propose our model-based stochastic gradient Langevin dynamics sampling method. We show that our method does provide states distributed close to an ideal prioritized sampling distribution estimated by the brute-force method, which does not suffer from the two limitations. We conduct experiments on both discrete and continuous control problems to show our approach’s efficacy and examine the practical implication of our method in an autonomous driving application. '
  volume: 180
  URL: https://proceedings.mlr.press/v180/pan22a.html
  PDF: https://proceedings.mlr.press/v180/pan22a/pan22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-pan22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yangchen
    family: Pan
  - given: Jincheng
    family: Mei
  - given: Amir-massoud
    family: Farahmand
  - given: Martha
    family: White
  - given: Hengshuai
    family: Yao
  - given: Mohsen
    family: Rohani
  - given: Jun
    family: Luo
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1561-1571
  id: pan22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1561
  lastpage: 1571
  published: 2022-08-17 00:00:00 +0000
- title: 'Robust learning of tractable probabilistic models'
  abstract: 'Tractable probabilistic models (TPMs) compactly represent a joint probability distribution over a large number of random variables and admit polynomial  time computation of (1) exact likelihoods; (2) marginal probability distributions over a small subset of variables given evidence; and (3) in some cases most probable explanations over all non-observed variables given observations. In this paper, we leverage these tractability properties to solve the robust maximum likelihood parameter estimation task in TPMs under the assumption that a TPM structure and complete training data is provided as input. Specifically, we show that TPMs learned by optimizing the likelihood perform poorly when data is subject to adversarial attacks/noise/perturbations/corruption and we can address this issue by optimizing robust likelihood. To this end, we develop an efficient approach for constructing uncertainty sets that model data corruption in TPMs and derive an efficient gradient-based local search method for learning TPMs that are robust against these uncertainty sets. We empirically demonstrate the efficacy of our proposed approach on a collection of benchmark datasets.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/peddi22a.html
  PDF: https://proceedings.mlr.press/v180/peddi22a/peddi22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-peddi22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Rohith
    family: Peddi
  - given: Tahrima
    family: Rahman
  - given: Vibhav
    family: Gogate
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1572-1581
  id: peddi22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1572
  lastpage: 1581
  published: 2022-08-17 00:00:00 +0000
- title: 'Attribution of predictive uncertainties in classification models'
  abstract: 'Predictive uncertainties in classification tasks are often a consequence of model inadequacy or insufficient training data. In popular applications, such as image processing, we are often required to scrutinise these uncertainties by meaningfully attributing them to input features. This helps to improve interpretability assessments. However, there exist few effective frameworks for this purpose. Vanilla forms of popular methods for the provision of saliency masks, such as SHAP or integrated gradients, adapt poorly to target measures of uncertainty. Thus, state-of-the-art tools instead proceed by creating counterfactual or adversarial feature vectors, and assign attributions by direct comparison to original images. In this paper, we present a novel framework that combines path integrals, counterfactual explanations and generative models, in order to procure attributions that contain few observable artefacts or noise. We evidence that this outperforms existing alternatives through quantitative evaluations with popular benchmarking methods and data sets of varying complexity.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/perez22a.html
  PDF: https://proceedings.mlr.press/v180/perez22a/perez22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-perez22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Iker
    family: Perez
  - given: Piotr
    family: Skalski
  - given: Alec
    family: Barns-Graham
  - given: Jason
    family: Wong
  - given: David
    family: Sutton
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1582-1591
  id: perez22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1582
  lastpage: 1591
  published: 2022-08-17 00:00:00 +0000
- title: 'Learning large Bayesian networks with expert constraints'
  abstract: 'We propose a new score-based algorithm for learning the structure of a Bayesian Network (BN). It is the first algorithm that simultaneously supports the requirements of (i) learning a BN of bounded treewidth, (ii) satisfying expert constraints, including positive and negative ancestry properties between nodes, and (iii) scaling up to BNs with several thousand nodes. The algorithm operates in two phases. In Phase 1, we utilize a modified version of an existing BN structure learning algorithm, modified to generate an initial Directed Acyclic Graph (DAG) that supports a portion of the given constraints. In Phase 2, we follow the BN-SLIM framework, introduced by Peruvemba Ramaswamy and Szeider (AAAI 2021). We improve the initial DAG by repeatedly running a MaxSAT solver on selected local parts. The MaxSAT encoding entails local versions of the expert constraints as hard constraints. We evaluate a prototype implementation of our algorithm on several standard benchmark sets. The encouraging results demonstrate the power and flexibility of the BN-SLIM framework. It boosts the score while increasing the number of satisfied expert constraints.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/peruvemba-ramaswamy22a.html
  PDF: https://proceedings.mlr.press/v180/peruvemba-ramaswamy22a/peruvemba-ramaswamy22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-peruvemba-ramaswamy22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Vaidyanathan
    family: Peruvemba Ramaswamy
  - given: Stefan
    family: Szeider
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1592-1601
  id: peruvemba-ramaswamy22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1592
  lastpage: 1601
  published: 2022-08-17 00:00:00 +0000
- title: 'AND/OR branch-and-bound for computational protein design optimizing K*'
  abstract: 'The importance of designing proteins, such as high affinity antibodies, has become ever more apparent.  Computational Protein Design can cast such design problems as optimization tasks with the objective of maximizing K*, an approximation of binding affinity.  Here we lay out a graphical model framework for K* optimization that enables use of compact AND/OR search algorithms. We designed an AND/OR branch-and-bound algorithm, AOBB-K*, for optimizing K* that is guided by a new K* heuristic and can incorporate specialized performance improvements with theoretical guarantees. As AOBB-K* is inspired by algorithms from the well studied task of Marginal MAP, this work provides a foundation for harnessing advancements in state-of-the-art mixed inference schemes and adapting them to protein design.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/pezeshki22a.html
  PDF: https://proceedings.mlr.press/v180/pezeshki22a/pezeshki22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-pezeshki22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Bobak
    family: Pezeshki
  - given: Radu
    family: Marinescu
  - given: Alexander
    family: Ihler
  - given: Rina
    family: Dechter
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1602-1612
  id: pezeshki22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1602
  lastpage: 1612
  published: 2022-08-17 00:00:00 +0000
- title: 'Identifiability of sparse causal effects using instrumental variables'
  abstract: 'Exogenous heterogeneity, for example, in the form of instrumental variables can help us learn a system’s underlying causal structure and predict the outcome of unseen intervention experiments. In this paper, we consider linear models in which the causal effect from covariates X on a response Y is sparse. We provide conditions under which the causal coefficient becomes identifiable from the observed distribution. These conditions can be satisfied even if the number of instruments is as small as the number of causal parents. We also develop graphical criteria under which identifiability holds with probability one if the edge coefficients are sampled randomly from a distribution that is absolutely continuous with respect to Lebesgue measure and $Y$ is childless.  As an estimator, we propose spaceIV and prove that it consistently estimates the causal effect if the model is identifiable and evaluate its performance on simulated data. If identifiability does not hold, we show that it may still be possible to recover a subset of the causal parents.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/pfister22a.html
  PDF: https://proceedings.mlr.press/v180/pfister22a/pfister22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-pfister22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Niklas
    family: Pfister
  - given: Jonas
    family: Peters
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1613-1622
  id: pfister22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1613
  lastpage: 1622
  published: 2022-08-17 00:00:00 +0000
- title: 'Bayesian quantile and expectile optimisation'
  abstract: 'Bayesian optimisation (BO) is widely used  to optimise stochastic black box functions. While most BO approaches focus on optimising conditional expectations, many applications require risk-averse strategies and alternative criteria accounting for the distribution tails need to be considered. In this paper, we propose new variational models for Bayesian quantile and expectile regression that are well-suited for heteroscedastic noise settings. Our models consist of two latent Gaussian processes accounting respectively for the conditional quantile (or expectile) and the scale parameter of an asymmetric likelihood functions. Furthermore, we propose two BO strategies based on max-value entropy search and Thompson sampling, that are tailored to such models and that can accommodate large batches of points. Contrary to existing BO approaches for risk-averse optimisation, our strategies can directly optimise for the quantile and expectile, without requiring replicating observations or assuming a parametric form for the noise. As illustrated in the experimental section, the proposed approach clearly outperforms the state of the art in the heteroscedastic, non-Gaussian case.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/picheny22a.html
  PDF: https://proceedings.mlr.press/v180/picheny22a/picheny22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-picheny22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Victor
    family: Picheny
  - given: Henry
    family: Moss
  - given: Léonard
    family: Torossian
  - given: Nicolas
    family: Durrande
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1623-1633
  id: picheny22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1623
  lastpage: 1633
  published: 2022-08-17 00:00:00 +0000
- title: 'Using hierarchies to efficiently combine evidence with Dempster’s rule of combination'
  abstract: 'Dempster’s rule of combination allows us to combine various independent pieces of evidence that each have a certain degree of uncertainty. This provides a useful way for dealing with uncertain evidence, but the rule is computationally intractable. In this paper, we analyze the complexity of this rule for differently structured bodies of evidence and we consider a known algorithm by Shafer and Logan to compute this rule efficiently over a hierarchical set of evidence. We show that one can check in polynomial time whether an arbitrary set of evidence has a hierarchical shape, enabling the use of Shafer and Logan’s algorithm. Moreover, we consider two different approaches to deal with non-hierarchical sets of evidence: (i) considering hierarchical subsets and (ii) taking advantage of internal hierarchical structures in the overall set. For the former case, we conclude that getting different hierarchies from an arbitrary set of pieces of evidence corresponds to the VERTEX COVER problem and we present algorithms for obtaining these hierarchies based on this correspondence. For the latter case, we present a fixed-parameter tractable algorithm which computes the belief function of any piece of evidence included in the set.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/pinto-prieto22a.html
  PDF: https://proceedings.mlr.press/v180/pinto-prieto22a/pinto-prieto22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-pinto-prieto22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Daira
    family: Pinto Prieto
  - given: Ronald
    prefix: de
    family: Haan
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1634-1643
  id: pinto-prieto22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1634
  lastpage: 1643
  published: 2022-08-17 00:00:00 +0000
- title: 'Voronoi density estimator for high-dimensional data: Computation, compactification and convergence'
  abstract: 'The Voronoi Density Estimator (VDE) is an established density estimation technique that adapts to the local geometry of data. However, its applicability has been so far limited to problems in two and three dimensions. This is because Voronoi cells rapidly increase in complexity as dimensions grow, making the necessary explicit computations infeasible. We define a variant of the VDE deemed Compactified Voronoi Density Estimator (CVDE), suitable for higher dimensions. We propose computationally efficient algorithms for numerical approximation of the CVDE and formally prove convergence of the estimated density to the original one. We implement and empirically validate the CVDE through a comparison with the Kernel Density Estimator (KDE). Our results indicate that the CVDE outperforms the KDE on sound and image data.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/polianskii22a.html
  PDF: https://proceedings.mlr.press/v180/polianskii22a/polianskii22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-polianskii22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Vladislav
    family: Polianskii
  - given: Giovanni Luca
    family: Marchetti
  - given: Alexander
    family: Kravberg
  - given: Anastasiia
    family: Varava
  - given: Florian T.
    family: Pokorny
  - given: Danica
    family: Kragic
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1644-1653
  id: polianskii22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1644
  lastpage: 1653
  published: 2022-08-17 00:00:00 +0000
- title: 'Clustering a union of linear subspaces via matrix factorization and innovation search'
  abstract: 'This paper focuses on the Matrix Factorization based Clustering (MFC) method  which is one of the few closed-form algorithms for the subspace clustering algorithm. Despite being simple, closed-form, and computation-efficient, MFC can outperform the other sophisticated subspace clustering methods in many challenging scenarios. We reveal the connection between MFC and the Innovation Pursuit (iPursuit) algorithm which was shown to be able to outperform the other spectral clustering based methods with a notable margin especially when the span of clusters are close. A novel theoretical study is presented which sheds light on the key performance factors of both algorithms (MFC/iPursuit) and  it is shown that both algorithms can be robust to notable intersections between the span of  clusters. Importantly, in contrast to the theoretical guarantees of other algorithms which emphasized on the distance between the subspaces as the key performance factor and without making the innovation assumption, it is shown that the performance of MFC/iPursuit mainly depends on the distance between the innovative components of the clusters.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/rahmani22a.html
  PDF: https://proceedings.mlr.press/v180/rahmani22a/rahmani22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-rahmani22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Mostafa
    family: Rahmani
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1654-1664
  id: rahmani22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1654
  lastpage: 1664
  published: 2022-08-17 00:00:00 +0000
- title: 'Learning in Markov games: Can we exploit a general-sum opponent?'
  abstract: 'In this paper, we study the learning problem in two-player general-sum Markov Games. We consider the online setting where we control a single player, playing against an arbitrary opponent to minimize the regret. Previous works only consider the zero-sum Markov Games setting, in which the two agents are completely adversarial. However, in some cases, the two agents may have different reward functions without having conflicting objectives. This involves a stronger notion of regret than the one used in previous works. This class of games, called general-sum Markov Games is far to be well understood and studied. We show that the new regret minimization problem is significantly harder than in standard Markov Decision Processes and zero-sum Markov Games. To do this, we derive a lower bound on the expected regret of any “good” learning strategy which shows the constant dependencies with the number of deterministic policies, which is not present in zerosum Markov Games and Markov Decision Processes. Then we propose a novel optimistic algorithm that nearly matches the proposed lower bound. Proving these results requires overcoming several new challenges that are not present in Markov Decision Processes or zero-sum Markov Games.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/ramponi22a.html
  PDF: https://proceedings.mlr.press/v180/ramponi22a/ramponi22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-ramponi22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Giorgia
    family: Ramponi
  - given: Marcello
    family: Restelli
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1665-1675
  id: ramponi22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1665
  lastpage: 1675
  published: 2022-08-17 00:00:00 +0000
- title: 'Expectation programming: Adapting probabilistic programming systems to estimate expectations efficiently'
  abstract: 'We show that the standard computational pipeline of probabilistic programming systems (PPSs) can be inefficient for estimating expectations and introduce the concept of expectation programming to address this. In expectation programming, the aim of the backend inference engine is to directly estimate expected return values of programs, as opposed to approximating their conditional distributions. This distinction, while subtle, allows us to achieve substantial performance improvements over the standard PPS computational pipeline by tailoring computation to the expectation we care about. We realize a particular instance of our expectation programming concept, Expectation Programming in Turing (EPT), by extending the PPS Turing to allow so-called target-aware inference to be run automatically. We then verify the statistical soundness of EPT theoretically, and show that it provides substantial empirical gains in practice.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/reichelt22a.html
  PDF: https://proceedings.mlr.press/v180/reichelt22a/reichelt22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-reichelt22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tim
    family: Reichelt
  - given: Adam
    family: Goliński
  - given: Luke
    family: Ong
  - given: Tom
    family: Rainforth
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1676-1685
  id: reichelt22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1676
  lastpage: 1685
  published: 2022-08-17 00:00:00 +0000
- title: 'A free lunch from the noise: Provable and practical exploration for representation learning'
  abstract: 'Representation learning lies at the heart of the empirical success of deep learning for dealing with the curse of dimensionality. However, the power of representation learning has not been fully exploited yet in reinforcement learning (RL), due to i), the trade-off between expressiveness and tractability; and ii), the coupling between exploration and representation learning. In this paper, we first reveal the fact that under some noise assumption in the stochastic control model, we can obtain the linear spectral feature of its corresponding Markov transition operator in closed-form for free. Based on this observation, we propose Spectral Dynamics Embedding (SPEDE), which breaks the tradeoff and completes optimistic exploration for representation learning by exploiting the structure of the noise. We provide rigorous theoretical analysis of SPEDE, and demonstrate the practical superior performance over the existing state-of-the-art empirical algorithms on several benchmarks.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/ren22a.html
  PDF: https://proceedings.mlr.press/v180/ren22a/ren22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-ren22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tongzheng
    family: Ren
  - given: Tianjun
    family: Zhang
  - given: Csaba
    family: Szepesvári
  - given: Bo
    family: Dai
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1686-1696
  id: ren22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1686
  lastpage: 1696
  published: 2022-08-17 00:00:00 +0000
- title: 'Quantum perceptron revisited: Computational-statistical tradeoffs'
  abstract: 'Quantum machine learning algorithms could provide significant speed-ups over their classical counterparts; however, whether they could also achieve good generalization remains unclear. Recently, two quantum perceptron models which give a quadratic improvement over the classical perceptron algorithm using Grover’s search have been proposed by Wiebe et al. While the first model reduces the complexity with respect to the size of the training set, the second one improves the bound on the number of mistakes made by the perceptron. In this paper, we introduce a hybrid quantum-classical perceptron algorithm with lower complexity and better generalization ability than the classical perceptron. We show a quadratic improvement over the classical perceptron in both the number of samples and the margin of the data. We derive a bound on the expected error of the hypothesis returned by our algorithm, which compares favorably to the one obtained with the classical online perceptron. We use numerical experiments to illustrate the trade-off between computational complexity and statistical accuracy in quantum perceptron learning and discuss some of the key practical issues surrounding the implementation of quantum perceptron models into near-term quantum devices, whose practical implementation represents a serious challenge due to inherent noise. However, the potential benefits make correcting this worthwhile.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/roget22a.html
  PDF: https://proceedings.mlr.press/v180/roget22a/roget22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-roget22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Mathieu
    family: Roget
  - given: Giuseppe
    family: Di Molfetta
  - given: Hachem
    family: Kadri
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1697-1706
  id: roget22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1697
  lastpage: 1706
  published: 2022-08-17 00:00:00 +0000
- title: 'Resolving label uncertainty with implicit posterior models'
  abstract: 'We propose a method for jointly inferring labels across a collection of data samples, where each sample consists of an observation and a prior belief about the label. By implicitly assuming the existence of a generative model for which a differentiable predictor is the posterior, we derive a training objective that allows learning under weak beliefs. This formulation unifies various machine learning settings; the weak beliefs can come in the form of noisy or incomplete labels, likelihoods given by a different prediction mechanism on auxiliary input, or common-sense priors reflecting knowledge about the structure of the problem at hand. We demonstrate the proposed algorithms on diverse problems: classification with negative training examples, learning from rankings, weakly and self-supervised aerial imagery segmentation, co-segmentation of video frames, and coarsely supervised text classification.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/rolf22a.html
  PDF: https://proceedings.mlr.press/v180/rolf22a/rolf22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-rolf22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Esther
    family: Rolf
  - given: Nikolay
    family: Malkin
  - given: Alexandros
    family: Graikos
  - given: Ana
    family: Jojic
  - given: Caleb
    family: Robinson
  - given: Nebojsa
    family: Jojic
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1707-1717
  id: rolf22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1707
  lastpage: 1717
  published: 2022-08-17 00:00:00 +0000
- title: 'Feature learning and random features in standard finite-width convolutional neural networks: An empirical study'
  abstract: 'The Neural Tangent Kernel is an important milestone in the ongoing effort to build a theory for deep learning. Its prediction that sufficiently wide neural networks behave as kernel methods, or equivalently as random feature models arising from linearized networks, has been confirmed empirically for certain wide architectures. In this paper, we compare the performance of two common finite-width convolutional neural networks, LeNet and AlexNet, to their linearizations on common benchmark datasets like MNIST and modified versions of it, CIFAR-10 and an ImageNet subset. We demonstrate empirically that finite-width neural networks, generally, greatly outperform the finite-width linearization of these architectures. When increasing the problem difficulty of the classification task, we observe a larger gap which is in line with common intuition that finite-width neural networks perform feature learning which finite-width linearizations cannot. At the same time, finite-width linearizations improve dramatically with width, approaching the behavior of the wider standard networks which in turn perform slightly better than their standard width counterparts. Therefore, it appears that feature learning for non-wide standard networks is important but becomes less significant with increasing width. We furthermore identify cases where both standard and linearized networks match in performance, in agreement with NTK theory, and a case where a wide linearization outperforms its standard width counterpart.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/samarin22a.html
  PDF: https://proceedings.mlr.press/v180/samarin22a/samarin22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-samarin22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Maxim
    family: Samarin
  - given: Volker
    family: Roth
  - given: David
    family: Belius
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1718-1727
  id: samarin22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1718
  lastpage: 1727
  published: 2022-08-17 00:00:00 +0000
- title: 'Robust identifiability in linear structural equation models of causal inference'
  abstract: 'We consider the problem of robust parameter estimation from observational data in the context of linear structural equation models (LSEMs). Under various conditions on LSEMs and the model parameters the prior work provides efficient algorithms to recover the parameters. However, these results are often about generic identifiability. In practice, generic identifiability is not sufficient and we need robust identifiability: small changes in the observational data should not affect the parameters by a huge amount. Robust identifiability has received far less attention and remains poorly understood.  Sankararaman et al. (2019) recently provided a set of sufficient conditions on parameters under which robust identifiability is feasible. However, a limitation of their work is that their results only apply to a small sub-class of LSEMs, called “bow-free paths.” In this work, we show that for any “bow-free model”, in all but $\frac{1}{\poly(n)}$-measure of instances robust identifiability holds. Moreover, whenever an instance is robustly identifiable, the algorithm proposed in Foygel et al., (2012) can be used to recover the parameters in a robust fashion. In contrast, for generic identifiability Foygel et al., (2012) proved that with measure $1$, instances are generically identifiable. Thus, we show that robust identifiability is a strictly harder problem than generic identifiability. Finally, we validate our results on both simulated and real-world datasets.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/sankararaman22a.html
  PDF: https://proceedings.mlr.press/v180/sankararaman22a/sankararaman22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-sankararaman22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Karthik A.
    family: Sankararaman
  - given: Anand
    family: Louis
  - given: Navin
    family: Goyal
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1728-1737
  id: sankararaman22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1728
  lastpage: 1737
  published: 2022-08-17 00:00:00 +0000
- title: 'How unfair is private learning?'
  abstract: 'As machine learning algorithms are deployed on sensitive data in critical decision making processes, it is becoming increasingly important that they are also private and fair. In this paper, we show that, when the data has a long-tailed structure, it is not possible to build accurate learning algorithms that are both private and results in higher accuracy on minority subpopulations. We further show that relaxing overall accuracy can lead to good fairness even with strict privacy requirements. To corroborate our theoretical results in practice, we provide an extensive set of experimental results using a variety of synthetic, vision (CIFAR-10 and CelebA), and tabular (Law School) datasets and learning algorithms.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/sanyal22a.html
  PDF: https://proceedings.mlr.press/v180/sanyal22a/sanyal22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-sanyal22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Amartya
    family: Sanyal
  - given: Yaxi
    family: Hu
  - given: Fanny
    family: Yang
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1738-1748
  id: sanyal22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1738
  lastpage: 1748
  published: 2022-08-17 00:00:00 +0000
- title: 'Probabilistic spatial transformer networks'
  abstract: 'Spatial Transformer Networks (STNs) estimate image transformations that can improve downstream tasks by ‘zooming in’ on relevant regions in an image. However, STNs are hard to train and sensitive to mis-predictions of transformations. To circumvent these limitations, we propose a probabilistic extension that estimates a stochastic transformation rather than a deterministic one. Marginalizing transformations allows us to consider each image at multiple poses, which makes the localization task easier and the training more robust. As an additional benefit, the stochastic transformations act as a localized, learned data augmentation that improves the downstream tasks. We show across standard imaging benchmarks and on a challenging real-world dataset that these two properties lead to improved classification performance, robustness and model calibration. We further demonstrate that the approach generalizes to non-visual domains by improving model performance on time-series data.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/schwobel22a.html
  PDF: https://proceedings.mlr.press/v180/schwobel22a/schwobel22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-schwobel22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Pola
    family: Schwöbel
  - given: Frederik Rahbæk
    family: Warburg
  - given: Martin
    family: Jørgensen
  - given: Kristoffer Hougaard
    family: Madsen
  - given: Søren
    family: Hauberg
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1749-1759
  id: schwobel22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1749
  lastpage: 1759
  published: 2022-08-17 00:00:00 +0000
- title: 'Learning functions on multiple sets using multi-set transformers'
  abstract: 'We propose a general deep architecture for learning functions on multiple permutation-invariant sets.  We also show how to generalize this architecture to sets of elements of any dimension by dimension equivariance. We demonstrate that our architecture is a universal approximator of these functions, and show superior results to existing methods on a variety of tasks including counting tasks, alignment tasks, distinguishability tasks and statistical distance measurements. This last task is quite important in Machine Learning.  Although our approach is quite general, we demonstrate that it can generate approximate estimates of KL divergence and mutual information that are more accurate than previous techniques that are specifically designed to approximate those statistical distances.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/selby22a.html
  PDF: https://proceedings.mlr.press/v180/selby22a/selby22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-selby22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Kira A.
    family: Selby
  - given: Ahmad
    family: Rashid
  - given: Ivan
    family: Kobyzev
  - given: Mehdi
    family: Rezagholizadeh
  - given: Pascal
    family: Poupart
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1760-1770
  id: selby22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1760
  lastpage: 1770
  published: 2022-08-17 00:00:00 +0000
- title: 'SymNet 2.0: Effectively handling Non-Fluents and Actions in Generalized Neural Policies for RDDL Relational MDPs'
  abstract: 'Relational MDPs (RMDPs) compactly represent an infinite set of MDPs with an unbounded number of objects. Solving an RMDP requires a generalized policy that applies to all instances of a domain. Recently, Garg et al. proposed SymNet for this task– it constructs a graph neural network that shares parameters across all instances in a domain, thus making it applicable to any instance in a zero-shot manner. Our analysis of SymNet reveals that it performs no better than random on 1/4th of planning competition domains. The key reasons are its design choices: it misses important information during graph construction, leading to (1) poor generalizability, and (2) potential non-identifiability of different actions. In response, our solution, SymNet2.0, substantially augments SymNet’s graph construction approach by introducing additional nodes and edges which allow a better transfer of important information about a domain. It also improves SymNet’s action decoders with relevant information from objects to make different actions identifiable during scoring. Extensive experiments on twelve competition domains, where we use imitation learning over data generated from the PROST planner, demonstrate that SymNet2.0 performs vastly better than SymNet. Interestingly, even though SymNet2.0 is trained over data from PROST, it outperforms the planner on several test instances due to former’s ability to scale to large instances in a zero-shot manner.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/sharma22a.html
  PDF: https://proceedings.mlr.press/v180/sharma22a/sharma22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-sharma22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Vishal
    family: Sharma
  - given: Daman
    family: Arora
  - given: Florian
    family: Geißer
  - given: Mausam
    family: 
  - given: Parag
    family: Singla
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1771-1781
  id: sharma22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1771
  lastpage: 1781
  published: 2022-08-17 00:00:00 +0000
- title: 'Reframed GES with a neural conditional dependence measure'
  abstract: 'In a nonparametric setting, the causal structure is often identifiable only up to Markov equivalence, and for the purpose of causal inference, it is useful to learn a graphical representation of the Markov equivalence class (MEC).  In this paper, we revisit the Greedy Equivalence Search (GES) algorithm, which is widely cited as a score-based algorithm for learning the MEC of the underlying causal structure. We observe that in order to make the GES algorithm consistent in a nonparametric setting, it is not necessary to design a scoring metric that evaluates graphs. Instead, it suffices to plug in a consistent estimator of a measure of conditional dependence to guide the search. We therefore present a reframing of the GES algorithm, which is more flexible than the standard score-based version and readily lends itself to the nonparametric setting with a general measure of conditional dependence. In addition, we propose a neural conditional dependence (NCD) measure, which utilizes the expressive power of deep neural networks to characterize conditional independence in a nonparametric manner. We establish the optimality of the reframed GES algorithm under standard assumptions and the consistency of using our NCD estimator to decide conditional independence. Together these results justify the proposed approach. Experimental results demonstrate the effectiveness of our method in causal discovery, as well as the advantages of using our NCD measure over kernel-based measures.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/shen22a.html
  PDF: https://proceedings.mlr.press/v180/shen22a/shen22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-shen22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Xinwei
    family: Shen
  - given: Shengyu
    family: Zhu
  - given: Jiji
    family: Zhang
  - given: Shoubo
    family: Hu
  - given: Zhitang
    family: Chen
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1782-1791
  id: shen22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1782
  lastpage: 1791
  published: 2022-08-17 00:00:00 +0000
- title: 'Conditional simulation using diffusion Schrödinger bridges'
  abstract: 'Denoising diffusion models have recently emerged as a powerful class of generative models. They provide state-of-the-art results, not only for unconditional simulation, but also when used to solve conditional simulation problems arising in a wide range of inverse problems. A limitation of these models is that they are computationally intensive at generation time as they require simulating a diffusion process over a long time horizon. When performing unconditional simulation, a Schr{ö}dinger bridge formulation of generative modeling leads to a theoretically grounded algorithm shortening generation time which is complementary to other proposed acceleration techniques. We extend the Schrödinger bridge framework to conditional simulation. We demonstrate this novel methodology on various applications including image super-resolution, optimal filtering for state-space models and the refinement of pre-trained networks. Our code can be found at https://github.com/vdeborto/cdsb.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/shi22a.html
  PDF: https://proceedings.mlr.press/v180/shi22a/shi22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-shi22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yuyang
    family: Shi
  - given: Valentin
    family: De Bortoli
  - given: George
    family: Deligiannidis
  - given: Arnaud
    family: Doucet
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1792-1802
  id: shi22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1792
  lastpage: 1802
  published: 2022-08-17 00:00:00 +0000
- title: 'Neural ensemble search via Bayesian sampling'
  abstract: 'Recently, neural architecture search (NAS) has been applied to automate the design of neural networks in real-world applications. A large number of algorithms have been developed to improve the search cost or the performance of the final selected architectures in NAS. Unfortunately, these NAS algorithms aim to select only one single well-performing architecture from their search spaces and thus have overlooked the capability of neural network ensemble (i.e., an ensemble of neural networks with diverse architectures) in achieving improved performance over a single final selected architecture. To this end, we introduce a novel neural ensemble search algorithm, called neural ensemble search via Bayesian sampling (NESBS), to effectively and efficiently select well-performing neural network ensembles from a NAS search space. In our extensive experiments, NESBS algorithm is shown to be able to achieve improved performance over state-of-the-art NAS algorithms while incurring a comparable search cost, thus indicating the superior performance of our NESBS algorithm over these NAS algorithms in practice.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/shu22a.html
  PDF: https://proceedings.mlr.press/v180/shu22a/shu22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-shu22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yao
    family: Shu
  - given: Yizhou
    family: Chen
  - given: Zhongxiang
    family: Dai
  - given: Bryan Kian Hsiang
    family: Low
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1803-1812
  id: shu22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1803
  lastpage: 1812
  published: 2022-08-17 00:00:00 +0000
- title: 'Shifted compression framework: generalizations and improvements'
  abstract: 'Communication is one of the key bottlenecks in the distributed training of large-scale machine learning models, and lossy compression of exchanged information, such as stochastic gradients or models, is one of the most effective instruments to alleviate this issue. Among the most studied compression techniques is the class of unbiased compression operators with variance bounded by a multiple of the square norm of the vector we wish to compress. By design, this variance may remain high, and only diminishes if the input vector approaches zero. However, unless the model being trained is overparameterized, there is no a-priori reason for the vectors we wish to compress to approach zero during the iterations of classical methods such as distributed compressed {\sf SGD}, which has adverse effects on the convergence speed. Due to this issue, several more elaborate and seemingly very different algorithms have been proposed recently, with the goal of circumventing this issue. These methods are based on the idea of compressing the {\em difference} between the vector we would normally wish to compress and some auxiliary vector which changes throughout the iterative process. In this work we take a step back, and develop a unified framework for studying such methods, conceptually, and theoretically. Our framework incorporates methods compressing both gradients and models, using unbiased and biased compressors, and sheds light on the construction of the auxiliary vectors. Furthermore, our general framework can lead to the improvement of several existing algorithms, and can produce new algorithms. Finally, we performed several numerical experiments which illustrate and support our theoretical findings.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/shulgin22a.html
  PDF: https://proceedings.mlr.press/v180/shulgin22a/shulgin22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-shulgin22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Egor
    family: Shulgin
  - given: Peter
    family: Richtárik
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1813-1823
  id: shulgin22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1813
  lastpage: 1823
  published: 2022-08-17 00:00:00 +0000
- title: 'PAC-Bayesian domain adaptation bounds for multiclass learners'
  abstract: 'Multiclass neural networks are a common tool in modern unsupervised domain adaptation, yet an appropriate theoretical description for their non-uniform sample complexity is lacking in the adaptation literature. To fill this gap, we propose the first PAC-Bayesian adaptation bounds for multiclass learners. We facilitate practical use of our bounds by also proposing the first approximation techniques for the multiclass distribution divergences we consider. For divergences dependent on a Gibbs predictor, we propose additional PAC-Bayesian adaptation bounds which remove the need for inefficient Monte-Carlo estimation. Empirically, we test the efficacy of our proposed approximation techniques as well as some novel design-concepts which we include in our bounds. Finally, we apply our bounds to analyze a common adaptation algorithm that uses neural networks.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/sicilia22a.html
  PDF: https://proceedings.mlr.press/v180/sicilia22a/sicilia22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-sicilia22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Anthony
    family: Sicilia
  - given: Katherine
    family: Atwell
  - given: Malihe
    family: Alikhani
  - given: Seong Jae
    family: Hwang
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1824-1834
  id: sicilia22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1824
  lastpage: 1834
  published: 2022-08-17 00:00:00 +0000
- title: 'VQ-Flows: Vector quantized local normalizing flows'
  abstract: 'Normalizing flows provide an elegant approach to  generative modeling that allows for efficient sampling and exact density  evaluation of unknown data distributions. However, current techniques have  significant limitations in their expressivity when the data distribution  is supported on a low-dimensional manifold or has a non-trivial topology.  We introduce a novel statistical framework for learning a mixture of  local normalizing flows as “chart maps” over the data manifold.  Our framework augments the expressivity of recent approaches while  preserving the signature property of normalizing flows, that they admit  exact density evaluation. We learn a suitable atlas of charts for the data  manifold via a vector quantized auto-encoder (VQ-AE) and the distributions  over them using a conditional flow. We validate experimentally that our  probabilistic framework enables existing approaches to better model data  distributions over complex manifolds.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/sidheekh22a.html
  PDF: https://proceedings.mlr.press/v180/sidheekh22a/sidheekh22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-sidheekh22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Sahil
    family: Sidheekh
  - given: Chris B.
    family: Dock
  - given: Tushar
    family: Jain
  - given: Radu
    family: Balan
  - given: Maneesh K.
    family: Singh
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1835-1845
  id: sidheekh22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1835
  lastpage: 1845
  published: 2022-08-17 00:00:00 +0000
- title: 'Enhanced adaptive optics control with image to image translation'
  abstract: 'We aim to significantly enhance the science return of astronomical observatories, and in particular giant terrestrial optical telescopes. Observatories employ Adaptive Optics (AO) systems in order to acquire high sensitivity diffraction limited images of the sky. The incumbent “workhorse” for control of AO systems employs a linear real-time controller in a closed loop, with sensing of state performed via a (Shack-Hartmann) wavefront sensor (WFS). The actuators of a deformable mirror (DM) are driven, with the action performed in each iteration having a continuous representation as an array of DC voltages. The typical control regime is practical and scalable, nonetheless, there remains a residual uncompensated turbulence that leads to optical aberrations limiting the class of scientific assets that can be acquired. We have developed and trained a translational GAN model that accurately estimates residual perturbations from WFS images. Model inference occurs in 0.34 milliseconds using off-the-shelf GPU hardware, and is applicable for use in AO control where the control loop might be running at 500Hz. We develop an AO control regime with a second controller stage actuating a second DM controlled in an open loop according to the estimated residual turbulence. Using the open-source COMPASS tool for simulation, we are able to significantly improve the performance using our new regime.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/smith22a.html
  PDF: https://proceedings.mlr.press/v180/smith22a/smith22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-smith22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Jeffrey
    family: Smith
  - given: Jesse
    family: Cranney
  - given: Charles
    family: Gretton
  - given: Damien
    family: Gratadour
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1846-1856
  id: smith22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1846
  lastpage: 1856
  published: 2022-08-17 00:00:00 +0000
- title: 'Fast inference and transfer of compositional task structures for few-shot task generalization'
  abstract: 'We tackle real-world problems with complex structures beyond the pixel-based game or simulator. We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph that defines a set of subtasks and their dependencies that are unknown to the agent. Different from the previous meta-RL methods trying to directly infer the unstructured task embedding, our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks, and use it as a prior to improve the task inference in testing. Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks than various existing algorithms such as meta reinforcement learning, hierarchical reinforcement learning, and other heuristic agents.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/sohn22a.html
  PDF: https://proceedings.mlr.press/v180/sohn22a/sohn22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-sohn22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Sungryull
    family: Sohn
  - given: Hyunjae
    family: Woo
  - given: Jongwook
    family: Choi
  - given: Lyubing
    family: Qiang
  - given: Izzeddin
    family: Gur
  - given: Aleksandra
    family: Faust
  - given: Honglak
    family: Lee
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1857-1865
  id: sohn22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1857
  lastpage: 1865
  published: 2022-08-17 00:00:00 +0000
- title: 'Mutual information based Bayesian graph neural network for few-shot learning'
  abstract: 'In the deep neural network based few-shot learning, the limited training data may make the neural network extract ineffective features, which leads to inaccurate results. By Bayesian graph neural network (BGNN), the probability distributions on hidden layers imply useful features, and the few-shot learning could improved by establishing the correlation among features. Thus, in this paper, we incorporate mutual information (MI) into BGNN to describe the correlation, and propose an innovative framework by adopting the Bayesian network with continuous variables (BNCV) for effective calculation of MI. First, we build the BNCV simultaneously when calculating the probability distributions of features from the Dropout in hidden layers of BGNN. Then, we approximate the MI values efficiently by probabilistic inferences over BNCV. Finally, we give the correlation based loss function and training algorithm of our BGNN model. Experimental results show that our MI based BGNN framework is effective for few-shot learning and outperforms some state-of-the-art competitors by large margins on accuracy.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/song22a.html
  PDF: https://proceedings.mlr.press/v180/song22a/song22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-song22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Kaiyu
    family: Song
  - given: Kun
    family: Yue
  - given: Liang
    family: Duan
  - given: Mingze
    family: Yang
  - given: Angsheng
    family: Li
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1866-1875
  id: song22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1866
  lastpage: 1875
  published: 2022-08-17 00:00:00 +0000
- title: 'SMT-based weighted model integration with structure awareness'
  abstract: 'Weighted Model Integration (WMI) is a popular formalism aimed at unifying approaches for probabilistic inference in hybrid domains, involving logical and algebraic constraints. Despite a considerable amount of recent work, allowing WMI algorithms to scale with the complexity of the hybrid problem is still a challenge. In this paper we highlight some substantial limitations of existing state-of-the-art solutions, and develop an algorithm that combines SMT-based enumeration, an efficient technique in formal verification, with an effective encoding of the problem structure.  This allows our algorithm to avoid generating redundant models, resulting in substantial computational savings. An extensive experimental evaluation on both synthetic and real-world datasets confirms the advantage of the proposed solution over existing alternatives.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/spallitta22a.html
  PDF: https://proceedings.mlr.press/v180/spallitta22a/spallitta22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-spallitta22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Giuseppe
    family: Spallitta
  - given: Gabriele
    family: Masina
  - given: Paolo
    family: Morettin
  - given: Andrea
    family: Passerini
  - given: Roberto
    family: Sebastiani
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1876-1885
  id: spallitta22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1876
  lastpage: 1885
  published: 2022-08-17 00:00:00 +0000
- title: 'A robustness test for estimating total effects with covariate adjustment'
  abstract: 'Suppose we want to estimate a total effect with covariate adjustment in a linear structural equation model. We have a causal graph to decide what covariates to adjust for, but are uncertain about the graph. Here, we propose a testing procedure, that exploits the fact that there are multiple valid adjustment sets for the target total effect in the causal graph, to perform a robustness check on the graph. If the test rejects, it is a strong indication that we should not rely on the graph. We discuss what mistakes in the graph our testing procedure can detect and which ones it cannot and develop two strategies on how to select a list of valid adjustment sets for the procedure. We also connect our result to the related econometrics literature on coefficient stability tests.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/su22a.html
  PDF: https://proceedings.mlr.press/v180/su22a/su22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-su22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Zehao
    family: Su
  - given: Leonard
    family: Henckel
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1886-1895
  id: su22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1886
  lastpage: 1895
  published: 2022-08-17 00:00:00 +0000
- title: 'Simplified and unified analysis of various learning problems by reduction to Multiple-Instance Learning'
  abstract: 'In statistical learning, many problem formulations have been proposed so far, such as multi-class learning, complementarily labeled learning, multi-label learning, multi-task learning, which provide theoretical models for various real-world tasks. Although they have been extensively studied, the relationship among them has not been fully investigated. In this work, we focus on a particular problem formulation called Multiple-Instance Learning (MIL), and show that various learning problems including all the problems mentioned above with some of new problems can be reduced to MIL with theoretically guaranteed generalization bounds, where the reductions are established under a new reduction scheme we provide as a by-product. The results imply that the MIL-reduction gives a simplified and unified framework for designing and analyzing algorithms for various learning problems. Moreover, we show that the MIL-reduction framework can be kernelized.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/suehiro22a.html
  PDF: https://proceedings.mlr.press/v180/suehiro22a/suehiro22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-suehiro22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Daiki
    family: Suehiro
  - given: Eiji
    family: Takimoto
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1896-1906
  id: suehiro22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1896
  lastpage: 1906
  published: 2022-08-17 00:00:00 +0000
- title: 'Marginal MAP estimation for inverse RL under occlusion with observer noise'
  abstract: 'We consider the problem of learning the behavioral preferences of an expert engaged in a task from noisy and partially-observable demonstrations. This is motivated by real-world applications such as a line robot learning from observing a human worker, where some observations are occluded by environmental elements. Furthermore, robotic perception tends to be imperfect and noisy. Previous techniques for inverse reinforcement learning (IRL) take the approach of either omitting the missing portions or inferring it as part of expectation-maximization, which tends to be slow and prone to local optima. We present a new method that generalizes the well-known Bayesian maximum-a-posteriori (MAP) IRL method by marginalizing the occluded portions of the trajectory. This is then extended with an observation model to account for perception noise. This novel application of marginal MAP (MMAP) to IRL significantly improves on the previous IRL technique under occlusion in both formative evaluations on a toy problem and in a summative evaluation on a produce sorting line task by a physical robot.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/suresh22a.html
  PDF: https://proceedings.mlr.press/v180/suresh22a/suresh22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-suresh22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Prasanth Sengadu
    family: Suresh
  - given: Prashant
    family: Doshi
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1907-1916
  id: suresh22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1907
  lastpage: 1916
  published: 2022-08-17 00:00:00 +0000
- title: 'High-probability bounds for robust stochastic Frank-Wolfe algorithm'
  abstract: 'We develop and analyze robust Stochastic Frank-Wolfe type algorithms for projection-free stochastic convex optimization problems with heavy-tailed stochastic gradients. Existing works on the oracle complexity of such algorithms require a uniformly bounded variance assumption, and hold only in expectation. We develop tight high-probability bounds for robust versions of Stochastic Frank-Wolfe type algorithm under heavy-tailed assumptions, including infinite variance, on the stochastic gradient. Our methodological construction of the robust Stochastic Frank-Wolfe type algorithms leverage techniques from the robust statistic literature. Our theoretical analysis highlights the need to utilize robust versions of Stochastic Frank-Wolfe type algorithm for dealing with heavy-tailed data arising in practice.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/tang22a.html
  PDF: https://proceedings.mlr.press/v180/tang22a/tang22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-tang22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tongyi
    family: Tang
  - given: Krishna
    family: Balasubramanian
  - given: Thomas
    family: Chun Man Lee
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1917-1927
  id: tang22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1917
  lastpage: 1927
  published: 2022-08-17 00:00:00 +0000
- title: 'Contrastive latent variable models for neural text generation'
  abstract: 'Deep latent variable models such as variational autoencoders and energy-based models are widely used for neural text generation. Most of them focus on matching the prior distribution with the posterior distribution of the latent variable for text reconstruction. In addition to instance-level reconstruction, this paper aims to integrate contrastive learning in the latent space, forcing the latent variables to learn high-level semantics by exploring inter-instance relationships. Experiments on various text generation benchmarks show the effectiveness of our proposed method. We also empirically show that our method can mitigate the posterior collapse issue for latent variable based text generation models. '
  volume: 180
  URL: https://proceedings.mlr.press/v180/teng22a.html
  PDF: https://proceedings.mlr.press/v180/teng22a/teng22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-teng22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Zhiyang
    family: Teng
  - given: Chenhua
    family: Chen
  - given: Yan
    family: Zhang
  - given: Yue
    family: Zhang
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1928-1938
  id: teng22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1928
  lastpage: 1938
  published: 2022-08-17 00:00:00 +0000
- title: 'Semi-supervised novelty detection using ensembles with regularized disagreement'
  abstract: 'Deep neural networks often predict samples with high confidence even when they come from unseen classes and should instead be flagged for expert evaluation.  Current novelty detection algorithms cannot reliably identify such near OOD points unless they have access to labeled data that is similar to these novel samples. In this paper, we develop a new ensemble-based procedure for semi-supervised novelty detection (SSND) that successfully leverages a mixture of unlabeled ID and novel-class samples to achieve good detection performance.  In particular, we show how to achieve disagreement only on OOD data using early stopping regularization. While we prove this fact for a simple data distribution, our extensive experiments suggest that it holds true for more complex scenarios: our approach significantly outperforms state-of-the-art SSND methods on standard image data sets (SVHN/CIFAR-10/CIFAR-100) and medical image data sets with only a negligible increase in computation cost.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/tifrea22a.html
  PDF: https://proceedings.mlr.press/v180/tifrea22a/tifrea22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-tifrea22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Alexandru
    family: Tifrea
  - given: Eric
    family: Stavarache
  - given: Fanny
    family: Yang
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1939-1948
  id: tifrea22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1939
  lastpage: 1948
  published: 2022-08-17 00:00:00 +0000
- title: 'Efficient inference for dynamic topic modeling with large vocabularies'
  abstract: 'Dynamic topic modeling is a well established tool for capturing the temporal dynamics of the topics of a corpus. In this work, we develop a scalable dynamic topic model by utilizing the correlation among the words in the vocabulary. By correlating previously independent temporal processes for words, our new model allows us to reliably estimate the topic representations containing less frequent words. We develop an amortised variational inference method with self-normalised importance sampling approximation to the word distribution that dramatically reduces the computational complexity and the number of variational parameters in order to handle large vocabularies. With extensive experiments on text datasets, we show that our method significantly outperforms the previous works by modeling word correlations, and it is able to handle real world data with a large vocabulary which could not be processed by previous continuous dynamic topic models.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/tomasi22a.html
  PDF: https://proceedings.mlr.press/v180/tomasi22a/tomasi22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-tomasi22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Federico
    family: Tomasi
  - given: Mounia
    family: Lalmas
  - given: Zhenwen
    family: Dai
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1950-1959
  id: tomasi22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1950
  lastpage: 1959
  published: 2022-08-17 00:00:00 +0000
- title: 'Learning linear non-Gaussian polytree models'
  abstract: ' In the context of graphical causal discovery, we adapt the versatile framework of linear non-Gaussian acyclic models (LiNGAMs) to propose new algorithms to efficiently learn graphs that are polytrees.  Our approach combines the Chow–Liu algorithm, which first learns the undirected tree structure, with novel schemes to orient the edges.  The orientation schemes assess algebraic relations among moments of the data-generating distribution and are computationally inexpensive. We establish high-dimensional consistency results for our approach and compare different algorithmic versions in numerical experiments.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/tramontano22a.html
  PDF: https://proceedings.mlr.press/v180/tramontano22a/tramontano22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-tramontano22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Daniele
    family: Tramontano
  - given: Anthea
    family: Monod
  - given: Mathias
    family: Drton
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1960-1969
  id: tramontano22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1960
  lastpage: 1969
  published: 2022-08-17 00:00:00 +0000
- title: 'Multi-source domain adaptation via weighted joint distributions optimal transport'
  abstract: 'This work addresses the problem of domain adaptation on an unlabeled target dataset using knowledge from multiple labelled source datasets. Most current approaches tackle this problem by searching for an embedding that is invariant across source and target domains, which corresponds to searching for a universal classifier that works well on all domains. In this paper, we address this problem from a new perspective: instead of crushing diversity of the source distributions, we exploit it to adapt better to the target distribution. Our method, named Multi-Source Domain Adaptation via Weighted Joint Distribution Optimal Transport (MSDA-WJDOT), aims at finding simultaneously an Optimal Transport-based alignment between the source and target distributions and a re-weighting of the sources distributions. We discuss the theoret- ical aspects of the method and propose a conceptually simple algorithm. Numerical experiments indicate that the proposed method achieves state-of- the-art performance on simulated and real datasets.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/turrisi22a.html
  PDF: https://proceedings.mlr.press/v180/turrisi22a/turrisi22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-turrisi22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Rosanna
    family: Turrisi
  - given: Rémi
    family: Flamary
  - given: Alain
    family: Rakotomamonjy
  - given: Massimiliano
    family: Pontil
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1970-1980
  id: turrisi22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1970
  lastpage: 1980
  published: 2022-08-17 00:00:00 +0000
- title: 'Towards unsupervised open world semantic segmentation'
  abstract: 'For the semantic segmentation of images, state-of-the-art deep neural networks (DNNs) achieve high segmentation accuracy if that task is restricted to a closed set of classes. However, as of now DNNs have limited ability to operate in an open world, where they are tasked to identify pixels belonging to unknown objects and eventually to learn novel classes, incrementally. Humans have the capability to say: “I don’t know what that is, but I’ve already seen something like that”. Therefore, it is desirable to perform such an incremental learning task in an unsupervised fashion. We introduce a method where unknown objects are clustered based on visual similarity. Those clusters are utilized to define new classes and serve as training data for unsupervised incremental learning. More precisely, the connected components of a predicted semantic segmentation are assessed by a segmentation quality estimate. Connected components with a low estimated prediction quality are candidates for a subsequent clustering. Additionally, the component-wise quality assessment allows for obtaining predicted segmentation masks for the image regions potentially containing unknown objects. The respective pixels of such masks are pseudo-labeled and afterwards used for re-training the DNN, i.e., without the use of ground truth generated by humans. In our experiments we demonstrate that, without access to ground truth and even with few data, a DNN’s class space can be extended by a novel class, achieving considerable segmentation accuracy.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/uhlemeyer22a.html
  PDF: https://proceedings.mlr.press/v180/uhlemeyer22a/uhlemeyer22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-uhlemeyer22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Svenja
    family: Uhlemeyer
  - given: Matthias
    family: Rottmann
  - given: Hanno
    family: Gottschalk
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1981-1991
  id: uhlemeyer22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1981
  lastpage: 1991
  published: 2022-08-17 00:00:00 +0000
- title: 'Learning invariant weights in neural networks'
  abstract: 'Assumptions about invariances or symmetries in data can significantly increase the predictive power of statistical models. Many commonly used machine learning models are constraint to respect certain symmetries, such as translation equivariance in convolutional neural networks, and incorporating other symmetry types is actively being studied. Yet, learning invariances from the data itself remains an open research problem. It has been shown that the marginal likelihood offers a principled way to learn invariances in Gaussian Processes. We propose a weight-space equivalent to this approach, by minimizing a lower bound on the marginal likelihood to learn invariances in neural networks, resulting in naturally higher performing models.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/ouderaa22a.html
  PDF: https://proceedings.mlr.press/v180/ouderaa22a/ouderaa22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-ouderaa22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Tycho F.A.
    prefix: van der
    family: Ouderaa
  - given: Mark
    prefix: van der
    family: Wilk
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 1992-2001
  id: ouderaa22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 1992
  lastpage: 2001
  published: 2022-08-17 00:00:00 +0000
- title: 'Causal forecasting: generalization bounds for autoregressive models'
  abstract: 'Despite the increasing relevance of forecasting methods, causal implications of these algorithms remain largely unexplored. This is concerning considering that, even under simplifying assumptions such as causal sufficiency, the statistical risk of a model can differ significantly from its causal risk. Here, we study the problem of causal generalization—generalizing from the observational to interventional distributions—in forecasting. Our goal is to find answers to the question: How does the efficacy of an autoregressive (VAR) model in predicting statistical associations compare with its ability to predict under interventions? To this end, we introduce the framework of causal learning theory for forecasting. Using this framework, we obtain a characterization of the difference between statistical and causal risks, which helps identify sources of divergence between them. Under causal sufficiency, the problem of causal generalization amounts to learning under covariate shifts albeit with additional structure (restriction to interventional distributions under the VAR model). This structure allows us to obtain uniform convergence bounds on causal generalizability for the class of VAR models. To the best of our knowledge, this is the first work that provides theoretical guarantees for causal generalization in the time-series setting.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/vankadara22a.html
  PDF: https://proceedings.mlr.press/v180/vankadara22a/vankadara22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-vankadara22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Leena Chennuru
    family: Vankadara
  - given: Philipp Michael
    family: Faller
  - given: Michaela
    family: Hardt
  - given: Lenon
    family: Minorics
  - given: Debarghya
    family: Ghoshdastidar
  - given: Dominik
    family: Janzing
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2002-2012
  id: vankadara22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2002
  lastpage: 2012
  published: 2022-08-17 00:00:00 +0000
- title: 'Intervention target estimation in the presence of latent variables'
  abstract: 'This paper considers the problem of estimating unknown intervention targets in causal directed acyclic graphs from observational and interventional data in the presence of latent variables. The focus is on linear structural equation models with soft interventions. The existing approaches to this problem involve performing extensive conditional independence tests, and they estimate the unknown intervention targets alongside learning the structure of the causal model in its entirety. This joint learning approach results in algorithms that are not scalable as graph sizes grow. This paper proposes an approach that does not necessitate learning the entire causal model and focuses on learning only the intervention targets. The key idea of this approach is leveraging the property that interventions impose sparse changes in the precision matrix of a linear model. The proposed framework consists of a sequence of precision difference estimation steps. Furthermore, the necessary knowledge to refine an observational Markov equivalence class (MEC) to an interventional MEC is inferred. Simulation results are provided to illustrate the scalability of the proposed algorithm and compare it with those of the existing approaches.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/varici22a.html
  PDF: https://proceedings.mlr.press/v180/varici22a/varici22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-varici22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Burak
    family: Varici
  - given: Karthikeyan
    family: Shanmugam
  - given: Prasanna
    family: Sattigeri
  - given: Ali
    family: Tajer
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2013-2023
  id: varici22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2013
  lastpage: 2023
  published: 2022-08-17 00:00:00 +0000
- title: 'Bayesian federated estimation of causal effects from observational data'
  abstract: 'We propose a Bayesian framework for estimating causal effects from federated observational data sources. Bayesian causal inference is an important approach to learning the distribution of the causal estimands and understanding the uncertainty of causal effects. Our framework estimates the posterior distributions of the causal effects to compute the higher-order statistics that capture the uncertainty. We integrate local causal effects from different data sources without centralizing them. We then estimate the treatment effects from observational data using a non-parametric reformulation of the classical potential outcomes framework. We model the potential outcomes as a random function distributed by Gaussian processes, with defining parameters that can be efficiently learned from multiple data sources. Our method avoids exchanging raw data among the sources, thus contributing towards privacy-preserving causal learning. The promise of our approach is demonstrated through a set of simulated and real-world examples.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/vo22a.html
  PDF: https://proceedings.mlr.press/v180/vo22a/vo22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-vo22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Thanh Vinh
    family: Vo
  - given: Young
    family: Lee
  - given: Trong Nghia
    family: Hoang
  - given: Tze-Yun
    family: Leong
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2024-2034
  id: vo22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2024
  lastpage: 2034
  published: 2022-08-17 00:00:00 +0000
- title: 'Bias aware probabilistic Boolean matrix factorization'
  abstract: 'Boolean matrix factorization (BMF) is a combinatorial problem arising from a wide range of applications including recommendation system, collaborative filtering, and dimensionality reduction. Currently, the noise model of existing BMF methods is often assumed to be homoscedastic; however, in real world data scenarios, the deviations of observed data from their true values are almost surely diverse due to stochastic noises, making  each data point not equally suitable for fitting a model. In this case, it is not ideal to treat all data points as equally distributed. Motivated by such observations, we introduce a probabilistic BMF model that recognizes the object- and feature-wise bias distribution respectively, called bias aware BMF (BABF). To the best of our knowledge, BABF is the first approach for Boolean decomposition with consideration of the feature-wise and object-wise bias in binary data. We conducted experiments on datasets with different levels of background noise, bias level, and sizes of the signal patterns, to test the effectiveness of our method in various scenarios. We demonstrated that our model outperforms the state-of-the-art factorization methods in both accuracy and efficiency in recovering the original datasets, and the inferred bias level is highly  significantly correlated with true existing bias in both simulated and real world datasets. '
  volume: 180
  URL: https://proceedings.mlr.press/v180/wan22a.html
  PDF: https://proceedings.mlr.press/v180/wan22a/wan22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-wan22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Changlin
    family: Wan
  - given: Pengtao
    family: Dang
  - given: Tong
    family: Zhao
  - given: Yong
    family: Zang
  - given: Chi
    family: Zhang
  - given: Sha
    family: Cao
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2035-2044
  id: wan22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2035
  lastpage: 2044
  published: 2022-08-17 00:00:00 +0000
- title: 'Meta-learning without data via Wasserstein distributionally-robust model fusion'
  abstract: 'Existing meta-learning works assume that each task has available training and testing data. However, there are many available pre-trained models without accessing their training data in practice. We often need a single model to solve different tasks simultaneously as this is much more convenient to deploy the models.  Our work aims to meta-learn a model initialization from these pre-trained models without using corresponding training data. We name this challenging problem setting as Data-Free Learning To Learn (DFL2L). We propose a distributionally robust optimization (DRO) framework to learn a black-box model to fuse and compress all the pre-trained models into a single network to address this problem. To encourage good generalization to the unseen new tasks, the proposed DRO framework diversifies the learned task embedding associated with each pre-trained model to cover the diversity in the underlying training task distributions. A model initialization is sampled from the black-box network during meta-testing as the meta learned initialization. Extensive experiments on offline and online DFL2L settings and several real image datasets demonstrate the effectiveness of the proposed methods. '
  volume: 180
  URL: https://proceedings.mlr.press/v180/wang22a.html
  PDF: https://proceedings.mlr.press/v180/wang22a/wang22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-wang22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Zhenyi
    family: Wang
  - given: Xiaoyang
    family: Wang
  - given: Li
    family: Shen
  - given: Qiuling
    family: Suo
  - given: Kaiqiang
    family: Song
  - given: Dong
    family: Yu
  - given: Yan
    family: Shen
  - given: Mingchen
    family: Gao
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2045-2055
  id: wang22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2045
  lastpage: 2055
  published: 2022-08-17 00:00:00 +0000
- title: 'Detecting textual adversarial examples through randomized substitution and vote'
  abstract: 'A line of work has shown that natural text processing models are vulnerable to adversarial examples. Correspondingly, various defense methods are proposed to mitigate the threat of textual adversarial examples, \textit{e.g.} adversarial training, input transformations, detection, \textit{etc}. In this work, we treat the optimization process for synonym substitution based textual adversarial attacks as a specific sequence of word replacement, in which each word mutually influences other words. We identify that we could destroy such mutual interaction and eliminate the adversarial perturbation by randomly substituting a word with its synonyms. Based on this observation, we propose a novel textual adversarial example detection method, termed \textit{Randomized Substitution and Vote} (RS&V), which votes the prediction label by accumulating the logits of $k$ samples generated by randomly substituting the words in the input text with synonyms. The proposed RS&V is generally applicable to any existing neural networks without modification on the architecture or extra training, and it is orthogonal to prior work on making the classification network itself more robust. Empirical evaluations on three benchmark datasets demonstrate that our RS&V could detect the textual adversarial examples more successfully than the existing detection methods while maintaining the high classification accuracy on benign samples.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/wang22b.html
  PDF: https://proceedings.mlr.press/v180/wang22b/wang22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-wang22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Xiaosen
    family: Wang
  - given: Xiong
    family: Yifeng
  - given: Kun
    family: He
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2056-2065
  id: wang22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2056
  lastpage: 2065
  published: 2022-08-17 00:00:00 +0000
- title: 'ST-MAML : A stochastic-task based method for task-heterogeneous meta-learning'
  abstract: 'Optimization-based meta-learning typically assumes tasks are sampled from a single distribution - an assumption that oversimplifies and limits the diversity of tasks that meta-learning can model. Handling tasks from multiple distributions is challenging for meta-learning because it adds ambiguity to task identities. This paper proposes a novel method, ST-MAML, that empowers model-agnostic meta-learning (MAML) to learn from multiple task distributions. ST-MAML encodes tasks using a stochastic neural network module, that summarizes every task with a stochastic representation. The proposed Stochastic Task (ST) strategy learns a distribution of solutions for an ambiguous task and allows a meta-model to self-adapt to the current task. ST-MAML also propagates the task representation to enhance input variable encodings. Empirically, we demonstrate that ST-MAML outperforms the state-of-the-art on two few-shot image classification tasks, one curve regression benchmark, one image completion problem, and a real-world temperature prediction application.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/wang22c.html
  PDF: https://proceedings.mlr.press/v180/wang22c/wang22c.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-wang22c.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Zhe
    family: Wang
  - given: Jake
    family: Grigsby
  - given: Arshdeep
    family: Sekhon
  - given: Yanjun
    family: Qi
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2066-2074
  id: wang22c
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2066
  lastpage: 2074
  published: 2022-08-17 00:00:00 +0000
- title: 'Toward learning human-aligned cross-domain robust models by countering misaligned features'
  abstract: 'Machine learning has demonstrated remarkable prediction accuracy over i.i.d data, but the accuracy often drops when tested with data from another distribution. In this paper, we aim to offer another view of this problem in a perspective assuming the reason behind this accuracy drop is the reliance of models on the features that are not aligned well with how a data annotator considers similar across these two datasets. We refer to these features as misaligned features. We extend the conventional generalization error bound to a new one for this setup with the knowledge of how the misaligned features are associated with the label. Our analysis offers a set of techniques for this problem, and these techniques are naturally linked to many previous methods in robust machine learning literature. We also compared the empirical strength of these methods demonstrated the performance when these previous techniques are combined, with implementation available.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/wang22d.html
  PDF: https://proceedings.mlr.press/v180/wang22d/wang22d.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-wang22d.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Haohan
    family: Wang
  - given: Zeyi
    family: Huang
  - given: Hanlin
    family: Zhang
  - given: Yong Jae
    family: Lee
  - given: Eric P.
    family: Xing
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2075-2084
  id: wang22d
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2075
  lastpage: 2084
  published: 2022-08-17 00:00:00 +0000
- title: 'Generalized Bayesian quadrature with spectral kernels'
  abstract: 'Bayesian probabilistic integration, or Bayesian quadrature (BQ), has arisen as a popular means of numerical integral estimation with quantified uncertainty for problems where computational cost limits data availability. BQ leverages flexible Gaussian processes (GPs) to model an integrand which can be subsequently analytically integrated through properties of Gaussian distributions. However, BQ is inherently limited by the fact that the method relies on the use of a strict set of kernels for use in the GP model of the integrand, reducing the flexibility of the method in modeling varied integrand types. In this paper, we present spectral Bayesian quadrature, a form of Bayesian quadrature that allows for the use of any shift-invariant kernel in the integrand GP model while still maintaining the analytical tractability of the integral posterior, increasing the flexibility of BQ methods to address varied problem settings. Additionally our method enables integration with respect to a uniform expectation, effectively computing definite integrals of challenging integrands. We derive the theory and error bounds for this model, as well as demonstrate GBQ’s improved accuracy, flexibility, and data efficiency, compared to traditional BQ and other numerical integration methods, on a variety of quadrature problems.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/warren22a.html
  PDF: https://proceedings.mlr.press/v180/warren22a/warren22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-warren22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Houston
    family: Warren
  - given: Rafael
    family: Oliveira
  - given: Fabio
    family: Ramos
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2085-2095
  id: warren22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2085
  lastpage: 2095
  published: 2022-08-17 00:00:00 +0000
- title: 'Causal discovery under a confounder blanket'
  abstract: 'Inferring causal relationships from observational data is rarely straightforward, but the problem is especially difficult in high dimensions. For these applications, causal discovery algorithms typically require parametric restrictions or extreme sparsity constraints. We relax these assumptions and focus on an important but more specialized problem, namely recovering the causal order among a subgraph of variables known to descend from some (possibly large) set of confounding covariates, i.e. a $\textit{confounder blanket}$. This is useful in many settings, for example when studying a dynamic biomolecular subsystem with genetic data providing background information. Under a structural assumption called the $\textit{confounder blanket principle}$, which we argue is essential for tractable causal discovery in high dimensions, our method accommodates graphs of low or high sparsity while maintaining polynomial time complexity. We present a structure learning algorithm that is provably sound and complete with respect to a so-called $\textit{lazy oracle}$. We design inference procedures with finite sample error control for linear and nonlinear systems, and demonstrate our approach on a range of simulated and real-world datasets. An accompanying $\texttt{R}$ package, $\texttt{cbl}$, is available from $\texttt{CRAN}$.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/watson22a.html
  PDF: https://proceedings.mlr.press/v180/watson22a/watson22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-watson22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: David S.
    family: Watson
  - given: Ricardo
    family: Silva
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2096-2106
  id: watson22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2096
  lastpage: 2106
  published: 2022-08-17 00:00:00 +0000
- title: 'A new constructive criterion for Markov equivalence of MAGs'
  abstract: 'Ancestral graphs are an important tool for encoding causal knowledge as they represent uncertainty about the presence of latent confounding and selection bias, and they can be inferred from data. As for other graphical models, several maximal ancestral graphs (MAGs) may encode the same statistical information in the form of conditional independencies.  Such MAGs are said to be Markov equivalent. This work concerns graphical characterizations and computational aspects of Markov equivalence between MAGs. These issues have been studied in past years leading to several criteria and methods to test Markov equivalence. The state-of-the-art algorithm, provided by Hu and Evans [UAI 2020], runs in time $O(n^5)$ for instances with $n$ vertices. We propose a new constructive graphical criterion for the Markov equivalence of MAGs, which allows us to develop a practically effective equivalence test with worst-case runtime $O(n^3)$. Additionally, our criterion is expressed in terms of natural graphical concepts, which is of independent value.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/wienobst22a.html
  PDF: https://proceedings.mlr.press/v180/wienobst22a/wienobst22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-wienobst22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Marcel
    family: Wienöbst
  - given: Max
    family: Bannach
  - given: Maciej
    family: Liśkiewicz
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2107-2116
  id: wienobst22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2107
  lastpage: 2116
  published: 2022-08-17 00:00:00 +0000
- title: 'Residual bootstrap exploration for stochastic linear bandit'
  abstract: 'We propose a new bootstrap-based online algorithm for stochastic linear bandit problems. The key idea is to adopt residual bootstrap exploration, in which the agent estimates the next step reward by re-sampling the residuals of mean reward estimate. Our algorithm, residual bootstrap exploration for stochastic linear bandit (\texttt{LinReBoot}), estimates the linear reward from its re-sampling distribution and pulls the arm with the highest reward estimate. In particular, we contribute a theoretical framework to demystify residual bootstrap-based exploration mechanisms in stochastic linear bandit problems. The key insight is that the strength of bootstrap exploration is based on collaborated optimism between the online-learned model and the re-sampling distribution of residuals. Such observation enables us to show that the proposed \texttt{LinReBoot} secure a high-probability $\tilde{O}(d \sqrt{n})$ sub-linear regret under mild conditions. Our experiments support the easy generalizability of the \texttt{ReBoot} principle in the various formulations of linear bandit problems and show the significant computational efficiency of \texttt{LinReBoot}. '
  volume: 180
  URL: https://proceedings.mlr.press/v180/wu22a.html
  PDF: https://proceedings.mlr.press/v180/wu22a/wu22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-wu22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Shuang
    family: Wu
  - given: Chi-Hua
    family: Wang
  - given: Yuantong
    family: Li
  - given: Guang
    family: Cheng
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2117-2127
  id: wu22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2117
  lastpage: 2127
  published: 2022-08-17 00:00:00 +0000
- title: 'Differentially private multi-party data release for linear regression'
  abstract: 'Differentially Private (DP) data release is a promising technique to disseminate data without compromising the privacy of data subjects. However the majority of prior work has focused on scenarios where a single party owns all the data. In this paper we focus on the multi-party setting, where different stakeholders own disjoint sets of attributes belonging to the same group of data subjects. Within the context of linear regression that allow all parties to train models on the complete data without the ability to infer private attributes or identities of individuals, we start with directly applying Gaussian mechanism and show it has the small eigenvalue problem. We further propose our novel method and prove it asymptotically converges to the optimal (non-private) solutions with increasing dataset size. We substantiate the theoretical results through experiments on both artificial and real-world datasets.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/wu22b.html
  PDF: https://proceedings.mlr.press/v180/wu22b/wu22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-wu22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Ruihan
    family: Wu
  - given: Xin
    family: Yang
  - given: Yuanshun
    family: Yao
  - given: Jiankai
    family: Sun
  - given: Tianyi
    family: Liu
  - given: Q. Kilian
    family: Weinberger
  - given: Chong
    family: Wang
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2128-2137
  id: wu22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2128
  lastpage: 2137
  published: 2022-08-17 00:00:00 +0000
- title: 'Partial likelihood Thompson sampling'
  abstract: 'We consider the problem of deciding how best to target and prioritize existing vaccines that may offer protection against new variants of an infectious disease. Sequential experiments are a promising approach; however, challenges due to delayed feedback and the overall ebb and flow of disease prevalence make available methods inapplicable for this task. We present a method, partial likelihood Thompson sampling, that can handle these challenges. Our method involves running Thompson sampling with belief updates determined by partial likelihood each time we observe an event. To test our approach, we ran a semi-synthetic experiment based on 200 days of COVID-19 infection data in the US.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/wu22c.html
  PDF: https://proceedings.mlr.press/v180/wu22c/wu22c.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-wu22c.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Han
    family: Wu
  - given: Stefan
    family: Wager
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2138-2147
  id: wu22c
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2138
  lastpage: 2147
  published: 2022-08-17 00:00:00 +0000
- title: 'Fine-Grained matching with multi-perspective similarity modeling for cross-modal retrieval'
  abstract: 'Cross-modal retrieval relies on learning inter-modal correspondences. Most existing approaches focus on learning global or local correspondence and fail to explore fine-grained multi-level alignments. Moreover, it remains to be investigated how to infer more accurate similarity scores. In this paper, we propose a novel fine-grained matching with Multi-Perspective Similarity Modeling (MPSM) Network for cross-modal retrieval. Specifically, the Knowledge Graph Iterative Dissemination (KGID) module is designed to iteratively broadcast global semantic knowledge, enabling domain information to be integrated and relevant nodes to be associated, resulting in fine-grained modality representations. Subsequently, vector-based similarity representations are learned from multiple perspectives to model multi-level alignments comprehensively. The Relation Graph Reconstruction (SRGR) module is further developed to enhance cross-modal correspondence by constructing similarity relation graphs and adaptively reconstructing them. Extensive experiments on the Flickr30K and MSCOCO datasets validate that our model significantly outperforms several state-of-the-art baselines.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/xie22a.html
  PDF: https://proceedings.mlr.press/v180/xie22a/xie22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-xie22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Xiumin
    family: Xie
  - given: Chuanwen
    family: Hou
  - given: Zhixin
    family: Li
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2148-2158
  id: xie22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2148
  lastpage: 2158
  published: 2022-08-17 00:00:00 +0000
- title: 'Deterministic policy gradient: Convergence analysis'
  abstract: 'The deterministic policy gradient (DPG) method proposed in Silver et al. [2014] has been demonstrated to exhibit superior performance particularly for applications with multi-dimensional and continuous action spaces. However, it remains unclear whether DPG converges, and if so, how fast it converges and whether it converges as efficiently as other PG methods. In this paper, we provide a theoretical analysis of DPG to answer those questions. We study the single timescale DPG (often the case in practice) in both on-policy and off-policy settings, and show that both algorithms attain an $\epsilon$-accurate stationary policy with a sample complexity of $\mathcal{O}(\epsilon^{-2})$. Moreover, we establish the convergence rate for DPG under Gaussian noise exploration, which is widely adopted in practice to improve the performance of DPG. To our best knowledge, this is the first non-asymptotic convergence characterization for DPG methods.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/xiong22a.html
  PDF: https://proceedings.mlr.press/v180/xiong22a/xiong22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-xiong22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Huaqing.
    family: Xiong
  - given: Tengyu
    family: Xu
  - given: Lin
    family: Zhao
  - given: Yingbin
    family: Liang
  - given: Wei
    family: Zhang
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2159-2169
  id: xiong22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2159
  lastpage: 2169
  published: 2022-08-17 00:00:00 +0000
- title: 'Finite-horizon equilibria for neuro-symbolic concurrent stochastic games'
  abstract: 'We present novel techniques for neuro-symbolic concurrent stochastic games, a recently proposed modelling formalism to represent a set of probabilistic agents operating in a continuous-space environment using a combination of neural network based perception mechanisms and traditional symbolic methods. To date, only zero-sum variants of the model were studied, which is too restrictive when agents have distinct objectives. We formalise notions of equilibria for these models and present algorithms to synthesise them. Focusing on the finite-horizon setting, and (global) social welfare subgame-perfect optimality, we consider two distinct types: Nash equilibria and correlated equilibria. We first show that an exact solution based on backward induction may yield arbitrarily bad equilibria. We then propose an approximation algorithm called frozen subgame improvement, which proceeds through iterative solution of nonlinear programs. We develop a prototype implementation and demonstrate the benefits of our approach on two case studies: an automated car-parking system and an aircraft collision avoidance system.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/yan22a.html
  PDF: https://proceedings.mlr.press/v180/yan22a/yan22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-yan22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Rui
    family: Yan
  - given: Gabriel
    family: Santos
  - given: Xiaoming
    family: Duan
  - given: David
    family: Parker
  - given: Marta
    family: Kwiatkowska
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2170-2180
  id: yan22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2170
  lastpage: 2180
  published: 2022-08-17 00:00:00 +0000
- title: 'Addressing token uniformity in transformers via singular value transformation'
  abstract: 'Token uniformity is commonly observed in transformer-based models, in which different tokens share a large proportion of similar information after going through stacked multiple self-attention layers in a transformer. In this paper, we propose to use the distribution of singular values of outputs of each transformer layer to characterise the phenomenon of token uniformity and empirically illustrate that a less skewed singular value distribution can alleviate the token uniformity problem. Base on our observations, we define several desirable properties of singular value distributions and propose a novel transformation function for updating the singular values. We show that apart from alleviating token uniformity, the transformation function should preserve the local neighbourhood structure in the original embedding space. Our proposed singular value transformation function is applied to a range of transformer-based language models such as BERT, ALBERT, RoBERTa and DistilBERT, and improved performance is observed in semantic textual similarity evaluation and a range of GLUE tasks.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/yan22b.html
  PDF: https://proceedings.mlr.press/v180/yan22b/yan22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-yan22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Hanqi
    family: Yan
  - given: Lin
    family: Gui
  - given: Wenjie
    family: Li
  - given: Yulan
    family: He
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2181-2191
  id: yan22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2181
  lastpage: 2191
  published: 2022-08-17 00:00:00 +0000
- title: 'Differentially private SGDA for minimax problems'
  abstract: 'Stochastic gradient descent ascent (SGDA) and its variants have been the workhorse for solving minimax problems. However,  in contrast to the well-studied stochastic gradient descent (SGD) with differential privacy (DP) constraints,  there is  little work on understanding the generalization (utility)  of SGDA with DP constraints. In this paper, we use the algorithmic stability approach to establish the generalization (utility) of DP-SGDA in different settings. In particular, for the convex-concave setting, we prove that the DP-SGDA can achieve  an optimal utility rate in terms of the weak primal-dual population risk in both smooth and non-smooth cases. To our best knowledge, this is the first-ever-known result for DP-SGDA in the non-smooth case.  We further provide its  utility  analysis in   the nonconvex-strongly-concave setting which is  the  first-ever-known result in terms of the primal population risk.  The convergence and generalization results for this nonconvex setting  are new even in the non-private setting.  Finally,  numerical experiments are conducted to  demonstrate the effectiveness of DP-SGDA  for both convex and nonconvex cases.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/yang22a.html
  PDF: https://proceedings.mlr.press/v180/yang22a/yang22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-yang22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Zhenhuan
    family: Yang
  - given: Shu
    family: Hu
  - given: Yunwen
    family: Lei
  - given: Kush R
    family: Vashney
  - given: Siwei
    family: Lyu
  - given: Yiming
    family: Ying
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2192-2202
  id: yang22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2192
  lastpage: 2202
  published: 2022-08-17 00:00:00 +0000
- title: 'Self-supervised representations for multi-view reinforcement learning'
  abstract: 'Learning policies from raw, pixel images are quite important for the real-world application of deep reinforcement learning (RL). Standard model-free RL algorithms focus on single-view settings and unify the representation learning and policy learning into an end-to-end training process. However, such a learning paradigm is sample-inefficiency and sensitive to hyper-parameters when supervised merely by the reward signals. Based on this, we present Self-Supervised Representations (S2R) for multi-view reinforcement learning, a sample-efficient representation learning method for learning features from high-dimensional images. In S2R, we introduce a representation learning framework and define a novel multi-view auxiliary objective based on the multi-view image states and Conditional Entropy Bottleneck (CEB) principle. We integrate S2R with the deep RL agent to learn robust representations that preserve task-relevant information while discarding task-irrelevant information and find optimal policies that maximize the expected return. Empirically, we demonstrate the effectiveness of S2R in the visual DeepMind Control (DMControl) suite and show its better performance on the default DMControl tasks and their variants by replacing the tasks’ default background with a random image or natural video.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/yang22b.html
  PDF: https://proceedings.mlr.press/v180/yang22b/yang22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-yang22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Huanhuan
    family: Yang
  - given: Dianxi
    family: Shi
  - given: Guojun
    family: Xie
  - given: Yingxuan
    family: Peng
  - given: Yi
    family: Zhang
  - given: Yantai
    family: Yang
  - given: Shaowu
    family: Yang
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2203-2213
  id: yang22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2203
  lastpage: 2213
  published: 2022-08-17 00:00:00 +0000
- title: 'Robust textual embedding against word-level adversarial attacks'
  abstract: 'We attribute the vulnerability of natural language processing models to the fact that similar inputs are converted to dissimilar representations in the embedding space, leading to inconsistent outputs, and we propose a novel robust training method, termed \textit{Fast Triplet Metric Learning (FTML)}.  Specifically, we argue that the original sample should have similar representation with its adversarial counterparts and distinguish its representation from other samples for better robustness. To this end, we adopt the triplet metric learning into the standard training to pull words closer to their positive samples (\textit{i.e.}, synonyms) and push away their negative samples (\textit{i.e.}, non-synonyms) in the embedding space. Extensive experiments demonstrate that FTML can significantly promote the model robustness against various advanced adversarial attacks while keeping competitive classification accuracy on original samples. Besides, our method is efficient as it only needs to adjust the embedding and introduces very little overhead on the standard training. Our work shows great potential of improving the textual robustness through robust word embedding.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/yang22c.html
  PDF: https://proceedings.mlr.press/v180/yang22c/yang22c.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-yang22c.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yichen
    family: Yang
  - given: Xiaosen
    family: Wang
  - given: Kun
    family: He
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2214-2224
  id: yang22c
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2214
  lastpage: 2224
  published: 2022-08-17 00:00:00 +0000
- title: 'CoSPA: An improved masked language model with copy mechanism for Chinese spelling correction'
  abstract: 'Existing BERT-based models for Chinese spelling correction (CSC) have three issues. 1) Bert tends to rectify a correct low-frequency collocation into a high-frequency one and leads to over-correcting. 2) It fails to completely detect phonic or morphological errors by the current learned similarity knowledge between Chinese characters, and the recall rate still has room to improve. 3) Two-dimensional glyph information of Chinese characters is overlooked and some morphological misused characters may be difficult to detect. This paper proposes a hybrid approach, CoSPA, to address these issues. 1) This paper proposes an alterable copy mechanism to alleviate over-correcting by jointly learning to copy a correct character from input sentence, or generate a character from BERT. No method has used copy mechanism in BERT for CSC. 2) The attention mechanism is further applied on the phonic and shape representation of each character at the output layer. 3) Shape representation is enhanced by mining character glyph with ResNet, and fused with stroke representation via an adaptive gating unit. The experimental results show that CoSPA outperforms the previous state-of-the-art methods on SIGHAN2015 datasets.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/yang22d.html
  PDF: https://proceedings.mlr.press/v180/yang22d/yang22d.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-yang22d.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Shoujian
    family: Yang
  - given: Lian
    family: Yu
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2225-2234
  id: yang22d
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2225
  lastpage: 2234
  published: 2022-08-17 00:00:00 +0000
- title: 'Noisy L0-sparse subspace clustering on dimensionality reduced data'
  abstract: 'Sparse subspace clustering methods with sparsity induced by L0-norm, such as L0-Sparse Subspace Clustering (L0-SSC), are demonstrated to be more effective than its L1 counterpart such as Sparse Subspace Clustering (SSC). However, the theoretical analysis of L0-SSC is restricted to clean data that lie exactly in subspaces. Real data often suffer from noise and they may lie close to subspaces. In this paper, we show that an optimal solution to the optimization problem of noisy L0-SSC achieves subspace detection property (SDP), a key element with which data from different subspaces are separated, under deterministic and semi-random model. Our results provide theoretical guarantee on the correctness of noisy L0-SSC in terms of SDP on noisy data for the first time, which reveals the advantage of noisy L0-SSC in terms of much less restrictive condition on subspace affinity. In order to improve the efficiency of noisy L0-SSC, we propose Noisy-DR-L0-SSC which provably recovers the subspaces on dimensionality reduced data. Noisy-DR-L0-SSC first projects the data onto a lower dimensional space by random projection, then performs noisy L0-SSC on the dimensionality reduced data for improved efficiency. Experimental results demonstrate the effectiveness of Noisy-DR-L0-SSC.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/yang22e.html
  PDF: https://proceedings.mlr.press/v180/yang22e/yang22e.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-yang22e.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yingzhen
    family: Yang
  - given: Ping
    family: Li
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2235-2245
  id: yang22e
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2235
  lastpage: 2245
  published: 2022-08-17 00:00:00 +0000
- title: 'Pareto navigation gradient descent: a first-order algorithm for optimization in pareto set'
  abstract: 'Many modern machine learning applications, such as multi-task learning, require finding optimal model parameters to trade-off multiple objective functions that may conflict with each other. The notion of the Pareto set allows us to focus on the set of (often infinite number of) models that cannot be strictly improved. But it does not provide an actionable procedure for picking one or a few special models to return to practical users. In this paper, we consider optimization in Pareto set (OPT-in-Pareto), the problem of finding Pareto models that optimize an extra reference criterion function within the Pareto set. This function can either encode a specific preference from the users, or represent a generic diversity measure for obtaining a set of diversified Pareto models that are representative of the whole Pareto set. Unfortunately, despite being a highly useful framework, efficient algorithms for OPT-in-Pareto have been largely missing, especially for large-scale, non-convex, and non-linear objectives in deep learning. A naive approach is to apply Riemannian manifold gradient descent on the Pareto set, which yields a high computational cost due to the need for eigen-calculation of Hessian matrices. We propose a first-order algorithm that approximately solves OPT-in-Pareto using only gradient information, with both high practical efficiency and theoretically guaranteed convergence property. Empirically, we demonstrate that our method works efficiently for a variety of challenging multi-task-related problems.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/ye22a.html
  PDF: https://proceedings.mlr.press/v180/ye22a/ye22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-ye22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Mao
    family: Ye
  - given: Qiang
    family: Liu
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2246-2255
  id: ye22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2246
  lastpage: 2255
  published: 2022-08-17 00:00:00 +0000
- title: 'Future gradient descent for adapting the temporal shifting data distribution in online recommendation systems'
  abstract: 'One of the key challenges of learning an online recommendation model is the temporal domain shift, which causes the mismatch between the training and testing data distribution and hence domain generalization error. To overcome, we propose to learn a meta future gradient generator that forecasts the gradient information of the future data distribution for training so that the recommendation model can be trained as if we were able to look ahead at the future of its deployment. Compared with Batch Update, a widely used paradigm, our theory suggests that the proposed algorithm achieves smaller temporal domain generalization error measured by a gradient variation term in a local regret. We demonstrate the empirical advantage by comparing with various representative baselines.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/ye22b.html
  PDF: https://proceedings.mlr.press/v180/ye22b/ye22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-ye22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Mao
    family: Ye
  - given: Ruichen
    family: Jiang
  - given: Haoxiang
    family: Wang
  - given: Dhruv
    family: Choudhary
  - given: Xiaocong
    family: Du
  - given: Bhargav
    family: Bhushanam
  - given: Aryan
    family: Mokhtari
  - given: Arun
    family: Kejariwal
  - given: Qiang
    family: Liu
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2256-2266
  id: ye22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2256
  lastpage: 2266
  published: 2022-08-17 00:00:00 +0000
- title: 'Superposing many tickets into one: A performance booster for sparse neural network training'
  abstract: 'Recent works on sparse neural network training have shown that a compelling trade-off between performance and efficiency can be achieved. Existing sparse training methods usually strive to find the best sparse subnetwork possible in one single run, without involving any expensive dense or pre-training steps. For instance, dynamic sparse training (DST), as one of the most prominent directions,  is capable of reaching a competitive performance of dense training by iteratively evolving the sparse topology during the course of training. In this paper, we argue that it is better to allocate the limited resources to create multiple low-loss sparse subnetworks and superpose them into a stronger one, instead of allocating all resources entirely to find an individual subnetwork. To achieve this, two desiderata are required: (1) efficiently producing many low-loss subnetworks, the so-called cheap tickets, within one training process limited to the standard training time used in dense training; (2) effectively superposing these cheap tickets into one stronger subnetwork without going over the constrained parameter budget. To corroborate our conjecture, we present a novel sparse training approach, termed \textbf{Sup-tickets}, which can satisfy the above two desiderata concurrently in a single sparse-to-sparse training process. Across various models on CIFAR-10/100 and ImageNet, we show that Sup-tickets integrates seamlessly with the existing sparse training methods and demonstrates consistent performance improvement.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/yin22a.html
  PDF: https://proceedings.mlr.press/v180/yin22a/yin22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-yin22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Lu
    family: Yin
  - given: Vlado
    family: Menkovski
  - given: Meng
    family: Fang
  - given: Tianjin
    family: Huang
  - given: Yulong
    family: Pei
  - given: Mykola
    family: Pechenizkiy
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2267-2277
  id: yin22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2267
  lastpage: 2277
  published: 2022-08-17 00:00:00 +0000
- title: 'Offline stochastic shortest path: Learning, evaluation and towards optimality'
  abstract: 'Goal-oriented Reinforcement Learning, where the agent needs to reach the goal state while simultaneously minimizing the cost, has received significant attention in real-world applications. Its theoretical formulation, stochastic shortest path (SSP), has been intensively researched in the online setting. Nevertheless, it remains understudied when such an online interaction is prohibited and only historical data is provided. In this paper, we consider the offline stochastic shortest path problem when the state space and the action space are finite. We design the simple value iteration-based algorithms for tackling both offline policy evaluation (OPE) and offline policy learning tasks. Notably, our analysis of these simple algorithms yields strong instance-dependent bounds which can imply worst-case bounds that are near-minimax optimal. We hope our study could help illuminate the fundamental statistical limits of the offline SSP problem and motivate further studies beyond the scope of current consideration.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/yin22b.html
  PDF: https://proceedings.mlr.press/v180/yin22b/yin22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-yin22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Ming
    family: Yin
  - given: Wenjing
    family: Chen
  - given: Mengdi
    family: Wang
  - given: Yu-Xiang
    family: Wang
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2278-2288
  id: yin22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2278
  lastpage: 2288
  published: 2022-08-17 00:00:00 +0000
- title: 'Active learning with label comparisons'
  abstract: 'Supervised learning typically relies on manual annotation of the true labels. When there are many potential classes, searching for the best one can be prohibitive for a human annotator. On the other hand, comparing two candidate labels is often much easier. We focus on this type of pairwise supervision and ask how it can be used effectively in learning, and in particular in active learning. We obtain several insightful results in this context. In principle, finding the best of k labels can be done with k-1 active queries. We show that there is a natural class where this approach is sub-optimal, and that there is a more comparison-efficient active learning scheme. A key element in our analysis is the “label neighborhood graph” of the true distribution, which has an edge between two classes if they share a decision boundary. We also show that in the PAC setting, pairwise comparisons cannot provide improved sample complexity in the worst case. We complement our theoretical results with experiments, clearly demonstrating the effect of the neighborhood graph on sample complexity.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/yona22a.html
  PDF: https://proceedings.mlr.press/v180/yona22a/yona22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-yona22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Gal
    family: Yona
  - given: Shay
    family: Moran
  - given: Gal
    family: Elidan
  - given: Amir
    family: Globerson
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2289-2298
  id: yona22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2289
  lastpage: 2298
  published: 2022-08-17 00:00:00 +0000
- title: 'Cross-domain adaptive transfer reinforcement learning based on state-action correspondence'
  abstract: 'Despite the impressive success achieved in various domains, deep reinforcement learning (DRL) is still faced with the sample inefficiency problem.  Transfer learning (TL), which leverages prior knowledge from different but related tasks to accelerate the target task learning, has emerged as a promising direction to improve RL efficiency.  The majority of prior work considers TL across tasks with the same state-action spaces, while transferring across domains with different state-action spaces is relatively unexplored.  Furthermore, such existing cross-domain transfer approaches only enable transfer from a single source policy, leaving open the important question of how to best transfer from multiple source policies. This paper proposes a novel framework called Cross-domain Adaptive Transfer (CAT) to accelerate DRL. CAT learns the state-action correspondence from each source task to the target task and adaptively transfers knowledge from multiple source task policies to the target policy. CAT can be easily combined with existing DRL algorithms and experimental results show that CAT significantly accelerates learning and outperforms other cross-domain transfer methods on multiple continuous action control tasks.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/you22a.html
  PDF: https://proceedings.mlr.press/v180/you22a/you22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-you22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Heng
    family: You
  - given: Tianpei
    family: Yang
  - given: Yan
    family: Zheng
  - given: Jianye
    family: Hao
  - given: E.
    family: Taylor
    suffix: Matthew
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2299-2309
  id: you22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2299
  lastpage: 2309
  published: 2022-08-17 00:00:00 +0000
- title: 'Learning binary multi-scale games on networks'
  abstract: 'Network games are a natural modeling framework for strategic interactions of agents whose actions have local impact on others. Recently, a multi-scale network game model has been proposed to capture local effects at multiple network scales, such as among both individuals and groups. We propose a framework to learn the utility functions of binary multi-scale games from agents’ behavioral data. Departing from much prior work in this area, we model agent behavior as following logit-response dynamics, rather than acting according to a Nash equilibrium. This defines a generative time-series model  of joint behavior of both agents and groups, which enables us to naturally cast the learning problem as maximum likelihood estimation (MLE). We show that in the important special case of multi-scale linear-quadratic games, this MLE problem is convex. Extensive experiments using both synthetic and real data demonstrate that our proposed modeling and learning approach is effective in both game parameter estimation as well as prediction of future behavior, even when we learn the game from only a single behavior time series. Furthermore, we show how to use our framework to develop a statistical test for the existence of multi-scale structure in the game, and use it to demonstrate that real time-series data indeed exhibits such structure.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/yu22a.html
  PDF: https://proceedings.mlr.press/v180/yu22a/yu22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-yu22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Sixie
    family: Yu
  - given: P. Jeffrey
    family: Brantingham
  - given: Matthew
    family: Valasik
  - given: Yevgeniy
    family: Vorobeychik
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2310-2319
  id: yu22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2310
  lastpage: 2319
  published: 2022-08-17 00:00:00 +0000
- title: 'Predictive Whittle networks for time series'
  abstract: 'Recent developments have shown that modeling in the spectral domain improves the accuracy in time series forecasting. However, state-of-the-art neural spectral forecasters do not generally yield trustworthy predictions. In particular, they lack the means to gauge predictive likelihoods and provide uncertainty estimates. We propose predictive Whittle networks to bridge this gap, which exploit both the advances of neural forecasting in the spectral domain and leverage tractable likelihoods of probabilistic circuits. For this purpose, we propose a novel Whittle forecasting loss that makes use of these predictive likelihoods to guide the training of the neural forecasting component. We demonstrate how predictive Whittle networks improve real-world forecasting accuracy, while also allowing a transformation back into the time domain, in order to provide the necessary feedback of when the model’s prediction may become erratic.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/yu22b.html
  PDF: https://proceedings.mlr.press/v180/yu22b/yu22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-yu22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Zhongjie
    family: Yu
  - given: Fabrizio
    family: Ventola
  - given: Nils
    family: Thoma
  - given: Devendra Singh
    family: Dhami
  - given: Martin
    family: Mundt
  - given: Kristian
    family: Kersting
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2320-2330
  id: yu22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2320
  lastpage: 2330
  published: 2022-08-17 00:00:00 +0000
- title: 'Principle of relevant information for graph sparsification'
  abstract: 'Graph sparsification aims to reduce the number of edges of a graph while maintaining its structural properties. In this paper, we propose the first general and effective information-theoretic formulation of graph sparsification, by taking inspiration from the Principle of Relevant Information (PRI). To this end, we extend the PRI from a standard scalar random variable setting to structured data (i.e., graphs). Our Graph-PRI objective is achieved by operating on the graph Laplacian, made possible by expressing the graph Laplacian of a subgraph in terms of a sparse edge selection vector w. We provide both theoretical and empirical justifications on the validity of our Graph-PRI approach. We also analyze its analytical solutions in a few special cases. We finally present three representative real-world applications, namely graph sparsification, graph regularized multi-task learning, and medical imaging-derived brain network classification, to demonstrate the effectiveness, the versatility and the enhanced interpretability of our approach over prevalent sparsification techniques. Code of Graph-PRI is available at https://github.com/SJYuCNEL/PRI-Graphs.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/yu22c.html
  PDF: https://proceedings.mlr.press/v180/yu22c/yu22c.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-yu22c.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Shujian
    family: Yu
  - given: Francesco
    family: Alesiani
  - given: Wenzhe
    family: Yin
  - given: Robert
    family: Jenssen
  - given: Jose C.
    family: Principe
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2331-2341
  id: yu22c
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2331
  lastpage: 2341
  published: 2022-08-17 00:00:00 +0000
- title: 'Asymptotic optimality for active learning processes'
  abstract: 'Active Learning (AL) aims to optimize basic learned model(s) iteratively by selecting and annotating unlabeled data samples that are deemed to best maximise the model performance with minimal required data. However, the learned model is easy to overfit due to the biased distribution (sampling bias and dataset shift) formed by non-uniform sampling used in AL. Considering AL as an iterative sequential optimization process, we first provide a perspective on AL in terms of statistical properties, i.e., asymptotic unbiasedness, consistency and asymptotic efficiency, with respect to basic estimators when the sample size (size of labeled set) becomes large, and in the limit as sample size tends to infinity. We then discuss how biases affect AL. Finally, we proposed a flexible AL framework that aims to mitigate the impact of bias in AL by minimizing generalization error and importance-weighted training loss simultaneously.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/zhan22a.html
  PDF: https://proceedings.mlr.press/v180/zhan22a/zhan22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-zhan22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Xueying
    family: Zhan
  - given: Yaowei
    family: Wang
  - given: Antoni B.
    family: Chan
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2342-2352
  id: zhan22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2342
  lastpage: 2352
  published: 2022-08-17 00:00:00 +0000
- title: 'Distributed adversarial training to robustify deep neural networks at scale'
  abstract: 'Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification. To defend against such attacks, an effective and popular approach, known as adversarial training (AT), has been shown to mitigate the negative impact of adversarial attacks by virtue of a min-max robust training method. While effective, it remains unclear whether it can successfully be adapted to the distributed learning context. The power of distributed optimization over multiple machines enables us to scale up robust training over large models and datasets. Spurred by that, we propose distributed adversarial training (DAT), a large-batch adversarial training framework implemented over multiple machines. We show that DAT is general, which supports training over labeled and unlabeled data, multiple types of attack generation methods, and gradient compression operations favored for distributed optimization. Theoretically, we provide, under standard conditions in the optimization theory, the convergence rate of DAT to the first-order stationary points in general non-convex settings. Empirically, we demonstrate that DAT either matches or outperforms state-of-the-art robust accuracies and achieves a graceful training speedup (e.g., on ResNet-50 under ImageNet). Codes are available at https://github.com/dat-2022/dat.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/zhang22a.html
  PDF: https://proceedings.mlr.press/v180/zhang22a/zhang22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-zhang22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Gaoyuan
    family: Zhang
  - given: Songtao
    family: Lu
  - given: Yihua
    family: Zhang
  - given: Xiangyi
    family: Chen
  - given: Pin-Yu
    family: Chen
  - given: Quanfu
    family: Fan
  - given: Lee
    family: Martie
  - given: Lior
    family: Horesh
  - given: Mingyi
    family: Hong
  - given: Sijia
    family: Liu
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2353-2363
  id: zhang22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2353
  lastpage: 2363
  published: 2022-08-17 00:00:00 +0000
- title: 'Stability of SGD: Tightness analysis and improved bounds'
  abstract: 'Stochastic Gradient Descent (SGD) based methods have been widely used for training large-scale machine learning models that also generalize well in practice. Several explanations have been offered for this generalization performance, a prominent one being algorithmic stability Hardt et al [2016]. However, there are no known examples of smooth loss functions for which the analysis can be shown to be tight. Furthermore, apart from properties of the loss function, data distribution has also been shown to be an important factor in generalization performance. This raises the question: is the stability analysis of Hardt et al [2016] tight for smooth functions, and if not, for what kind of loss functions and data distributions can the stability analysis be improved? In this paper we first settle open questions regarding tightness of bounds in the data-independent setting: we show that for general datasets, the existing analysis for convex and strongly-convex loss functions is tight, but it can be improved for non-convex loss functions. Next, we give novel and improved data-dependent bounds: we show stability upper bounds for a large class of convex regularized loss functions, with negligible  regularization parameters, and improve existing data-dependent bounds in the non-convex setting. We hope that our results will initiate further efforts to better understand the data-dependent setting under non-convex loss functions, leading to an improved understanding of the generalization abilities of deep networks.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/zhang22b.html
  PDF: https://proceedings.mlr.press/v180/zhang22b/zhang22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-zhang22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yikai
    family: Zhang
  - given: Wenjia
    family: Zhang
  - given: Sammy
    family: Bald
  - given: Vamsi
    family: Pingali
  - given: Chao
    family: Chen
  - given: Mayank
    family: Goswami
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2364-2373
  id: zhang22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2364
  lastpage: 2373
  published: 2022-08-17 00:00:00 +0000
- title: 'Research on video adversarial attack with long living cycle'
  abstract: 'In recent years, the vulnerability of networks has attracted the attention of researchers. However, in these methods, the impact of video compression coding on the added adversarial perturbation, i.e., the robustness of the video adversarial example, is not considered. When an adversarial sample is just generated, its attack capability is the strongest. However, with multiple video encoding and video decoding in Internet transmission, the added adversarial disturbance will be continuously eliminated, eventually leading to the attack on the adversarial sample performance disappearing. We define this phenomenon as the decay of the lifetime of adversarial examples. We propose an adversarial attack method based on optimized integer space to resist this performance degradation. The robustness of anti-coding, the visual concealment, and the attack success rate are all considered during the attack process. In addition, we have also reduced the rounding loss caused by normalization in the deep neural network model process. The contributions of our methods are 1) We show the performance degradation caused by video compression coding on existing video adversarial attack methods, which seems an effective way for detecting of defending video adversarial examples. 2) A robust video adversarial attack method is proposed to resist video compression coding. The experiment shows that our method performs better on the robustness of anti-coding, visual concealment, and attack success rate.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/zhao22a.html
  PDF: https://proceedings.mlr.press/v180/zhao22a/zhao22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-zhao22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Zeyu
    family: Zhao
  - given: Ke
    family: Xu
  - given: Xinghao
    family: Jiang
  - given: Tanfeng
    family: Sun
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2374-2382
  id: zhao22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2374
  lastpage: 2382
  published: 2022-08-17 00:00:00 +0000
- title: 'Causal discovery with heterogeneous observational data'
  abstract: 'We consider the problem of causal discovery (structure learning) from heterogeneous observational data. Most existing methods assume homogeneous sampling scheme and causal mechanism, which may lead to misleading conclusions when violated. We propose a novel approach that exploits data heterogeneity to infer possibly cyclic causal structures from causally insufficient systems. The core idea is to model the direct causal effects as functions of exogenous covariates that help explain sampling and causal heterogeneity. We investigate the structure identifiability properties of the proposed model. Structure learning is carried out in a fully Bayesian fashion, which provides natural uncertainty quantification. We demonstrate its utility through extensive simulations and two real-world applications.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/zhou22a.html
  PDF: https://proceedings.mlr.press/v180/zhou22a/zhou22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-zhou22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Fangting
    family: Zhou
  - given: Kejun
    family: He
  - given: Yang
    family: Ni
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2383-2393
  id: zhou22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2383
  lastpage: 2393
  published: 2022-08-17 00:00:00 +0000
- title: 'Convergence Analysis of Linear Coupling with Inexact Proximal Operator'
  abstract: 'Linear coupling is recently proposed to accelerate first-order algorithms by linking gradient descent and mirror descent together, which is able to achieve the accelerated convergence rate for first-order algorithms. This work focuses on the convergence analysis of linear coupling for convex composite minimization when the proximal operator cannot be exactly computed. It is of particular interest to study the convergence of linear coupling because it not only achieves the accelerated convergence rate for first-order algorithm but also works for generic norms. We present convergence analysis of linear coupling by allowing the proximal operator to be computed up to a certain precision. Our analysis illustrates that the accelerated convergence rate of linear coupling with inexact proximal operator can be preserved if the error sequence of inexact proximal operator decreases in a sufficiently fast rate. More importantly, our analysis leads to better bounds than existing works on inexact proximal operator. Experiment results on several real-world datasets verify our theoretical results.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/zhou22b.html
  PDF: https://proceedings.mlr.press/v180/zhou22b/zhou22b.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-zhou22b.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Qiang
    family: Zhou
  - given: Sinno
    family: Jialin Pan
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2394-2403
  id: zhou22b
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2394
  lastpage: 2403
  published: 2022-08-17 00:00:00 +0000
- title: 'Information design for multiple independent and self-interested defenders: Work less, pay off more'
  abstract: 'This paper studies the problem of information design in a general security game setting in which multiple independent self-interested defenders attempt to provide protection simultaneously on the same set of important targets against an unknown attacker. A principal, who can be one of the defenders, has access to certain private information (i.e., attacker type) whereas other defenders do not. We investigate the  question of how that principal, with additional private information, can influence the decisions of the defenders by partially and strategically revealing her information. We focus on the algorithmic study of information design for private signaling in this game setting. In particular, we develop a polynomial-time ellipsoid algorithm to compute an optimal private signaling scheme. Our key finding is that the separation oracle in the ellipsoid approach can be carefully reduced to bipartite matching. Furthermore, we introduce a compact representation of any ex-ante persuasive signaling schemes by exploiting intrinsic security resource allocation structures, enabling us to compute an optimal scheme significantly faster. Our experiment results show that by strategically revealing private information, the principal can significantly enhance the protection effectiveness on the targets.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/zhou22c.html
  PDF: https://proceedings.mlr.press/v180/zhou22c/zhou22c.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-zhou22c.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Chenghan
    family: Zhou
  - given: Andrew
    family: Spivey
  - given: Haifeng
    family: Xu
  - given: Thanh
    family: Hong Nguyen
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2404-2413
  id: zhou22c
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2404
  lastpage: 2413
  published: 2022-08-17 00:00:00 +0000
- title: 'Causal inference with treatment measurement error: a nonparametric instrumental variable approach'
  abstract: 'We propose a kernel-based nonparametric estimator for the causal effect when the cause is corrupted by error. We do so by generalizing estimation in the instrumental variable setting. Despite significant work on regression with measurement error, additionally handling unobserved confounding in the continuous setting is non-trivial: we have seen little prior work. As a by-product of our investigation, we clarify a connection between mean embeddings and characteristic functions, and how learning one simultaneously allows one to learn the other. This opens the way for kernel method research to leverage existing results in characteristic function estimation. Finally, we empirically show that our proposed method, MEKIV, improves over baselines and is robust under changes in the strength of measurement error and to the type of error distributions.'
  volume: 180
  URL: https://proceedings.mlr.press/v180/zhu22a.html
  PDF: https://proceedings.mlr.press/v180/zhu22a/zhu22a.pdf
  edit: https://github.com/mlresearch//v180/edit/gh-pages/_posts/2022-08-17-zhu22a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence'
  publisher: 'PMLR'
  author: 
  - given: Yuchen
    family: Zhu
  - given: Limor
    family: Gultchin
  - given: Arthur
    family: Gretton
  - given: Matt J.
    family: Kusner
  - given: Ricardo
    family: Silva
  editor: 
  - given: James
    family: Cussens
  - given: Kun
    family: Zhang
  page: 2414-2424
  id: zhu22a
  issued:
    date-parts: 
      - 2022
      - 8
      - 17
  firstpage: 2414
  lastpage: 2424
  published: 2022-08-17 00:00:00 +0000
