- title: 'Asian Conference on Machine Learning: Preface' abstract: 'Preface to ACML 2021.' volume: 157 URL: https://proceedings.mlr.press/v157/balasubramanian21a.html PDF: https://proceedings.mlr.press/v157/balasubramanian21a/balasubramanian21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-balasubramanian21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: i-xiii id: balasubramanian21a issued: date-parts: - 2021 - 11 - 28 firstpage: i lastpage: xiii published: 2021-11-28 00:00:00 +0000 - title: 'Vector Transport Free Riemannian LBFGS for Optimization on Symmetric Positive Definite Matrix Manifolds' abstract: 'This work concentrates on optimization on Riemannian manifolds. The Limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) algorithm is a commonly used quasi-Newton method for numerical optimization in Euclidean spaces. Riemannian LBFGS (RLBFGS) is an extension of this method to Riemannian manifolds. RLBFGS involves computationally expensive vector transports as well as unfolding recursions using adjoint vector transports. In this article, we propose two mappings in the tangent space using the inverse second root and Cholesky decomposition. These mappings make both vector transport and adjoint vector transport identity and therefore isometric. Identity vector transport makes RLBFGS less computationally expensive and its isometry is also very useful in convergence analysis of RLBFGS. Moreover, under the proposed mappings, the Riemannian metric reduces to Euclidean inner product, which is much less computationally expensive. We focus on the Symmetric Positive Definite (SPD) manifolds which are beneficial in various fields such as data science and statistics. This work opens a research opportunity for extension of the proposed mappings to other well-known manifolds.' volume: 157 URL: https://proceedings.mlr.press/v157/godaz21a.html PDF: https://proceedings.mlr.press/v157/godaz21a/godaz21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-godaz21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Reza family: Godaz - given: Benyamin family: Ghojogh - given: Reshad family: Hosseini - given: Reza family: Monsefi - given: Fakhri family: Karray - given: Mark family: Crowley editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1-16 id: godaz21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1 lastpage: 16 published: 2021-11-28 00:00:00 +0000 - title: 'Understanding How Over-Parametrization Leads to Acceleration: A case of learning a single teacher neuron' abstract: 'Over-parametrization has become a popular technique in deep learning. It is observed that by over-parametrization, a larger neural network needs a fewer training iterations than a smaller one to achieve a certain level of performance — namely, over-parametrization leads to acceleration in optimization. However, despite that over-parametrization is widely used nowadays, little theory is available to explain the acceleration due to over-parametrization. In this paper, we propose understanding it by studying a simple problem first. Specifically, we consider the setting that there is a single teacher neuron with quadratic activation, where over-parametrization is realized by having multiple student neurons learn the data generated from the teacher neuron. We provably show that over-parametrization helps the iterate generated by gradient descent to enter the neighborhood of a global optimal solution that achieves zero testing error faster.' volume: 157 URL: https://proceedings.mlr.press/v157/wang21a.html PDF: https://proceedings.mlr.press/v157/wang21a/wang21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-wang21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Jun-Kun family: Wang - given: Jacob family: Abernethy editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 17-32 id: wang21a issued: date-parts: - 2021 - 11 - 28 firstpage: 17 lastpage: 32 published: 2021-11-28 00:00:00 +0000 - title: 'Hybrid Estimation for Open-Ended Questions with Early-Age Students’ Block-Based Programming Answers' abstract: 'Block-based programming is of great significance for cultivating children’s computational thinking. However, due to the following challenges, it is difficult to evaluate students’ programming ability in online learning systems: 1) compared with the traditional Online Judge (OJ) system, there is no standard answer for a given task in block-based programming; 2) in order to promote students’ interests, although the programs are not totally correct and unrelated to the task, the teacher will give a comparatively higher score. Therefore, current approaches involving output comparison and code analysis do not work effectively. Furthermore, deep learning methods also suffer from the problem of how to represent block code for classification. We propose a novel hybrid estimation model to address these challenges. We first learn graph embedding from the parsed Abstract Syntax Tree (AST) to present the logicality of the code. Next, we provide some methods to measure the workload and complexity of the code. Then, we extracted some key variables and task-irrelevant properties, introduced teacher bias. Finally, XGBoost was constructed for classification. Based on real-world data collected from an online Scratch platform by early-age students, our model outperforms KimCNN, ResNet-18, and Graph2Vec+XGBoost. Moreover, we provided statistical analyses and intuitive explanations to interpret the characteristics in various groups.' volume: 157 URL: https://proceedings.mlr.press/v157/wu21a.html PDF: https://proceedings.mlr.press/v157/wu21a/wu21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-wu21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Hao family: Wu - given: Tianyi family: Chen - given: Xianzhe family: Luo - given: Canghong family: Jin - given: Yun family: Zhang - given: Minghui family: Wu editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 33-48 id: wu21a issued: date-parts: - 2021 - 11 - 28 firstpage: 33 lastpage: 48 published: 2021-11-28 00:00:00 +0000 - title: 'The Power of Factorial Powers: New Parameter settings for (Stochastic) Optimization' abstract: 'The convergence rates for convex and non-convex optimization methods depend on the choice of a host of constants, including step-sizes, Lyapunov function constants and momentum constants. In this work we propose the use of factorial powers as a flexible tool for defining constants that appear in convergence proofs. We list a number of remarkable properties that these sequences enjoy, and show how they can be applied to convergence proofs to simplify or improve the convergence rates of the momentum method, accelerated gradient and the stochastic variance reduced method (SVRG).' volume: 157 URL: https://proceedings.mlr.press/v157/defazio21a.html PDF: https://proceedings.mlr.press/v157/defazio21a/defazio21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-defazio21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Aaron family: Defazio - given: Robert M. family: Gower editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 49-64 id: defazio21a issued: date-parts: - 2021 - 11 - 28 firstpage: 49 lastpage: 64 published: 2021-11-28 00:00:00 +0000 - title: 'Local Aggressive Adversarial Attacks on 3D Point Cloud' abstract: 'Deep neural networks are found to be prone to adversarial examples which could deliberately fool the model to make mistakes. Recently, a few of works expand this task from 2D image to 3D point cloud by using global point cloud optimization. However, the perturbations of global point are not effective for misleading the victim model. First, not all points are important in optimization toward misleading. Abundant points account considerable distortion budget but contribute trivially to attack. Second, the multi-label optimization is suboptimal for adversarial attack, since it consumes extra energy in finding multi-label victim model collapse and causes instance transformation to be dissimilar to any particular instance. Third, the independent adversarial and perceptibility losses, caring misclassification and dissimilarity separately, treat the updating of each point equally without a focus. Therefore, once perceptibility loss approaches its budget threshold, all points would be stock in the surface of hypersphere and attack would be locked in local optimality. Therefore, we propose a local aggressive adversarial attacks (L3A) to solve above issues. Technically, we select a bunch of salient points, the high-score subset of point cloud according to gradient, to perturb. Then a flow of aggressive optimization strategies are developed to reinforce the unperceptive generation of adversarial examples toward misleading victim models. Extensive experiments on PointNet, PointNet++ and DGCNN demonstrate the state-of-the-art performance of our method against existing adversarial attack methods.' volume: 157 URL: https://proceedings.mlr.press/v157/sun21a.html PDF: https://proceedings.mlr.press/v157/sun21a/sun21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-sun21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Yiming family: Sun - given: Feng family: Chen - given: Zhiyu family: Chen - given: Mingjie family: Wang editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 65-80 id: sun21a issued: date-parts: - 2021 - 11 - 28 firstpage: 65 lastpage: 80 published: 2021-11-28 00:00:00 +0000 - title: '$h$-DBSCAN: A simple fast DBSCAN algorithm for big data' abstract: 'DBSCAN is a classical clustering algorithm, which can identify different shapes and isolate noisy patterns from a dataset. Despite the above advantages, the bottleneck of DBSCAN is its computation time for high dimensional datasets. This work, thus, presents a simple and fast method to improve the efficiency of DBSCAN algorithm. We reduce the execution time in two aspects. The first one is to reduce the number of points presented to DBSCAN and the second one is to apply the HNSW technique instead of the linear search structure for improving its efficiency. The experimental results show that our proposed algorithm can greatly improve the clustering speed without losing or even obtaining better accuracy, especially for large-scale datasets. ' volume: 157 URL: https://proceedings.mlr.press/v157/weng21a.html PDF: https://proceedings.mlr.press/v157/weng21a/weng21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-weng21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Shaoyuan family: Weng - given: Jin family: Gou - given: Zongwen family: Fan editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 81-96 id: weng21a issued: date-parts: - 2021 - 11 - 28 firstpage: 81 lastpage: 96 published: 2021-11-28 00:00:00 +0000 - title: 'CTAB-GAN: Effective Table Data Synthesizing' abstract: 'While data sharing is crucial for knowledge development, privacy concerns and strict regulation (e.g., European General Data Protection Regulation (GDPR)) unfortunately limit its full effectiveness. Synthetic tabular data emerges as an alternative to enable data sharing while fulfilling regulatory and privacy constraints. The state-of-the-art tabular data synthesizers draw methodologies from Generative Adversarial Networks (GAN) and address two main data types in industry, i.e., continuous and categorical. In this paper, we develop CTAB-GAN, a novel conditional table GAN architecture that can effectively model diverse data types, including a mix of continuous and categorical variables. Moreover, we address data imbalance and long tail issues, i.e., certain variables have drastic frequency differences across large values. To achieve those aims, we first introduce the information loss, classification loss and generator loss to the conditional GAN. Secondly, we design a novel conditional vector, which efficiently encodes the mixed data type and skewed distribution of data variable. We extensively evaluate CTAB-GAN with the state of the art GANs that generate synthetic tables, in terms of data similarity and analysis utility. The results on five datasets show that the synthetic data of CTAB-GAN remarkably resembles the real data for all three types of variables and results into higher accuracy for five machine learning algorithms, by up to 17%.' volume: 157 URL: https://proceedings.mlr.press/v157/zhao21a.html PDF: https://proceedings.mlr.press/v157/zhao21a/zhao21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-zhao21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Zilong family: Zhao - given: Aditya family: Kunar - given: Robert family: Birke - given: Lydia Y. family: Chen editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 97-112 id: zhao21a issued: date-parts: - 2021 - 11 - 28 firstpage: 97 lastpage: 112 published: 2021-11-28 00:00:00 +0000 - title: 'Fairness constraint of Fuzzy C-means Clustering improves clustering fairness' abstract: 'Fuzzy C-Means (FCM) clustering is a classic clustering algorithm, which is widely used in the real world. Despite the distinct advantages of FCM algorithm, whether the usage of fairness constraint in the FCM could improve clustering fairness remains fully elusive. By introducing a novel fair loss term into the objective function, a Fair Fuzzy C-Means (FFCM) algorithm was proposed in this current study. We proved that the membership value was constrained by distance and fairness in the meantime during the optimization process in the proposed objective function. By studying the Fuzzy C-Means Clustering with fairness constraint problem and proposing a fair fuzzy C-means method, this study provided mechanism understanding in achieving the fairness constraint in Fuzzy C-Means clustering and bridged up the gap of fair fuzzy clustering.' volume: 157 URL: https://proceedings.mlr.press/v157/xia21a.html PDF: https://proceedings.mlr.press/v157/xia21a/xia21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-xia21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Xu family: Xia - given: Zhang family: Hui - given: Ynag family: Chunming - given: Zhao family: Xujian - given: Li family: Bo editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 113-128 id: xia21a issued: date-parts: - 2021 - 11 - 28 firstpage: 113 lastpage: 128 published: 2021-11-28 00:00:00 +0000 - title: 'Meta-Model-Based Meta-Policy Optimization' abstract: 'Model-based meta-reinforcement learning (RL) methods have recently been shown to be a promising approach to improving the sample efficiency of RL in multi-task settings. However, the theoretical understanding of those methods is yet to be established, and there is currently no theoretical guarantee of their performance in a real-world environment. In this paper, we analyze the performance guarantee of model-based meta-RL methods by extending the theorems proposed by Janner et al. (2019). On the basis of our theoretical results, we propose Meta-Model-Based Meta-Policy Optimization (M3PO), a model-based meta-RL method with a performance guarantee. We demonstrate that M3PO outperforms existing meta-RL methods in continuous-control benchmarks.' volume: 157 URL: https://proceedings.mlr.press/v157/hiraoka21a.html PDF: https://proceedings.mlr.press/v157/hiraoka21a/hiraoka21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-hiraoka21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Takuya family: Hiraoka - given: Takahisa family: Imagawa - given: Voot family: Tangkaratt - given: Takayuki family: Osa - given: Takashi family: Onishi - given: Yoshimasa family: Tsuruoka editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 129-144 id: hiraoka21a issued: date-parts: - 2021 - 11 - 28 firstpage: 129 lastpage: 144 published: 2021-11-28 00:00:00 +0000 - title: 'An Aligned Subgraph Kernel Based on Discrete-Time Quantum Walk' abstract: 'In this paper, a novel graph kernel is designed by aligning the amplitude representation of the vertices. Firstly, the amplitude representation of a vertex is calculated based on the discrete-time quantum walk. Then a matching-based graph kernel is constructed through identifying the correspondence between the vertices of two graphs. The newly proposed kernel can be regarded as a kind of aligned subgraph kernel that incorporates the explicit local information of substructures. Thus, it can address the disadvantage arising in the classical R-convolution kernel that the relative locations of substructures between the graphs are ignored. Experiments on several standard datasets demonstrate that the proposed kernel has better performance compared with other state-of-the-art graph kernels in terms of classification accuracy.' volume: 157 URL: https://proceedings.mlr.press/v157/liu21a.html PDF: https://proceedings.mlr.press/v157/liu21a/liu21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-liu21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Kai family: Liu - given: Lulu family: Wang - given: Yi family: Zhang editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 145-157 id: liu21a issued: date-parts: - 2021 - 11 - 28 firstpage: 145 lastpage: 157 published: 2021-11-28 00:00:00 +0000 - title: 'On the Convex Combination of Determinantal Point Processes' abstract: 'Determinantal point processes (DPPs) are attractive probabilistic models for expressing item quality and set diversity simultaneously. Although DPPs are widely-applicable to many subset selection tasks, there exist simple small-size probability distributions that any DPP cannot express. To overcome this drawback while keeping good properties of DPPs, in this paper we investigate the expressive power of \emph{convex combinations of DPPs}. We provide upper and lower bounds for the number of DPPs required for \emph{exactly} expressing any probability distribution. For the \emph{approximation} error, we give an upper bound on the Kullback–Leibler divergence $n-\lfloor \log t\rfloor +\epsilon$ for any $\epsilon >0$ of approximate distribution from a given joint probability distribution, where $t$ is the number of DPPs. Our numerical simulation on an online retail dataset empirically verifies that a convex combination of only two DPPs can outperform a nonsymmetric DPP in terms of the Kullback–Leibler divergence. By combining a polynomial number of DPPs, we can express probability distributions induced by bounded-degree pseudo-Boolean functions, which include weighted coverage functions of bounded occurrence.' volume: 157 URL: https://proceedings.mlr.press/v157/matsuoka21a.html PDF: https://proceedings.mlr.press/v157/matsuoka21a/matsuoka21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-matsuoka21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Tatsuya family: Matsuoka - given: Naoto family: Ohsaka - given: Akihiro family: Yabe editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 158-173 id: matsuoka21a issued: date-parts: - 2021 - 11 - 28 firstpage: 158 lastpage: 173 published: 2021-11-28 00:00:00 +0000 - title: 'Encoder-decoder-based image transformation approach for integrating precipitation forecasts' abstract: 'As the damage caused by heavy rainfall is becoming more serious, the improvement of precipitation forecasts is highly demanded. For this purpose, arithmetic and Bayesian average-based methods have been proposed to integrate multiple 2D-grid forecasts. However, since a single weight is shared in the entire grid in these methods, local variations of the importance of forecasts could not be taken into account. Besides, although a variety of information is available in precipitation forecast, it would not be straightforwardly to incorporate the additional information in the existing methods. To overcome these problems, we propose an encoder-decoder-based image transformation method that generates a weight image that is optimized in a pixel-wise manner and additional information could be embedded as the channel of input images and feature maps. Through the experiment of precipitation forecast in the period from April 2018 to March 2019 in Japan, we will show that our proposed integration method outperforms existing methods.' volume: 157 URL: https://proceedings.mlr.press/v157/hachiya21a.html PDF: https://proceedings.mlr.press/v157/hachiya21a/hachiya21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-hachiya21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Hirotaka family: Hachiya - given: Yusuke family: Masumoto - given: Yuki family: Mori - given: Naonori family: Ueda editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 174-188 id: hachiya21a issued: date-parts: - 2021 - 11 - 28 firstpage: 174 lastpage: 188 published: 2021-11-28 00:00:00 +0000 - title: 'A Mutual Information Regularization for Adversarial Training' abstract: 'Recently, a number of methods have been developed to alleviate the vulnerability of deep neural networks to adversarial examples, among which adversarial training and its variants have been demonstrated to be the most effective empirically. This paper aims to further improve the robustness of adversarial training against adversarial examples. We propose a new training method called mutual information and mean absolute error adversarial training (MIMAE-AT) in which the mutual information between the probabilistic predictions of the natural and the adversarial examples along with the mean absolute error between their logits are used as regularization terms to the standard adversarial training.We conduct experiments and demonstrate that the proposed MIMAE-AT method improves the state-of-the-art on adversarial robustness.' volume: 157 URL: https://proceedings.mlr.press/v157/atsague21a.html PDF: https://proceedings.mlr.press/v157/atsague21a/atsague21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-atsague21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Modeste family: Atsague - given: Olukorede family: Fakorede - given: Jin family: Tian editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 188-203 id: atsague21a issued: date-parts: - 2021 - 11 - 28 firstpage: 188 lastpage: 203 published: 2021-11-28 00:00:00 +0000 - title: 'BRAC+: Improved Behavior Regularized Actor Critic for Offline Reinforcement Learning' abstract: 'Online interactions with the environment to collect data samples for training a Reinforcement Learning (RL) agent is not always feasible due to economic and safety concerns. The goal of Offline Reinforcement Learning is to address this problem by learning effective policies using previously collected datasets. Standard off-policy RL algorithms are prone to overestimations of the values of out-of-distribution (less explored) actions and are hence unsuitable for Offline RL. Behavior regularization, which constraints the learned policy within the support set of the dataset, has been proposed to tackle the limitations of standard off-policy algorithms. In this paper, we improve the behavior regularized offline reinforcement learning and propose BRAC+. First, we propose quantification of the out-of-distribution actions and conduct comparisons between using Kullback–Leibler divergence versus using Maximum Mean Discrepancy as the regularization protocol. We propose an analytical upper bound on the KL divergence as the behavior regularizer to reduce variance associated with sample based estimations. Second, we mathematically show that the learned Q values can diverge even using behavior regularized policy update under mild assumptions. This leads to large overestimations of the Q values and performance deterioration of the learned policy. To mitigate this issue, we add a gradient penalty term to the policy evaluation objective. By doing so, the Q values are guaranteed to converge. On challenging offline RL benchmarks, BRAC+ outperforms the baseline behavior regularized approaches by $40%\sim 87%$ and the state-of-the-art approach by $6%$.' volume: 157 URL: https://proceedings.mlr.press/v157/zhang21a.html PDF: https://proceedings.mlr.press/v157/zhang21a/zhang21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-zhang21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Chi family: Zhang - given: Sanmukh family: Kuppannagari - given: Prasanna family: Viktor editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 204-219 id: zhang21a issued: date-parts: - 2021 - 11 - 28 firstpage: 204 lastpage: 219 published: 2021-11-28 00:00:00 +0000 - title: 'Cautious Actor-Critic' abstract: 'The oscillating performance of off-policy learning and persisting errors in the actor-critic(AC) setting call for algorithms that can conservatively learn to suit the stability-critical applications better. In this paper, we propose a novel off-policy AC algorithm cautious actor-critic (CAC). The name cautious comes from the doubly conservative nature that we exploit the classic policy interpolation from conservative policy iteration for the actor and the entropy-regularization of conservative value iteration for the critic. Our key observation is the entropy-regularized critic facilitates and simplifies the unwieldy interpolated actor update while still ensuring robust policy improvement. We compare CAC to state-of-the-art AC methods on a set of challenging continuous control problems and demonstrate thatCAC achieves comparable performance while significantly stabilizes learning.' volume: 157 URL: https://proceedings.mlr.press/v157/zhu21a.html PDF: https://proceedings.mlr.press/v157/zhu21a/zhu21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-zhu21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Lingwei family: Zhu - given: Toshinori family: Kitamura - given: Matsubara family: Takamitsu editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 220-235 id: zhu21a issued: date-parts: - 2021 - 11 - 28 firstpage: 220 lastpage: 235 published: 2021-11-28 00:00:00 +0000 - title: 'Quaternion Graph Neural Networks' abstract: 'Recently, graph neural networks (GNNs) have become an important and active research direction in deep learning. It is worth noting that most of the existing GNN-based methods learn graph representations within the Euclidean vector space. Beyond the Euclidean space, learning representation and embeddings in hyper-complex space have also shown to be a promising and effective approach. To this end, we propose Quaternion Graph Neural Networks (QGNN) to learn graph representations within the Quaternion space. As demonstrated, the Quaternion space, a hyper-complex vector space, provides highly meaningful computations and analogical calculus through Hamilton product compared to the Euclidean and complex vector spaces. Our QGNN obtains state-of-the-art results on a range of benchmark datasets for graph classification and node classification. Besides, regarding knowledge graphs, our QGNN-based embedding model achieves state-of-the-art results on three new and challenging benchmark datasets for knowledge graph completion. Our code is available at: \url{https://github.com/daiquocnguyen/QGNN}.' volume: 157 URL: https://proceedings.mlr.press/v157/nguyen21a.html PDF: https://proceedings.mlr.press/v157/nguyen21a/nguyen21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-nguyen21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Dai Quoc family: Nguyen - given: Tu Dinh family: Nguyen - given: Dinh family: Phung editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 236-251 id: nguyen21a issued: date-parts: - 2021 - 11 - 28 firstpage: 236 lastpage: 251 published: 2021-11-28 00:00:00 +0000 - title: 'Expressive Neural Voice Cloning' abstract: 'Voice cloning is the task of learning to synthesize the voice of an unseen speaker from a few samples. While current voice cloning methods achieve promising results in Text-to-Speech (TTS) synthesis for a new voice, these approaches lack the ability to control the expressiveness of synthesized audio. In this work, we propose a controllable voice cloning method that allows fine-grained control over various style aspects of the synthesized speech for an unseen speaker. We achieve this by explicitly conditioning the speech synthesis model on a speaker encoding, pitch contour and latent style tokens during training. Through both quantitative and qualitative evaluations, we show that our framework can be used for various expressive voice cloning tasks using only a few transcribed or untranscribed speech samples for a new speaker. These cloning tasks include style transfer from a reference speech, synthesizing speech directly from text, and fine-grained style control by manipulating the style conditioning variables during inference.' volume: 157 URL: https://proceedings.mlr.press/v157/neekhara21a.html PDF: https://proceedings.mlr.press/v157/neekhara21a/neekhara21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-neekhara21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Paarth family: Neekhara - given: Shehzeen family: Hussain - given: Shlomo family: Dubnov - given: Farinaz family: Koushanfar - given: Julian family: McAuley editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 252-267 id: neekhara21a issued: date-parts: - 2021 - 11 - 28 firstpage: 252 lastpage: 267 published: 2021-11-28 00:00:00 +0000 - title: 'SPDE-Net: Neural Network based prediction of stabilization parameter for SUPG technique' abstract: 'We propose \textit{SPDE-Net}, an artificial neural network (ANN) to predict the stabilization parameter for the streamline upwind/Petrov-Galerkin (SUPG) stabilization technique for solving singularly perturbed differential equations (SPDEs). The prediction task is modeled as a regression problem and is solved using ANN. Three training strategies for the ANN have been proposed i.e supervised, $L^2$ error minimization (global) and $L^2$ error minimization (local). It has been observed that the proposed method yields accurate results, and even outperforms some of the existing state-of-the-art ANN-based partial differential equation (PDE) solvers such as Physics Informed Neural Network (PINN).' volume: 157 URL: https://proceedings.mlr.press/v157/yadav21a.html PDF: https://proceedings.mlr.press/v157/yadav21a/yadav21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-yadav21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Sangeeta family: Yadav - given: Sashikumaar family: Ganesan editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 268-283 id: yadav21a issued: date-parts: - 2021 - 11 - 28 firstpage: 268 lastpage: 283 published: 2021-11-28 00:00:00 +0000 - title: 'DDSAS: Dynamic and Differentiable Space-Architecture Search' abstract: 'Neural Architecture Search (NAS) has made remarkable progress in automatically designing neural networks. However, existing differentiable NAS and stochastic NAS methods are either biased towards exploitation and thus may converge to a local minimum, or biased towards exploration and thus converge slowly. In this work, we propose a Dynamic and Differentiable Space-Architecture Search (DDSAS) method to address the exploration-exploitation dilemma. DDSAS dynamically samples space, searches architectures in the sampled subspace with gradient descent, and leverages the Upper Confidence Bound (UCB) to balance exploitation and exploration. The whole search space is elastic, offering flexibility to evolve and to consider resource constraints. Experiments on image classification datasets demonstrate that with only 4GB memory and 3 hours for searching, DDSAS achieves 2.39% test error on CIFAR10, 16.26% test error on CIFAR100, and 23.9% test error when transferring to ImageNet. When directly searching on ImageNet, DDSAS achieves comparable accuracy with more than 6.5 times speedup over state-of-the-art methods. The source codes are available at https://github.com/xingxing-123/DDSAS.' volume: 157 URL: https://proceedings.mlr.press/v157/yang21a.html PDF: https://proceedings.mlr.press/v157/yang21a/yang21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-yang21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Longxing family: Yang - given: Yu family: Hu - given: Shun family: Lu - given: Zihao family: Sun - given: Jilin family: Mei - given: Yiming family: Zeng - given: Zhiping family: Shi - given: Yinhe family: Han - given: Xiaowei family: Li editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 284-299 id: yang21a issued: date-parts: - 2021 - 11 - 28 firstpage: 284 lastpage: 299 published: 2021-11-28 00:00:00 +0000 - title: 'Sinusoidal Flow: A Fast Invertible Autoregressive Flow' abstract: 'Normalising flows offer a flexible way of modelling continuous probability distributions. We consider expressiveness, fast inversion and exact Jacobian determinant as three desirable properties a normalising flow should possess. However, few flow models have been able to strike a good balance among all these properties. Realising that the integral of a convex sum of sinusoidal functions squared leads to a bijective residual transformation, we propose Sinusoidal Flow, a new type of normalising flows that inherits the expressive power and triangular Jacobian from fully autoregressive flows while guaranteed by Banach fixed-point theorem to remain fast invertible and thereby obviate the need for sequential inversion typically required in fully autoregressive flows. Experiments show that our Sinusoidal Flow is not only able to model complex distributions, but can also be reliably inverted to generate realistic-looking samples even with many layers of transformations stacked.' volume: 157 URL: https://proceedings.mlr.press/v157/wei21a.html PDF: https://proceedings.mlr.press/v157/wei21a/wei21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-wei21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Yumou family: Wei editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 299-314 id: wei21a issued: date-parts: - 2021 - 11 - 28 firstpage: 299 lastpage: 314 published: 2021-11-28 00:00:00 +0000 - title: 'Uplift Modeling with High Class Imbalance' abstract: 'Uplift modeling refers to estimating the causal effect of a treatment on an individual observation, used for instance to identify customers worth targeting with a discount in e-commerce. We introduce a simple yet effective undersampling strategy for dealing with the prevalent problem of high class imbalance (low conversion rate) in such applications. Our strategy is agnostic to the base learners and produces a 6.5% improvement over the best published benchmark for the largest public uplift data which incidentally exhibits high class imbalance. We also introduce a new metric on calibration for uplift modeling and present a strategy to improve the calibration of the proposed method.' volume: 157 URL: https://proceedings.mlr.press/v157/nyberg21a.html PDF: https://proceedings.mlr.press/v157/nyberg21a/nyberg21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-nyberg21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Otto family: Nyberg - given: Tomasz family: Kuśmierczyk - given: Arto family: Klami editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 315-330 id: nyberg21a issued: date-parts: - 2021 - 11 - 28 firstpage: 315 lastpage: 330 published: 2021-11-28 00:00:00 +0000 - title: 'Iterative Deep Model Compression and Acceleration in the Frequency Domain' abstract: 'Deep Convolutional Neural Networks (CNNs) are successfully applied in many complex tasks, but their storage and huge computational costs hinder their deployment on edge devices. CNN model compression techniques have been widely studied in the past five years, most of which are conducted in the spatial domain. Inspired by the sparsity and low-rank properties of weight matrices in the frequency domain, we propose a novel frequency pruning framework for model compression and acceleration while maintaining high-performance. We firstly apply Discrete Cosine Transform (DCT) on convolutional kernels and train them in the frequency domain to get sparse representations. Then we propose an iterative model compression method to decompose the frequency matrices with a sampled-based low-rank approximation algorithm, and then fine-tune and recompose the low-rank matrices gradually until a predefined compression ratio is reached. We further demonstrate that model inference can be conducted with the decomposed frequency matrices, where model parameters and inference cost can be significantly reduced. Extensive experiments using well-known CNN models based on three open datasets show that the proposed method outperforms the state-of-the-arts in reduction of both the number of parameters and floating-point operations (FLOPs) without sacrificing too much model accuracy.' volume: 157 URL: https://proceedings.mlr.press/v157/zeng21a.html PDF: https://proceedings.mlr.press/v157/zeng21a/zeng21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-zeng21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Yao family: Zeng - given: Xusheng family: Liu - given: Lintan family: Sun - given: Wenzhong family: Li - given: Yuchu family: Fang - given: Sanglu family: Lu editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 331-346 id: zeng21a issued: date-parts: - 2021 - 11 - 28 firstpage: 331 lastpage: 346 published: 2021-11-28 00:00:00 +0000 - title: 'Penalty Method for Inversion-Free Deep Bilevel Optimization' abstract: 'Solving a bilevel optimization problem is at the core of several machine learning problems such as hyperparameter tuning, data denoising, meta- and few-shot learning, and trainingdata poisoning. Different from simultaneous or multi-objective optimization, the steepest descent direction for minimizing the upper-level cost in a bilevel problem requires the inverse of the Hessian of the lower-level cost. In this work, we propose a novel algorithm for solving bilevel optimization problems based on the classical penalty function approach. Our method avoids computing the Hessian inverse and can handle constrained bilevel problems easily. We prove the convergence of the method under mild conditions and show that the exact hypergradient is obtained asymptotically. Our method’s simplicity and small space and time complexities enable us to effectively solve large-scale bilevel problems involving deep neural networks. We present results on data denoising, few-shot learning, and training-data poisoning problems in a large-scale setting. Our results show that our approach outperforms or is comparable to previously proposed methods based on automatic differentiation and approximate inversion in terms of accuracy, run-time, and convergence speed' volume: 157 URL: https://proceedings.mlr.press/v157/mehra21a.html PDF: https://proceedings.mlr.press/v157/mehra21a/mehra21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-mehra21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Akshay family: Mehra - given: Jihun family: Hamm editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 347-362 id: mehra21a issued: date-parts: - 2021 - 11 - 28 firstpage: 347 lastpage: 362 published: 2021-11-28 00:00:00 +0000 - title: 'CTS2: Time Series Smoothing with Constrained Reinforcement Learning' abstract: 'Time series smoothing is essential for time series analysis and forecasting. It helps to identify trends and patterns of time series. However, the presence of irregular perturbations disrupt the time series smoothness and distort information. The goal of time series smoothing is to remove these perturbations while preserving as much information as possible. Existing smoothing algorithms have complete freedom to make corrections to the data points which often over smooth the time series and lose information. None of them considers constraining data corrections to the best of our knowledge. Moreover, most existing methods either do not smooth in real-time or their parameters need to be hand-tuned in different scenarios. To improve smoothing performance while considering data correction constraints, we propose a $\mathbf{C}$onstrained reinforcement learning-based $\mathbf{T}$ime $\mathbf{S}$eries $\mathbf{S}$moothing method, or CTS$^2$. Specifically, we first formulate the smoothing problem as a Constrained Markov Decision Process (CMDP). We then incorporate data correction constraints to restrict the amount of correction at each point. Finally, we learn a policy network with a linear projection layer to smooth the time series. The linear projection layer ensures that all data corrections satisfy the data correction constraints. We evaluate CTS$^2$ on both synthetic and real-world time series datasets; our results show that CTS$^2$ successfully smooths time series in real-time, satisfies all the correction constraints, and works efficiently in a variety of scenarios.' volume: 157 URL: https://proceedings.mlr.press/v157/liu21b.html PDF: https://proceedings.mlr.press/v157/liu21b/liu21b.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-liu21b.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Yongshuai family: Liu - given: Xin family: Liu editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 363-378 id: liu21b issued: date-parts: - 2021 - 11 - 28 firstpage: 363 lastpage: 378 published: 2021-11-28 00:00:00 +0000 - title: 'Open Images V5 Text Annotation and Yet Another Mask Text Spotter' abstract: 'A large scale human-labeled dataset plays an important role in creating high quality deep learning models. In this paper we present text annotation for Open Images V5 dataset. To our knowledge it is the largest among publicly available manually created text annotations. Having this annotation we trained a simple Mask-RCNN-based network, referred as Yet Another Mask Text Spotter (YAMTS), which achieves competitive performance or even outperforms current state-of-the-art approaches in some cases on ICDAR 2013, ICDAR 2015 and {Total-Text} datasets. Code for text spotting model available online at: \url{https://github.com/openvinotoolkit/training_extensions}. The model can be exported to OpenVINO{\texttrademark}-format and run on Intel{\textregistered} CPUs.' volume: 157 URL: https://proceedings.mlr.press/v157/krylov21a.html PDF: https://proceedings.mlr.press/v157/krylov21a/krylov21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-krylov21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Ilya family: Krylov - given: Sergei family: Nosov - given: Vladislav family: Sovrasov editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 379-389 id: krylov21a issued: date-parts: - 2021 - 11 - 28 firstpage: 379 lastpage: 389 published: 2021-11-28 00:00:00 +0000 - title: 'Language Representations for Generalization in Reinforcement Learning' abstract: 'The choice of state and action representation in Reinforcement Learning (RL) has a significant effect on agent performance for the training task. But its relationship with generalization to new tasks is under-explored. One approach to improving generalization investigated here is the use of language as a representation. We compare vector-states and discrete-actions to language representations. We find the agents using language representations generalize better and could solve tasks with more entities, new entities, and more complexity than seen in the training task. We attribute this to the compositionality of language.' volume: 157 URL: https://proceedings.mlr.press/v157/goodger21a.html PDF: https://proceedings.mlr.press/v157/goodger21a/goodger21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-goodger21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Nikolaj family: Goodger - given: Peter family: Vamplew - given: Cameron family: Foale - given: Richard family: Dazeley editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 390-405 id: goodger21a issued: date-parts: - 2021 - 11 - 28 firstpage: 390 lastpage: 405 published: 2021-11-28 00:00:00 +0000 - title: 'Temporal Relation based Attentive Prototype Network for Few-shot Action Recognition' abstract: 'Few-shot action recognition aims at recognizing novel action classes with only a small number of labeled video samples. We propose a temporal relation based attentive prototype network (TRAPN) for few-shot action recognition. Concretely, we tackle this challenging task from three aspects. Firstly, we propose a spatio-temporal motion enhancement (STME) module to highlight object motions in videos. The STME module utilizes cues from content displacements in videos to enhance the features in the motion-related regions. Secondly, we learn the core common action transformations by our temporal relation (TR) module, which captures the temporal relations at short-term and long-term time scales. The learned temporal relations are encoded into descriptors to constitute sample-level features. The abstract action transformations are described by multiple groups of temporal relation descriptors. Thirdly, a vanilla prototype for the support class (e.g., the mean of the support class) cannot fit well for different query samples. We generate an attentive prototype constructed from temporal relation descriptors of support samples, which gives more weight to discriminative samples. We evaluate our TRAPN on Kinetics, UCF101 and HMDB51 real-world few-shot datasets. Results show that our network achieves the state-of-the-art performance.' volume: 157 URL: https://proceedings.mlr.press/v157/wang21b.html PDF: https://proceedings.mlr.press/v157/wang21b/wang21b.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-wang21b.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Guangge family: Wang - given: Haihui family: Ye - given: Xiao family: Wang - given: Weirong family: Ye - given: Hanzi family: Wang editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 406-421 id: wang21b issued: date-parts: - 2021 - 11 - 28 firstpage: 406 lastpage: 421 published: 2021-11-28 00:00:00 +0000 - title: 'An Optimistic Acceleration of AMSGrad for Nonconvex Optimization' abstract: 'We propose a new variant of AMSGrad (Reddi et. al., 2018), a popular adaptive gradient based optimization algorithm widely used for training deep neural networks. Our algorithm adds prior knowledge about the sequence of consecutive mini-batch gradients and leverages its underlying structure making the gradients sequentially predictable. By exploiting the predictability process and ideas from optimistic online learning, the proposed algorithm can accelerate the convergence and increase its sample efficiency. After establishing a tighter upper bound under some convexity conditions on the regret, we offer a complimentary view of our algorithm which generalizes to the offline and stochastic nonconvex optimization settings. In the nonconvex case, we establish a non-asymptotic convergence bound independent of the initialization. We illustrate, via numerical experiments, the practical speedup on several deep learning models and benchmark datasets.' volume: 157 URL: https://proceedings.mlr.press/v157/wang21c.html PDF: https://proceedings.mlr.press/v157/wang21c/wang21c.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-wang21c.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Jun-Kun family: Wang - given: Xiaoyun family: Li - given: Belhal family: Karimi - given: Ping family: Li editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 422-437 id: wang21c issued: date-parts: - 2021 - 11 - 28 firstpage: 422 lastpage: 437 published: 2021-11-28 00:00:00 +0000 - title: 'Dynamic Coordination Graph for Cooperative Multi-Agent Reinforcement Learning' abstract: 'This paper introduces Dynamic $Q$-value Coordination Graph (QCGraph) for cooperative multi-agent reinforcement learning. QCGraph aims to dynamically represent and generalize through factorizing the joint value function of all agents according to dynamically created coordination graph based on subsets of agents. The value can be maximized by message passing at both a local and global level along the graph which allows training the value function end-to-end. The coordination graph is dynamically generated and used to generate the payoff functions which are approximated using graph neural networks and parameter sharing to improve generalization over the state-action space. We show that QCGraph can solve a variety of challenging multi-agent tasks being superior to other value factorization approaches. ' volume: 157 URL: https://proceedings.mlr.press/v157/siu21a.html PDF: https://proceedings.mlr.press/v157/siu21a/siu21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-siu21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Chapman family: Siu - given: Jason family: Traish - given: Richard Yi Da family: Xu editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 438-453 id: siu21a issued: date-parts: - 2021 - 11 - 28 firstpage: 438 lastpage: 453 published: 2021-11-28 00:00:00 +0000 - title: 'S2TNet: Spatio-Temporal Transformer Networks for Trajectory Prediction in Autonomous Driving' abstract: 'To safely and rationally participate in dense and heterogeneous traffic, autonomous vehicles require to sufficiently analyze the motion patterns of surrounding traffic-agents and accurately predict their future trajectories. This is challenging because the trajectories of traffic-agents are not only influenced by the traffic-agents themselves but also by spatial interaction with each other. Previous methods usually rely on the sequential step-by-step processing of Long Short-Term Memory networks (LSTMs) and merely extract the interactions between spatial neighbors for single type traffic-agents. We propose the Spatio-Temporal Transformer Networks (S2TNet), which models the spatio-temporal interactions by spatio-temporal Transformer and deals with the temporel sequences by temporal Transformer. We input additional category, shape and heading information into our networks to handle the heterogeneity of traffic-agents. The proposed methods outperforms state-of-the-art methods on ApolloScape Trajectory dataset by more than 7% on both the weighted sum of Average and Final Displacement Error.' volume: 157 URL: https://proceedings.mlr.press/v157/chen21a.html PDF: https://proceedings.mlr.press/v157/chen21a/chen21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-chen21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Weihuang family: Chen - given: Fangfang family: Wang - given: Hongbin family: Sun editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 454-469 id: chen21a issued: date-parts: - 2021 - 11 - 28 firstpage: 454 lastpage: 469 published: 2021-11-28 00:00:00 +0000 - title: 'Solving Machine Learning Problems' abstract: 'Can a machine learn Machine Learning? This work trains a machine learning model to solve machine learning problems from a University undergraduate level course. We generate a new training set of questions and answers consisting of course exercises, homework, and quiz questions from MIT’s 6.036 Introduction to Machine Learning course and train a machine learning model to answer these questions. Our system demonstrates an overall accuracy of 96% for open-response questions and 97% for multiple-choice questions, compared with MIT students’ average of 93%, achieving grade A performance in the course, all in real-time. Questions cover all 12 topics taught in the course, excluding coding questions or questions with images. Topics include: (i) basic machine learning principles; (ii) perceptrons; (iii) feature extraction and selection; (iv) logistic regression; (v) regression; (vi) neural networks; (vii) advanced neural networks; (viii) convolutional neural networks; (ix) recurrent neural networks; (x) state machines and MDPs; (xi) reinforcement learning; and (xii) decision trees. Our system uses Transformer models within an encoder-decoder architecture with graph and tree representations. An important aspect of our approach is a data-augmentation scheme for generating new example problems. We also train a machine learning model to generate problem hints. Thus, our system automatically generates new questions across topics, answers both open-response questions and multiple-choice questions, classifies problems, and generates problem hints, pushing the envelope of AI for STEM education.' volume: 157 URL: https://proceedings.mlr.press/v157/tran21a.html PDF: https://proceedings.mlr.press/v157/tran21a/tran21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-tran21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Sunny family: Tran - given: Pranav family: Krishna - given: Ishan family: Pakuwal - given: Prabhakar family: Kafle - given: Nikhil family: Singh - given: Jayson family: Lynch - given: Iddo family: Drori editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 470-485 id: tran21a issued: date-parts: - 2021 - 11 - 28 firstpage: 470 lastpage: 485 published: 2021-11-28 00:00:00 +0000 - title: 'Pedestrian Wind Factor Estimation in Complex Urban Environments' abstract: 'Urban planners and policy makers face the challenge of creating livable and enjoyable citiesfor larger populations in much denser urban conditions. While the urban microclimate holdsa key role in defining the quality of urban spaces today and in the future, the integrationof wind microclimate assessment in early urban design and planning processes remains achallenge due to the complexity and high computational expense of computational fluiddynamics (CFD) simulations. This work develops a data-driven workflow for real-timepedestrian wind comfort estimation in complex urban environments which may enabledesigners, policy makers and city residents to make informed decisions about mobility,health, and energy choices. We use a conditional generative adversarial network (cGAN)architecture to reduce the computational computation while maintaining high confidencelevels and interpretability, adequate representation of urban complexity, and suitability forpedestrian comfort estimation. We demonstrate high quality wind field approximationswhile reducing computation time from days to seconds.' volume: 157 URL: https://proceedings.mlr.press/v157/mokhtar21a.html PDF: https://proceedings.mlr.press/v157/mokhtar21a/mokhtar21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-mokhtar21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Sarah family: Mokhtar - given: Matt family: Beveridge - given: Yumeng family: Cao - given: Iddo family: Drori editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 486-501 id: mokhtar21a issued: date-parts: - 2021 - 11 - 28 firstpage: 486 lastpage: 501 published: 2021-11-28 00:00:00 +0000 - title: 'DPOQ: Dynamic Precision Onion Quantization' abstract: 'With the development of deployment platforms and application scenarios for deep neural networks, traditional fixed network architectures cannot meet the requirements. Meanwhile the dynamic network inference becomes a new research trend. Many slimmable and scalable networks have been proposed to satisfy different resource constraints (e.g., storage, latency and energy). And a single network may support versatile architectural configurations including: depth, width, kernel size, and resolution. In this paper, we propose a novel network architecture reuse strategy enabling dynamic precision in parameters. Since our low-precision networks are wrapped in the high-precision networks like an onion, we name it dynamic precision onion quantization (DPOQ). We train the network by using the joint loss with scaled gradients. To further improve the performance and make different precision network compatible with each other, we propose the precision shift batch normalization (PSBN). And we also propose a scalable input-specific inference mechanism based on this architecture and make the network more adaptable. Experiments on the CIFAR and ImageNet dataset have shown that our DPOQ achieves not only better flexibility but also higher accuracy than the individual quantization.' volume: 157 URL: https://proceedings.mlr.press/v157/li21a.html PDF: https://proceedings.mlr.press/v157/li21a/li21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-li21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Bowen family: Li - given: Kai family: Huang - given: Siang family: Chen - given: Dongliang family: Xiong - given: Luc family: Claesen editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 502-517 id: li21a issued: date-parts: - 2021 - 11 - 28 firstpage: 502 lastpage: 517 published: 2021-11-28 00:00:00 +0000 - title: 'A Causal Approach for Unfair Edge Prioritization and Discrimination Removal' abstract: 'In budget-constrained settings aimed at mitigating unfairness, like law enforcement, it is essential to prioritize the sources of unfairness before taking measures to mitigate them in the real world. Unlike previous works, which only serve as a caution against possible discrimination and de-bias data after data generation, this work provides a toolkit to mitigate unfairness during data generation, given by the Unfair Edge Prioritization algorithm, in addition to de-biasing data after generation, given by the Discrimination Removal algorithm. We assume that a non-parametric Markovian causal model representative of the data generation procedure is given. The edges emanating from the sensitive nodes in the causal graph, such as race, are assumed to be the sources of unfairness. We first quantify Edge Flow in any edge X –> Y, which is the belief of observing a specific value of Y due to the influence of a specific value of X along X –> Y. We then quantify Edge Unfairness by formulating a non-parametric model in terms of edge flows. We then prove that cumulative unfairness towards sensitive groups in a decision, like race in a bail decision, is non-existent when edge unfairness is absent. We prove this result for the non-trivial non-parametric model setting when the cumulative unfairness cannot be expressed in terms of edge unfairness. We then measure the Potential to mitigate the Cumulative Unfairness when edge unfairness is decreased. Based on these measurements, we propose the Unfair Edge Prioritization algorithm that can then be used by policymakers. We also propose the Discrimination Removal Procedure that de-biases a data distribution by eliminating optimization constraints that grow exponentially in the number of sensitive attributes and values taken by them. Extensive experiments validate the theorem and specifications used for quantifying the above measures.' volume: 157 URL: https://proceedings.mlr.press/v157/pavan-ravishankar21a.html PDF: https://proceedings.mlr.press/v157/pavan-ravishankar21a/pavan-ravishankar21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-pavan-ravishankar21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Balaraman Ravindran family: Pavan Ravishankar suffix: Pranshu Malviya editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 518-533 id: pavan-ravishankar21a issued: date-parts: - 2021 - 11 - 28 firstpage: 518 lastpage: 533 published: 2021-11-28 00:00:00 +0000 - title: 'Spatial Temporal Enhanced Contrastive and Pretext Learning for Skeleton-based Action Representation' abstract: 'In this paper, we focus on unsupervised representation learning for skeleton-based action recognition. The critical issue of this task is extracting discriminative spatial-temporal information from skeleton sequences to form action representation. To better solve this, we propose a novel unsupervised framework named contrastive-pretext spatial-temporal network (CP-STN), aiming to achieve accurate action recognition by better exploiting discriminative spatial-temporal enhanced features from massive unlabeled data. We combine contrastive and pretext tasks learning paradigms in one framework by using asymmetric spatial and temporal augmentations to enable network extracting discriminative representations with spatial-temporal information fully. Furthermore, graph-based convolution is used as the backbone to explore natural spatial-temporal graph information in skeleton data. Extensive experimental results show that our CP-STN significantly boosts the performance of existing skeleton-based action representations learning networks and achieves state-of-the-art accuracy on two challenging benchmarks in both unsupervised and semi-supervised settings.' volume: 157 URL: https://proceedings.mlr.press/v157/zhan21a.html PDF: https://proceedings.mlr.press/v157/zhan21a/zhan21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-zhan21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Yiwen family: Zhan - given: Yuchen family: Chen - given: Pengfei family: Ren - given: Haifeng family: Sun - given: Jingyu family: Wang - given: Qi family: Qi - given: Jianxin family: Liao editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 534-547 id: zhan21a issued: date-parts: - 2021 - 11 - 28 firstpage: 534 lastpage: 547 published: 2021-11-28 00:00:00 +0000 - title: 'QActor: Active Learning on Noisy Labels' abstract: 'Noisy labeled data is more a norm than a rarity for self-generated content that is continuously published on the web and social media from non-experts. Active querying experts are conventionally adopted to provide labels for the informative samples which don’t have labels, instead of possibly incorrect labels. The new challenge that arises here is how to discern the informative and noisy labels which benefit from expert cleaning. In this paper, we aim to leverage the stringent oracle budget to robustly maximize learning accuracy. We propose a noise-aware active learning framework, QActor, and a novel measure \emph{CENT}, which considers both cross-entropy and entropy to select informative and noisy labels for an expert cleansing. QActor iteratively cleans samples via quality models and actively querying an expert on those noisy yet informative samples. To adapt to learning capacity per iteration, QActor dynamically adjusts the query limit according to the learning loss for each learning iteration. We extensively evaluate different image datasets with noise label ratios ranging between 30% and 60%. Our results show that QActor can nearly match the optimal accuracy achieved using only clean data at the cost of only an additional 10% of ground truth data from the oracle.' volume: 157 URL: https://proceedings.mlr.press/v157/younesian21a.html PDF: https://proceedings.mlr.press/v157/younesian21a/younesian21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-younesian21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Taraneh family: Younesian - given: Zilong family: Zhao - given: Amirmasoud family: Ghiassi - given: Robert family: Birke - given: Lydia Y family: Chen editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 548-563 id: younesian21a issued: date-parts: - 2021 - 11 - 28 firstpage: 548 lastpage: 563 published: 2021-11-28 00:00:00 +0000 - title: 'Max-Utility Based Arm Selection Strategy For Sequential Query Recommendations' abstract: 'We consider the query recommendation problem in closed loop interactive learning settings like online information gathering and exploratory analytics. The problem can be naturally modelled using the Multi-Armed Bandits (MAB) framework with countably many arms. The standard MAB algorithms for countably many arms begin with selecting a random set of candidate arms and then applying standard MAB algorithms, e.g., UCB, on this candidate set downstream. We show that such a selection strategy often results in higher cumulative regret and to this end, we propose a selection strategy based on the maximum utility of the arms. We show that in tasks like online information gathering, where sequential query recommendations are employed, the sequences of queries are correlated and the number of potentially optimal queries can be reduced to a manageable size by selecting queries with maximum utility with respect to the currently executing query. Our experimental results using a recent real online literature discovery service log file demonstrate that the proposed arm selection strategy improves the cumulative regret substantially with respect to the state-of-the-art baseline algorithms. Our data model and source code are available at  \url{https://anonymous.4open.science/r/0e5ad6b7-ac02-4577-9212-c9d505d3dbdb/}' volume: 157 URL: https://proceedings.mlr.press/v157/puthiya-parambath21a.html PDF: https://proceedings.mlr.press/v157/puthiya-parambath21a/puthiya-parambath21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-puthiya-parambath21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Shameem family: Puthiya Parambath - given: Christos family: Anagnostopoulos - given: Roderick family: Murray-Smith - given: Sean family: MacAvaney - given: Evangelos prefix: and family: Zervas editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 564-579 id: puthiya-parambath21a issued: date-parts: - 2021 - 11 - 28 firstpage: 564 lastpage: 579 published: 2021-11-28 00:00:00 +0000 - title: 'Multi-task Actor-Critic with Knowledge Transfer via a Shared Critic' abstract: 'Multi-task actor-critic is a learning paradigm proposed in the literature to improve the learning efficiency of multiple actor-critics by sharing the learned policies across tasks while the reinforcement learning progresses online. However, existing multi-task actor-critic algorithms can only handle reinforcement learning tasks within the same problem domain, they may fail in cases where tasks possessing diverse state-action spaces. Taking this cue, in this paper, we embark a study on multi-task actor-critic with knowledge transfer via a share critic to enable the multi-task learning of actor-critic in heterogeneous state-action environments. Further, for efficient learning of the proposed multi-task actor-critic, a new formula for calculating the gradient of the actor network is also presented. To evaluate the performance of our approach, comprehensive empirical studies on continuous robotic tasks with different numbers of links. The experimental results confirmed the effectiveness of the proposed multi-task actor-critic algorithm.' volume: 157 URL: https://proceedings.mlr.press/v157/zhang21b.html PDF: https://proceedings.mlr.press/v157/zhang21b/zhang21b.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-zhang21b.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Gengzhi family: Zhang - given: Liang family: Feng - given: Yaqing family: Hou editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 580-593 id: zhang21b issued: date-parts: - 2021 - 11 - 28 firstpage: 580 lastpage: 593 published: 2021-11-28 00:00:00 +0000 - title: 'Contrastive Neural Processes for Self-Supervised Learning' abstract: 'Recent contrastive methods show significant improvement in self-supervised learning in several domains. In particular, contrastive methods are most effective where data augmentation can be easily constructed e.g. in computer vision. However, they are less successful in domains without established data transformations such as time series data. In this paper, we propose a novel self-supervised learning framework that combines contrastive learning with neural processes. It relies on recent advances in neural processes to perform time series forecasting. This allows to generate augmented versions of data by employing a set of various sampling functions and, hence, avoid manually designed augmentations. We extend conventional neural processes and propose a new contrastive loss to learn times series representations in a self-supervised setup. Therefore, unlike previous self-supervised methods, our augmentation pipeline is task-agnostic, enabling our method to perform well across various applications. In particular, a ResNet with a linear classifier trained using our approach is able to outperform state-of-the-art techniques across industrial, medical and audio datasets improving accuracy over 10% in ECG periodic data. We further demonstrate that our self-supervised representations are more efficient in the latent space, improving multiple clustering indexes and that fine-tuning our method on 10% of labels achieves results competitive to fully-supervised learning.' volume: 157 URL: https://proceedings.mlr.press/v157/kallidromitis21a.html PDF: https://proceedings.mlr.press/v157/kallidromitis21a/kallidromitis21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-kallidromitis21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Konstantinos family: Kallidromitis - given: Denis family: Gudovskiy - given: Kozuka family: Kazuki - given: Ohama family: Iku - given: Luca family: Rigazio editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 594-609 id: kallidromitis21a issued: date-parts: - 2021 - 11 - 28 firstpage: 594 lastpage: 609 published: 2021-11-28 00:00:00 +0000 - title: 'Pyramid Correlation based Deep Hough Voting for Visual Object Tracking' abstract: ' Most of the existing Siamese-based trackers treat tracking problem as a parallel task of classification and regression. However, some studies show that the sibling head structure could lead to suboptimal solutions during the network training. Through experiments we find that, without regression, the performance could be equally promising as long as we delicately design the network to suit the training objective. We introduce a novel voting-based classification-only tracking algorithm named Pyramid Correlation based Deep Hough Voting (short for PCDHV), to jointly locate the top-left and bottom-right corners of the target. Specifically we innovatively construct a Pyramid Correlation module to equip the embedded feature with fine-grained local structures and global spatial contexts; The elaborately designed Deep Hough Voting module further take over, integrating long-range dependencies of pixels to perceive corners; In addition, the prevalent discretization gap is simply yet effectively alleviated by increasing the spatial resolution of the feature maps while exploiting channel-space relationships. The algorithm is general, robust and simple. We demonstrate the effectiveness of the module through a series of ablation experiments. Without bells and whistles, our tracker achieves better or comparable performance to the SOTA algorithms on three challenging benchmarks (TrackingNet, GOT-10k and LaSOT) while running at a real-time speed of 80 FPS. Codes and models will be released. ' volume: 157 URL: https://proceedings.mlr.press/v157/wang21d.html PDF: https://proceedings.mlr.press/v157/wang21d/wang21d.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-wang21d.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Ying family: Wang - given: Tingfa family: Xu - given: Shenwang family: Jiang - given: Junjie family: Chen - given: Jianan family: Li editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 610-625 id: wang21d issued: date-parts: - 2021 - 11 - 28 firstpage: 610 lastpage: 625 published: 2021-11-28 00:00:00 +0000 - title: 'calibrated adversarial training' abstract: 'Adversarial training is an approach of increasing the robustness of models to adversarial attacks by including adversarial examples in the training set. One major challenge of producing adversarial examples is to contain sufficient perturbation in the example to flip the model’s output while not making severe changes in the example’s semantical content. Exuberant change in the semantical content could also change the true label of the example. Adding such examples to the training set results in adverse effects. In this paper, we present the Calibrated Adversarial Training, a method that reduces the adverse effects of semantic perturbations in adversarial training. The method produces pixel-level adaptations to the perturbations based on novel calibrated robust error. We provide theoretical analysis on the calibrated robust error and derive an upper bound for it. Our empirical results show a superior performance of the Calibrated Adversarial Training over a number of public datasets.' volume: 157 URL: https://proceedings.mlr.press/v157/huang21a.html PDF: https://proceedings.mlr.press/v157/huang21a/huang21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-huang21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Tianjin family: Huang - given: Vlado family: Menkovski - given: Yulong family: Pei - given: Mykola family: Pechenizkiy editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 626-641 id: huang21a issued: date-parts: - 2021 - 11 - 28 firstpage: 626 lastpage: 641 published: 2021-11-28 00:00:00 +0000 - title: 'ASD-Conv: Monocular 3D object detection network based on Asymmetrical Segmentation Depth-aware Convolution' abstract: 'In the field of 3D object recognition, monocular 3D recognition technology is a valuable recognition technology. Compared with binocular technology and lidar technology, its cost is lower. In this paper, based on the existing monocular 3D recognition network, we propose an asymmetrical segmentation depth-aware network: ASD-Conv Network, which is used to better obtain the depth information of monocular images, so as to obtain better recognition results. Compared with other monocular recognition networks, ASD-Conv network performs special segmentation on the image, which can better obtain the depth distribution of the image, and has made a good breakthrough and improvement in the image recognition tasks of 2D, BEV and 3D. The improved algorithm proposed in this paper can improve the detection accuracy while maintaining a certain real-time performance. Experimental results show that compared with the current model, the proposed monocular 3D object detection algorithm based on D-ASDConv has an average improvement rate of 2.82%(AP) in large object detection and the highest average improvement rate of 2.01%(AP) in small object detection on Kitti dataset. The algorithm can effectively learn more advanced features of spatial perception, and the detection results of monocular images are more accurate.' volume: 157 URL: https://proceedings.mlr.press/v157/xingyuan21a.html PDF: https://proceedings.mlr.press/v157/xingyuan21a/xingyuan21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-xingyuan21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Yu family: Xingyuan - given: Du family: Neng - given: Gao family: Ge - given: Wen family: Fan editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 642-655 id: xingyuan21a issued: date-parts: - 2021 - 11 - 28 firstpage: 642 lastpage: 655 published: 2021-11-28 00:00:00 +0000 - title: 'Convolutional Hypercomplex Embeddings for Link Prediction' abstract: 'Knowledge graph embedding research has mainly focused on the two smallest normed division algebras, $\mathbb{R}$ and $\mathbb{C}$. Recent results suggest that trilinear products of quaternion-valued embeddings can be a more effective means to tackle link prediction. In addition, models based on convolutions on real-valued embeddings often yield state-of-the-art results for link prediction. In this paper, we investigate a composition of convolution operations with hypercomplex multiplications. We propose the four approaches QMult, OMult, ConvQ and ConvO to tackle the link prediction problem. QMult and OMult can be considered as quaternion and octonion extensions of previous state-of-the-art approaches, including DistMult and ComplEx. ConvQ and ConvO build upon QMult and OMult by including convolution operations in a way inspired by the residual learning framework. We evaluated our approaches on seven link prediction datasets including WN18RR, FB15K-237 and YAGO3-10. Experimental results suggest that the benefits of learning hypercomplex-valued vector representations become more apparent as the size and complexity of the knowledge graph grows. ConvO outperforms state-of-the-art approaches on FB15K-237 in MRR, Hit@1 and Hit@3, while QMult, OMult, ConvQ and ConvO outperform state-of-the-approaches on YAGO3-10 in all metrics. Results also suggest that link prediction performances can be further improved via prediction averaging. To foster reproducible research, we provide an open-source implementation of approaches, including training and evaluation scripts as well as pretrained models.' volume: 157 URL: https://proceedings.mlr.press/v157/demir21a.html PDF: https://proceedings.mlr.press/v157/demir21a/demir21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-demir21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Caglar family: Demir - given: Diego family: Moussallem - given: Stefan family: Heindorf - given: Axel-Cyrille family: Ngonga Ngomo editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 656-671 id: demir21a issued: date-parts: - 2021 - 11 - 28 firstpage: 656 lastpage: 671 published: 2021-11-28 00:00:00 +0000 - title: 'Beyond $L_p$ Clipping: Equalization based Psychoacoustic Attacks against ASRs' abstract: 'Automatic Speech Recognition (ASR) systems convert speech into text and can be placed into two broad categories: traditional and fully end-to-end. Both types have been shown to be vulnerable to adversarial audio examples that sound benign to the human ear but force the ASR to produce malicious transcriptions. Of these attacks, only the “psychoacoustic” attacks can create examples with relatively imperceptible perturbations, as they leverage the knowledge of the human auditory system. Unfortunately, existing psychoacoustic attacks can only be applied against traditional models, and are obsolete against the newer, fully end-to-end ASRs. In this paper, we propose an equalization-based psychoacoustic attack that can exploit both traditional and fully end-to-end ASRs. We successfully demonstrate our attack against real-world ASRs that include DeepSpeech and Wav2Letter. Moreover, we employ a user study to verify that our method creates low audible distortion. Specifically, 80 of the 100 participants voted in favor of \textit{all} our attack audio samples as less noisier than the existing state-of-the-art attack. Through this, we demonstrate both types of existing ASR pipelines can be exploited with minimum degradation to attack audio quality.' volume: 157 URL: https://proceedings.mlr.press/v157/abdullah21a.html PDF: https://proceedings.mlr.press/v157/abdullah21a/abdullah21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-abdullah21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Hadi family: Abdullah - given: Muhammad Sajidur family: Rahman - given: Christian family: Peeters - given: Cassidy family: Gibson - given: Washington family: Garcia - given: Vincent family: Bindschaedler - given: Thomas family: Shrimpton - given: Patrick family: Traynor editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 672-688 id: abdullah21a issued: date-parts: - 2021 - 11 - 28 firstpage: 672 lastpage: 688 published: 2021-11-28 00:00:00 +0000 - title: 'Slice-sampling based 3D Object Classification' abstract: 'Multiview-based 3D object detection achieved great success in the past years. However, for some complex models with complex inner structures, the performances of these methods are not satisfactory. This paper provides a method based on slide sampling for 3D object classification. First, we slice and sample the model from the different depths and directions to get the model’s features. Then, a deep neural network designed based on the attention mechanism is used to classify the input data. The experiments show that the performance of our method is competitive on ModelNet. Moreover, for some special models with simple surfaces and complex inner structures, the performance of our method is outstanding and stable.' volume: 157 URL: https://proceedings.mlr.press/v157/xiangwen21a.html PDF: https://proceedings.mlr.press/v157/xiangwen21a/xiangwen21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-xiangwen21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Zhao family: Xiangwen - given: Yang family: Yi-Jun - given: Zeng family: Wei - given: Yang family: Liqun - given: Wang family: Yao editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 689-704 id: xiangwen21a issued: date-parts: - 2021 - 11 - 28 firstpage: 689 lastpage: 704 published: 2021-11-28 00:00:00 +0000 - title: 'Multi-Branch Network for Cross-Subject EEG-based Emotion Recognition' abstract: 'In recent years, electrocardiogram (EEG)-based emotion recognition has received increasing attention in affective computing. Since the individual differences of EEG signals are large, most models are trained for specific subjects, and the generalization is poor when applied to new subjects. In this paper, we propose a Multi-Branch Network (MBN) model to solve this problem. According to the characteristics of the cross-subject data, different branch networks are designed to separate the background features and task features of the EEG signals for classification to have better model performance. Besides, there is no new-subject data needed during model training. In order to avoid the negative improvement caused by samples with significant differences to model training, a tiny amount of new-subject data is used to filter the training samples to improve the model performance further. Before training the model, the samples with significant differences from the new subject were deleted by comparing the background features between the subjects. The experimental results show that compared with Single-Branch Network (SBN) model, the accuracy of the MBN model is improved by 20.89% on the SEED dataset. Furthermore, compared with other common methods, the proposed method uses less new-subject data, which improves its practicability in practical application.' volume: 157 URL: https://proceedings.mlr.press/v157/lin21a.html PDF: https://proceedings.mlr.press/v157/lin21a/lin21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-lin21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Guang family: Lin - given: Li family: Zhu - given: Bin family: Ren - given: Yiteng family: Hu - given: Jianhai family: Zhang editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 705-720 id: lin21a issued: date-parts: - 2021 - 11 - 28 firstpage: 705 lastpage: 720 published: 2021-11-28 00:00:00 +0000 - title: 'Skew-symmetrically perturbed gradient flow for convex optimization' abstract: 'Recently, many methods for optimization and sampling have been developed by designing continuous dynamics followed by discretization. The dynamics that have been used for optimization have their corresponding underlying functionals to be minimized. On the other hand, a wider class of dynamics have been studied for sampling, which is not necessarily limited to functional minimization. For example, dynamics perturbed with skew-symmetric matrices, which cannot be seen as minimization of functionals, have been widely used to reduce asymptotic variance. Following this success in sampling, exploring such perturbed dynamics in the context of optimization can open a new avenue to optimization algorithm design. In this work, we introduce a perturbation technique for sampling into optimization for strongly convex functions. We show that perturbation applied to the gradient flow yields rapid convergence in optimization for strongly convex functions. Based on this continuous dynamics, we propose an optimization algorithm for strongly convex functions with a novel discretization framework that combines the Euler method with the leapfrog method which is used in the Hamilton Monte Carlo method. Our numerical experiments show that the perturbation technique is useful for optimization.' volume: 157 URL: https://proceedings.mlr.press/v157/futami21a.html PDF: https://proceedings.mlr.press/v157/futami21a/futami21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-futami21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Futoshi family: Futami - given: Tomoharu family: Iwata - given: Naonori family: Ueda - given: Ikko family: Yamane editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 721-736 id: futami21a issued: date-parts: - 2021 - 11 - 28 firstpage: 721 lastpage: 736 published: 2021-11-28 00:00:00 +0000 - title: 'Improving Gaussian mixture latent variable model convergence with Optimal Transport' abstract: 'Generative models with both discrete and continuous latent variables are highly motivated by the structure of many real-world data sets. They present, however, subtleties in training often manifesting in the discrete latent variable not being leveraged. In this paper, we show why such models struggle to train using traditional log-likelihood maximization, and that they are amenable to training using the Optimal Transport framework of Wasserstein Autoencoders. We find our discrete latent variable to be fully leveraged by the model when trained, without any modifications to the objective function or significant fine tuning. Our model generates comparable samples to other approaches while using relatively simple neural networks, since the discrete latent variable carries much of the descriptive burden. Furthermore, the discrete latent provides significant control over generation.' volume: 157 URL: https://proceedings.mlr.press/v157/gaujac21a.html PDF: https://proceedings.mlr.press/v157/gaujac21a/gaujac21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-gaujac21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Benoit family: Gaujac - given: Ilya family: Feige - given: David family: Barber editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 737-752 id: gaujac21a issued: date-parts: - 2021 - 11 - 28 firstpage: 737 lastpage: 752 published: 2021-11-28 00:00:00 +0000 - title: 'Generating Deep Networks Explanations with Robust Attribution Alignment' abstract: 'Attribution methods play a key role in generating post-hoc explanations on pre-trained models, however it has been shown that existing methods yield unfaithful and noisy explanations. In this paper, we propose a new paradigm of attribution method: we treat the model’s explanations as a part of network’s outputs then generate attribution maps from the underlying deep network. The generated attribution maps are up-sampled from the last convolutional layer of the network to obtain localization information about the target to be explained. Inspired by recent studies that showed adversarially robust models’ saliency map aligns well with human perception, we utilize attribution maps from the robust model to supervise the learned attributions. Our proposed method can produce visually plausible explanations along with the prediction in inference phase. Experiments on real datasets show that our proposed method yields more faithful explanations than post-hoc attribution methods with lighter computational costs.' volume: 157 URL: https://proceedings.mlr.press/v157/zeng21b.html PDF: https://proceedings.mlr.press/v157/zeng21b/zeng21b.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-zeng21b.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Guohang family: Zeng - given: Yousef family: Kowsar - given: Sarah family: Erfani - given: James family: Bailey editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 753-768 id: zeng21b issued: date-parts: - 2021 - 11 - 28 firstpage: 753 lastpage: 768 published: 2021-11-28 00:00:00 +0000 - title: 'Scalable gradient matching based on state space Gaussian Processes' abstract: 'In many scientific fields, various phenomena are modeled by ordinary differential equations (ODEs). Parameters in ODEs are generally unknown and hard to measure directly. Since analytical solutions for ODEs can rarely be obtained, statistical methods are often used to infer parameters from experimental observations. Among many existing methods, Gaussian process-based gradient matching has been explored extensively. However, the existing method cannot be scaled to a massive dataset. Given $N$ data points, existing algorithms show $\mathcal{O}(N^3)$ computational cost. In this paper, we propose a novel algorithm using the state space reformulation of Gaussian processes. More specifically, we reformulate Gaussian process gradient matching as a special state-space model problem, then approximate its posterior distribution by a novel Rao-Blackwellization filtering, which enjoys $\mathcal{O}(N)$ computational cost. Moreover, our algorithm is expressed as closed forms, it is 1000 times more faster than existing methods measured in wall clock time.' volume: 157 URL: https://proceedings.mlr.press/v157/futami21b.html PDF: https://proceedings.mlr.press/v157/futami21b/futami21b.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-futami21b.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Futoshi family: Futami editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 769-784 id: futami21b issued: date-parts: - 2021 - 11 - 28 firstpage: 769 lastpage: 784 published: 2021-11-28 00:00:00 +0000 - title: 'Domain Adaptive YOLO for One-Stage Cross-Domain Detection' abstract: 'Domain shift is a major challenge for object detectors to generalize well to real world applications. Emerging techniques of domain adaptation for two-stage detectors help to tackle this problem. However, two-stage detectors are not the first choice for industrial applications due to its long time consumption. In this paper, a novel Domain Adaptive YOLO (DA-YOLO) is proposed to improve cross-domain performance for one-stage detectors. Image level features alignment is used to strictly match for local features like texture, and loosely match for global features like illumination. Multi-scale instance level features alignment is presented to reduce instance domain shift effectively, such as variations in object appearance and viewpoint. A consensus regularization to these domain classifiers is employed to help the network generate domain-invariant detections. We evaluate our proposed method on popular datasets like Cityscapes, KITTI, SIM10K and et al.. The results demonstrate considerable improvement when tested under different cross-domain scenarios.' volume: 157 URL: https://proceedings.mlr.press/v157/zhang21c.html PDF: https://proceedings.mlr.press/v157/zhang21c/zhang21c.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-zhang21c.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Shizhao family: Zhang - given: Hongya family: Tuo - given: Jian family: Hu - given: Zhongliang family: Jing editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 785-797 id: zhang21c issued: date-parts: - 2021 - 11 - 28 firstpage: 785 lastpage: 797 published: 2021-11-28 00:00:00 +0000 - title: 'Hierarchical Semantic Segmentation using Psychometric Learning' abstract: 'Assigning meaning to parts of image data is the goal of semantic image segmentation. Machine learning methods, specifically supervised learning is commonly used in a variety of tasks formulated as semantic segmentation. One of the major challenges in the supervised learning approaches is expressing and collecting the rich knowledge that experts have with respect to the meaning present in the image data. Towards this, typically a fixed set of labels is specified and experts are tasked with annotating the pixels, patches or segments in the images with the given labels. In general, however, the set of classes does not fully capture the rich semantic information present in the images. For example, in medical imaging such as histology images, the different parts of cells could be grouped and sub-grouped based on the expertise of the pathologist. To achieve such a precise semantic representation of the concepts in the image, we need access to the full depth of knowledge of the annotator. In this work, we develop a novel approach to collect segmentation annotations from experts based on psychometric testing. Our method consists of psychometric testing procedure, active query selection, query enhancement, and a deep metric learning model to achieve a patch-level image embedding that allows for semantic segmentation of images. We show the merits of our method with evaluation on the synthetically generated image, aerial image and histology image.' volume: 157 URL: https://proceedings.mlr.press/v157/yin21a.html PDF: https://proceedings.mlr.press/v157/yin21a/yin21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-yin21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Lu family: Yin - given: Vlado family: Menkovski - given: Shwei family: Liu - given: Mykola family: Pechenizkiy editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 798-813 id: yin21a issued: date-parts: - 2021 - 11 - 28 firstpage: 798 lastpage: 813 published: 2021-11-28 00:00:00 +0000 - title: 'Improving Hashing Algorithms for Similarity Search via MLE and the Control Variates Trick' abstract: 'Hashing algorithms are continually used for large-scale learning and similarity search, with computationally cheap and better algorithms being proposed every year. In this paper we focus on hashing algorithms which involve estimating a distance measure $d(\vec{x}_i,\vec{x}_j)$ between two vectors $\vec{x}_i, \vec{x}_j$. Such hashing algorithms require generation of random variables, and we propose two approaches to reduce the variance of our hashed estimates: control variates and maximum likelihood estimates. We explain how these approaches can be immediately applied to a wide subset of hashing algorithms. Further, we evaluate the impact of these methods on various datasets. We finally run empirical simulations to verify our results.' volume: 157 URL: https://proceedings.mlr.press/v157/kang21a.html PDF: https://proceedings.mlr.press/v157/kang21a/kang21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-kang21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Keegan family: Kang - given: Sergey family: Kushnarev - given: Wei Pin family: Wong - given: Rameshwar family: Pratap - given: Haikal family: Yeo - given: Chen family: Yijia editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 814-829 id: kang21a issued: date-parts: - 2021 - 11 - 28 firstpage: 814 lastpage: 829 published: 2021-11-28 00:00:00 +0000 - title: 'Feature Convolutional Networks' abstract: 'Convolutional neural networks are among the most successful deep learning models used for image processing, computer vision and natural language processing applications. In this paper, we define convolution operator for numerical tabular features and thus propose feature convolutional network model for machine learning tasks. Feature convolutional networks contain feature convolution layer to extract pairwise feature convolutions in the relational feature spaces. Compared with the baseline multi-layer neural network model, the feature convolutional network gains better performance among all the experiments. The experiments results suggest that feature convolutional networks can generate efficient features automatically and provide better performance through automatic feature learning. The demo code is at https://github.com/info-ruc/FeatConvNet.' volume: 157 URL: https://proceedings.mlr.press/v157/hu21a.html PDF: https://proceedings.mlr.press/v157/hu21a/hu21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-hu21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: He family: Hu editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 830-839 id: hu21a issued: date-parts: - 2021 - 11 - 28 firstpage: 830 lastpage: 839 published: 2021-11-28 00:00:00 +0000 - title: 'Bias-tolerant Fair Classification' abstract: 'The label bias and selection bias are acknowledged as two reasons in data that will hinder the fairness of machine-learning outcomes. The label bias occurs when the labeling decision is disturbed by sensitive features, while the selection bias occurs when subjective bias exists during the data sampling. Even worse, models trained on such data can inherit or even intensify the discrimination. Most algorithmic fairness approaches perform an empirical risk minimization with predefined fairness constraints, which tends to trade-off accuracy for fairness. However, such methods would achieve the desired fairness level with the sacrifice of the benefits (receive positive outcomes) for individuals affected by the bias. Therefore, we propose a \textbf{B}ias-Tolerant \textbf{FA}ir \textbf{R}egularized \textbf{L}oss (B-FARL), which tries to regain the benefits using data affected by label bias and selection bias. B-FARL takes the biased data as input, calls a model that approximates the one trained with fair but latent data, and thus prevents discrimination without constraints required. In addition, we show the effective components by decomposing B-FARL, and we utilize the meta-learning framework for the B-FARL optimization. The experimental results on real-world datasets show that our method is empirically effective in improving fairness towards the direction of true but latent labels.' volume: 157 URL: https://proceedings.mlr.press/v157/zhang21d.html PDF: https://proceedings.mlr.press/v157/zhang21d/zhang21d.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-zhang21d.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Yixuan family: Zhang - given: Feng family: Zhou - given: Zhidong family: Li - given: Yang family: Wang - given: Fang family: Chen editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 840-855 id: zhang21d issued: date-parts: - 2021 - 11 - 28 firstpage: 840 lastpage: 855 published: 2021-11-28 00:00:00 +0000 - title: 'Multi-factor Memory Attentive Model for Knowledge Tracing' abstract: 'The traditional knowledge tracing with neural network usually embeds the required information and predicates the knowledge proficiency by embedded information. Only few information, however, is considered in traditional methods, such as the information of exercises in terms of concept. In this paper, we propose a multi-factor memory attentive model for knowledge tracing (MMAKT). In terms of Neural Cognitive Diagnosis (NeuralCD) framework, MMAKT introduces the factors of the knowledge concept relevancy, the difficulty of each concept, the discrimination among exercises and the student’s proficiency to construct interaction vectors. Moreover, in order to achieve more accurate prediction precision, MMAKT introduces attention mechanism to enhance the expression of historical relationship between interactions. With the experiments on the real-world datasets, MMAKT shows better performance of knowledge tracing and prediction in comparision with the state-of-the-art approaches.' volume: 157 URL: https://proceedings.mlr.press/v157/liu21c.html PDF: https://proceedings.mlr.press/v157/liu21c/liu21c.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-liu21c.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Congjie family: Liu - given: Xiaoguang family: Li editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 856-869 id: liu21c issued: date-parts: - 2021 - 11 - 28 firstpage: 856 lastpage: 869 published: 2021-11-28 00:00:00 +0000 - title: 'Asymptotically Exact and Fast Gaussian Copula Models for Imputation of Mixed Data Types' abstract: 'Missing values with mixed data types is a common problem in a large number of machine learning applications such as processing of surveys and in different medical applications. Recently, Gaussian copula models have been suggested as a means of performing imputation of missing values using a probabilistic framework. While the present Gaussian copula models have shown to yield state of the art performance, they have two limitations: they are based on an approximation that is fast but may be imprecise and they do not support unordered multinomial variables. We address the first limitation using direct and arbitrarily precise approximations both for model estimation and imputation by using randomized quasi-Monte Carlo procedures. The method we provide has lower errors for the estimated model parameters and the imputed values, compared to previously proposed methods. We also extend the previous Gaussian copula models to include unordered multinomial variables in addition to the present support of ordinal, binary, and continuous variables.' volume: 157 URL: https://proceedings.mlr.press/v157/christoffersen21a.html PDF: https://proceedings.mlr.press/v157/christoffersen21a/christoffersen21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-christoffersen21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Benjamin family: Christoffersen - given: Mark family: Clements - given: Keith family: Humphreys - given: Hedvig family: Kjellström editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 870-885 id: christoffersen21a issued: date-parts: - 2021 - 11 - 28 firstpage: 870 lastpage: 885 published: 2021-11-28 00:00:00 +0000 - title: 'Greedy Search Algorithm for Mixed Precision in Post-Training Quantization of Convolutional Neural Network Inspired by Submodular Optimization' abstract: 'For lower bit-widths such as less than 8-bit, many quantization strategies include re-training in order to recover accuracy degradation. However, the re-training works against rapid deployment for wide distribution of quantized models. Therefore, post-training quantization has been getting more attention in recent years. In one example, partial quantization according to the layer sensitivity based on the accuracy after each quantization has been proposed; however, the effects of one layer quantization on the other layers has not taken into account. To further reduce the accuracy degradation, we propose a quantization scheme that considers the effects by continuously updating the accuracy after each layer quantization. Additionally, for more data compression, we extend that scheme to mixed precision, which applies a layer-by-layer fitted bit-width. Since the search space for bit allocation per layer increases exponentially with the number of layers $N$, Existing methods require computationally intensive approach such as network training. Here, we derive practical solutions to the bit allocation problem in polynomial time $O(N^2)$ using a deterministic greedy search algorithm inspired by submodular optimization without any training. For example, the proposed algorithm completes a search on ResNet18 for ImageNet in 1 hour for a single GPU. Compared to the case without updating the layer sensitivity, our method improves the accuracy of the quantized model by more than 1% with multiple convolutional neural networks. For examples, 6-bit quantization of MobileNetV2 achieves 80.1% reduction of model size with -1.10% accuracy degradation. 4-bit quantization of ResNet50 achieves 82.9% size reduction with -0.194% accuracy degradation. Furthermore, results show that the proposed method reduces the accuracy degradation by more than about 0.7% compared to various latest post-training quantization strategies.' volume: 157 URL: https://proceedings.mlr.press/v157/satoki21a.html PDF: https://proceedings.mlr.press/v157/satoki21a/satoki21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-satoki21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Tsuji family: Satoki - given: Kawaguchi family: Hiroshi - given: Inoue family: Atsuki - given: Sakai family: Yasufumi - given: Yamada family: Fuyuka editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 886-901 id: satoki21a issued: date-parts: - 2021 - 11 - 28 firstpage: 886 lastpage: 901 published: 2021-11-28 00:00:00 +0000 - title: 'ExNN-SMOTE: Extended Natural Neighbors Based SMOTE to Deal with Imbalanced Data' abstract: 'Many practical applications suffer from the problem of imbalanced classification. The minority class has poor classification performance; on the other hand, its misclassification cost is high. One reason for classification difficulty is the intrinsic complicated distribution characteristics (CDCs) in imbalanced data itself. Classical oversampling method SMOTE generates synthetic minority class examples between neighbors, which is parameter dependent. Furthermore, due to blindness of neighbor selection, SMOTE suffers from overgeneralization in the minority class. To solve such problems, we propose an oversampling method, called extended natural neighbors based SMOTE (ExNN-SMOTE). In ExNN-SMOTE, neighbors are determined adaptively by capturing data distribution characteristics. Extensive experiments over synthetic and real datasets demonstrate the effectiveness of ExNN-SMOTE dealing with CDCs and the superiority of ExNN-SMOTE over other SMOTE-related methods.' volume: 157 URL: https://proceedings.mlr.press/v157/guan21a.html PDF: https://proceedings.mlr.press/v157/guan21a/guan21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-guan21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Hongjiao family: Guan - given: Bin family: Ma - given: Yingtao family: Zhang - given: Xianglong family: Tang editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 902-917 id: guan21a issued: date-parts: - 2021 - 11 - 28 firstpage: 902 lastpage: 917 published: 2021-11-28 00:00:00 +0000 - title: 'Geometric Value Iteration: Dynamic Error-Aware KL Regularization for Reinforcement Learning' abstract: ' The recent boom in the literature on entropy-regularized reinforcement learning (RL) approaches reveals that Kullback-Leibler (KL) regularization brings advantages to RL algorithms by canceling out errors under mild assumptions. However, existing analyses focus on fixed regularization with a constant weighting coefficient and do not consider cases where the coefficient is allowed to change dynamically. In this paper, we study the dynamic coefficient scheme and present the first asymptotic error bound. Based on the dynamic coefficient error bound, we propose an effective scheme to tune the coefficient according to the magnitude of error in favor of more robust learning. Complementing this development, we propose a novel algorithm, Geometric Value Iteration (GVI), that features a dynamic error-aware KL coefficient design with the aim of mitigating the impact of errors on performance. Our experiments demonstrate that GVI can effectively exploit the trade-off between learning speed and robustness over uniform averaging of a constant KL coefficient. The combination of GVI and deep networks shows stable learning behavior even in the absence of a target network, where algorithms with a constant KL coefficient would greatly oscillate or even fail to converge.' volume: 157 URL: https://proceedings.mlr.press/v157/kitamura21a.html PDF: https://proceedings.mlr.press/v157/kitamura21a/kitamura21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-kitamura21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Toshinori family: Kitamura - given: Lingwei family: Zhu - given: Takamitsu family: Matsubara editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 918-931 id: kitamura21a issued: date-parts: - 2021 - 11 - 28 firstpage: 918 lastpage: 931 published: 2021-11-28 00:00:00 +0000 - title: 'Collaborative Novelty Detection for Distributed Data by a Probabilistic Method' abstract: 'Novelty detection, which detects anomalies based on a training dataset consisting of only the normal data, is an important task in several applications. In addition, in the real world, there may be situations where data is owned by multiple parties in a distributed manner but cannot be shared with each other due to privacy and confidentiality requirements. Therefore, how to develop distributed novelty detection while preserving privacy is essential. To address this challenge, we propose a probabilistic collaborative method that allows distributed novelty detection for multiple parties without sharing the original data. The proposed method constructs a collaborative kernel based on a collaborative data analysis framework, by which intermediate representations are generated from each party and shared for collaborative novelty detection. Numerical experiments demonstrate that the proposed method obtains better performance compared with the individual novelty detection in the local party.' volume: 157 URL: https://proceedings.mlr.press/v157/imakura21a.html PDF: https://proceedings.mlr.press/v157/imakura21a/imakura21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-imakura21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Akira family: Imakura - given: Xiucai family: Ye - given: Tetsuya family: Sakurai editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 932-947 id: imakura21a issued: date-parts: - 2021 - 11 - 28 firstpage: 932 lastpage: 947 published: 2021-11-28 00:00:00 +0000 - title: 'Efficient Coreset Constructions via Sensitivity Sampling' abstract: 'A coreset for a set of points is a small subset of weighted points that approximately preserves important properties of the original set. Specifically, if $P$ is a set of points, $Q$ is a set of queries, and $f:P\times Q\to\mathbb{R}$ is a cost function, then a set $S\subseteq P$ with weights $w:P\to[0,\infty)$ is an $\epsilon$-coreset for some parameter $\epsilon>0$ if $\sum_{s\in S}w(s)f(s,q)$ is a $(1+\epsilon)$ multiplicative approximation to $\sum_{p\in P}f(p,q)$ for all $q\in Q$. Coresets are used to solve fundamental problems in machine learning under various big data models of computation. Many of the suggested coresets in the recent decade used, or could have used a general framework for constructing coresets whose size depends quadratically on the total sensitivity $t$. In this paper we improve this bound from $O(t^2)$ to $O(t\log t)$. Thus our results imply more space efficient solutions to a number of problems, including projective clustering, $k$-line clustering, and subspace approximation. The main technical result is a generic reduction to the sample complexity of learning a class of functions with bounded VC dimension. We show that obtaining an $(\nu,\alpha)$-sample for this class of functions with appropriate parameters $\nu$ and $\alpha$ suffices to achieve space efficient $\epsilon$-coresets. Our result implies more efficient coreset constructions for a number of interesting problems in machine learning; we show applications to $k$-median/$k$-means, $k$-line clustering, $j$-subspace approximation, and the integer $(j,k)$-projective clustering problem. ' volume: 157 URL: https://proceedings.mlr.press/v157/braverman21a.html PDF: https://proceedings.mlr.press/v157/braverman21a/braverman21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-braverman21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Vladimir family: Braverman - given: Dan family: Feldman - given: Harry family: Lang - given: Adiel family: Statman - given: Samson family: Zhou editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 948-963 id: braverman21a issued: date-parts: - 2021 - 11 - 28 firstpage: 948 lastpage: 963 published: 2021-11-28 00:00:00 +0000 - title: 'Dynamic Popularity-Aware Contrastive Learning for Recommendation' abstract: 'With the development of deep learning techniques, contrastive representation learning has been increasingly employed in large-scale recommender systems. For instance, deep user-item matching models can be trained by contrasting positive and negative examples and learning discriminative user and item representations. Despite their success, the distinguishable properties of the recommender system are often ignored in existing modelling. Standard methods approximate maximum likelihood estimation on user behavior data in a manner similar to language models. Specifically, the way of model optimization corresponds to approximating the user-item pointwise mutual information, which can be regarded as eliminating the influence of global item popularity on user behavior to capture intrinsic user preference. In addition, unlike the situation in language models where word frequency is relatively stable, item popularity is constantly evolving. To address these issues, we propose a novel dynamic popularity-aware (DPA) contrastive learning method for recommendation, which consists of two key components: i) a dynamic negative sampling strategy is involved to enhance the user representation, ii) a dynamic prediction recovery is adopted by the real-time item popularity. The proposed strategy can be naturally overlaid on any contrastive learning-based matching model to more accurately capture user interest and system dynamics. Finally, the effectiveness of the proposed strategy is demonstrated through comprehensive experiments on an e-commerce scenario of Alibaba Group.' volume: 157 URL: https://proceedings.mlr.press/v157/lin21b.html PDF: https://proceedings.mlr.press/v157/lin21b/lin21b.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-lin21b.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Fangquan family: Lin - given: Wei family: Jiang - given: Jihai family: Zhang - given: Cheng family: Yang editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 964-968 id: lin21b issued: date-parts: - 2021 - 11 - 28 firstpage: 964 lastpage: 968 published: 2021-11-28 00:00:00 +0000 - title: 'Neural Graph Filtering for Context-aware Recommendation' abstract: ' With the rapid development of web services, various kinds of context data become available in recommender systems to handler the data sparsity problem, called context-aware recommendation (CAR). It is challenging to develop effective approaches to model and exploit these various and heterogeneous data. Recently, heterogeneous information network (HIN) has been adopted to model the context data due to its flexibility in modelling data heterogeneity. However, most of the HIN-based methods, which rely on meta paths or graph embedding to extract features from HINs, cannot fully mine the network structure and semantic features of users and items. Besides, these methods, utilizing the global dataset to learn personalized latent factors, usually suffer individuality loss problem. In this paper, we propose a neural graph filtering method for context-aware recommendation, called NGF. First, we use an unified HIN to model both the users’ feedback information and the context data. Then, we adopt graph filtering to predict aspect-level ratings on a series of independent subgraphs of the unified HIN and feed a deep neural network (DNN) to fuse the predictions for CAR. Concretely, graph filtering is a case-by-case algorithm for personalized recommendation on HINs, which predicts the further behavior by all its similar historical behaviors. We split the unified HIN into many single-aspect networks according to the semantic relations and utilize graph filtering to predict user’s behavior on each subgraphs. The following deep neural network is to fuse the personalized predictions in aspect-level. Extensive experiments on two real-world datasets demonstrate the effectiveness of our neural graph filtering for CAR.' volume: 157 URL: https://proceedings.mlr.press/v157/chuanyan21a.html PDF: https://proceedings.mlr.press/v157/chuanyan21a/chuanyan21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-chuanyan21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Zhang family: Chuanyan - given: Hong family: Xiaoguang editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 969-984 id: chuanyan21a issued: date-parts: - 2021 - 11 - 28 firstpage: 969 lastpage: 984 published: 2021-11-28 00:00:00 +0000 - title: 'Lifelong Learning with Sketched Structural Regularization' abstract: 'Preventing catastrophic forgetting while continually learning new tasks is an essential problem in lifelong learning. Structural regularization (SR) refers to a family of algorithms that mitigate catastrophic forgetting by penalizing the network for changing its “critical parameters" from previous tasks while learning a new one. The penalty is often induced via a quadratic regularizer defined by an \emph{importance matrix}, e.g., the (empirical) Fisher information matrix in the Elastic Weight Consolidation framework. In practice and due to computational constraints, most SR methods crudely approximate the importance matrix by its diagonal. In this paper, we propose \emph{Sketched Structural Regularization} (Sketched SR) as an alternative approach to compress the importance matrices used for regularizing in SR methods. Specifically, we apply \emph{linear sketching methods} to better approximate the importance matrices in SR algorithms. We show that sketched SR: (i) is computationally efficient and straightforward to implement, (ii) provides an approximation error that is justified in theory, and (iii) is method oblivious by construction and can be adapted to any method that belongs to the SR class. We show that our proposed approach consistently improves various SR algorithms’ performance on both synthetic experiments and benchmark continual learning tasks, including permuted-MNIST and CIFAR-100.' volume: 157 URL: https://proceedings.mlr.press/v157/li21b.html PDF: https://proceedings.mlr.press/v157/li21b/li21b.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-li21b.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Haoran family: Li - given: Aditya family: Krishnan - given: Jingfeng family: Wu - given: Soheil family: Kolouri - given: Praveen K. family: Pilly - given: Vladimir family: Braverman editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 985-1000 id: li21b issued: date-parts: - 2021 - 11 - 28 firstpage: 985 lastpage: 1000 published: 2021-11-28 00:00:00 +0000 - title: 'Robust Regression for Monocular Depth Estimation' abstract: 'Learning accurate models for monocular depth estimation requires precise depth annotation as e.g. gathered through LiDAR scanners. Because the data acquisition with sensors of this kind is costly and does not scale well in general, less advanced depth sources, such as time-of-flight cameras, are often used instead. However, these sensors provide less reliable signals, resulting in imprecise depth data for training regression models. As shown in idealized environments, the noise produced by commonly used RGB-D sensors violates standard statistical assumptions of regression methods, such as least squares estimation. In this paper, we investigate whether robust regression methods, which are more tolerant toward violations of statistical assumptions, can mitigate the effects of low-quality data. As a viable alternative to established approaches of that kind, we propose the use of so-called superset learning, where the original data is replaced by (less precise but more reliable) set-valued data. To evaluate and compare the methods, we provide an extensive empirical study on common benchmark data for monocular depth estimation. Our results clearly show the superiority of robust variants over conventional regression.' volume: 157 URL: https://proceedings.mlr.press/v157/lienen21a.html PDF: https://proceedings.mlr.press/v157/lienen21a/lienen21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-lienen21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Julian family: Lienen - given: Nils family: Nommensen - given: Ralph family: Ewerth - given: Eyke family: Hüllermeier editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1001-1016 id: lienen21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1001 lastpage: 1016 published: 2021-11-28 00:00:00 +0000 - title: 'Transfer Learning with Adaptive Online TrAdaBoost for Data Streams' abstract: 'In many real-world applications, data are often produced in the form of streams. Consider, for example, data produced by sensors. In data streams there can be concept drift where the distribution of the data changes. When we deal with multiple streams from the same domain, concepts that have occurred in one stream may occur in another. Therefore, being able to reuse knowledge across multiple streams can help models recover from concept drifts more quickly. A major challenge is that these data streams may be only partially identical and a direct adoption would not suffice. In this paper, we propose a novel framework to transfer both identical and partially identical concepts across different streams. In particular, we propose a new technique called Adaptive Online TrAdaBoost that tunes weight adjustments during boosting based on model performance. The experiments on synthetic data verify the desired properties of the proposed method, and the experiments on real-world data show the better performance of the method for data stream mining compared with its baselines.' volume: 157 URL: https://proceedings.mlr.press/v157/wu21b.html PDF: https://proceedings.mlr.press/v157/wu21b/wu21b.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-wu21b.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Ocean family: Wu - given: Yun Sing family: Koh - given: Gillian family: Dobbie - given: Thomas family: Lacombe editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1017-1032 id: wu21b issued: date-parts: - 2021 - 11 - 28 firstpage: 1017 lastpage: 1032 published: 2021-11-28 00:00:00 +0000 - title: 'Bridging Code-Text Representation Gap using Explanation' abstract: 'This paper studies Code-Text Representation (CTR) learning, aiming to learn general-purpose representations that support downstream code/text applications such as code search, finding code matching textual queries. However, state-of-the-arts do not focus on matching the gap between code/text modalities. In this paper, we complement this gap by providing an intermediate representation, and view it as “explanation.” Our contribution is three fold: First, we propose four types of explanation utilization methods for CTR, and compare their effectiveness. Second, we show that using explanation as the model input is desirable. Third, we confirm that even automatically generated explanation can lead to a drastic performance gain. To the best of our knowledge, this is the first work to define and categorize code explanation, for enhancing code understanding/representation.' volume: 157 URL: https://proceedings.mlr.press/v157/han21a.html PDF: https://proceedings.mlr.press/v157/han21a/han21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-han21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Hojae family: Han - given: Youngwon family: Lee - given: Minsoo family: Kim - given: Hwang family: Seung-won editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1033-1048 id: han21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1033 lastpage: 1048 published: 2021-11-28 00:00:00 +0000 - title: 'ContriQ: Ally-Focused Cooperation and Enemy-Concentrated Confrontation in Multi-Agent Reinforcement Learning' abstract: 'Centralized training with decentralized execution (CTDE) is an important setting for cooperative multi-agent reinforcement learning (MARL) due to communication constraints during execution and scalability constraints during training, which has shown superior performance but still suffers from challenges. One branch is to understand the mutual interplay between agents. Due to the communication constraints in practice, agents cannot exchange perceptual information, and thus, many approaches use a centralized attention network with scalability constraints. Contrary to these common approaches, we propose to learn to cooperate in a decentralized way by applying attention mechanism on the local observation so that each agent could focus on allied agents with a decentralized model, and therefore promote understanding. Another branch is to model how agents cooperate and simplify the learning process. Previous approaches that focus on value decomposition have achieved innovative results but still suffer from problems. These approaches either limit the representation expressiveness of their value function classes or relax the IGM consistency to achieve scalability, which may lead to poor performance. We combine value composition with game abstraction by modeling the relationships between agents as a bi-level graph. We propose a novel value decomposition network based on it through a bi-level attention network, which indicates the contribution of allied agents attacking enemies and the priority of attacking each enemy under the situation of each time step, respectively. We show that our method substantially outperforms existing state-of-the-art methods on battle games in StarCraft Ⅱ, and attention analysis is also comprehensively discussed with sights.' volume: 157 URL: https://proceedings.mlr.press/v157/chenran21a.html PDF: https://proceedings.mlr.press/v157/chenran21a/chenran21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-chenran21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Zhao family: Chenran - given: Shi family: Dianxi - given: Zhang family: Yaowen - given: Yang family: Huanhuan - given: Yang family: Shaowu - given: Zhang family: Yongjun editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1049-1064 id: chenran21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1049 lastpage: 1064 published: 2021-11-28 00:00:00 +0000 - title: 'DAGSurv: Directed Ayclic Graph Based Survival Analysis Using Deep Neural Networks' abstract: 'Causal structures for observational survival data provide crucial information regarding the relationships between covariates and time-to-event. We derive motivation from the information theoretic source coding argument, and show that incorporating the knowledge of the directed acyclic graph (DAG) can be beneficial if suitable source encoders are employed. As a possible source encoder in this context, we derive a variational inference based conditional variational autoencoder for causal structured survival prediction, which we refer to as \texttt{DAGSurv}. We illustrate the performance of \texttt{DAGSurv} on low and high-dimensional synthetic datasets, and real-world datasets such as METABRIC and GBSG. We demonstrate that the proposed method outperforms other survival analysis baselines such as \texttt{Cox} Proportional Hazards, \texttt{DeepSurv} and \texttt{Deephit}, which are oblivious to the underlying causal relationship between data entities.' volume: 157 URL: https://proceedings.mlr.press/v157/sharma21a.html PDF: https://proceedings.mlr.press/v157/sharma21a/sharma21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-sharma21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Ansh Kumar family: Sharma - given: Rahul family: Kukreja - given: Ranjitha family: Prasad - given: Shilpa family: Rao editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1065-1080 id: sharma21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1065 lastpage: 1080 published: 2021-11-28 00:00:00 +0000 - title: 'Modeling Risky Choices in Unknown Environments' abstract: 'Decision-theoretic models explain human behavior in choice problems involving uncertainty, in terms of individual tendencies such as risk aversion. However, many classical models of risk require knowing the distribution of possible outcomes (rewards) for all options, limiting their applicability outside of controlled experiments. We study the task of learning such models in contexts where the modeler does not know the distributions but instead can only observe the choices and their outcomes for a user familiar with the decision problems, for example a skilled player playing a digital game. We propose a framework combining two separate components, one for modeling the unknown decision-making environment and another for the risk behavior. By using environment models capable of learning distributions we are able to infer classical models of decision-making under risk from observations of the user’s choices and outcomes alone, and we also demonstrate alternative models for predictive purposes. We validate the approach on artificial data and demonstrate a practical use case in modeling risk attitudes of professional esports teams.' volume: 157 URL: https://proceedings.mlr.press/v157/tanskanen21a.html PDF: https://proceedings.mlr.press/v157/tanskanen21a/tanskanen21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-tanskanen21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Ville family: Tanskanen - given: Chang family: Rajani - given: Homayun family: Afrabandpey - given: Aini family: Putkonen - given: Aurélien family: Nioche - given: Arto family: Klami editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1081-1096 id: tanskanen21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1081 lastpage: 1096 published: 2021-11-28 00:00:00 +0000 - title: 'Expert advice problem with noisy low rank loss' abstract: 'We consider the expert advice problem with a low rank but noisy loss sequence, where a loss vector $l_{t} \in [-1,1]^N$ in each round $t$ is of the form $l_{t} = U v_{t} + \epsilon_{t}$ for some fixed but unknown $N \times d$ matrix $U$ called the kernel, some $d$-dimensional seed vector $v_{t} \in \mathbb{R}^{d}$, and some additional noisy term $\epsilon_t \in \mathbb{R}^{N}$ whose norm is bounded by $\epsilon$. This is a generalization of the works of Hazan et al. and Barman et al., where the former only treats noiseless loss and the latter assumes that the kernel is known in advance. In this paper, we propose an algorithm, where we re-construct the kernel under the assumptions, that the low rank loss is noised and there is no prior information about kernel. In this algorithm, we approximate the kernel by choosing a set of loss vectors with a high degree of independence from each other, and we give a regret bound of $O(d\sqrt{T}+d^{4/3}(N\epsilon)^{1/3}\sqrt{T})$. Moreover, even if in experiment, the proposed algorithm performs better than Hazan’s algorithm and Hedge algorithm.' volume: 157 URL: https://proceedings.mlr.press/v157/liu21d.html PDF: https://proceedings.mlr.press/v157/liu21d/liu21d.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-liu21d.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Yaxiong family: Liu - given: Xuanke family: Jiang - given: Kohei family: Hatano - given: Eiji family: Takimoto editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1097-1112 id: liu21d issued: date-parts: - 2021 - 11 - 28 firstpage: 1097 lastpage: 1112 published: 2021-11-28 00:00:00 +0000 - title: 'An online semi-definite programming with a generalised log-determinant regularizer and its applications' abstract: 'We consider a variant of the online semi-definite programming problem: The decision space consists of positive semi-definite matrices with bounded diagonal entries and bounded $\Gamma$-trace norm, which is a generalization of the trace norm defined by a positive definite matrix $\Gamma$. To solve this problem, we propose a follow-the-regularized-leader algorithm with a novel regularizer, which is a generalisation of the log-determinant function parameterized by the matrix $\Gamma$. Then we apply our algorithm to online binary matrix completion (OBMC) with side information and online similarity prediction with side information, and improve mistake bounds by logarithmic factors. In particular, for OBMC our mistake bound is optimal.' volume: 157 URL: https://proceedings.mlr.press/v157/liu21e.html PDF: https://proceedings.mlr.press/v157/liu21e/liu21e.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-liu21e.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Yaxiong family: Liu - given: Ken-ichiro family: Moridomi - given: Kohei family: Hatano - given: Eiji family: Takimoto editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1113-1128 id: liu21e issued: date-parts: - 2021 - 11 - 28 firstpage: 1113 lastpage: 1128 published: 2021-11-28 00:00:00 +0000 - title: 'Cross-structural Factor-topic Model: Document Analysis with Sophisticated Covariates' abstract: 'Modern text data is increasingly gathered in situations where it is paired with a high-dimensional collection of covariates: then both the text, the covariates, and their relationships are of interest to analyze. Despite the growing amount of such data, current topic models are unable to take into account large amounts of covariates successfully: they fail to model structure among covariates and distort findings of both text and covariates. This paper presents a solution: a novel factor-topic model that enables researchers to analyze latent structure in both text and sophisticated document-level covariates collectively. The key innovation is that besides learning the underlying topical structure, the model also learns the underlying factorial structure from the covariates and the interactions between the two structures. A set of tailored variational inference algorithms for efficient computation are provided. Experiments on three different datasets show the model outperforms comparable topic models in the ability to predict held-out document content. Two case studies focusing on Finnish parliamentary election candidates and game players on Steam demonstrate the model discovers semantically meaningful topics, factors, and their interactions. The model both outperforms state-of-the-art models in predictive accuracy and offers new factor-topic insights beyond other topic models.' volume: 157 URL: https://proceedings.mlr.press/v157/lu21a.html PDF: https://proceedings.mlr.press/v157/lu21a/lu21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-lu21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Chien family: Lu - given: Jaakko family: Peltonen - given: Timo family: Nummenmaa - given: Jyrki family: Nummenmaa - given: Kalervo family: Järvelin editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1129-1144 id: lu21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1129 lastpage: 1144 published: 2021-11-28 00:00:00 +0000 - title: 'Deep Structural Contrastive Subspace Clustering' abstract: 'Deep subspace clustering based on data self-expression is devoted to learning pairwise affinities in the latent feature space. Existing methods tend to rely on an autoencoder framework to learn representations for an affinity matrix. However, the representation learning driven largely by pixel-level data reconstruction is somewhat incompatible with the subspace clustering task. With the unavailability of ground truth, can structural representations, which is exactly what subspace clustering favors, be achieved by simply exploiting the supervision information in the data itself? In this paper, we formulate this intuition as a structural contrastive prediction task and propose an end-to-end trainable framework referred as Deep Structural Contrastive Subspace Clustering (DSCSC). Specifically, DSCSC makes use of data augmentation technique to mine positive pairs and constructs a data similarity graph in the embedding feature space to search negative pairs. A novel structural contrastive loss is proposed on the latent representations to achieve positive-concentrated and negative-separated property for subspace preserving. Extensive experiments on the benchmark datasets demonstrate that our method outperforms the state-of-the-art deep subspace clustering methods and imply the necessity of the proposed structural contrastive loss.' volume: 157 URL: https://proceedings.mlr.press/v157/peng21a.html PDF: https://proceedings.mlr.press/v157/peng21a/peng21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-peng21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Bo family: Peng - given: Wenjie family: Zhu editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1145-1160 id: peng21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1145 lastpage: 1160 published: 2021-11-28 00:00:00 +0000 - title: 'Lifelong Learning with Branching Experts' abstract: 'The problem of branching experts is an extension of the experts problem where the set of experts may grow over time. We compare this problem in different learning settings along several axes: adversarial versus stochastic losses; a fixed versus a growing set of experts (branching experts); and single-task versus lifelong learning with expert advice. First, for the branching experts problem, we achieve tight regret bounds in both adversarial and stochastic setting with a single algorithm. While it was known that the adversarial branching experts problem is strictly harder than the non-branching one, the stochastic branching experts problem is in fact no harder. Next, we study the extension to the lifelong learning with expert advice in which one has to make online predictions with a sequence of tasks. For this problem, we provide a single algorithm which works for both adversarial and stochastic setting, and our bounds when specialized to the case without branching recover the regret bounds previously achieved separately via different algorithms. Furthermore, we prove a regret lower bound which shows that in the lifelong learning scenario, the case with branching experts now becomes strictly harder than the non-branching case in the stochastic setting.' volume: 157 URL: https://proceedings.mlr.press/v157/wu21c.html PDF: https://proceedings.mlr.press/v157/wu21c/wu21c.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-wu21c.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Yi-Shan family: Wu - given: Yi-Te family: Hong - given: Chi-Jen family: Lu editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1161-1175 id: wu21c issued: date-parts: - 2021 - 11 - 28 firstpage: 1161 lastpage: 1175 published: 2021-11-28 00:00:00 +0000 - title: 'Ensembling With a Fixed Parameter Budget: When Does It Help and Why?' abstract: 'Given a fixed parameter budget, one can build a single large neural network or create a memory-split ensemble: a pool of several smaller networks with the same total parameter count as the single network. A memory-split ensemble can outperform its single model counterpart (Lobacheva et al., 2020): a phenomenon known as the memory-split advantage (MSA). The reasons for MSA are still not yet fully understood. In particular, it is difficult in practice to predict when it will exist. This paper sheds light on the reasons underlying MSA using random feature theory. We study the dependence of the MSA on several factors: the parameter budget, the training set size, the L2 regularization and the Stochastic Gradient Descent (SGD) hyper-parameters. Using the bias-variance decomposition, we show that MSA exists when the reduction in variance due to the ensemble (\ie, \textit{ensemble gain}) exceeds the increase in squared bias due to the smaller size of the individual networks (\ie, \textit{shrinkage cost}). Taken together, our theoretical analysis demonstrates that the MSA mainly exists for the small parameter budgets relative to the training set size, and that memory-splitting can be understood as a type of regularization. Adding other forms of regularization, \eg L2 regularization, reduces the MSA. Thus, the potential benefit of memory-splitting lies primarily in the possibility of speed-up via parallel computation. Our empirical experiments with deep neural networks and large image datasets show that MSA is not a general phenomenon, but mainly exists when the number of training iterations is small.' volume: 157 URL: https://proceedings.mlr.press/v157/deng21a.html PDF: https://proceedings.mlr.press/v157/deng21a/deng21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-deng21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Didan family: Deng - given: Emil Bertram family: Shi editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1176-1191 id: deng21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1176 lastpage: 1191 published: 2021-11-28 00:00:00 +0000 - title: 'Revisiting Weight Initialization of Deep Neural Networks' abstract: 'The proper {\em initialization of weights} is crucial for the effective training and fast convergence of {\em deep neural networks} (DNNs). Prior work in this area has mostly focused on the principle of {\em balancing the variance among weights per layer} to maintain stability of (i) the input data propagated forwards through the network, and (ii) the loss gradients propagated backwards, respectively. This prevalent heuristic is however agnostic of dependencies among gradients across the various layers and captures only first-order effects per layer. In this paper, we investigate a {\em unifying approach}, based on approximating and controlling the {\em norm of the layers’ Hessians}, which both generalizes and explains existing initialization schemes such as {\em smooth activation functions}, {\em Dropouts}, and {\em ReLU}.' volume: 157 URL: https://proceedings.mlr.press/v157/skorski21a.html PDF: https://proceedings.mlr.press/v157/skorski21a/skorski21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-skorski21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Maciej family: Skorski - given: Alessandro family: Temperoni - given: Martin family: Theobald editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1192-1207 id: skorski21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1192 lastpage: 1207 published: 2021-11-28 00:00:00 +0000 - title: 'Robust Model-based Reinforcement Learning for Autonomous Greenhouse Control' abstract: 'Due to the high efficiency and less weather dependency, autonomous greenhouses provide an ideal solution to meet the increasing demand for fresh food. However, managers are faced with some challenges in finding appropriate control strategies for crop growth, since the decision space of the greenhouse control problem is an astronomical number. Therefore, an intelligent closed-loop control framework is highly desired to generate an automatic control policy. As a powerful tool for optimal control, reinforcement learning (RL) algorithms can surpass human beings’ decision-making and can also be seamlessly integrated into the closed-loop control framework. However, in complex real-world scenarios such as agricultural automation control, where the interaction with the environment is time-consuming and expensive, the application of RL algorithms encounters two main challenges, i.e., sample efficiency and safety. Although model-based RL methods can greatly mitigate the efficiency problem of greenhouse control, the safety problem has not got too much attention. In this paper, we present a model-based robust RL framework for autonomous greenhouse control to meet the sample efficiency and safety challenges. Specifically, our framework introduces an ensemble of environment models to work as a simulator and assist in policy optimization, thereby addressing the low sample efficiency problem. As for the safety concern, we propose a sample dropout module to focus more on worst-case samples, which can help improve the adaptability of the greenhouse planting policy in extreme cases. Experimental results demonstrate that our approach can learn a more effective greenhouse planting policy with better robustness than existing methods.' volume: 157 URL: https://proceedings.mlr.press/v157/zhang21e.html PDF: https://proceedings.mlr.press/v157/zhang21e/zhang21e.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-zhang21e.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Wanpeng family: Zhang - given: Xiaoyan family: Cao - given: Yao family: Yao - given: Zhicheng family: An - given: Xi family: Xiao - given: Dijun family: Luo editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1208-1223 id: zhang21e issued: date-parts: - 2021 - 11 - 28 firstpage: 1208 lastpage: 1223 published: 2021-11-28 00:00:00 +0000 - title: 'Physics-inspired Learning for Structure-Aware Texture-Sensitive Underwater Image Enhancement' abstract: 'Recently, improving the visual quality of underwater images using deep learning-based methods has drawn considerable attention. Unfortunately, diverse environmental factors (e.g., blue/green color distortion) severely limit their performance in real-world environments. Therefore, strengthening the superiority of the underwater image enhancement method is critical. In this paper, we devote ourselves to develop a new architecture with strong superiority and adaptability. Inspired by the underwater imaging principle, we establish a novel physics-inspired learning model that is easy to realize. A Structure-Aware Texture-Sensitive Network (SATS-Net) is further developed to portray the model. The structure-aware module is responsible for structural information, and the texture-sensitive module is responsible for textural information. Thus, SATS-Net successfully incorporates robust characterization absorbed from the physical principle to achieve strong robustness and adaptability. We conduct extensive experiments to demonstrate that SATS-Net outperforms existing advanced techniques in various real-world underwater environments.' volume: 157 URL: https://proceedings.mlr.press/v157/xue21a.html PDF: https://proceedings.mlr.press/v157/xue21a/xue21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-xue21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Xinwei family: Xue - given: Zexuan family: Li - given: Long family: Ma - given: Risheng family: Liu - given: Xin family: Fan editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1224-1236 id: xue21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1224 lastpage: 1236 published: 2021-11-28 00:00:00 +0000 - title: 'Robust Domain Randomised Reinforcement Learning through Peer-to-Peer Distillation' abstract: 'In reinforcement learning, domain randomisation is a popular technique for learning general policies that are robust to new environments and domain-shifts at deployment. However, naively aggregating information from randomised domains may lead to high variances in gradient estimation and sub-optimal policies. To address this issue, we present a peer-to-peer online distillation strategy for reinforcement learning termed P2PDRL, where multiple learning agents are each assigned to a different environment, and then exchange knowledge through mutual regularisation based on Kullback–Leibler divergence. Our experiments on continuous control tasks show that P2PDRL enables robust learning across a wider randomisation distribution than baselines, and more robust generalisation performance to new environments at testing.' volume: 157 URL: https://proceedings.mlr.press/v157/zhao21b.html PDF: https://proceedings.mlr.press/v157/zhao21b/zhao21b.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-zhao21b.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Chenyang family: Zhao - given: Timothy family: Hospedales editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1237-1252 id: zhao21b issued: date-parts: - 2021 - 11 - 28 firstpage: 1237 lastpage: 1252 published: 2021-11-28 00:00:00 +0000 - title: 'PFedAtt: Attention-based Personalized Federated Learning on Heterogeneous Clients' abstract: 'In federated learning, heterogeneity among the clients’ local datasets results in large variations in the number of local updates performed by each client in a communication round. Simply aggregating such local models into a global model will confine the capacity of the system, that is, the single global model will be restricted from delivering good performance on each client’s task. This paper provides a general framework to analyze the convergence of personalized federated learning algorithms. It subsumes previously proposed methods and provides a principled understanding of the computational guarantees. Using insights from this analysis, we propose PFedAtt, a personalized federated learning method that incorporates attention-based grouping to facilitate similar clients’ collaborations. Theoretically, we provide the convergence guarantee for the algorithm, and empirical experiments corroborate the competitive performance of PFedAtt on heterogeneous clients.' volume: 157 URL: https://proceedings.mlr.press/v157/ma21a.html PDF: https://proceedings.mlr.press/v157/ma21a/ma21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-ma21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Zichen family: Ma - given: Yu family: Lu - given: Wenye family: Li - given: Jinfeng family: Yi - given: Shuguang family: Cui editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1253-1268 id: ma21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1253 lastpage: 1268 published: 2021-11-28 00:00:00 +0000 - title: 'Multi-stream based marked point process' abstract: 'When using a point process, a specific form of the model needs to be designed for intensity function, based on physical and mathematical prior knowledge about the data. Recently, a fully trainable deep learning-based approach has been developed for temporal point processes. This approach models a cumulative hazard function (CHF), which is capable of systematic computation of adaptive intensity function in a data-driven manner. However, this approach does not take the attribute information of events into account although many applications of point processes generate with a variety of marked information such as location, magnitude, and depth of seismic activity. To overcome this limitation, we propose a fully trainable marked point process method, modeling decomposed CHFs for time and mark using multi-stream deep neural networks. In addition, we also propose to encode multiple marked information into a single image and extract necessary information adaptively without detailed knowledge about the data. We show the effectiveness of our proposed method through experiments with simulated toy data and real seismic data.' volume: 157 URL: https://proceedings.mlr.press/v157/hong21a.html PDF: https://proceedings.mlr.press/v157/hong21a/hong21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-hong21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Sujun family: Hong - given: Hirotaka family: Hachiya editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1269-1284 id: hong21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1269 lastpage: 1284 published: 2021-11-28 00:00:00 +0000 - title: 'Bayesian Latent Factor Model for Higher-order Data' abstract: 'Latent factor models are canonical tools to learn low-dimensional and linear embedding of original data. Traditional latent factor models are based on low-rank matrix factorization of covariance matrices. However, for higher-order data with multiple modes, i.e., tensors, this simple treatment fails to take into account the mode-specific relations. This ignorance leads to inefficiency in analysis of complex structures as well as poor data compression ability. In this paper, unlike covariance matrices, we investigate high-order covariance tensor directly by exploiting tensor ring (TR) format and propose the Bayesian TR latent factor model, which can represent complex multi-linear correlations and achieves efficient data compression. To overcome the difficulty of finding the optimal TR-ranks and simultaneously imposing sparsity on loading coefficients, a multiplicative Gamma process (MGP) prior is adopted to automatically infer the ranks and obtain sparsity. Then, we establish an efficient parameter-expanded EM algorithm to learn the maximum a posteriori (MAP) estimate of model parameters. Finally, we evaluate our model on covariance estimation, latent factor learning and image inpainting problems.' volume: 157 URL: https://proceedings.mlr.press/v157/tao21a.html PDF: https://proceedings.mlr.press/v157/tao21a/tao21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-tao21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Zerui family: Tao - given: Xuyang family: Zhao - given: Toshihisa family: Tanaka - given: Qibin family: Zhao editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1285-1300 id: tao21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1285 lastpage: 1300 published: 2021-11-28 00:00:00 +0000 - title: 'Learning 3-opt heuristics for traveling salesman problem via deep reinforcement learning' abstract: 'Traveling salesman problem (TSP) is a classical combinatorial optimization problem. As it represents a large number of important practical problems, it has received extensive studies and a great variety of algorithms have been proposed to solve it, including exact and heuristic algorithms. The success of heuristic algorithms relies heavily on the design of powerful heuristic rules, and most of the existing heuristic rules were manually designed by experienced experts to model their insights and observations on TSP instances and solutions. Recent studies have shown an alternative promising design strategy that directly learns heuristic rules from TSP instances without any manual interference. Here, we report an iterative improvement approach (called Neural-3-OPT) that solves TSP through automatically learning effective 3-opt heuristics via deep reinforcement learning. In the proposed approach, we adopt a pointer network to select 3 links from the current tour,and a feature-wise linear modulation network to select an appropriate way to reconnect the segments after removing the selected 3 links. We demonstrate that our approach achieves state-of-the-art performance on both real TSP instances and randomly-generated instances than, to the best of our knowledge, the existing neural network-based approaches.' volume: 157 URL: https://proceedings.mlr.press/v157/sui21a.html PDF: https://proceedings.mlr.press/v157/sui21a/sui21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-sui21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Jingyan family: Sui - given: Shizhe family: Ding - given: Ruizhi family: Liu - given: Liming family: Xu - given: Dongbo family: Bu editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1301-1316 id: sui21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1301 lastpage: 1316 published: 2021-11-28 00:00:00 +0000 - title: 'Time-Constrained Multi-Agent Path Finding in Non-Lattice Graphs with Deep Reinforcement Learning' abstract: 'Multi-Agent Path Finding (MAPF) is a routing problem in which multiple agents need to each find a lowest-cost collection of routes in a graph that avoids collisions between agents. This problem occurs frequently in the domain of logistics, for example in the routing of trains in shunting yards, airplanes at airports, and picking robots in automated warehouses. A solution is presented for the MAPF problem in which agents operate on an arbitrary directed graph, rather than the commonly assumed grid world, which extends support to use cases where the environment cannot be easily modeled in a grid shape. Furthermore, constraints are introduced on the start and end times of the routing tasks, which is vital in MAPF problems that are part of larger logistics systems. A Reinforcement Learning-based (RL) approach is proposed to learn a local routing policy for an agent in a manner that relieves the need for manually designing heuristics. It relies on a Graph Convolutional Network to handle arbitrary graphs. Both single-agent and multi-agent RL approaches are presented, showing how a multi-agent setup can reduce training time by exploiting the similarities in agent properties and local graph topologies.' volume: 157 URL: https://proceedings.mlr.press/v157/knippenberg21a.html PDF: https://proceedings.mlr.press/v157/knippenberg21a/knippenberg21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-knippenberg21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Marijn prefix: van family: Knippenberg - given: Mike family: Holenderski - given: Vlado family: Menkovski editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1317-1332 id: knippenberg21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1317 lastpage: 1332 published: 2021-11-28 00:00:00 +0000 - title: 'Exposing Cyber-Physical System Weaknesses by Implicitly Learning their Underlying Models' abstract: 'Cyber-Physical Systems (CPS) plays a critical role in today’s social life, especially with occasional pandemic events. With more reliance on the cyber operation of infrastructures, it is important to understand attacking mechanisms in CPS for potential solutions and defenses, where False Data Injection Attack (FDIA) is an important class. FDIA methods in the literature require the mathematical CPS model and state variable values to create an efficient attack vector, unrealistic for many attackers in the real world. Also, they do not have performance guarantee. This paper shows that it is possible to deploy a FDIA without having the CPS model and state variables information. Additionally, we prove a theoretic bound for the proposed method. Specifically, we design a scheme that learns an implicit CPS model to create tampered sensor measurements to deploy an attack based only on historical data. The proposed framework utilizes a Wasserstein generative adversarial network with two regularization terms to create such tampered measurements also known as adversarial examples. To build an attack with confidence, we present a proof based on convergence in distribution and Lipschitz norm to show that our method captures the real observed measurement distribution. This means that our model learns the complex underlying processes from the CPSs. We demonstrate the robustness and universality of our proposed framework based on two diversified adversarial examples with different systems, domains, and datasets.' volume: 157 URL: https://proceedings.mlr.press/v157/costilla-enriquez21a.html PDF: https://proceedings.mlr.press/v157/costilla-enriquez21a/costilla-enriquez21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-costilla-enriquez21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Napoleon family: Costilla-Enriquez - given: Yang family: Weng editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1333-1348 id: costilla-enriquez21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1333 lastpage: 1348 published: 2021-11-28 00:00:00 +0000 - title: 'NAS-HPO-Bench-II: A Benchmark Dataset on Joint Optimization of Convolutional Neural Network Architecture and Training Hyperparameters' abstract: 'The benchmark datasets for neural architecture search (NAS) have been developed to alleviate the computationally expensive evaluation process and ensure a fair comparison. Recent NAS benchmarks only focus on architecture optimization, although the training hyperparameters affect the obtained model performances. Building the benchmark dataset for joint optimization of architecture and training hyperparameters is essential to further NAS research. The existing NAS-HPO-Bench is a benchmark for joint optimization, but it does not consider the network connectivity design as done in modern NAS algorithms. This paper introduces the first benchmark dataset for joint optimization of network connections and training hyperparameters, which we call NAS-HPO-Bench-II. We collect the performance data of 4K cell-based convolutional neural network architectures trained on the CIFAR-10 dataset with different learning rate and batch size settings, resulting in the data of 192K configurations. The dataset includes the exact data for 12 epoch training. We further build the surrogate model predicting the accuracies after 200 epoch training to provide the performance data of longer training epoch. By analyzing NAS-HPO-Bench-II, we confirm the dependency between architecture and training hyperparameters and the necessity of joint optimization. Finally, we demonstrate the benchmarking of the baseline optimization algorithms using NAS-HPO-Bench-II.' volume: 157 URL: https://proceedings.mlr.press/v157/hirose21a.html PDF: https://proceedings.mlr.press/v157/hirose21a/hirose21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-hirose21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Yoichi family: Hirose - given: Nozomu family: Yoshinari - given: Shinichi family: Shirakawa editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1349-1364 id: hirose21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1349 lastpage: 1364 published: 2021-11-28 00:00:00 +0000 - title: 'Metric Learning for comparison of HMMs using Graph Neural Networks' abstract: 'Hidden Markov models (HMMs) belong to the class of double embedded stochastic models which were originally leveraged for speech recognition and synthesis. HMMs subsequently became a generic sequence model across multiple domains like NLP, bio-informatics and thermodynamics to name a few. Literature has several heuristic metrics to compare two HMMs by factoring in their structure and emission probability distributions in HMM nodes. However, typical structure-based metrics overlook the similarity between HMMs having different structures yet similar behavior and typical behavior-based metrics rely on the representativeness of the reference sequence used for assessing the similarity in behavior. Further, little exploration has taken place in leveraging the recent advancements in deep graph neural networks for learning effective representations for HMMs. In this paper, we propose two novel deep neural network based approaches to learn embeddings for HMMs and evaluate the validity of the embeddings based on subsequent clustering and classification tasks. Our proposed approaches use a Graph variational Autoencoder and diffpooling based Graph neural network (GNN) to learn embeddings for HMMs. The graph autoencoder infers latent low-dimensional flat embeddings for HMMs in a task-agnostic manner; whereas the diffpooling based graph neural network learns class-label aware embeddings by inferring and aggregating a hierarchical set of clusters and sub-clusters of graph nodes. Empirical results reveal that the HMM embeddings learnt through the Graph variational autoencoders and diffpooling based GNN outperform the popular heuristics as measured by the cluster quality metrics and the classification accuracy in downstream tasks.' volume: 157 URL: https://proceedings.mlr.press/v157/soni21a.html PDF: https://proceedings.mlr.press/v157/soni21a/soni21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-soni21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Rajan Kumar family: Soni - given: Karthick family: Seshadri - given: Balaraman family: Ravindran editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1365-1380 id: soni21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1365 lastpage: 1380 published: 2021-11-28 00:00:00 +0000 - title: 'Layer-Wise Neural Network Compression via Layer Fusion' abstract: ' This paper proposes \textit{layer fusion} - a model compression technique that discovers which weights to combine and then fuses weights of similar fully-connected, convolutional and attention layers. Layer fusion can significantly reduce the number of layers of the original network with little additional computation overhead, while maintaining competitive performance. From experiments on CIFAR-10, we find that various deep convolution neural networks can remain within 2% accuracy points of the original networks up to a compression ratio of 3.33 when iteratively retrained with layer fusion. For experiments on the WikiText-2 language modelling dataset, we compress Transformer models to 20% of their original size while being within 5 perplexity points of the original network. We also find that other well-established compression techniques can achieve competitive performance when compared to their original networks given a sufficient number of retraining steps. Generally, we observe a clear inflection point in performance as the amount of compression increases, suggesting a bound on the amount of compression that can be achieved before an exponential degradation in performance. ' volume: 157 URL: https://proceedings.mlr.press/v157/o-neill21a.html PDF: https://proceedings.mlr.press/v157/o-neill21a/o-neill21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-o-neill21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: James family: O’Neill - given: Greg family: V. Steeg - given: Aram family: Galstyan editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1381-1396 id: o-neill21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1381 lastpage: 1396 published: 2021-11-28 00:00:00 +0000 - title: 'Bayesian neural network unit priors and generalized Weibull-tail property' abstract: 'The connection between Bayesian neural networks and Gaussian processes gained a lot of attention in the last few years. Hidden units are proven to follow a Gaussian process limit when the layer width tends to infinity. Recent work has suggested that finite Bayesian neural networks may outperform their infinite counterparts because they adapt their internal representations flexibly. To establish solid ground for future research on finite-width neural networks, our goal is to study the prior induced on hidden units. Our main result is an accurate description of hidden units tails which shows that unit priors become heavier-tailed going deeper, thanks to the introduced notion of generalized Weibull-tail. This finding sheds light on the behavior of hidden units of finite Bayesian neural networks. ' volume: 157 URL: https://proceedings.mlr.press/v157/vladimirova21a.html PDF: https://proceedings.mlr.press/v157/vladimirova21a/vladimirova21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-vladimirova21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Mariia family: Vladimirova - given: Julyan family: Arbel - given: Stéphane family: Girard editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1397-1412 id: vladimirova21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1397 lastpage: 1412 published: 2021-11-28 00:00:00 +0000 - title: 'A Partial Label Metric Learning Algorithm for Class Imbalanced Data' abstract: 'The performance of machine learning algorithms depends on the distance metric, in addition to the model and loss function, etc. The partial label metric learning technique can improve the accuracy of partial label learning algorithms by using training data to learn a better distance metric, which has gradually attracted the attention of scholars in recent years. The essence of partial label learning is mainly to deal with multi-class classification problems, while class imbalance is a common phenomenon in these problems. The class imbalanced problem affects the prediction accuracy of minority class samples, but the current partial label metric learning algorithms rarely consider the problem. In this paper, we propose two partial label metric learning algorithms (PL-CCML-SFN and PL-CCML-LDD) that can solve the class imbalanced problem. The basic idea is to add a regularization term to the objective function of the PL-CCML model, which can induce each class to be uniformly distributed in the new metric space and thus play the role of balancing each class. The experimental results show that these two algorithms, compared with the existing partial label metric learning algorithms, have improved the overall performance on the class imbalanced data.' volume: 157 URL: https://proceedings.mlr.press/v157/liu21f.html PDF: https://proceedings.mlr.press/v157/liu21f/liu21f.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-liu21f.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Wenpeng family: Liu - given: Li family: Wang - given: Jie family: Chen - given: Yu family: Zhou - given: Ruirui family: Zheng - given: Jianjun family: He editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1413-1428 id: liu21f issued: date-parts: - 2021 - 11 - 28 firstpage: 1413 lastpage: 1428 published: 2021-11-28 00:00:00 +0000 - title: 'Scaling Average-Linkage via Sparse Cluster Embeddings' abstract: 'Average-linkage is one of the most popular hierarchical clustering algorithms. It is well known that average-linkage does not scale to large data sets due to the slow asymptotic running time. The fastest known implementation has running time quadratic in the number of data points. This paper presents a technique that we call cluster embedding. The embedding maps each cluster into a point in slightly higher dimensions. The pairwise distances between the mapped points approximate the average distance between clusters. By utilizing this embedding we scale the task of finding close pairs of clusters, which is a key step in average-linkage clustering. We achieve an approximate, sub-quadratic time implementation of average-linkage. We show theoretically the algorithm proposed in this paper achieves a near-linear running time and scales to large data sets. Moreover, its scalability empirically dominates average-linkage and typically offers 3-10x speed-up on large data sets.' volume: 157 URL: https://proceedings.mlr.press/v157/lavastida21a.html PDF: https://proceedings.mlr.press/v157/lavastida21a/lavastida21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-lavastida21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Thomas family: Lavastida - given: Kefu family: Lu - given: Benjamin family: Moseley - given: Yuyan family: Wang editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1429-1444 id: lavastida21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1429 lastpage: 1444 published: 2021-11-28 00:00:00 +0000 - title: 'Multi-scale Salient Instance Segmentation based on Encoder-Decoder' abstract: 'Salient instance segmentation refers to segmenting noticeable instance objects in images. In the face of multi-scale salient instances and overlapping instances, the existing salient instance segmentation methods have great limitations including inaccurate detection of large-scale instances, missing detection of small-scale instances, and wrong segmentation of overlapping instances. In order to solve these problems, a new multi-scale salient instance segmentation network (MSISNet) based on encoder-decoder is proposed. Firstly, a receptive field encoder (RFE) is designed to alleviate the problems of inaccurate detection of large-scale instances, missing detection of small-scale instances, and especially wrong segmentation of overlapping instances. Then, a pyramid decoder (PD) for the detection branch is designed to further alleviate the problem of inaccurate detection of large-scale instances and the difficulty in locating small-scale instances. Finally, a multi-stage decoder (MSD) is designed to improve the quality of the segmentation mask. Experiments on salient instance segmentation dataset Salient Instance Segmentation-1K (SIS-1K) have been conducted and the results show that the proposed method MSISNet is superior to the existing salient instance segmentation methods MSRNet and S4Net, and achieves better segmentation accuracy and speed.' volume: 157 URL: https://proceedings.mlr.press/v157/chen21b.html PDF: https://proceedings.mlr.press/v157/chen21b/chen21b.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-chen21b.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Houru family: Chen - given: Caijuan family: Shi - given: Wei family: Li - given: Changyu family: Duan - given: jinwei family: Yan editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1445-1460 id: chen21b issued: date-parts: - 2021 - 11 - 28 firstpage: 1445 lastpage: 1460 published: 2021-11-28 00:00:00 +0000 - title: 'A Two-Stage Training Framework with Feature-Label Matching Mechanism for Learning from Label Proportions' abstract: 'In this paper, we study a task called Learning from Label Proportions (LLP). LLP aims to learn an instance-level classifier given a number of bags and each bag is composed of several instances. The label of each instance is concealed and what we know is the proportion of each class in each bag. The lack of instance-level supervision information makes the model struggle for finding the right direction for optimization. In this paper, we solve this problem by developing a two-stage training framework. First, we facilitate contrastive learning to train a feature extractor in an unsupervised way. Second, we train a linear classifier with the parameter of the feature extractor fixed. This framework performs much better than most baselines but is still unsatisfactory when the bag size or the number of classes is large. Therefore, we further propose a Feature-Label Matching mechanism (FLMm). FLMm can provide a roughly right optimization direction for the classifier by assigning labels to a subset of instances selected in this bag with a high degree of confidence. Therefore, the classifier can more easily establish the correspondence between instances and labels in the second stage. Experimental results on two benchmark datasets, namely CIFAR10 and CIFAR100, show that our model is far superior than baseline models, for example, accuracy increases from 43.44% to 61.25% for bag size 128 on CIFAR100.' volume: 157 URL: https://proceedings.mlr.press/v157/yang21b.html PDF: https://proceedings.mlr.press/v157/yang21b/yang21b.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-yang21b.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Haoran family: Yang - given: Wanjing family: Zhang - given: Wai family: Lam editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1461-1476 id: yang21b issued: date-parts: - 2021 - 11 - 28 firstpage: 1461 lastpage: 1476 published: 2021-11-28 00:00:00 +0000 - title: '$K^2$-GNN: Multiple Users’ Comments Integration with Probabilistic K-Hop Knowledge Graph Neural Networks' abstract: 'Integrating multiple comments into a concise statement for any online products or web services requires a non-trivial understanding of the input. Recently, graph neural networks (GNN) has been successfully applied to learn from highly-structured graph representations to mitigate the relationship between entities, such as co-references. However, current inter-sentence relation extraction cannot leverage discrete reasoning chains over multiple comments. To address this issue, in this paper, we propose a probabilistic $K$-hop knowledge graph (KKG) to extend existing knowledge graphs with inferred relations via discrete intra-sentence and inter-sentence reasoning chains. KKG associates each inferred relation with a confidence value through Bayesian inference. We further answer how a knowledge graph with inferred relations can help the multiple comments integration through integrating KKG with GNN ($\text{K}^2$-GNN). Our extensive experimental results show that our $\text{K}^2$-GNN outperforms all baseline graph models on multiple comments integration.' volume: 157 URL: https://proceedings.mlr.press/v157/zhan21b.html PDF: https://proceedings.mlr.press/v157/zhan21b/zhan21b.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-zhan21b.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Huixin family: Zhan - given: Kun family: Zhang - given: Chenyi family: Hu - given: Victor family: Sheng editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1477-1492 id: zhan21b issued: date-parts: - 2021 - 11 - 28 firstpage: 1477 lastpage: 1492 published: 2021-11-28 00:00:00 +0000 - title: 'Learn to Predict Vertical Track Irregularity with Extremely Imbalanced Data' abstract: 'Railway systems require regular manual maintenance, a large part of which is dedicated to inspecting track deformation. Such deformation might severely impact trains’ runtime security, whereas such inspections remain costly for both finance and human resources. Therefore, a more precise and efficient approach to detect railway track deformation is in urgent need. In this paper, we showcase an application framework for predicting vertical track irregularity, based on a real-world, large-scale dataset produced by several operating railways in China. We have conducted extensive experiments on various machine learning & ensemble learning algorithms in an effort to maximize the model’s capability in capturing any irregularity. We also proposed a novel approach for handling imbalanced data in multivariate time series prediction tasks with adaptive data sampling and penalized loss. Such an approach has proven to reduce models’ sensitivity to the imbalanced target domain, thus improving its performance in predicting rare extreme values.' volume: 157 URL: https://proceedings.mlr.press/v157/chen21c.html PDF: https://proceedings.mlr.press/v157/chen21c/chen21c.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-chen21c.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Yutao family: Chen - given: Yu family: Zhang - given: Fei family: Yang editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1493-1504 id: chen21c issued: date-parts: - 2021 - 11 - 28 firstpage: 1493 lastpage: 1504 published: 2021-11-28 00:00:00 +0000 - title: 'Semi-Open Attribute Extraction from Chinese Functional Description Text' abstract: 'Attribute extraction is a task to identify the attribute and the corresponding attribute value from unstructured text, which is important for extensive applications like web information retrieval and the recommended system. The traditional relation extraction-based methods or joint extraction-based systems are often perform attribute classify based on subject and attribute-value pairs, and extract the attribute triples in the scope of ontology schema categories, which is in the assumption of the close-world and cannot satisfy the diversity of attributes. In this work, we propose a semi-open information extraction system for attribute extraction in a multi-component framework. With the proposed semi-open attribute extraction system (SOAE), more attribute-value pairs can be discovered by extracting literal triples without the limitation of pre-defined ontology. An additional co-trained ontology-based attribute extraction model is appended as a component following the assumption of the partial-closed world (PCWA), remission the performance degradation of SOAE caused by missing of the literal predicate in raw text and contribute to extract richer attribute triples and construct more dense knowledge graph. For evaluating the performance of the attribute extraction system, we construct a Chinese functional description text dataset CNShipNet and conduct experiments on it. The experimental results demonstrate that our proposed approach outperforms several state-of-the-art baselines with a large margin.' volume: 157 URL: https://proceedings.mlr.press/v157/zhang21f.html PDF: https://proceedings.mlr.press/v157/zhang21f/zhang21f.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-zhang21f.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Li family: Zhang - given: Yanzeng family: Li - given: Rouyu family: Zhang - given: Wenjie family: Li editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1505-1520 id: zhang21f issued: date-parts: - 2021 - 11 - 28 firstpage: 1505 lastpage: 1520 published: 2021-11-28 00:00:00 +0000 - title: 'Regularized Mutual Learning for Personalized Federated Learning' abstract: 'Federated Learning (FL) is a privacy-protected learning paradigm, which allows many clients to jointly train a model under the coordination of a server without the local data leakage. In real-world scenarios, data in different clients usually cannot satisfy the independent and identically distributed (i.i.d.) assumption adopted widely in machine learning. Traditionally training a single global model may cause performance degradation and difficulty in ensuring convergence in such a non-i.i.d. case. To handle this case, various models can be trained for each client to capture the personalization in each client. In this paper, we propose a new personalized FL framework, called Personalized Federated Mutual Learning (PFML), to use the non-i.i.d. characteristics to generate specific models for clients. Specifically, the PFML method integrates mutual learning into the local update process in each client to not only improve the performance of both the global and personalized models but also speed up the convergence compared with state-of-the-art methods. Moreover, the proposed PFML method can help maintain the heterogeneity of client models and protect the information of personalized models. Experiments on benchmark datasets show the effectiveness of the proposed PFML model. ' volume: 157 URL: https://proceedings.mlr.press/v157/yang21c.html PDF: https://proceedings.mlr.press/v157/yang21c/yang21c.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-yang21c.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Ruihong family: Yang - given: Junchao family: Tian - given: Yu family: Zhang editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1521-1536 id: yang21c issued: date-parts: - 2021 - 11 - 28 firstpage: 1521 lastpage: 1536 published: 2021-11-28 00:00:00 +0000 - title: 'Relation Also Need Attention: Integrating Relation Information Into Image Captioning' abstract: 'Image captioning methods with attention mechanism are leading this field, especially models with global and local attention. But there are few conventional models to integrate the relationship information between various regions of the image. In this paper, this kind of relationship features are embedded into the fused attention mechanism to explore the internal visual and semantic relations between different object regions. Besides, to alleviate the exposure bias problem and make the training process more efficient, we combine Generative Adversarial Network with Reinforcement Learning and employ the greedy decoding method to generate a dynamic baseline reward for self-critical training. Finally, experiments on MSCOCO datasets show that the model can generate more accurate and vivid image captioning sentences and perform better in multiple prevailing metrics than the previous advanced models.' volume: 157 URL: https://proceedings.mlr.press/v157/chen21d.html PDF: https://proceedings.mlr.press/v157/chen21d/chen21d.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-chen21d.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Tianyu family: Chen - given: Zhixin family: Li - given: Tiantao family: Xian - given: Canlong family: Zhang - given: Huifang family: Ma editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1537-1552 id: chen21d issued: date-parts: - 2021 - 11 - 28 firstpage: 1537 lastpage: 1552 published: 2021-11-28 00:00:00 +0000 - title: 'Learning to Switch Optimizers for Quadratic Programming' abstract: 'Quadratic programming (QP) seeks to solve optimization problems involving quadratic functions that can include complex boundary constraints. QP in the unrestricted form is $\mathcal{NP}$-hard; but when restricted to the convex case, it becomes tractable. Active set and interior point methods are used to solve convex problems, and in the nonconvex case various heuristics or relaxations are used to produce high-quality solutions in finite time. Learning to optimize (L2O) is an emerging approach to design solvers for optimization problems. We develop an L2O approach that uses reinforcement learning to learn a stochastic policy to switch between pre-existing optimization algorithms to solve QP problem instances. In particular, our agent switches between three simple optimizers: Adam, gradient descent, and random search. Our experiments show that the learned optimizer minimizes quadratic functions faster and finds better-quality solutions in the long term than do any of the possible optimizers switched between. We also compare our solver with the standard QP algorithms in MATLAB and find better performance in fewer function evaluations.' volume: 157 URL: https://proceedings.mlr.press/v157/getzelman21a.html PDF: https://proceedings.mlr.press/v157/getzelman21a/getzelman21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-getzelman21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Grant family: Getzelman - given: Prasanna family: Balaprakash editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1553-1568 id: getzelman21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1553 lastpage: 1568 published: 2021-11-28 00:00:00 +0000 - title: 'Perturbing Eigenvalues with Residual Learning in Graph Convolutional Neural Networks' abstract: 'Network structured data is ubiquitous in natural and social science applications. Graph Convolutional Neural Network (GCN) has attracted significant attention recently due to its success in representing, modeling, and predicting large-scale network data. Various types of graph convolutional filters were proposed to process graph signals to boost the performance of graph-based semi-supervised learning. This paper introduces a novel spectral learning technique called EigLearn, which uses residual learning to perturb the eigenvalues of the graph filter matrix to optimize its capability. EigLearn is relatively easy to implement, and yet thorough experimental studies reveal that it is more effective and efficient than the prior works on the specific issue, such as LanczosNet and FisherGCN. EigLearn only perturbs a small number of eigenvalues and does not require a complete eigendecomposition. Our investigation shows that EigLearn reaches the maximal performance improvement by perturbing about 30 to 40 eigenvalues, and the EigLearn-based GCN has comparable efficiency as the standard GCN. Furthermore, EigLearn bears a clear explanation in the spectral domain of the graph filter and shows aggregation effects in performance improvement when coupled with different graph filters. Hence, we anticipate that EigLearn may serve as a useful neural unit in various graph-involved neural net architectures.' volume: 157 URL: https://proceedings.mlr.press/v157/yao21a.html PDF: https://proceedings.mlr.press/v157/yao21a/yao21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-yao21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Shibo family: Yao - given: Dantong family: Yu - given: Xiangmin family: Jiao editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1569-1584 id: yao21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1569 lastpage: 1584 published: 2021-11-28 00:00:00 +0000 - title: 'Bayesian nonparametric model for arbitrary cubic partitioning' abstract: 'In this paper, we propose a continuous-time Markov process for cubic partitioning models of three-dimensional (3D) arrays and its application to Bayesian nonparametric relational data analysis of 3D array data. Relational data analysis is a topic that has been actively studied in the field of Bayesian nonparametrics, and in particular, models for analyzing 3D arrays have attracted much attention in recent years. In particular, the cubic partitioning model is very popular due to its practical usefulness, and various models such as the infinite relational model and the Mondrian process have been proposed. However, these conventional models have the disadvantage that they are limited to a certain class of cubic partitions, and there is a need for a model that can represent a broader class of arbitrary cubic partitions, which has long been an open issue in this field. In this study, we propose a stochastic process that can represent arbitrary cubic partitions of 3D arrays as a continuous-time Markov process. Furthermore, by combining it with the Aldous-Hoover-Kallenberg representation theorem, we construct an infinitely exchangeable 3D relational model and apply it to real data to show its application to relational data analysis. Experiments show that the proposed model improves the prediction performance by expanding the class of representable cubic partitioning. ' volume: 157 URL: https://proceedings.mlr.press/v157/nakano21a.html PDF: https://proceedings.mlr.press/v157/nakano21a/nakano21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-nakano21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Masahiro family: Nakano - given: Yasuhiro family: Fujiwara - given: Akisato family: Kimura - given: Takeshi family: Yamada - given: Naonori family: Ueda editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1585-1600 id: nakano21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1585 lastpage: 1600 published: 2021-11-28 00:00:00 +0000 - title: 'Bayesian Inference for Optimal Transport with Stochastic Cost' abstract: 'In machine learning and computer vision, optimal transport has had significant success inlearning generative models and defining metric distances between structured and stochasticdata objects, that can be cast as probability measures. The key element of optimal trans-port is the so called lifting of anexactcost (distance) function, defined on the sample space,to a cost (distance) between probability measures over the sample space. However, in manyreal life applications the cost isstochastic: e.g., the unpredictable traffic flow affects the costof transportation between a factory and an outlet. To take this stochasticity into account,we introduce a Bayesian framework for inferring the optimal transport plan distributioninduced by the stochastic cost, allowing for a principled way to include prior informationand to model the induced stochasticity on the transport plans. Additionally, we tailor anHMC method to sample from the resulting transport plan posterior distribution.' volume: 157 URL: https://proceedings.mlr.press/v157/mallasto21a.html PDF: https://proceedings.mlr.press/v157/mallasto21a/mallasto21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-mallasto21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Anton family: Mallasto - given: Markus family: Heinonen - given: Samuel family: Kaski editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1601-1616 id: mallasto21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1601 lastpage: 1616 published: 2021-11-28 00:00:00 +0000 - title: 'Multi-view Latent Subspace Clustering based on both Global and Local Structure' abstract: 'Most existing multi-view clustering methods focus on the global structure or local structure among samples, and few methods focus on the two structures at the same time. In this paper, we propose a Multi-view Latent subspace Clustering based on both Global and Local structure (MLCGL). In this method, a latent embedding representation is learned by exploring the complementary information from different views. In the latent space, not only the global reconstruction relationship but also the local geometric structure among the latent variables are discovered. In this way, a unified affinity graph matrix is constructed in the latent space for different views, which indicates a clear between-class relationship. Meanwhile, a rank constraint is introduced on the Laplacian graph to facilitate the division of samples into the required clusters. In MLCGL, the affinity graph also provides positive feedback to optimize the learned latent representation and contribute to divided it into reasonable clusters. Moreover, we present an alternating iterative optimization scheme to optimize objective functions. Compared with the state-of-art algorithms, MLCGL has achieved excellent experimental performance on several real-world datasets.' volume: 157 URL: https://proceedings.mlr.press/v157/honghan21a.html PDF: https://proceedings.mlr.press/v157/honghan21a/honghan21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-honghan21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Zhou family: Honghan - given: Cai family: Weiling - given: Xu family: Le - given: Yang family: Ming editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1617-1632 id: honghan21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1617 lastpage: 1632 published: 2021-11-28 00:00:00 +0000 - title: 'Augmenting Imbalanced Time-series Data via Adversarial Perturbation in Latent Space' abstract: 'Success of training deep learning models largely depends on the amount and quality of training data. Although numerous data augmentation techniques have already been pro- posed for certain domains such as computer vision where simple schemes such as rotation and flipping have been shown to be effective, other domains such as time-series data have a relatively smaller set of augmentation techniques readily available. Data imbalance is a phenomenon often observed in real-world data. However, a simple oversampling technique may make a model vulnerable to overfitting, so a proper data augmentation is desired. To tackle these problems, we propose a novel data augmentation method that utilizes the latent vectors of an autoencoder in a novel way. When input data are perturbed in its latent space, their reconstructed data retains properties similar to the original one. In con- trast, adversarial augmentation is a technique to train robust deep neural networks against unforeseen data shifts or corruptions by providing a downstream model with samples that are difficult to predict. Our method adversarially perturbs input data in its latent space so that the augmented data is diverse and conducive to reducing test error of a downstream model. The experimental results demonstrated that our method achieves the right balance, significantly modifying the input data to help generalization while retaining its realism.' volume: 157 URL: https://proceedings.mlr.press/v157/kim21a.html PDF: https://proceedings.mlr.press/v157/kim21a/kim21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-kim21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Beomsoo family: Kim - given: Jang-Ho family: Choi - given: Jaegul family: Choo editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1633-1644 id: kim21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1633 lastpage: 1644 published: 2021-11-28 00:00:00 +0000 - title: 'Building Decision Tree for Imbalanced Classification via Deep Reinforcement Learning' abstract: 'Data imbalance is prevalent in classification problems and tends to bias the classifier towards the majority of classes. This paper proposes a decision tree building method for imbalanced binary classification via deep reinforcement learning. First, the decision tree building process is regarded as a multi-step game and modeled as a Markov decision process. Then, the tree-based convolution is applied to extract state vectors from the tree structure, and each node is abstracted into a parameterized action. Next, the reward function is designed based on a range of evaluation metrics of imbalanced classification. Finally, a popular deep reinforcement learning algorithm called Multi-Pass DQN is employed to find an optimal decision tree building policy. The experiments on more than 15 imbalanced data sets indicate that our method outperforms the state-of-the-art methods.' volume: 157 URL: https://proceedings.mlr.press/v157/wen21a.html PDF: https://proceedings.mlr.press/v157/wen21a/wen21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-wen21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Guixuan family: Wen - given: Kaigui family: Wu editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1645-1659 id: wen21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1645 lastpage: 1659 published: 2021-11-28 00:00:00 +0000 - title: 'Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox' abstract: 'We propose a novel machine learning framework to conduct real-time multi-speaker diarization and recognition without prior registration and pretraining in a fully online learning setting. Our contributions are two-fold. First, we propose a new benchmark to evaluate the rarely studied fully online speaker diarization problem. We build upon existing datasets of real world utterances to automatically curate MiniVox, an experimental environment which generates infinite configurations of continuous multi-speaker speech stream. Second, we consider the practical problem of online learning with episodically revealed rewards and introduce a solution based on semi-supervised and self-supervised learning methods. Additionally, we provide a workable web-based recognition system which interactively handles the cold start problem of new user’s addition by transferring representations of old arms to new ones with an extendable contextual bandit. We demonstrate that our proposed method obtains robust performance in the online MiniVox framework given either cepstrum-based representations or deep neural network embeddings.' volume: 157 URL: https://proceedings.mlr.press/v157/lin21c.html PDF: https://proceedings.mlr.press/v157/lin21c/lin21c.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-lin21c.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Baihan family: Lin - given: Xinxin family: Zhang editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1660-1674 id: lin21c issued: date-parts: - 2021 - 11 - 28 firstpage: 1660 lastpage: 1674 published: 2021-11-28 00:00:00 +0000 - title: 'Video Action Recognition with Neural Architecture Search' abstract: 'Recently, deep convolutional neural networks have been widely used in the field of videoaction recognition. Current approaches tend to concentrate on the structure design fordifferent backbone networks, but what kind of network structures can process video botheffectively and quickly still remains to be solved despite the encouraging progress. With thehelp of neural architecture search (NAS), we search for three hyperparameters in the videoprocessing network, which are the number of frames, the number of layers per residual stageand the channel number for all layers. We relax the entire search space into a continuoussearch space, and search for a set of network architectures that balance accuracy andcomputational efficiency by considering accuracy as the primary optimization goal andcomputational complexity as the secondary optimization goal. We conduct experiments onUCF101 and Kinetics400 datasets, validating new state-of-the-art results of the proposedNAS based scheme for video action recognition.' volume: 157 URL: https://proceedings.mlr.press/v157/zhou21a.html PDF: https://proceedings.mlr.press/v157/zhou21a/zhou21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-zhou21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Yuanding family: Zhou - given: Baopu family: Li - given: Zhihui family: Wang - given: Haojie family: Li editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1675-1690 id: zhou21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1675 lastpage: 1690 published: 2021-11-28 00:00:00 +0000 - title: 'Learning Maximum Margin Markov Networks from examples with missing labels' abstract: 'Structured output classifiers based on the framework of Markov Networks provide a transparent way to model statistical dependencies between output labels. The Markov Network (MN) classifier can be efficiently learned by the maximum margin method, which however requires expensive completely annotated examples. We extend the maximum margin algorithm for learning of unrestricted MN classifiers from examples with partially missing annotation of labels. The proposed algorithm translates learning into minimization of a novel loss function which is convex, has a clear connection with the supervised margin-rescaling loss, and can be efficiently optimized by first-order methods. We demonstrate the efficacy of the proposed algorithm on a challenging structured output classification problem where it beats deep neural network models trained from a much higher number of completely annotated examples, while the proposed method used only partial annotations.' volume: 157 URL: https://proceedings.mlr.press/v157/franc21a.html PDF: https://proceedings.mlr.press/v157/franc21a/franc21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-franc21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Vojtech family: Franc - given: Andrii family: Yermakov editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1691-1706 id: franc21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1691 lastpage: 1706 published: 2021-11-28 00:00:00 +0000 - title: 'Maximization of Monotone $k$-Submodular Functions with Bounded Curvature and Non-$k$-Submodular Functions' abstract: 'The concept of $k$-submodularity is an extension of submodularity, of which maximization has various applications, such as influence maximization and sensor placement. In such situations, to model complicated real problems, we want to deal with multiple factors, such as, more detailed parameter representing a property of a given function or a constraint which should be imposed for a given function, simultaneously. Besides, it is preferable that an algorithm for the modeling problem is simple. In this paper, for both monotone $k$-submodular function maximization with bounded curvature and monotone weakly $k$-submodular function maximization, we give approximation ratio analysis on greedy-type algorithms on the problem with the matroid constraint and that with the individual size constraint. Furthermore, we give an approximation ratio analysis on another type of the relaxation of $k$-submodular functions, approximately $k$-submodular functions, with the matroid constraint.' volume: 157 URL: https://proceedings.mlr.press/v157/matsuoka21b.html PDF: https://proceedings.mlr.press/v157/matsuoka21b/matsuoka21b.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-matsuoka21b.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Tatsuya family: Matsuoka - given: Naoto family: Ohsaka editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1707-1722 id: matsuoka21b issued: date-parts: - 2021 - 11 - 28 firstpage: 1707 lastpage: 1722 published: 2021-11-28 00:00:00 +0000 - title: 'Unsupervised Cycle-Consistent Network for Removing Susceptibility Artifacts in Single-shot EPI' abstract: 'Single-shot EPI(ssEPI) is one of the most important ultrafast MRI sequences commonly used for diffusion-weighted MRI and functional MRI. However, ssEPI suffers from susceptibility artifacts, especially in the high field or at the tissue boundaries. The widely used blip/down approaches, such as TOPUP, estimate the underlying distortion field from a pair of images with reversed-phase encoding direction. Typically, the iterative methods are used to find a solution to the ill-posed problem of finding the displacement map that maps up/down acquisitions onto each other. Then the geometric and intensity corrections are applied to obtain the undistorted images based on the estimated displacement map. This paper presents a new unsupervised cycle-consistent deep neural network that takes advantage of both the deep neural network and the gradient reversal method. The proposed method consists of three main components: (1) the Resnet50-Unet to map the pair of images with inverted phase encoding to the displacement maps; (2) the geometric and intensity correction module to obtain the undistorted images; (3) the forward model is applied to get the cycled blip up/down images, and the cycle-consistent loss is optimized. In addition, the CNN network will generate two field maps to overcome motion or field drift during the scan. This new network is trained unsupervised on the clinical datasets downloaded from the Human Connection Project website. And we test this method on both preclinical and clinical datasets. The preclinical dataset is collected from 20 mice based on the modified EPI pulse sequence in 7T scanner. Both simulated and experimental results demonstrate that our method outperforms state-of-the-art methods. In conclusion, we proposed an unsupervised cycle-consistent deep neural network for removing susceptibility artifacts. The results on both preclinical and clinical datasets show this new method’s acceleration and generalization capabilities.' volume: 157 URL: https://proceedings.mlr.press/v157/xie21a.html PDF: https://proceedings.mlr.press/v157/xie21a/xie21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-xie21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Weida family: Xie - given: Shi family: Chen - given: Qingjia family: Bao - given: Kewen family: Liu - given: Zhao family: Li - given: Xiaojun family: Li - given: Chongxin family: Bai - given: Piqiang family: Li - given: Chaoyang family: Liu - given: Otikovs family: Martins editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1723-1738 id: xie21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1723 lastpage: 1738 published: 2021-11-28 00:00:00 +0000 - title: 'Hybrid Summarization with Semantic Weighting Reward and Latent Structure Detector' abstract: 'Text summarization has been a significant challenge in the Nature Process Language (NLP) field. The approach of dealing with text summarization can be roughly divided into two main paradigms: extractive and abstractive manner. The former allows capturing the most representative snippets in a document while the latter generates a summary by understanding the latent meaning in a material with a language generation model. Recently, studies found that jointly employing the extractive and abstractive summarization models can take advantage of their complementary advantages, creating both concise and informative summaries. However, the reinforced summarization models mainly depend on the ROUGE-based reward, which only has the ability to quantify the extent of word-matching rather than semantic-matching between document and summary. Meanwhile, documents are usually collected with redundant or noisy information due to the existence of repeated or irrelevant information in real-world applications. Therefore, only depending on ROUGE-based reward to optimize the reinforced summarization models may lead to biased summary generation. In this paper, we propose a novel deep \bf{Hy}brid \bf{S}ummarization with semantic weighting \bf{R}eward and latent structure \bf{D}etector (HySRD). Specifically, HySRD introduces a new reward mechanism that simultaneously takes advantage of semantic and syntactic information among documents and summaries. To effectively model the accuracy semantics, a latent structure detector is designed to incorporate the high-level latent structures in the sentence representation for information selection. Extensive experiments have been conducted on two well-known benchmark datasets \emph{CNN/Daily Mail} (short input document) and \emph{BigPatent} (long input document). The automatic evaluation shows that our approach significantly outperforms the state-of-the-art of hybrid summarization models.' volume: 157 URL: https://proceedings.mlr.press/v157/song21a.html PDF: https://proceedings.mlr.press/v157/song21a/song21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-song21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Mingyang family: Song - given: Liping family: Jing - given: Yi family: Feng - given: Zhiwei family: Sun - given: Lin family: Xiao editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1739-1754 id: song21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1739 lastpage: 1754 published: 2021-11-28 00:00:00 +0000 - title: 'Fast Rate Learning in Stochastic First Price Bidding' abstract: 'First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising. % As far as learning is concerned, first-price auctions are more challenging because the optimal bidding strategy does not only depend on the value of the item but also requires some knowledge of the other bids. % They have already given rise to several works in sequential learning, % many of which consider models for which the value of the buyer or the opponents’ maximal bid is chosen in an adversarial manner. Even in the simplest settings, this gives rise to algorithms whose pseudo-regret grows as $\sqrt{T}$ with respect to the time horizon $T$. % Focusing on the case where the buyer plays against a stationary stochastic environment, we show how to achieve significantly lower pseudo-regret: when the opponents’ maximal bid distribution is known we provide an algorithm whose pseudo-regret can be as low as $\log^2(T)$; in the case where the distribution must be learnt sequentially, a generalization of this algorithm can achieve $T^{1/3+ \epsilon}$ pseudo-regret, for any $\epsilon>0$. % To obtain these results, we introduce two novel ideas that can be of interest in their own right. First, by transposing results obtained in the posted price setting, we provide conditions under which the first-price bidding utility is locally quadratic around its optimum. Second, we leverage the observation that, on small sub-intervals, the concentration of the variations of the empirical distribution function may be controlled more accurately than by using the classical Dvoretzky-Kiefer-Wolfowitz inequality. % Numerical simulations confirm that our algorithms converge much faster than alternatives proposed in the literature for various bid distributions, including for bids collected on an actual programmatic advertising platform.' volume: 157 URL: https://proceedings.mlr.press/v157/achddou21a.html PDF: https://proceedings.mlr.press/v157/achddou21a/achddou21a.pdf edit: https://github.com/mlresearch//v157/edit/gh-pages/_posts/2021-11-28-achddou21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of The 13th Asian Conference on Machine Learning' publisher: 'PMLR' author: - given: Juliette family: Achddou - given: Olivier family: Cappé - given: Aurélien family: Garivier editor: - given: Vineeth N. family: Balasubramanian - given: Ivor family: Tsang page: 1754-1769 id: achddou21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1754 lastpage: 1769 published: 2021-11-28 00:00:00 +0000