Marginal Distribution Adaptation for Discrete Sets via Module-Oriented Divergence Minimization

Hanjun Dai; Mengjiao Yang; Yuan Xue; Dale Schuurmans; Bo Dai

Marginal Distribution Adaptation for Discrete Sets via Module-Oriented Divergence Minimization

Hanjun Dai, Mengjiao Yang, Yuan Xue, Dale Schuurmans, Bo Dai

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:4605-4617, 2022.

Abstract

Distributions over discrete sets capture the essential statistics including the high-order correlation among elements. Such information provides powerful insight for decision making across various application domains, e.g., product assortment based on product distribution in shopping carts. While deep generative models trained on pre-collected data can capture existing distributions, such pre-trained models are usually not capable of aligning with a target domain in the presence of distribution shift due to reasons such as temporal shift or the change in the population mix. We develop a general framework to adapt a generative model subject to a (possibly counterfactual) target data distribution with both sampling and computation efficiency. Concretely, instead of re-training a full model from scratch, we reuse the learned modules to preserve the correlations between set elements, while only adjusting corresponding components to align with target marginal constraints. We instantiate the approach for three commonly used forms of discrete set distribution—latent variable, autoregressive, and energy based models—and provide efficient solutions for marginal-constrained optimization in either primal or dual forms. Experiments on both synthetic and real-world e-commerce and EHR datasets show that the proposed framework is able to practically align a generative model to match marginal constraints under distribution shift.

Cite this Paper

BibTeX


@InProceedings{pmlr-v162-dai22c,
  title = 	 {Marginal Distribution Adaptation for Discrete Sets via Module-Oriented Divergence Minimization},
  author =       {Dai, Hanjun and Yang, Mengjiao and Xue, Yuan and Schuurmans, Dale and Dai, Bo},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {4605--4617},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/dai22c/dai22c.pdf},
  url = 	 {https://proceedings.mlr.press/v162/dai22c.html},
  abstract = 	 {Distributions over discrete sets capture the essential statistics including the high-order correlation among elements. Such information provides powerful insight for decision making across various application domains, e.g., product assortment based on product distribution in shopping carts. While deep generative models trained on pre-collected data can capture existing distributions, such pre-trained models are usually not capable of aligning with a target domain in the presence of distribution shift due to reasons such as temporal shift or the change in the population mix. We develop a general framework to adapt a generative model subject to a (possibly counterfactual) target data distribution with both sampling and computation efficiency. Concretely, instead of re-training a full model from scratch, we reuse the learned modules to preserve the correlations between set elements, while only adjusting corresponding components to align with target marginal constraints. We instantiate the approach for three commonly used forms of discrete set distribution—latent variable, autoregressive, and energy based models—and provide efficient solutions for marginal-constrained optimization in either primal or dual forms. Experiments on both synthetic and real-world e-commerce and EHR datasets show that the proposed framework is able to practically align a generative model to match marginal constraints under distribution shift.}
}

Endnote

%0 Conference Paper
%T Marginal Distribution Adaptation for Discrete Sets via Module-Oriented Divergence Minimization
%A Hanjun Dai
%A Mengjiao Yang
%A Yuan Xue
%A Dale Schuurmans
%A Bo Dai
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-dai22c
%I PMLR
%P 4605--4617
%U https://proceedings.mlr.press/v162/dai22c.html
%V 162
%X Distributions over discrete sets capture the essential statistics including the high-order correlation among elements. Such information provides powerful insight for decision making across various application domains, e.g., product assortment based on product distribution in shopping carts. While deep generative models trained on pre-collected data can capture existing distributions, such pre-trained models are usually not capable of aligning with a target domain in the presence of distribution shift due to reasons such as temporal shift or the change in the population mix. We develop a general framework to adapt a generative model subject to a (possibly counterfactual) target data distribution with both sampling and computation efficiency. Concretely, instead of re-training a full model from scratch, we reuse the learned modules to preserve the correlations between set elements, while only adjusting corresponding components to align with target marginal constraints. We instantiate the approach for three commonly used forms of discrete set distribution—latent variable, autoregressive, and energy based models—and provide efficient solutions for marginal-constrained optimization in either primal or dual forms. Experiments on both synthetic and real-world e-commerce and EHR datasets show that the proposed framework is able to practically align a generative model to match marginal constraints under distribution shift.

APA


Dai, H., Yang, M., Xue, Y., Schuurmans, D. & Dai, B.. (2022). Marginal Distribution Adaptation for Discrete Sets via Module-Oriented Divergence Minimization. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:4605-4617 Available from https://proceedings.mlr.press/v162/dai22c.html.

Related Material

Download PDF