Data-Efficient and Robust Trajectory Generation through Pathlet Dictionary Learning

yuanbo tang, Yan Tang, Zihui Zhao, Zixuan Zhang, Yang Li
Conference on Parsimony and Learning, PMLR 328:1072-1089, 2026.

Abstract

Trajectory generation has recently drawn growing interest in privacy-preserving urban mobility studies and location-based service applications. Although many studies have used deep learning or generative AI methods to model trajectories and achieved promising results, real-world trajectory data are noisy and often incomplete (e.g., device instability, low sampling rates, privacy-driven partial reporting), introducing distribution shifts and, as observed in our experiments, marked differences between synthetic and real trajectory distributions. To address this issue, we exploit the low-dimensional structure and regular patterns in urban trajectories and propose a parsimonious deep generative model based on sparse pathlet representations, which encode trajectories with sparse binary vectors associated with a learned compact dictionary of trajectory segments. Specifically, we introduce a probabilistic graphical model to describe the trajectory generation process, which includes a Variational Autoencoder (VAE) component and a linear decoder component. During training, the model can simultaneously learn the latent embedding of sparse pathlet representations and the pathlet dictionary that captures essential mobility patterns in the trajectory dataset. The conditional version of our model can also be used to generate customized trajectories based on temporal and spatial constraints. Our model can effectively learn data distribution even using noisy data, achieving relative improvements of 35.4% and 26.3% over strong baselines on two real-world trajectory datasets. Moreover, the generated trajectories can be conveniently utilized for multiple downstream tasks, including trajectory prediction and data denoising. Lastly, the framework design offers a significant efficiency advantage, saving 64.8% of the time and 56.5% of GPU memory compared to previous approaches. The code repository is available at https://anonymous.4open.science/r/Data-Efficient-and-Robust-Trajectory-Generation-through-Pathlet-Dictionary-Learning-045E.

Cite this Paper


BibTeX
@InProceedings{pmlr-v328-tang26b, title = {Data-Efficient and Robust Trajectory Generation through Pathlet Dictionary Learning}, author = {tang, yuanbo and Tang, Yan and Zhao, Zihui and Zhang, Zixuan and Li, Yang}, booktitle = {Conference on Parsimony and Learning}, pages = {1072--1089}, year = {2026}, editor = {Burkholz, Rebekka and Liu, Shiwei and Ravishankar, Saiprasad and Redman, William and Huang, Wei and Su, Weijie and Zhu, Zhihui}, volume = {328}, series = {Proceedings of Machine Learning Research}, month = {23--26 Mar}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v328/main/assets/tang26b/tang26b.pdf}, url = {https://proceedings.mlr.press/v328/tang26b.html}, abstract = {Trajectory generation has recently drawn growing interest in privacy-preserving urban mobility studies and location-based service applications. Although many studies have used deep learning or generative AI methods to model trajectories and achieved promising results, real-world trajectory data are noisy and often incomplete (e.g., device instability, low sampling rates, privacy-driven partial reporting), introducing distribution shifts and, as observed in our experiments, marked differences between synthetic and real trajectory distributions. To address this issue, we exploit the low-dimensional structure and regular patterns in urban trajectories and propose a parsimonious deep generative model based on sparse pathlet representations, which encode trajectories with sparse binary vectors associated with a learned compact dictionary of trajectory segments. Specifically, we introduce a probabilistic graphical model to describe the trajectory generation process, which includes a Variational Autoencoder (VAE) component and a linear decoder component. During training, the model can simultaneously learn the latent embedding of sparse pathlet representations and the pathlet dictionary that captures essential mobility patterns in the trajectory dataset. The conditional version of our model can also be used to generate customized trajectories based on temporal and spatial constraints. Our model can effectively learn data distribution even using noisy data, achieving relative improvements of 35.4% and 26.3% over strong baselines on two real-world trajectory datasets. Moreover, the generated trajectories can be conveniently utilized for multiple downstream tasks, including trajectory prediction and data denoising. Lastly, the framework design offers a significant efficiency advantage, saving 64.8% of the time and 56.5% of GPU memory compared to previous approaches. The code repository is available at https://anonymous.4open.science/r/Data-Efficient-and-Robust-Trajectory-Generation-through-Pathlet-Dictionary-Learning-045E.} }
Endnote
%0 Conference Paper %T Data-Efficient and Robust Trajectory Generation through Pathlet Dictionary Learning %A yuanbo tang %A Yan Tang %A Zihui Zhao %A Zixuan Zhang %A Yang Li %B Conference on Parsimony and Learning %C Proceedings of Machine Learning Research %D 2026 %E Rebekka Burkholz %E Shiwei Liu %E Saiprasad Ravishankar %E William Redman %E Wei Huang %E Weijie Su %E Zhihui Zhu %F pmlr-v328-tang26b %I PMLR %P 1072--1089 %U https://proceedings.mlr.press/v328/tang26b.html %V 328 %X Trajectory generation has recently drawn growing interest in privacy-preserving urban mobility studies and location-based service applications. Although many studies have used deep learning or generative AI methods to model trajectories and achieved promising results, real-world trajectory data are noisy and often incomplete (e.g., device instability, low sampling rates, privacy-driven partial reporting), introducing distribution shifts and, as observed in our experiments, marked differences between synthetic and real trajectory distributions. To address this issue, we exploit the low-dimensional structure and regular patterns in urban trajectories and propose a parsimonious deep generative model based on sparse pathlet representations, which encode trajectories with sparse binary vectors associated with a learned compact dictionary of trajectory segments. Specifically, we introduce a probabilistic graphical model to describe the trajectory generation process, which includes a Variational Autoencoder (VAE) component and a linear decoder component. During training, the model can simultaneously learn the latent embedding of sparse pathlet representations and the pathlet dictionary that captures essential mobility patterns in the trajectory dataset. The conditional version of our model can also be used to generate customized trajectories based on temporal and spatial constraints. Our model can effectively learn data distribution even using noisy data, achieving relative improvements of 35.4% and 26.3% over strong baselines on two real-world trajectory datasets. Moreover, the generated trajectories can be conveniently utilized for multiple downstream tasks, including trajectory prediction and data denoising. Lastly, the framework design offers a significant efficiency advantage, saving 64.8% of the time and 56.5% of GPU memory compared to previous approaches. The code repository is available at https://anonymous.4open.science/r/Data-Efficient-and-Robust-Trajectory-Generation-through-Pathlet-Dictionary-Learning-045E.
APA
tang, y., Tang, Y., Zhao, Z., Zhang, Z. & Li, Y.. (2026). Data-Efficient and Robust Trajectory Generation through Pathlet Dictionary Learning. Conference on Parsimony and Learning, in Proceedings of Machine Learning Research 328:1072-1089 Available from https://proceedings.mlr.press/v328/tang26b.html.

Related Material