Probabilistic Querying of Continuous-Time Event Sequences

Alex Boyd, Yuxin Chang, Stephan Mandt, Padhraic Smyth
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:10235-10251, 2023.

Abstract

Continuous-time event sequences, i.e., sequences consisting of continuous time stamps and associated event types (“marks”), are an important type of sequential data with many applications, e.g., in clinical medicine or user behavior modeling. Since these data are typically modeled in an autoregressive manner (e.g., using neural Hawkes processes or their classical counterparts), it is natural to ask questions about future scenarios such as “what kind of event will occur next” or “will an event of type $A$ occur before one of type $B$.” Addressing such queries with direct methods such as naive simulation can be highly inefficient from a computational perspective. This paper introduces a new typology of query types and a framework for addressing them using importance sampling. Example queries include predicting the $n^\mathrm{th}$ event type in a sequence and the hitting time distribution of one or more event types. We also leverage these findings further to be applicable for estimating general “$A$ before $B$” type of queries. We prove theoretically that our estimation method is effectively always better than naive simulation and demonstrate empirically based on three real-world datasets that our approach can produce orders of magnitude improvements in sampling efficiency compared to naive methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-boyd23a, title = {Probabilistic Querying of Continuous-Time Event Sequences}, author = {Boyd, Alex and Chang, Yuxin and Mandt, Stephan and Smyth, Padhraic}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {10235--10251}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/boyd23a/boyd23a.pdf}, url = {https://proceedings.mlr.press/v206/boyd23a.html}, abstract = {Continuous-time event sequences, i.e., sequences consisting of continuous time stamps and associated event types (“marks”), are an important type of sequential data with many applications, e.g., in clinical medicine or user behavior modeling. Since these data are typically modeled in an autoregressive manner (e.g., using neural Hawkes processes or their classical counterparts), it is natural to ask questions about future scenarios such as “what kind of event will occur next” or “will an event of type $A$ occur before one of type $B$.” Addressing such queries with direct methods such as naive simulation can be highly inefficient from a computational perspective. This paper introduces a new typology of query types and a framework for addressing them using importance sampling. Example queries include predicting the $n^\mathrm{th}$ event type in a sequence and the hitting time distribution of one or more event types. We also leverage these findings further to be applicable for estimating general “$A$ before $B$” type of queries. We prove theoretically that our estimation method is effectively always better than naive simulation and demonstrate empirically based on three real-world datasets that our approach can produce orders of magnitude improvements in sampling efficiency compared to naive methods.} }
Endnote
%0 Conference Paper %T Probabilistic Querying of Continuous-Time Event Sequences %A Alex Boyd %A Yuxin Chang %A Stephan Mandt %A Padhraic Smyth %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-boyd23a %I PMLR %P 10235--10251 %U https://proceedings.mlr.press/v206/boyd23a.html %V 206 %X Continuous-time event sequences, i.e., sequences consisting of continuous time stamps and associated event types (“marks”), are an important type of sequential data with many applications, e.g., in clinical medicine or user behavior modeling. Since these data are typically modeled in an autoregressive manner (e.g., using neural Hawkes processes or their classical counterparts), it is natural to ask questions about future scenarios such as “what kind of event will occur next” or “will an event of type $A$ occur before one of type $B$.” Addressing such queries with direct methods such as naive simulation can be highly inefficient from a computational perspective. This paper introduces a new typology of query types and a framework for addressing them using importance sampling. Example queries include predicting the $n^\mathrm{th}$ event type in a sequence and the hitting time distribution of one or more event types. We also leverage these findings further to be applicable for estimating general “$A$ before $B$” type of queries. We prove theoretically that our estimation method is effectively always better than naive simulation and demonstrate empirically based on three real-world datasets that our approach can produce orders of magnitude improvements in sampling efficiency compared to naive methods.
APA
Boyd, A., Chang, Y., Mandt, S. & Smyth, P.. (2023). Probabilistic Querying of Continuous-Time Event Sequences. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:10235-10251 Available from https://proceedings.mlr.press/v206/boyd23a.html.

Related Material