Real-time data analysis at the LHC: present and future

Vladimir Gligorov
Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning, PMLR 42:1-18, 2015.

Abstract

The Large Hadron Collider (LHC), which collides protons at an energy of 14 TeV, produces hundreds of exabytes of data per year, making it one of the largest sources of data in the world today. At present it is not possible to even transfer most of this data from the four main particle detectors at the LHC to “offline” data facilities, much less to permanently store it for future processing. For this reason the LHC detectors are equipped with real-time analysis systems, called triggers, which process this volume of data and select the most interesting proton-proton (pp) collisions. The LHC experiment triggers reduce the data produced by the LHC by between 1/1000 and 1/100000, to tens of petabytes per year, allowing its economical storage and further analysis. The bulk of the data-reduction is performed by custom electronics which ignores most of the data in its decision making, and is therefore unable to exploit the most powerful known data analysis strategies. I cover the present status of real-time data analysis at the LHC, before explaining why the future upgrades of the LHC experiments will increase the volume of data which can be sent off the detector and into off-the-shelf data processing facilities (such as CPU or GPU farms) to tens of exabytes per year. This development will simultaneously enable a vast expansion of the physics programme of the LHC’s detectors, and make it mandatory to develop and implement a new generation of real-time multivariate analysis tools in order to fully exploit this new potential of the LHC. I explain what work is ongoing in this direction and motivate why more effort is needed in the coming years.

Cite this Paper


BibTeX
@InProceedings{pmlr-v42-glig14, title = {Real-time data analysis at the LHC: present and future}, author = {Gligorov, Vladimir}, booktitle = {Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning}, pages = {1--18}, year = {2015}, editor = {Cowan, Glen and Germain, Cécile and Guyon, Isabelle and Kégl, Balázs and Rousseau, David}, volume = {42}, series = {Proceedings of Machine Learning Research}, address = {Montreal, Canada}, month = {13 Dec}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v42/glig14.pdf}, url = {https://proceedings.mlr.press/v42/glig14.html}, abstract = {The Large Hadron Collider (LHC), which collides protons at an energy of 14 TeV, produces hundreds of exabytes of data per year, making it one of the largest sources of data in the world today. At present it is not possible to even transfer most of this data from the four main particle detectors at the LHC to “offline” data facilities, much less to permanently store it for future processing. For this reason the LHC detectors are equipped with real-time analysis systems, called triggers, which process this volume of data and select the most interesting proton-proton (pp) collisions. The LHC experiment triggers reduce the data produced by the LHC by between 1/1000 and 1/100000, to tens of petabytes per year, allowing its economical storage and further analysis. The bulk of the data-reduction is performed by custom electronics which ignores most of the data in its decision making, and is therefore unable to exploit the most powerful known data analysis strategies. I cover the present status of real-time data analysis at the LHC, before explaining why the future upgrades of the LHC experiments will increase the volume of data which can be sent off the detector and into off-the-shelf data processing facilities (such as CPU or GPU farms) to tens of exabytes per year. This development will simultaneously enable a vast expansion of the physics programme of the LHC’s detectors, and make it mandatory to develop and implement a new generation of real-time multivariate analysis tools in order to fully exploit this new potential of the LHC. I explain what work is ongoing in this direction and motivate why more effort is needed in the coming years.} }
Endnote
%0 Conference Paper %T Real-time data analysis at the LHC: present and future %A Vladimir Gligorov %B Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Glen Cowan %E Cécile Germain %E Isabelle Guyon %E Balázs Kégl %E David Rousseau %F pmlr-v42-glig14 %I PMLR %P 1--18 %U https://proceedings.mlr.press/v42/glig14.html %V 42 %X The Large Hadron Collider (LHC), which collides protons at an energy of 14 TeV, produces hundreds of exabytes of data per year, making it one of the largest sources of data in the world today. At present it is not possible to even transfer most of this data from the four main particle detectors at the LHC to “offline” data facilities, much less to permanently store it for future processing. For this reason the LHC detectors are equipped with real-time analysis systems, called triggers, which process this volume of data and select the most interesting proton-proton (pp) collisions. The LHC experiment triggers reduce the data produced by the LHC by between 1/1000 and 1/100000, to tens of petabytes per year, allowing its economical storage and further analysis. The bulk of the data-reduction is performed by custom electronics which ignores most of the data in its decision making, and is therefore unable to exploit the most powerful known data analysis strategies. I cover the present status of real-time data analysis at the LHC, before explaining why the future upgrades of the LHC experiments will increase the volume of data which can be sent off the detector and into off-the-shelf data processing facilities (such as CPU or GPU farms) to tens of exabytes per year. This development will simultaneously enable a vast expansion of the physics programme of the LHC’s detectors, and make it mandatory to develop and implement a new generation of real-time multivariate analysis tools in order to fully exploit this new potential of the LHC. I explain what work is ongoing in this direction and motivate why more effort is needed in the coming years.
RIS
TY - CPAPER TI - Real-time data analysis at the LHC: present and future AU - Vladimir Gligorov BT - Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning DA - 2015/08/27 ED - Glen Cowan ED - Cécile Germain ED - Isabelle Guyon ED - Balázs Kégl ED - David Rousseau ID - pmlr-v42-glig14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 42 SP - 1 EP - 18 L1 - http://proceedings.mlr.press/v42/glig14.pdf UR - https://proceedings.mlr.press/v42/glig14.html AB - The Large Hadron Collider (LHC), which collides protons at an energy of 14 TeV, produces hundreds of exabytes of data per year, making it one of the largest sources of data in the world today. At present it is not possible to even transfer most of this data from the four main particle detectors at the LHC to “offline” data facilities, much less to permanently store it for future processing. For this reason the LHC detectors are equipped with real-time analysis systems, called triggers, which process this volume of data and select the most interesting proton-proton (pp) collisions. The LHC experiment triggers reduce the data produced by the LHC by between 1/1000 and 1/100000, to tens of petabytes per year, allowing its economical storage and further analysis. The bulk of the data-reduction is performed by custom electronics which ignores most of the data in its decision making, and is therefore unable to exploit the most powerful known data analysis strategies. I cover the present status of real-time data analysis at the LHC, before explaining why the future upgrades of the LHC experiments will increase the volume of data which can be sent off the detector and into off-the-shelf data processing facilities (such as CPU or GPU farms) to tens of exabytes per year. This development will simultaneously enable a vast expansion of the physics programme of the LHC’s detectors, and make it mandatory to develop and implement a new generation of real-time multivariate analysis tools in order to fully exploit this new potential of the LHC. I explain what work is ongoing in this direction and motivate why more effort is needed in the coming years. ER -
APA
Gligorov, V.. (2015). Real-time data analysis at the LHC: present and future. Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning, in Proceedings of Machine Learning Research 42:1-18 Available from https://proceedings.mlr.press/v42/glig14.html.

Related Material