Real-time data analysis at the LHC: present and future

Vladimir Gligorov

Real-time data analysis at the LHC: present and future

Vladimir Gligorov

Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning, PMLR 42:1-18, 2015.

Abstract

The Large Hadron Collider (LHC), which collides protons at an energy of 14 TeV, produces hundreds of exabytes of data per year, making it one of the largest sources of data in the world today. At present it is not possible to even transfer most of this data from the four main particle detectors at the LHC to “offline” data facilities, much less to permanently store it for future processing. For this reason the LHC detectors are equipped with real-time analysis systems, called triggers, which process this volume of data and select the most interesting proton-proton (pp) collisions. The LHC experiment triggers reduce the data produced by the LHC by between 1/1000 and 1/100000, to tens of petabytes per year, allowing its economical storage and further analysis. The bulk of the data-reduction is performed by custom electronics which ignores most of the data in its decision making, and is therefore unable to exploit the most powerful known data analysis strategies. I cover the present status of real-time data analysis at the LHC, before explaining why the future upgrades of the LHC experiments will increase the volume of data which can be sent off the detector and into off-the-shelf data processing facilities (such as CPU or GPU farms) to tens of exabytes per year. This development will simultaneously enable a vast expansion of the physics programme of the LHC’s detectors, and make it mandatory to develop and implement a new generation of real-time multivariate analysis tools in order to fully exploit this new potential of the LHC. I explain what work is ongoing in this direction and motivate why more effort is needed in the coming years.

Cite this Paper

BibTeX


@InProceedings{pmlr-v42-glig14,
  title = 	 {Real-time data analysis at the LHC: present and future},
  author = 	 {Gligorov, Vladimir},
  booktitle = 	 {Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning},
  pages = 	 {1--18},
  year = 	 {2015},
  editor = 	 {Cowan, Glen and Germain, Cécile and Guyon, Isabelle and Kégl, Balázs and Rousseau, David},
  volume = 	 {42},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Montreal, Canada},
  month = 	 {13 Dec},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v42/glig14.pdf},
  url = 	 {https://proceedings.mlr.press/v42/glig14.html},
  abstract = 	 {The Large Hadron Collider (LHC), which collides protons at an energy of 14 TeV, produces hundreds of exabytes of data per year, making it one of the largest sources of data in the world today. At present it is not possible to even transfer most of this data from the four main particle detectors at the LHC to “offline” data facilities, much less to permanently store it for future processing. For this reason the LHC detectors are equipped with real-time analysis systems, called triggers, which process this volume of data and select the most interesting proton-proton (pp) collisions. The LHC experiment triggers reduce the data produced by the LHC by between 1/1000 and 1/100000, to tens of petabytes per year, allowing its economical storage and further analysis. The bulk of the data-reduction is performed by custom electronics which ignores most of the data in its decision making, and is therefore unable to exploit the most powerful known data analysis strategies. I cover the present status of real-time data analysis at the LHC, before explaining why the future upgrades of the LHC experiments will increase the volume of data which can be sent off the detector and into off-the-shelf data processing facilities (such as CPU or GPU farms) to tens of exabytes per year. This development will simultaneously enable a vast expansion of the physics programme of the LHC’s detectors, and make it mandatory to develop and implement a new generation of real-time multivariate analysis tools in order to fully exploit this new potential of the LHC. I explain what work is ongoing in this direction and motivate why more effort is needed in the coming years.}
}

Endnote

%0 Conference Paper
%T Real-time data analysis at the LHC: present and future
%A Vladimir Gligorov
%B Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning
%C Proceedings of Machine Learning Research
%D 2015
%E Glen Cowan
%E Cécile Germain
%E Isabelle Guyon
%E Balázs Kégl
%E David Rousseau	
%F pmlr-v42-glig14
%I PMLR
%P 1--18
%U https://proceedings.mlr.press/v42/glig14.html
%V 42
%X The Large Hadron Collider (LHC), which collides protons at an energy of 14 TeV, produces hundreds of exabytes of data per year, making it one of the largest sources of data in the world today. At present it is not possible to even transfer most of this data from the four main particle detectors at the LHC to “offline” data facilities, much less to permanently store it for future processing. For this reason the LHC detectors are equipped with real-time analysis systems, called triggers, which process this volume of data and select the most interesting proton-proton (pp) collisions. The LHC experiment triggers reduce the data produced by the LHC by between 1/1000 and 1/100000, to tens of petabytes per year, allowing its economical storage and further analysis. The bulk of the data-reduction is performed by custom electronics which ignores most of the data in its decision making, and is therefore unable to exploit the most powerful known data analysis strategies. I cover the present status of real-time data analysis at the LHC, before explaining why the future upgrades of the LHC experiments will increase the volume of data which can be sent off the detector and into off-the-shelf data processing facilities (such as CPU or GPU farms) to tens of exabytes per year. This development will simultaneously enable a vast expansion of the physics programme of the LHC’s detectors, and make it mandatory to develop and implement a new generation of real-time multivariate analysis tools in order to fully exploit this new potential of the LHC. I explain what work is ongoing in this direction and motivate why more effort is needed in the coming years.

RIS


TY  - CPAPER
TI  - Real-time data analysis at the LHC: present and future
AU  - Vladimir Gligorov
BT  - Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning
DA  - 2015/08/27
ED  - Glen Cowan
ED  - Cécile Germain
ED  - Isabelle Guyon
ED  - Balázs Kégl
ED  - David Rousseau	
ID  - pmlr-v42-glig14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 42
SP  - 1
EP  - 18
L1  - http://proceedings.mlr.press/v42/glig14.pdf
UR  - https://proceedings.mlr.press/v42/glig14.html
AB  - The Large Hadron Collider (LHC), which collides protons at an energy of 14 TeV, produces hundreds of exabytes of data per year, making it one of the largest sources of data in the world today. At present it is not possible to even transfer most of this data from the four main particle detectors at the LHC to “offline” data facilities, much less to permanently store it for future processing. For this reason the LHC detectors are equipped with real-time analysis systems, called triggers, which process this volume of data and select the most interesting proton-proton (pp) collisions. The LHC experiment triggers reduce the data produced by the LHC by between 1/1000 and 1/100000, to tens of petabytes per year, allowing its economical storage and further analysis. The bulk of the data-reduction is performed by custom electronics which ignores most of the data in its decision making, and is therefore unable to exploit the most powerful known data analysis strategies. I cover the present status of real-time data analysis at the LHC, before explaining why the future upgrades of the LHC experiments will increase the volume of data which can be sent off the detector and into off-the-shelf data processing facilities (such as CPU or GPU farms) to tens of exabytes per year. This development will simultaneously enable a vast expansion of the physics programme of the LHC’s detectors, and make it mandatory to develop and implement a new generation of real-time multivariate analysis tools in order to fully exploit this new potential of the LHC. I explain what work is ongoing in this direction and motivate why more effort is needed in the coming years.
ER  -

APA


Gligorov, V.. (2015). Real-time data analysis at the LHC: present and future. Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning, in Proceedings of Machine Learning Research 42:1-18 Available from https://proceedings.mlr.press/v42/glig14.html.

Related Material

Download PDF