Today the LHCb collaboration has, for the first time, released its data to the public allowing research to be conducted by anyone in the world.
All scientific results from the LHCb collaboration are already made publicly accessible in open-access papers and numerical results from graphs in HEPData. Starting from today, not only are the results available, but also the data used by the researchers to produce these results are accessible. The data release is made in the CERN Open Data Policy framework which reflects values that have been enshrined in the CERN Convention for more than sixty years. LHCb data across Run 1 and 2 has been used for over 600 scientific publications including a number of significant discoveries, as described in previous LHCb public pages.
The data sample made available amounts to 20% of the total data set collected by the LHCb experiment in 2011 and 2012 during Run 1 of the Large Hadron Collider at CERN. It comprises 200 terabytes worth of data that can be directly downloaded by anyone. This data contains information obtained from proton-proton collision events filtered and recorded with the LHCb detector, suitable for different types of physics studies. The image above displays an event recorded during 2011. Further releases will be planned in the future.
The collaboration has preprocessed the data by reconstructing experimental signatures, such as the trajectories of charged particles, from the raw information delivered by their complex detector system. The data are filtered, classified according to approximately 300 processes and decays, and made available in the same format as used internally by LHCb physicists. The data can be downloaded from the CERN Open Data portal.
The analysis of LHC data is a complex and time-consuming exercise – it is not trivial to get started or produce results of value. Therefore to facilitate the analysis, the samples are accompanied by extensive documentation and metadata, as well as a glossary explaining several hundred special terms used in the preprocessing. The data can be analyzed using dedicated LHCb algorithms, which are available as open source software.
We are pleased to be making available this data for research purposes, educational purposes and any non-profit applications. All data sets come with digital identifiers, which can be used for reference and citation. We are excited to see how our data is being used and feedback is very welcome. We invite users of our data to open discussion and post questions in the CERN Open Data Forum.