Important note :
- The proper way to cite HiggsML challenge : Adam-Bourdarios, C., Cowan, G., Germain, C., Guyon, I., Kégl, B. & Rousseau, D.. (2015). The Higgs boson machine learning challenge. Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning, in PMLR 42:19-55, http://proceedings.mlr.press/v42/cowa14.html (bibtex fragment available there)
- The dataset final permanent home is on the CERN Open Data Portal (the full 800k events have been released including ground truth of the private test dataset which was withheld on Kaggle)
- Contact David Rousseau : rousseau at ijclab.in2p3.fr
The challenge has run from May to September 2014 on the Kaggle's platform. It was very successful (the most popular challenge on Kaggle so far) with 1785 teams of 1942 players, 35772 submissions, more than a thousand forum posts. The new challenge is now to make sure the wealth of ideas, software, algorithms exposed by the participants percolate into HEP physicists daily life.
Important dates :
- 9th-13th November 2015 : Data Science @ LHC workshop at CERN : http://cern.ch/DataScienceLHC2015
- Aug 2015 : The proceedings of the NIPS workshop are public : Journal of Machine Learning Research, Workshops and Proceedings Vol 42 :
- 19th May 2015 2PM Visit to CERN of Tianqi Chen and Tong He, winners of the HEP meets ML award, and of Gabor Melis, winner of the leaderboard. Webcasted presentations at 3PM, see http://cern.ch/higgsml-visit
- Early 2015 : Release of the full Challenge dataset on the CERN Open Data Portal
- Saturday 13th December : dedicated HEPML workshop at NIPS in Montreal, with some presentations by HiggsML challenge winners see detailed agenda and the video now available. Proceedings are being edited.
- 21st November : winners announcement ! on Kaggle's forum, and from ATLAS with a profile of the winners
- Monday 15th September 11:59 PM UTC time : the challenge is finished
The purpose of this web site is to provide additional information.
High Energy Physics (HEP) has been using Machine Learning (ML) techniques such as boosted decision trees (paper) and neural nets since the 90s. These techniques are now routinely used for difficult tasks such as the Higgs boson search. Nevertheless, formal connections between the two research fields are rather scarce, with some exceptions such as the AppStat group at LAL, founded in 2006. In collaboration with INRIA, AppStat promotes interdisciplinary research on machine learning, computational statistics, and high-energy particle and astroparticle physics.
We are now exploring new ways to improve the cross-fertilization of the two fields by setting up a data challenge, following the footsteps of, among others, the astrophysics community (dark matter and galaxy zoo challenges) and neurobiology (connectomics and decoding the human brain). The organization committee consists of ATLAS physicists and machine learning researchers.
The Challenge has run from Monday 12th to September 2014.
A CERN EP seminar (video) has taken place on May 13th to explain and publicize the challenge to the HEP community.
The Challenge
In a nutshell, we provide a data set containing a mixture of simulated signal and background events, built from simulated events provided by the ATLAS collaboration at CERN. Competitors can use or develop any algorithm they want, and the one who achieves the best signal/background separation wins! Besides classical prizes for the winners, a special "HEP meets ML" prize will also be awarded with an invitation to CERN; we are also seeking to organise a NIPS workshop.
For this HEP challenge we deliberately picked one of the most recent and hottest playgrounds: the Higgs decaying into a pair of tau leptons. The first ATLAS results were made public in december 2013 in a CERN seminar, ATLAS sees Higgs boson decay to fermions. The simulated events that participants will have in their hands are the same that physicists used. Participants will be working in realistic conditions although we have simplified quite a bit the original problem so that it became tractable without any background in physics.
HEP physicist, even ATLAS physicists, who have experience with multivariate analysis, neural nets, boosted decision trees and the like are warmly encouraged to compete with machine learning experts.
The Laboratoire de l’Accélerateur Linéaire (LAL) is a French lab located in the vicinity of Paris. It is overseen by both the CNRS (IN2P3) and University Paris-Sud. It counts 330 employees (125 researchers and 205 engineers and technicians) and brings internationally recognized contributions to experimental Particle Physics, Accelerator Physics, Astroparticle Physics, and Cosmology.
Contact : for any question of general interest about the challenge, please consult and use the forum provided on the Kaggle web site. For private comments, we are also reachable at higgsml_at_lal.in2p3.fr.