Important note :

  • The proper way to cite HiggsML challenge :   Adam-Bourdarios, C., Cowan, G., Germain, C., Guyon, I., Kégl, B. & Rousseau, D.. (2015). The Higgs boson machine learning challenge. Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning, in PMLR 42:19-55, (bibtex fragment available there)
  • The dataset final permanent home is on the CERN Open Data Portal (the full 800k events have been released including ground truth of the private test dataset which was withheld on Kaggle)
  • Contact David Rousseau : rousseau at



The challenge has run from May to September 2014 on the Kaggle's platform. It was very successful (the most popular challenge on Kaggle so far) with 1785 teams of 1942 players, 35772 submissions, more than a thousand forum posts. The new challenge is now to make sure the wealth of ideas, software, algorithms exposed by the participants percolate into HEP physicists daily life.

Important dates :

The purpose of this web site is to provide additional information.

High Energy Physics (HEP) has been using Machine Learning (ML) techniques such as boosted decision trees (paper) and neural nets since the 90s. These techniques are now routinely used for difficult tasks such as the Higgs boson search. Nevertheless, formal connections between the two research fields are rather scarce, with some exceptions such as the AppStat group at LAL, founded in 2006. In collaboration with INRIA, AppStat  promotes interdisciplinary research on machine learning, computational statistics, and high-energy particle and astroparticle physics.

We are now exploring new ways to improve the cross-fertilization of the two fields by setting up a data challenge, following the footsteps of, among others, the astrophysics community (dark matter and galaxy zoo challenges) and neurobiology (connectomics and decoding the human brain). The organization committee consists of ATLAS physicists and machine learning researchers.

The Challenge has run from Monday 12th to September 2014.

A CERN EP seminar  (video) has taken place on May 13th to explain and publicize the challenge to the HEP community.

The Challenge

In a nutshell, we provide a data set containing a mixture of simulated signal and background events, built from simulated events provided by the ATLAS collaboration at CERN. Competitors can use or develop any algorithm they want, and the one who achieves the best signal/background separation wins! Besides classical prizes for the winners, a special "HEP meets ML" prize will also be awarded with an invitation to CERN; we are also seeking to organise a NIPS workshop.

For this  HEP challenge we deliberately picked one of the most recent and hottest playgrounds: the Higgs decaying into a pair of tau leptons. The first ATLAS results were made public in december 2013 in a CERN seminar, ATLAS sees Higgs boson decay to fermions. The simulated events that participants will have in their hands are the same that physicists used. Participants will be working in realistic conditions although we have simplified quite a bit the original problem so that it became tractable without any background in physics.

HEP physicist, even ATLAS physicists, who have experience with multivariate analysis, neural nets, boosted decision trees and the like are warmly encouraged to compete with machine learning experts.

The Laboratoire de l’Accélerateur Linéaire (LAL) is a French lab located in the vicinity of Paris. It is overseen by both the CNRS (IN2P3) and University Paris-Sud. It counts 330 employees (125 researchers and 205 engineers and technicians) and brings internationally recognized contributions to experimental Particle Physics, Accelerator Physics, Astroparticle Physics, and Cosmology.

Contact : for any question of general interest about the challenge, please consult and use the forum provided on the Kaggle web site. For private comments, we are also reachable  at