The model which will eventually be the most useful to ATLAS is not necessarily the one which gives the absolute best performance. The AMS which we ask participant to optimize is indeed the most important criterion, but not the only one.
To acknowledge this, an Award will be given to the team that, as judged by the ATLAS collaboration members on the organizing committee, creates a model which is potentially the most useful to the ATLAS experiment, with the criterions detailed below.
The winning team will be invited to meet the ATLAS collaboration physicists at CERN, with up to $2,700 (~2000€) to cover their travel expenses.
To be eligible for the HEP meets ML Award, the team must submit the model (the code, with reasonably detailed information on the principles behind) by the competition deadline.
The team will decide how the amount of the Award will be divided internally, it being understood that the Award will not cover the travel expenses of team members who belong to the ATLAS collaboration or are based at CERN.
The award money will be granted in reimbursement of expenses including airfare, ground transportation or hotel (original receipts and boarding passes will be requested). Reimbursement is conditioned on participating to the visit program which will be organized.
A combination of criterion will be considered :
- performance AMS (of course we want something which does work!)
- performance ROC (this is one reason why we ask for a rank, in addition to the classification, so that we can study the ROC curve of the classifier)
- documentation of the model (in particular w.r.t the criterions listed here)
- simplicity/straightforwardness of approach. For example, a model which is built from a cascade of independent models on top of each other will be disfavored. A model which clearly required a lot of fine tuning by hand will be disfavored. It is hard to be quantitative, but a criteria could be the number of lines to describe the model (including how it is trained) with enough details so that it can be coded by an average physicist (of course, the smaller the better).
- CPU and memory demands. To set the scale with the Challenge data, running on a regular modern laptop, one minute for training and one millisecond to classify each entry, using less than one GB memory are good enough.
- robustness with respect to lack of training statistics
- flexibility w.r.t different target of optimisation
The HEP meets ML award has been given to Tianqi Chen and Tong He (team crowwork) who have released their software XGBoost very early in the competition. This is a parallelised boosted decision tree software which has been used by many competitors and give one of the highest score.
They have been invited to CERN the 19th May 2015. They will present their software in the special workshop webcasted from the CERN main auditorium, see http://cern.ch/higgsml-visit.