Training Dips

For the main part of the study, the training of the Dips algorithm with different training datasets, there is a gitlab project which hosts the analysis code.

https://gitlab.cern.ch/pgadow/desy-ftag-summie2022gitlab.cern.ch

You can use the Terminal in the jupyter notebook web interface to download the code with

git clone ssh://git@gitlab.cern.ch:7999/pgadow/desy-ftag-summie2022.git

Then, the notebooks and the auxiliary material supplied with them becomes available and you can execute them with the web interface.

There are two notebooks. While prepare_samples.ipynb is meant to provide some context and explanation how the hybrid training dataset has been prepared, the main resource is the notebook train_Dips.ipynb.

In this notebook, the Dips neural network is defined using the Keras high-level interface for the TensorFlow machine learning library.

The network is trained for a certain number of epochs using the hybrid training dataset.

Task 1: Read through the notebooks and try to understand how the code blocks are functioning and what the steps in the preprocessing are meant to achieve.

Task 2: Experiment with different numbers of epochs for the training and observe how the validation loss and the training loss are evolving over time.

Task 3: Compare the performance of the trained Dips algorithm with that of the evaluated Dips versions which were trained on particle-flow jets.

Two hybrid datasets are provided, one composed from ttbar + Zprime, and another one composed from ttbar + graviton events.

Both can be used for training the algorithms and are available at DESY NAF:

ttbar + Zprime: /nfs/dust/atlas/user/pgadow/summie2022/data/vr_dips_samples/hybrid_ttbar_zprime/
ttbar + graviton: /nfs/dust/atlas/user/pgadow/summie2022/data/vr_dips_samples/hybrid_ttbar_graviton/

The structure in both directories is similar, the final dataset to be used for training the algorithm is VR-hybrid-resampled_scaled_shuffled.h5.

Task 4: Run a training with the "ttbar + Zprime" dataset and the "ttbar + graviton" dataset each, using the same number of jets, epochs and batch size to have a fair comparison. Evaluate the performance on the ttbar, Zprime and graviton samples in prepared_samples, (inclusive_testing_ttbar_TrackJets.h5, inclusive_testing_zprime_TrackJets.h5, inclusive_testing_graviton_TrackJets.h) and compare which training dataset results in the best possible performance.

PreviousComparison of algorithm performance NextSoftware tutorials

Last updated 3 years ago