Yuval Reina: [email protected] Trian Xylouris: [email protected]
Below you can find a outline of how to reproduce our solution for the trackML competition.
For any questions, please contact us.
- files: Directory containing the competition's event files and the user prepared training files
- df_test_v1.pkl : user prepared validation file for ML algorithm
- df_train_v2_reduced.pkl :user prepared training file for ML algorithm
- event*-*.csv :competition's event files
- functions: Directory with python code
- cluster.py :the clustering functions
- expand.py :expanding functions
- ml_model.py :functions related to the Machine Learning algorithm
- other.py :utility functions
- trackml-library-master: Directory with competition utility files (https://www.kaggle.com/c/trackml-particle-identification/discussion/55708)
- conda_python-dependencies.yml :conda environment file
- create clustering.ipynb :jupyter notebook, used to create solutions for training
- Create training.ipynb :jupyter notebook, used to create training files
- trackML_solution.ipynb :jupyter notebook, our main solution notebook
We used verious hardware to train and run our solution. Any modern computer which can run ipython and jupyter notebooks will be ok. The software was tested on Windows 10 and Ubuntu 16.04 LTS.
Conda - 4.6.11
Python 3.6
IPython 6.2.1
On linux machine you can build your conda environment like this:
conda env create -f conda_python-dependencies.yml
The notebooks used to prepare the training files are:
- create clustering.ipynb - used to create solutions, which are used to select false tracks for training.
Change path to point to the path where you put the training events from kaggle
Change out_path to point to the path where you want to store the clustring results
If you want to try the ML algorithm with another solution algorithm, you can still use clustering to build false tracks, or use your algorithm to do it.
- Create training.ipynb - used to create the training files.
Change train_path to point to the path where you put the training events from kaggle
Change clustered_path to point to the path where you storeed the clustring results
The training results would be stored if the directory 'files'
Follow trackML_solution.ipynb and run the full solution.
You can also see and run most parts of this solution on kaggle's kernel