The objective of this project is learning the new ML library of Spark 2.0 and machine learning algorithms using like dataset real data science problems.
Experimental status
- Scala 2.11.11
- Spark 2.3.0
- DecisionTree (arg: tree)
- Logistical Regression (arg: logistic)
- Naive Bayes (arg: naives)
- Titanic's survivers (arg: titanic) (dataset of kaggle.com: https://www.kaggle.com/c/titanic)
To launch the training and testing of the problem, you must do:
Main <problem arg> <algorithm arg> (<opt: train path> <opt: test path>)