Name: Walter Silima
Type: User
Company: University of the Western Cape
Bio: I am, Walter Silima, an astrophysics and space science research student at University of the Western Cape.
Twitter: walter
Location: Robert Sobukwe Road, Bellville, 7535
Blog: [email protected]
Walter Silima's Projects
This project is more of a build up on the regression repository. It main focus on optimizing the hyper-parameters of the XGBoost regressor to best estimate the photometric redshifts under study. We used 80% of the dataset for training the algorithm and 20% for testing. We used sk-learn Randomised Search CV with r2_score, Median absolute deviation and both of them in different trial to find the best parameters for our testing data. The Median absolute deviation provides the best RMS and NMAD for this project.
In this repository we optimize the random forest (RF) hyper-parameters for the dataset; DR16 cross-matched with the WISE catalogue. In this case, we trained the algorithms on about 80% of the dataset to find the best parameter settings for the algorithms to best estimate the photometric redshifts using the sk-learn RandomisedSearchCV. We used the "neg_mean_squared_error", "neg_median_absolute_deviation" and both "neg_mean_squared_error" and "neg_median_absolute_deviation" as a scoring metrics. The "neg_median_absolute_deviation" yields best results for this project.
The photometric redshifts estimation is currently the most powerful and efficient way to estimate the distances to the extragalactic sources. The exponential data avalanche continues and this will require low cost, fast and efficient data-driven methods to analyse and make predictions from the data. In this study, we present the supervised machine learning algorithms that were used to attain the photometric redshifts of the galaxies and quasars found in Sloan Digital Sky Survey data release 16 (SDSS DR16). We adopt the K-Nearest Neighbour (KNN) and Random Forest (RF) regressors to estimate the photometric redshifts of 285685 galaxies and 124688 quasars by considering their photometric measurements.
Rucio - Scientific Data Management