Rinaldo Gagiano's Projects
This assignment involves the preprocessing of two main datasets prior to being merged. The first data set is imported. It has an unused variable removed and another variable renamed. The data set is then parsed for missing values. The identified missing values are replaced or removed using a variety of techniques including mean imputation, ratio replacement, removal, logical assumption replacement and constant value substitution. The second main data set is a binding of two smaller data sets. Both smaller data sets are imported from a large excel document, using specialised import specifications. The data sets are then subsetted to produce the respective desired tables. The subsetted data sets are then cleaned by the removal of blank columns. Once clean the data sets are bound by row. This main dataset then has a variable name changed. Both main data sets have their variable data types scanned and corrected. The two main data sets are then merged to form a grand final data set. The final data set has it's data types double-checked, leading to the factorising and labelling of a variable.
ANZ Data Science Program
Shiny Dashboard produced in R relating to Australian Suicide Statistics.
A project showcasing data wrangling skills
https://rpubs.com/Od-Lanir/MATH2270
This advanced python project of detecting fake news deals with fake and real news. Using sklearn, we build a TfidfVectorizer on our dataset. Then, we initialize a PassiveAggressive Classifier and fit the model. In the end, the accuracy score and the confusion matrix tell us how well our model fares.
Generate Multiple choice Questions from any content or news article using BERT Extractive Summarization, Wordnet and Conceptnet
Authorship Attribution based on logit scores
Machine Learning Model for Predicting a Ship's Crew Size using multiple predictors
Determine if a dataset of body measurements fits a normal distribution.
Data Analysis Case Study
Config files for my GitHub profile.
Chi-square Goodness of Fit Test performed on the proposed hypothesis.
๐คTransformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.
Gathering of Twitter API data on various sources, clean and mine the data through the programming tool R, and create visualisations dictating the main aspects that Twitter accounts in order to enhance and improve McPherson College's Twitter Account/Presence