Pre-Work for internship application.
Took a stab at a data pipeline to process URL logs to analyse what articles users where viewing during a specific time interval.
Found coding in R to be challanging without prior knowledge of it's intricacies. Managed to read in the source file containing the data in. Removed logs for users that didn't at least have three hits.
Then I noted that as part of the URL's the second portion after the base URL contained catagory names for the website. These could be isolated and aggregated to calculate various indicators as to the habits of the users in the set interval.
Managed to work with the URL's as strings and sliced out the catagory information. All that remained was to aggregate and visualise the information.