Lucas De Matheo's Projects
Do you know the relationship between Beer and Diaper sales? What would be the relationship between the sales of two products in such different categories? Well, this story was attributed to a great success case of Data Mining applied to retail chains, which was able to point out the correlation between the sales of these two products by analyzing patterns in the data of their customers' purchases. As correlation does not imply causality, the retail chain investigated this case further, getting closer and closer to an answer. The customers were largely men who, when buying diapers for their children, bought the beers to look after the children while watching the weekend games. Thus, the retailer chose to bring the diaper shelves closer to the beers, increasing its sales.
This dataset was obtained from the SUS website - Brazilian Unified Health System. These are SUS records on COVI-19 in the state of Rio de Janeiro's hospitals.
This repository aims to store the codes I developed during the Data Structure II course. Here, you will find a HASH-like structure, for an efficient data search. Code developed in C.
Our data is not always in just one place. When we receive data in different spreadsheets, it is interesting to organize it quickly and in a correct way.
Sometimes we see some very complicated datasets. just looking at it you already know that it will take some work to make it minimally viable. Here is an example and a solution. For each challenge, a different solution!
This mini-project can be used to generate data about individuals, including name, surname, address (with latitude and longitude), profession and monthly income in a completely random way. Serving Data Visualization projects and other projects in the Data Science area. The objective of this mini-project was to show the use of three tools that I usually use (MS Excel, Python and MS Power BI) in an integrated way with a focus on problem solving. First, MS Excel was used to generate random data with consistency, which serves as a basis for testing and applying data analysis techniques in the same tool or in other tools. Second, Python was used to enrich the result of the generated data, using a geolocation API. Finally, the Power BI tool was used for visualization | summary of the generated data.
This project, carried out in Jupyter Notebook, aims to explore the main Data Analysis techniques with Python tools. Pandas, Numpy, Seaborn, Matplotlib, Plotly and sklearn are used. Divided into three notebooks, I separate the data cleaning, data analysis and machine learning part. For more details and goals, see README
The Leek group guide to data sharing
Python program that reads files from a spreadsheet and concatenates the worksheets.
Config files for my GitHub profile.
microcontroller discipline repository
This repo was developed in orther to support my Undergraduate Thesis. It contains the tables, codes and the thesis itself.