Coder Social home page Coder Social logo

huyen-p / uci_online_retail_analysis Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 0.0 57.91 MB

Embark on a thorough investigation as we navigate the transactional dataset of UCI, a non-store retail UK company, utilizing SQL queries and PowerBI visualization tools. This analysis empowers businesses with strategic insights for precise inventory management and sales planning in the coming year.

Home Page: https://archive.ics.uci.edu/dataset/352/online+retail

TSQL 100.00%
analysis dashboard inventory-management online-retail sale

uci_online_retail_analysis's Introduction

1 - Project Objective

Details

The author will play as a data analyst role, who will process the dataset from start to end to

  • provide valuable Power BI dashboards serving for strategic purpose of the business.
  • propose the most appropriated strategies/ recommendations to boost the company growth.

2 - Data Wrangling

Details

At this stage, an analysis of a transactional dataset from a non-retail UK company called UCI, conducted using SQL queries, revealed that 96.86% of the rows contained sales data (524,917 rows), 1.71% of the rows contained cancellation data (9,251 rows), 0.97% of the rows were duplicates (5,268 rows), 0.46% of the rows were test cases (2,470 rows), and 3 rows contained vague values labeled "Adjust bad debt." The data was collected from December 1, 2010 to December 9, 2011.

image Figure 1 - HomePage Dashboard, which summarize the result from data cleaning process

To categorize the whole data set into 5 groups including sales, cancellation, dupplicated, test case and vague rows, the DA went through the following steps:

  • Review the raw dataset from Excel format to get more farmiliar with the dataset and take note any points of the weirdness, such as data type issue, missing data, empty cells, contaminated data, invalid data. This step can be done by using Filter and Comment functions on Excel.
  • Set the right data type for each column from Excel.
  • Import the data excel file to SQL after finishing the Excel review and set up stage.
  • Conduct an exploratory data analysis (EDA) in SQL based on the key notes beforehands at the Excel review stage.
  • Conclude and group the main categories of the dataset: sale_view and cancellation_view, which are excluded duplicated rows, test case rows and vague rows.
  • Decide next steps to process for each data category. The initial data cleaning SQL script and steps can foud here (link). Since the initial result from this EDA stage indicate that the dataset mainly involve to sale and cancelled rows, the DA can shape the main research and sub research questions of this project in relation to inventory management and sale/ marketing promotions in the next part.

3 - Conclusion

Details

  • In conclusion, this project centered on the analysis of a retail dataset with the objective of providing valuable insights through Power BI dashboards to strategically support the growth of a non-store UK company, UCI.
  • The data wrangling process involved a meticulous review of the dataset, SQL analysis, and categorization into 5 data groups including sales, cancellations, duplicates, test cases, and vague rows. The main research question focused on identifying beneficial business strategies for UCI's growth in the coming year, with sub-questions addressing inventory management and sales optimization.
  • The descriptive data analysis delved into inventory management and sales performance, utilizing features such as Demand, MeanDemand, Revenue, and others. Power BI dashboards, including Inventory Management and Sale Performance, were constructed to visualize insights derived from the analysis. The discussion highlighted the importance of considering both sales and cancellation data for precise evaluation, as exemplified by the distinction between demand and sold revenue or quantity for the highest-demand product with StockCode "84879" (ASSORTED COLOR BIRD ORNAMENT) during the period from December 1, 2010, to December 9, 2011. However, the provided dashboards can be used to analyze relevant insights for any products in the dataset, depending on the specific requirements of the business.
  • Despite the comprehensive analysis, the project acknowledged limitations related to inconsistent UnitPrices for products with the same StockCode in the dataset. To address these challenges, a call for a robust data improvement process was emphasized, involving meticulous cleaning, standardization, validation, and thorough documentation to ensure data integrity, boost decision-making confidence, enhance model reliability, and improve overall operational efficiency.
  • In essence, this project not only provided actionable recommendations for UCI's growth strategies but also underscored the critical importance of maintaining a clean and standardized dataset for reliable and effective data-driven decision-making in the business context.

4 - Inventory Management Dashboard

image

5 - Sale Performance Dashboard

image

uci_online_retail_analysis's People

Contributors

huyen-p avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

uci_online_retail_analysis's Issues

02_DataCleaningInExcel_n_SQL

01 - Checking values in the following column in Excel (to understand better about the data set)

  • 1. InvoiceNo
  • 2. StockCode
  • 3. Description
  • 4. Quantity
  • 5. InvoiceDate
  • 6. UnitPrice
  • 7. CustomerID
  • 8. Country

02 - Checking in SQL

  • 1. Missing values
  • 2. Outliers
  • 3. Contaminated Data
  • 4. Inconsistent Data
  • 5. Invalid Data
  • 6. Duplicate Data
  • 7. Data Type Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.