Coder Social home page Coder Social logo

karolinamgoma / atm-data-transactions---analysis Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 3.77 MB

In this project I am focusing on analysis transactions coming from five different ATMs. I am checking relationship among variables and doing basic visualizations. Data set comes from Kaggle website.

License: GNU General Public License v3.0

Jupyter Notebook 100.00%
analyis buisness data-science ploty python seaborn visualization

atm-data-transactions---analysis's Introduction

ATM-data-transactions---analysis

In this project I am focusing on analysis transactions coming from five different ATMs. I am checking relationship among variables and doing basic visualizations.

Description of the data set:

Data comes from Kaggle website: https://www.kaggle.com/datasets/saadfareed/data-set-of-atm-transaction-of-xyz-bank It has 10 columns and 11589 rows. It contains ATMs transactions of XYZ cards and no XYZ cards from 2011 till 2016, value of this trancations and information which day of the week the withdrawal has been done. Transactions cames from five ATMs. The location was not changed from 2011 till 2016.

Dataset has three categorical variables:atm_name,weekday,working_day and six numerical variables: No_Of_Withdrawals which is the sum of two numerical variables no_of_XYZ_card_withdrawals and no_of_other_card_withdrawals. We have total_amount_withdrawn which is the sum of two numerical variables amount_withdrawn_XYZ_card and amount_withdrawn_other_card. There are also dates which we can work as date objects thanks to import a module named datetime.

Cleaning the data-checking if it is necessarly: There is no missing values in the dataset and no duplicates.

TimeSeries analysis of transactions numbers and their amount:

From two TimeSeries charts we can see that the total number of withdrawals had increasing tendancy between 2012 and 2014. What is more, transactions of not XYZ cards at this period are slightly higher however the increasing tendency we can observe in the number XYZ cards transactions. The question for buissnes units can be, what happened between 2012-2014 that more and more of XYZ withdrawals has been done? The location of this ATMs hasn't change. Was it for example some special promo for XYZ card to do cash withdrawal? We see also decrasing tendency in total starting from 2015. What happen the end of November 2016? Is it an effect of buissnes decision? We can see total decline in transactions. After this, increasing tendency of not XYZ withdrawals, which for the first time since 2011 almost covers the numbers of XYZ card withdrawals. About the amount of transactions: We can observe the same trend as in case of transactions numbers. Increasing tendancy between 2012-2014 after decresing trend till the end 2016. After total decline in November 2016 and can noticed immediately incerasing tendancy, particulary for not XYZ cards. The amount of XYZ cash withdrawal is above amount for XYZ card till the 2016 after tendancy has changed. Obviously there is linear correlation between number of transaction and amount of cash withdrawal.

Checking categorical data:

We have got categocical data: atm_name (five diffrent ATMs), working_day(H,W), weekday(monday-sunday). All transactions are coming from five diffrent ATMs. We can noticed that almost half of the transactions is taking place on the weekend. Workdays ( Mon-Fri) consists of 57.13 % of all other days. Sunday is the best day of cash withdrawals for all five locations. This information might be useful in order to schedule service intervention for putting cash in the machine. It means also that in the weekend machines are working as hard as in the week. Having this in mind, we can check the data about when this ATMs are out of order to prevent it to be unavaiable especially in the weekend. What are the status of cash inside the machine before weekend? Having more data we can try to analysis if lack of cash during the weekend doesn't accure, if all these ATMs are usually on-service.

Checking numerical variables:

From the regression plot, the transaction number seems to have direct relationship with the transaction amount. From a hexbin plot, I can see that many samples (the peaks) have a number of transactions about 90-150, and correspondingly, the total amount is in the range of around 500000-800000. From KDE plot, I can see that the more transactions starting from around 100 level, the more diferences in the amounts.

I can say that, transactions number for XYZ card are slightly better related with XYZ cards amount than for no XYZ cards (more stick to the regression line).

We can see that ATM 'BIG Street' and Airport ATM has direct relationship and no so much outstanding values, hovever the situation is slightly diffrent for KK Nagar ATM. Bigger transaction numbers varies more for the transaction amount. The Mount Road ATM has more outstanding values in the weekends. KK Nagar and Christ College ATM both weekend and week.

Headmap shows strong linear relationship between number of trx and amount of transactions. Pearson correlation coefficient (=0.916) shows strong positive relationship.

Boxplot is showing presens of outliers both for numbers and amount of transactions. Total number of outliers are 3.67 % of all numeric values.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.