zark-stark12 / crm-rfm-modeling Goto Github PK

RFM modeling in Python

License: GNU General Public License v3.0

Python 100.00%

rfm crm python pandas rfm-modeling

crm-rfm-modeling's Introduction

CRM-RFM Modeling v1.0.4

pip install crm-rfm-modeling

This package should be used with the intent of a CRM analysis in order to score their CRM dataset with the well known method of RFM. Methodology derived from the following paper: https://link.springer.com/article/10.1057/palgrave.jdm.3240019

from crm_rfm_modeling import rfm
from crm_rfm_modeling.rfm import RFM

model = RFM()
#or
model = rfm.RFM()

RFM (Recency - Frequency - Monetary):

RFM Package used to model and score CRM data with the following scoring methods: Quintile Scoring, Mean Scoring, and Median Scoring. The package allows custom scoring on each variable related to Recency, Frequency, and Monetary. It is important to format the data when fitting the model depending on the dataset type listed below:

Customer/User Level CRM Dataset:
- The dataset should contain the Customer IDs to score against as an index. The Recency, Frequency, and Monetary columns should be labeled as so in order to score the variables and customers correctly.
Transactional CRM Dataset:
- The dataset should have a list of Cutomer IDs associated with each transaction as well as a date that can be interpreted by Pythons datetime package and a column associated the value of the transaction. The columns should be in order as so. There is no need to set the Customer IDs as the index as it will be set automatically during the scoring.

crm-rfm-modeling's People

Contributors

Stargazers

Watchers

Forkers

raffieeey tulei2006 dariaryn bigrlab soliverc cbrow97 maquinuz nouha57 devrato zhuzhuao uspa-technology

crm-rfm-modeling's Issues

Updating to 1.0.3 broken?

I was using version 1.0.1 for quite a while, then noticed today that there was a 1.0.3 version.

I updated via pip, and now the import doesn't work.

The .py files for this package are also nowhere to be found on my computer. The only reference to your package is in anaconda3\lib\site-packages\crm_rfm_modeling-1.0.3.dist-info\. Can't find any actual .py files like I could with version 1.0.1.

I then uninstalled the package entirely and simply pip installed version 1.0.3 rather than updating. This didn't work either.

I reverted back to 1.0.1 and everything works fine, .py files are present on my system.

Many NANs in fitted data?

My data:

print(transactions.head())

iban_hash                                                                     local_date            amount
0  0a00c7432fb237806ee8c073c620f56edf8c5b42936657... 2016-09-30     8.1
1  97287183cee5705d3c2ab6b2ccd207fbeae1c86bf88a0c... 2016-10-02     4.0
2  1b4645137726e017339bc5b10de23eb1138bd9ee15a76d... 2016-10-03     3.0
3  f9b50094d36c3fbcd4e912d7b0ded2f74b16243dd2ee3c... 2016-10-03     9.7
4  0a00c7432fb237806ee8c073c620f56edf8c5b42936657... 2016-10-04    10.0

Dtypes:

transactions.dtypes

iban_hash             object
local_date    datetime64[ns]
amount               float64

Running the model:

# set up RFM instance
myrfm = rfm.RFM()

# apply scoring to dataset
myrfm.fit(data=transactions,dataset_type='transactional',scoring_method='Median', recency_end_date='02/29/2020')

Output of myrfm.get_fitted_data()

I have just upgraded to version 1.0.4. I don't remember seeing NAs in previous versions. Even if their relative values were low, e.g. Somebody with a monetary value of $1 would still get a low score rather than NAN.

I did have to edit one date time object to get the package to work (see pull request). Could this have caused the problem? I can't revert my change because then the package doesn't work at all. :)

TypeError: descriptor 'sub' requires a 'datetime.datetime' object but received a 'str'

In line 16, the user specified end date was not converted to a datetime object. It was throwing an error. I fixed it manually by adding datetime.strptime(recency_end_date, '%m/%d/%y') around recency_end_date.

From the rfm_utils.py file:

Before:

def convert_transaction_to_user(df, recency_end_date=None):
    id_col = df.columns[0]
    date_col = df.columns[1]
    transaction_col = df.columns[2]
    df[date_col] = pd.to_datetime(df[date_col])

    if recency_end_date is not None:
        end_date = recency_end_date
    elif recency_end_date is None:
        end_date = datetime.today()

After

def convert_transaction_to_user(df, recency_end_date=None):
    id_col = df.columns[0]
    date_col = df.columns[1]
    transaction_col = df.columns[2]
    df[date_col] = pd.to_datetime(df[date_col])

    if recency_end_date is not None:
        end_date = **datetime.strptime(recency_end_date, '%m/%d/%y')**
    elif recency_end_date is None:
        end_date = datetime.today()

Feedback

Feedback by anyone using this is welcome. This was written to automate a typical RFM modeling process I would work on where there was no available package I was able to find for Python. If anyone has any feedback in regards to it or any issues feel free to reach out.

Accepted **kwargs.. how to use them?

            "recency_scoring_method","frequency_scoring_method", and "monetary_scoring_method";
            Set the value of these **kwargs to the following defined scoring methods listed above.

These are part of the fit() method. What are these for and how can we use them?

Set custom 'now' time.

This is useful for data from today or yesterday, but I am currently working with historical data. Also the example .csv you have is from 5 years ago.

It would be useful to have an optional argument that perhaps sets Now as the latest date in the dataframe, rather than today, to make testing easier to interpret.

Thanks!

Different Scoring scale in quintile?

I was comparing three different scoring methods, mean median and quintile.

I noticed that somebody who has a recency of 381.0 will get a score of 1 with scoring='Mean' and a score of 5 if you use scoring='Quintile'.

How to disable weights?

I can see that weights are pre-set and must sum to 1

    def __init__(self,weights=(0.2,0.2,0.6)):
        if sum(weights) == 1 and len(weights) == 3:
            self.weights = weights
        else:
            raise ValueError

What if I don't want to weight the scores?

zark-stark12 / crm-rfm-modeling Goto Github PK

crm-rfm-modeling's Introduction

CRM-RFM Modeling v1.0.4

crm-rfm-modeling's People

Contributors

Stargazers

Watchers

Forkers

crm-rfm-modeling's Issues

Updating to 1.0.3 broken?

Many NANs in fitted data?

TypeError: descriptor 'sub' requires a 'datetime.datetime' object but received a 'str'

Feedback

Accepted **kwargs.. how to use them?

Set custom 'now' time.

Different Scoring scale in quintile?

How to disable weights?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent