Coder Social home page Coder Social logo

distcomp's Introduction

TUTORIAL: How to use DistComp

Author: OR KOREN [email protected]


Installation

pip install git+ssh://[email protected]/or-koren/distcomp.git

Quick Start

creating a dummy df for the example
def DummyDF(n=1):
    
    df = pd.DataFrame(index=range(n))
    # create normal dist features
    df['feature_A'] = np.random.normal(loc=0,scale=1,size=df.shape[0])
    df['feature_B'] = np.random.normal(loc=0.5,scale=1.2,size=df.shape[0],)

    l=['test' for i in range(int(0.8*n))]+['ctrl' for i in range(int(0.2*n))]
    df['treatment'] = l
    # create feature with test and control from different distribution
    df['Feature_C'] = pd.concat([df[df.treatment=='test']['feature_A'],df[df.treatment=='ctrl']['feature_B']])

    return df

df=DummyDF(n=50000)

actual usage once you have a dataframe with features and an optional treatment column (e.g. test/control column)
from distcomp import comparing_distributions

# instantiating
cd=comparing_distributions(df,
                           features=['feature_A','feature_B','Feature_C'],
                           treatment='treatment',
                           sample_frac=0.8,
                           remove_outliers_quintiles=[0.01,0.99])
                           

# methods:

cd.PlotECDF(figsize=(10,4))

# histnorm accept percent/probability/density/probability density/None  ,barmode='overlay'
cd.PlotHist(histnorm='probability density',bins=None,)

cd.KS_Test(pval=0.1,ks_alternative='two-sided',ks_mode='auto')

cd.ttest(test_group_name='test',pval=0.05)

cd.create_table_one(test_group_name='test')

distcomp's People

Contributors

orkorn avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.