Coder Social home page Coder Social logo

alexianomena / correspondence_analysis_free_use Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 0.0 1.28 MB

Correspondence Analysis with python

Python 99.92% Shell 0.08%
correspondence-analysis contingency-table factor-analysis chi-square-statistics p-value python pandas seaborn reciprocal-averaging markov-chain-data-analysis

correspondence_analysis_free_use's Introduction

Free Correspondence Analysis Python Software

Suitable for Users from any Disciplines

Description

Perform standard correspondence analysis of two categorical variables (code module ca.py in the folder Methods/).

Code can be used to perform correspondence analysis on any dataset that can be transformed into a pandas DataFrame (see the code ca.py in the folder Methods/).

The method mcmca.py can be used for correspondence analysis of dataset that could be assumed to be generated from a Markov Chain Model.

Specific Project

Project Ef5-4: "The evolution of Ancient Egyptian - Quantitative and Non- Quantitative Mathematical Linguistics".

Institutions: ZIB (Zuse Institute Berlin) & MATH+ (Berlin Mathematics Research Center).

Software requirements

python version: 3.7 or +

packages: numpy, pandas, matplotlib, matplotlib.pyplot, matplotlib.backends.backend_pdf, scipy, scipy.stats, seaborn.

You can also get all these using conda by creating a new environment with the spec file myPy3_spec.txt (for a guidance, click here)

Usage requirement

See official publication link here

DOI: https://doi.org/10.12752/8257

Licence: Open Source Apache 2.0

Code Execution

Users with little to no background in python

Helper.py: performs one CA analysis (in this specific project: text vs. grammatical form)

Please enter all the inputs by following the corresponding questions/decriptions.

implementation.py is required to obtain the CA figures.

Users with a moderate background in python

implementation.py can be used to modify the default figure parameter settings. For further modifications, see all the codes in folder Methods/

Notes for all Users

If the dataset is already a contingency table, then the parameter isCont must be given as True and the table should be transformed into a panda dataframe (see example cHelper.py)

Supported Data type (if not a contingency table)

Excel file. In our specific project, datafile contains numerical coding of texts in Égyptien de Tradition, each single data consisting of a ten digits number encoding for the grammatical structure of a sentence (files can be downloaded here).

You can also use your own python function to clean your dataset instead of the function Cleaned_Data in implementation.py line 9.

Results

Figures/ folder is the default location of figure outputs.

Sample Figures

Click here for a higher resolution

Standard CA figure and a few statistics

Visualising the usual correspondence analysis results

Association clustermap

Visualising the strenght of the association between the variables

Identify similar clusters (similarity in the strenght of the associations)

Variable clustermap

Identify similar clusters of variables (chi-square similarity)

correspondence_analysis_free_use's People

Contributors

alexianomena avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.