Coder Social home page Coder Social logo

Visit QuantNet

Visit QuantNet Quantlet_Extraction_Evaluation_Visualisation Visit QuantNet 2.0

Name of QuantLet : Quantlet_Extraction_Evaluation_Visualisation

Published in : ''

Description : 'Extraction, grading and clustering of the Quantlets in the GitHub Organization Quantlet with the use of the classes modules/QUANTLET.py and modules/METAFILE.py. With this program you can extract, update and save the data, model topics with Latent Semantic Analysis, compute different clusterings and visualize the clustering with t-Stochastic Neighbour embedding.'

Keywords : Text analysis, LSA, t-SNE, clustering, kmeans clustering, spectral clustering, visualisation

See also : ''

Author : Marius Sterling

Submitted : September 18 2018 by Marius Sterling

Example : 

Picture1

PYTHON Code

from modules.QUANTLET import QUANTLET
import os

filename = 'data_file_'
github_token = None
USER = 'Quantlet'
# Creates if necessary the folders in the list
for i in ['data']:
    if i not in os.listdir():
        os.mkdir(i)

# looks for already saved files, if there is none, loads all data and save them
f = sorted([i for i in os.listdir('data') if 'json' in i and filename in i])
if not f:
    q = QUANTLET(github_token=github_token, user=USER)
    q.download_metafiles_from_user()
    name = 'data/' + filename
    name += q.get_last_commit().strftime('%Y%m%d')
    name += '.json'
    q.save(name)
else:
    q = QUANTLET.load('data/' + f[-1])

# Update all existing metafiles in q
q.update_existing_metafiles()

# Update all existing metafiles and searches for new Quantlets
q.update_all_metafiles(since=q.last_full_check)

# Saving data newly
name = 'data/' + filename
name += q.get_last_commit().strftime('%Y%m%d')
name += '.json'
q.save(name)


# return bad graded quantlets
grades = q.grading()
grades.loc[grades['q_quali'].isin(['C','D','F'])]

# Extract corpus and dictionary, document term matrix dtm
c,d      = q.get_corpus_dictionary()
dtm      = q.get_document_term_matrix(corpus=c,dictionary=d)
c_tfidf  = q.get_corpus_tfidf(c,d)

# do tf-idf and extract document topic  matrix X
lsa      = q.lsa_model(corpus=c_tfidf, dictionary=d, num_topics=20)
X        = q.get_lsa_matrix(lsa, corpus=c_tfidf, dictionary=d)

# cluster the Quantlets with K-Means into groups
cl,_     = q.cl_kmeans(X=X, n_clusters=20)

# 
named_cl = q.topic_labels(cl=cl,document_topic_matrix=X, lsa=lsa, top_n=4)
q.tsne(X, named_cl, n_iter=2500, save_directory='',save_ending='kmeans', file_type='png')

LvB's Projects

adm icon adm

Q for Advanced Mathematics - ADM slide

aobdl_code icon aobdl_code

Antisocial Online Behavior Detection Using Deep Learning

arr icon arr

Academic Rankings Research

awcd icon awcd

Adaptive Weights Community Detection algorithm

big_data_analysis icon big_data_analysis

R Code used in the bachelor thesis "Early Signals for changes of Economic Indicators using Big Data Analysis" (supervised by Lukas Borke)

bitcoinoptiontrading icon bitcoinoptiontrading

code of Thesis: Trading Strategies for Bitcoin Options based on Deviations in Risk Neutral and Historical Densities

blem icon blem

Quantlets for Master Thesis of Michael Kostmann

blockchain_mechanism icon blockchain_mechanism

This repository consists the codes that are used in the paper 'Blockchain mechanism and distributional characteristics of cryptos' published in 'Advances in Quantitative Analysis of Finance & Accounting (AQAFA)'

cacrypto icon cacrypto

Neural network based factor model for cryptocurrency

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.