Coder Social home page Coder Social logo

latentstructure's Introduction

This repository contains code and relevant files for the Latent Structure
modeling project described in:

Poldrack RA, Mumford JA, Schonberg T, Kalar D, Barman B, Yarkoni T (2012). Discovering relations between mind, brain, and mental disorders using topic mapping. PLOS Computational Biology, in press.

Preprint available at:

http://talyarkoni.org/papers/Poldrack_et_al_in_press_PLoS_CompBio.pdf

Guide to subdirectories:
CCA - contains results from CCA analysis

NIF-Disorders - contains files used for disorders topic modeling

clustering - contains files used for clustering of disorders

cogatlas - contains files used for cogatlas topic modeling

src: contains the source files used for all analyses. these require the following
external libraries:

http://numpy.scipy.org/
http://scikit-learn.org/stable/
http://rpy.sourceforge.net/rpy2.html
http://nipy.sourceforge.net/nibabel/

also note that src/utils needs to be in the python path

1_db_foci_to_image.py - obtains coordinates from neurosynth database and creates images
- uses utils/tal_to_mni.py

1.1_merge_peakfiles.sh - creates merged image and computes mask of voxels with activation
on at least 1% of papers

1.2_extract_nidag_text.py - extract full text of each article from database

2_get_cogat_concepts.py - read cognitive atlas concepts from RDF and get loading for each document

3_create_disorder_list.py - load NIF dysfunction ontology, grab synonyms, and add
additional missing terms

4_get_disorders_NIF.py - get loading for each disorder term from corpus

5_mk_cogatlas_docs.py - make documents based on cogatlas loadings

6_mk_disorders_docs.py - make documents based on disorder loadings

7_mk_pickled_data.py - created a pickle with the full dataset to make loading easier

8_mk_8fold_cogatlas.py - make files for 8-fold CV

9_mk_8fold_disorders.py - make files for 8-fold CV

10_mk_8fold_cogatlas_mallet_data.py - make mallet data from 8-fold files

11_mk_8fold_disorders_mallet_data.py - make mallet data from 8-fold files

12_mk_8fold_cogatlas_topic_models.py - create scripts to run mallet jobs

13_mk_8fold_disorders_topic_models.py - create scripts to run mallet jobs

14_get_best_topic_likelihood.py - check likelihoods to get get dimenstionality

15.1_get_disorders_dimensionality.py - generate additional topic models to get disorder dimensionality

15.3_get_best_disorder_dimensionality.py - get dimensionality that has unique topic dists

15_run_topic_models.py - generate scripts to run final topic models

16_mk_cogatlas_loadingdata.py - load topic data and save loadingdata.txt

17_mk_disorders_loadingdata.py - load topic data and save loadingdata.txt

18_mk_all_chisq_maps.py - make all chisquare maps - uses utils/mk_chisq_maps_filter.py

18.2_mk_6mm_chisq_maps.sh - make 6mm versions of topic

19_mk_slice_images_cogatlas.py - make slice images using p values

20_mk_slice_images_disorders.py - make slice images using p values

22_mk_latexreport_cogatlas.py - create latex report

23_mk_latexreport_disorders.py - create latex report

24_run_CCA_nonneg.py - run CCA analysis

25_cluster_disorders.R - run clustering on disorders data

26_make_cognitive_topic_figure.py - generate figure 2 from initial submission

27_make_disorders_topic_figure.py - generate figure 3 from initial submission

28_mk_8fold_fulltext_topic_models.py - create scripts to run fulltext topic models

28.1_mk_8fold_fulltext_mallet_data.py - make 8-fold data for full text analysis

28.2_run_all_fulltext_nfold.sh - run mallet jobs on 8-fold full text

28.3_get_best_dimensionality.py - get best dimensionality for full text

30_get_wordhist.py - get histograms of # of docs for each filtered paper

31_count_locations.py - count # of locations reported across all papers

32_doc_topic_hist.py - make histograms of docs/topic and topics/doc (for Figure 3)

33_plot_likelihood.py - plot empirical likelihood as function of # of topics (for Figure 2)

34_run_CCA_on_topics.py - run CCA directly on topic distributions

35_mk_fold1_loadingdata.py - get loadingdata for fold1 for each ntopics

36.1_mk_slice_corr_images.disorders.py - make images using correlation rather than p value

36_mk_slice_corr_images.cogatlas.py - make images using correlation rather than p value (for figure 

37_make_cognitive_topic_corr_figure.py - make Figure 4

38_make_disorder_topic_corr_figure.py - make Figure 6

39_get_topic_hierarchy.py - create graphviz .dot file for topic hierarchy (Figure 5)

latentstructure's People

Contributors

poldrack avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.