Coder Social home page Coder Social logo

zeigar / hansen_genescognition Goto Github PK

View Code? Open in Web Editor NEW

This project forked from netneurolab/hansen_genescognition

0.0 1.0 0.0 230.84 MB

Code supporting Hansen et al., 2021 "Mapping gene transcription and neurocognition across human neocortex".

MATLAB 100.00%

hansen_genescognition's Introduction

Molecular signatures of cognition and affect

This repository contains scripts, functions, and data I used or created in support of my work, "Mapping gene transcription and neurocognition across human neocortex". Find the preprint here, the postprint here, and the article here. All analyses were run on Matlab version 9.8.0.1359463 (R2020a) Update 1.

Data and scripts are organized into five subfolders. The data that is used in multiple scripts is included in the root folder. This data includes:

  • gene_expression.mat: node x gene matrix of normalized expression levels, created using abagen.
  • label.mat: a gene x 1 list of gene names, which correspond to the genes in gene_expression.mat
  • genes.mat: a struct of the indices that refer to stable genes in terms of three resolutions (34, 57, and 111 left-hemisphere nodes).
  • neurosynth.mat: a node x term matrix of probabilistic measures that certain terms are pubslished alongside certain brain regions
  • nodes.mat: a struct of the indices that refer to the left hemisphere brain regions
  • result.mat: the original PLS result that is used in all other analyses
  • result34.mat: the PLS result computed on the 34-node parcellation, used in scpt_brainspan.m.
  • spins.mat: a node x 10000 matrix of rotated left hemisphere brain regions used in spin tests
  • terms.mat: a struct of term names for Neurosynth terms and BrainMap terms

The Main Analysis

The folder PLS contains the script scpt_genes_cog_pls.m which performs partial least squares analysis on gene expression and functional activation matrices. It also contains the script scpt_cca.m which performs canonical correlation analysis on gene expression and functional activation matrices, and compares the results to the PLS results. The significance of the latent variables is assessed against a permutation test that accounts for spatial autocorrelation. The correlation of PLS-derived scores is cross-validated using the function fcn_crossval_pls_brain_obvs.m which assigns nodes on a distance-based method to account for spatial autocorrelation. The terms that contribute most to the first latent variable are extrated. Finally, PLS-derived scores are distributed among three network classifications: the intrinsic (resting-state) networks, the Von Economo cytoarchitectonic classes, and the Mesulam classes of laminar differentiation.

The main PLS code (pls_analysis.m) can be found here, under "Latest PLS Applications".

The data in the folder is:

Gene Set Enrichment Analysis

The folder GO contains the script scpt_GO.m which performs gene set enrichment analysis based on two PLS-defined gene sets. Analyses were adapted from this repository which also provides two necessary files which can be found here.

The data in the folder is:

  • gene_entrez_ids.csv: contains the entrezID corresponding to a list of genes. This file is a cleaner version of the original probes.csv from the Allen Institute. It was created using abagen.

Cell-Type Deconvolution

The folder CTD contains the script scpt_ctd.m which determines the ratio of genes that are preferentially expressed in seven different cell types. Significance is assessed against a null model of random gene sets. Cell type deconvolution comes from work discribed in this paper, and the data (alongside much more) can also be found at Jakob Seidlitz's repo The folder also contains the function fcn_ctd.m which was written after publication, but is easier to use so I've included it here.

The data in the folder is:

  • celltypes_PSP.csv : A list of gene names preferentially expressed in each of seven cell types.

Individual Differences in Behaviour

The folder HCP contains the script scpt_hcp.m which uses cortical thickness and T1w/T2w maps from the Human Connectome Project (S1200 release) to relate the PLS-derived gene score pattern to individual differences in behaviour. Original data can be downloaded from here. Note that the script is written for all 1096 subjects with full fMRI runs, but in reality only 417 unrelated subjects were used in analyses. Due to privacy policies, their subject indices are not included. This section got cut during the review process, so the results are only included in the preprint.

The data in the folder is:

  • hcp_smyl_all_125.mat: T1w/T2w ratios for all 1096 subjects with full fMRI runs parcellated into 219 cortical regions
  • hcp_thi_all_125.mat: cortical thickness for all 1096 subjects with full fMRI runs parcellated into 219 cortical regions

Molecular Signature across Development

The folder BrainSpan contains the script scpt_brainspan.m which replicates results using gene expression estimates from the BrainSpan database. The script also tracks the gene expression-functional activation signature across human development. Many thanks to Jake Vogel for organizing the data for comparability with AHBA.

The data in the folder is:

  • gene_expression_AHBA_harmonized.csv: gene by sample matrix of extimated gene expression levels. Only includes genes that are also reported in the original AHBA dataset.
  • gene_metadata_AHBA_harmonized.csv: includes information for each gene including gene name and entrez ID
  • samples_metadata.csv: includes information for each sample, including age and brain region
  • mapping.mat : an index-based map that links the 34 left hemisphere Desikan Killiany regions to the 16 cortical regions included in BrainSpan

hansen_genescognition's People

Contributors

justinehansen avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.