graspologic-org / graspologic Goto Github PK
View Code? Open in Web Editor NEWPython package for graph statistics
Home Page: https://graspologic-org.github.io/graspologic/
License: MIT License
Python package for graph statistics
Home Page: https://graspologic-org.github.io/graspologic/
License: MIT License
for friday (9/21)
based on input data formats we agree on, make sure these functions work properly and no longer call R scripts
DoD:
write code, tests, and documentation for the following:
DoD:
using modified atlases and dMRI pipeline
Example:
require(igraph)
gg <- rg.sample.SBM.correlated(n = 100, B = matrix(c(0.5,0.5,0.2,0.5), nrow = 2), rho = c(0.4,0.6), sigma = 0.2)
summary(gg$adjacency$A)
IGRAPH c77bf6c U--- 100 2424 --
summary(gg$adjacency$B)
IGRAPH 3ccfb0c U--- 100 2039 --
cor(as.vector(gg$adjacency$A[]), as.vector(gg$adjacency$B[]))
[1] 0.1494246
rg.sample.correlated.gnp <- function(P,sigma){
require(igraph)
n <- nrow(P)
U <- matrix(0, nrow = n, ncol = n)
U[col(U) > row(U)] <- runif(n*(n-1)/2)
U <- (U + t(U))
diag(U) <- runif(n)
A <- (U < P) + 0 ;
diag(A) <- 0
avec <- A[col(A) > row(A)]
pvec <- P[col(P) > row(P)]
bvec <- numeric(n*(n-1)/2)
uvec <- runif(n*(n-1)/2)
idx1 <- which(avec == 1)
idx0 <- which(avec == 0)
bvec[idx1] <- (uvec[idx1] < (sigma + (1 - sigma)*pvec[idx1])) + 0
bvec[idx0] <- (uvec[idx0] < (1 - sigma)*pvec[idx0]) + 0
B <- matrix(0, nrow = n, ncol = n)
B[col(B) > row(B)] <- bvec
B <- B + t(B)
diag(B) <- 0
return(list(A = graph.adjacency(A,"undirected"), B = graph.adjacency(B,"undirected")))
}
#gg <- rg.sample.SBM.correlated(n = 100, B = matrix(c(0.5,0.5,0.2,0.5), nrow = 2), rho = c(0.4,0.6), sigma = 0.2)
#cor(as.vector(gg$adjacency$A[]), as.vector(gg$adjacency$B[]))
rg.sample.SBM.correlated <- function(n, B, rho, sigma, conditional = FALSE){
if(!conditional){
tau <- sample(c(1:length(rho)), n, replace = TRUE, prob = rho)
}
else{
tau <- unlist(lapply(1:2,function(k) rep(k, rho[k]*n)))
}
P <- B[tau,tau]
return(list(adjacency=rg.sample.correlated.gnp(P, sigma),tau=tau))
}
write code, tests, and documentation for the following:
DoD:
Many to one algorithm
DoD:
title
add models for SBM, ER, and ZI, inheriting from a base model class.
It's slightly more intuitive to have single tone heatmaps (ie, color for large values, white for small values, somewhere in between otherwise). Makes visualizing stuff like the below:
as the absence of color generally indicates the something is small, whereas the presence of color usually indicates more of something, and here, that is fairly unintuitive. if you were to do 3 colors, that requires your readers to really be checking the axes, limits, etc so that's why we typically do 1-tone with white for small and color for large
Should be easy, most functions should already work on sparse matrices but we will need to update our typechecking in several places and write tests to make sure.
Also J1c says that one of the SVDs does not work on sparse
Possibly could include support for rank 1 + sparse matrices where rather than many 0s they have many of some constant
Write a concrete contributing guidelines
DoD
CONTRIBUTING.md that specifies the following:
First I think seaborn is a good choice. ggplot for python hasn't been developed in over 2 years so while ggplot is nice, I don't think I'm going to use it. Not the biggest fan of plotly. Thoughts appreciated.
So for actual plots:
as title states
Write a function for omnibus embedding with following features:
DoD:
Based on what is discovered in sprint 2, see if any significant findings can be repeated in another data set. In particular, if disease/environmental phenotype data can be related to graph statistic properties, try to find another data set for that specific phenotype
DoD:
Hi developers (cc @jovo),
I am running the OmnibusEmbed into several correlation matrices derived from functional magnetic resonance imaging data. Currently, I have 133 subjects, each one with 249 brain regions of interest timeseries. For each subject, I compute the pearson correlation matrix, so in the end I have a matrix [133 x 249 x 249] (if you prefer, 133 graphs with 249 vertices).
However, when I run:
embeddings = OmnibusEmbed(k=20).fit_transform(correlations)
embeddings
becomes a 2-items tuple, with two matrices [33117 x 20], in which np.allclose(embeddings[0], embeddings[1])
is True
. Why is it returning two of them?
Also, is it safe to reshape the matrix [33117 x 20] into [133 x 249 x 20], in a way that embeddings[0] contains the embeddings of the subject 0`s regions?
Thank you!
"concatenated vectors through unsupervised random forest, the features that were most informative would be the ones that are used. then, rather than MDS, we simply do an eigendecomposition"
Shared io functions in utils, so there is less inconsistency and chances for breakage. Ie, this breaks id imagine https://github.com/neurodata/pygraphstats/blob/master/graphstats/ase/ase.py with networkx.
Todo:
DoD:
See if any of these could be useful for graphs
ASE(A) operates on
A + diag(degree_vec / n-1)
Issues with setting jupyter notebook kernels prevent them from running on netlify
Write code, tests, and documentation for the following:
DoD:
Primitives
dimselect
ASE
LSE
OMNI
Super Primitives
Nonpar
Semipar
GClust
OOCASE
must be float or decimal. i will update import_graph function to cast to float if int array.
Thoughts on adding a function that returns the actual latent positions? So that I don't have to keep typing np.dot(lpm.X, np.diag(lpm.d) ** 0.5)
?
I think it makes sense to add to BaseEmbed, but I could see it being added to LatentPosition. @ebridge2 ?
DoD:
add graphs from here https://github.com/neurodata/graphstats/tree/master/data
ER, SBM, zi-poisson ER, zi-poisson SBM, weighted ER, weighted SBM simulations.
DoD:
Make a basic class contaning a structured representation of a latent position model. This consists of an X
, a Y
, and a optional vtx_names
attribute where X \in \mathbb{R}^{N \times k}, Y \in NULL U \mathbb{R}^{N \times k}
, and vtx_names \in \mathcal{S}^{N}
. Correspondingly, make the base embedding class contain an instance of a latent position model to return to users.
DoD:
Roll several preprocessing steps (various types of PTR, other transforms, etc), embeddings, and clustering steps into a sklearn Pipeline that will work with randomized parameter search
Select some phenotype features of interest and explore how spectral embedded data look with regard to these features. If it seems reasonable based on this output, try clustering/classification
DoD:
form
kwarg gets passed to LSEA declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.