Coder Social home page Coder Social logo

gitter-lab / singe Goto Github PK

View Code? Open in Web Editor NEW
11.0 5.0 5.0 20.66 MB

Gene regulatory network reconstruction from pseudotemporal single-cell gene expression data

License: MIT License

MATLAB 50.20% Shell 39.90% Python 5.02% Dockerfile 4.88%
regulatory-network single-cell-rna-seq granger-causality

singe's Introduction

Single-cell Inference of Networks using Granger Ensembles (SINGE)

Install test SINGE DOI

Gene regulatory network reconstruction from pseudotemporal single-cell gene expression data. Standalone MATLAB implementation of the SINGE algorithm. This code has been tested on MATLAB R2014b and R2018a on Linux, MATLAB R2020a on macOS, and MATLAB R2018a on Windows.

The software was formerly called SCINGE and has been renamed SINGE.

Citation

If you use the SINGE software please cite:

Network inference with Granger causality ensembles on single-cell transcriptomics.
Atul Deshpande, Li-Fang Chu, Ron Stewart, Anthony Gitter.
Cell Reports, 38:6, 2022.

The SINGE-supplemental repository contains additional scripts, analyses, and results related to this manuscript.

Dependencies

The dependencies vary based on how SINGE is run. Setup instructions for each mode are described below.

Modes of execution

The full SINGE pipeline runs multiple Generalized Lasso Granger (GLG) tests to infer different directed networks for different hyperparameters and subsamples of the data. These directed networks are then aggregated into a final predicted network. For small or medium datasets and relatively few hyperparameter combinations, SINGE can be run in a "standalone" mode where all the GLG tests and the aggregation step are run serially. However, for larger datasets or hyperparameter combinations, the GLG tests can be run in parallel on a single machine or multiple machines. After all GLG tests terminate, the results can be aggregated separately.

The standalone and parallel modes are accessible in three ways: MATLAB, compiled MATLAB executables with a wrapper Bash script, or Docker.

MATLAB environment

Running SINGE through MATLAB requires the source code in this repository and the glmnet_matlab package as a dependency. Unzip glmnet_matlab.zip in either the root directory that contains SINGE_Example.m or the code subdirectory. Then use SINGE.m to run SINGE in the standalone mode or SINGE_GLG_Test.m and SINGE_Aggregate.m to run each stage separately. SINGE can be run through MATLAB in Linux, macOS, or Windows but may not work in all Windows environments.

SINGE.m usage:

SINGE(Data,gene_list,outdir,hyperparameter_file)

Example

SINGE_Example.m demonstrates a simple example with the hyperparameters specified in default_hyperparameters.txt. It runs SINGE on data1/X_SCODE_data.mat and writes the results to the Output directory.

Compiled MATLAB code with MATLAB runtime

Requires Bash, the operating system-specific compiled SINGE code, and a compatible MATLAB runtime library, which can be downloaded from https://www.mathworks.com/products/compiler/matlab-runtime.html

Linux

Starting with release 0.4.0, the compiled executables for Linux SINGE_GLG_Test and SINGE_Aggregate are available from the GitHub releases page. Download these executables and place them in the same directory as the wrapper scripts SINGE.sh, run_SINGE_GLG_Test.sh, and run_SINGE_Aggregate.sh from this repository. The compiled code has been tested with the R2018a runtime in Linux.

macOS

Starting with release 0.4.1, the compiled executables for macOS are available from the GitHub releases page in the file SINGE_mac.tgz. Download these executables, untar them with tar -xf SINGE_mac.tgz, and place them in the same directory as the wrapper scripts SINGE.sh, run_SINGE_GLG_Test_mac.sh, and run_SINGE_Aggregate.sh_mac from this repository. The compiled code has been tested with the R2020a runtime in macOS.

Windows

There are no compiled executables for Windows. We recommend running SINGE through Docker in Windows if a compatible MATLAB environment is not available.

Usage

Bash wrapper script usage:

bash SINGE.sh runtime_dir mode Data gene_list outdir [hyperparameter_file] [hyperparameter_number]
  • hyperparameter_file is required only for the standalone and GLG modes.
  • hyperparameter_number is required only for GLG mode.

Use bash SINGE.sh -h to print the complete usage message. The SINGE.sh script automatically detects whether it is running on Linux or macOS and uses the appropriate wrapper script and executables.

Examples

Standalone mode (run GLG for all hyperparameters and aggregate the output)
bash SINGE.sh PATH_TO_RUNTIME standalone data1/X_SCODE_data.mat data1/gene_list.mat Output default_hyperparameters.txt
GLG mode (run GLG for the second hyperparameter in the hyperparameter file)
bash SINGE.sh PATH_TO_RUNTIME GLG data1/X_SCODE_data.mat data1/gene_list.mat Output default_hyperparameters.txt 2
Aggregate mode (run Aggregate mode separately)
bash SINGE.sh PATH_TO_RUNTIME Aggregate data1/X_SCODE_data.mat data1/gene_list.mat Output

Replace PATH_TO_RUNTIME with the path to the MATLAB runtime.

Docker

Requires Docker. The most straightforward way to run SINGE through Docker is with the SINGE.sh wrapper script. The usage is the same as the examples above except the script name and MATLAB runtime path do not need to be specified. Alternatively, arbitrary commands can be run inside Docker by overriding the default entry point. We recommend specifying the version of the Docker image. See the DockerHub tags for available versions.

Examples

SINGE.sh wrapper script in standalone mode
docker run -v $(pwd):/SINGE -w /SINGE agitter/singe:0.5.1 standalone data1/X_SCODE_data.mat data1/gene_list.mat Output default_hyperparameters.txt

Arbitrary commands using Bash as the entry point

docker run -v $(pwd):/SINGE -w /SINGE --entrypoint "/bin/bash" agitter/singe:0.5.1 -c "source ~/.bashrc; conda activate singe-test; tests/compare_example_output.sh Output tests/reference/latest

This example is part of the SINGE test code, which only runs when called from the root of the SINGE git repository.

Inputs

  • data - Path to matfile with ordered single-cell expression data (sparse matrix X), pseudotime values (array ptime), optional indices of regulators (array of index values regix), and optional branching information (matrix branches). For example, the data in data1/X_SCODE_data.mat represents a linear trajectory, and data_bifurcated/X_data_bifurcated.mat represents a branching trajectory with two branches.
  • gene_list - Path to file containing list of gene names corresponding to the rows in the expression data matrix X in Data (e.g., data1/gene_list.mat)
  • outdir - Path to folder for storing results from individual GLG Tests
  • hyperparameter_file - Path to file containing a list of GLG hyperparameter combinations for the hyperparameters described below

Additional input for compiled MATLAB code with R2018a runtime

  • runtime_dir - Path to MATLAB R2018a runtime library

GLG hyperparameters:

  • --ID - Numeric identifier for the GLG hyperparameter set, which should be unique for each hyperparameter set and replicate index
  • --lambda - Sparsity parameter (lambda = 0 results in a non-sparse solution)
  • --dT - Time resolution for GLG test
  • --num-lags - Number of lags for GLG test
  • --kernel-width - Gaussian kernel width for GLG test
  • --replicate - Replicate index
  • --family - Distribution Family of the gene expression values (options = gaussian, poisson, default = gaussian)
  • --prob-zero-removal - For Zero-handling Strategy (default = 0)
  • --prob-remove-samples - Sample removal rate for obtaining subsampled replicates (default = 0.2)
  • --date - Valid date in the dd-mmm-yyyy or mm/dd/yyyy format.

See default_hyperparameters.txt for an example hyperparameters file. Users can generate their own hyperparameter file using the bash script scripts/generate_hyperparameters.sh, which takes hyperparameter values from the files scripts/lambda.txt, scripts/kernel.txt, scripts/time.txt, scripts/probzeroremoval.txt, and scripts/probremovesample.txt.

See USAGE.md for guidelines on setting hyperparameters and running SINGE on a new dataset.

Outputs

  • SINGE_Ranked_Edge_List.txt - File with list of ranked edges according to their SINGE scores
  • SINGE_Gene_Influence.txt - File with list of genes ranked according to their SINGE influence.

Note on SINGE output for branching trajectories

When running SINGE v0.5.0 on a dataset with a branching trajectory (existence of matrix branches in mat file), the SINGE_Ranked_Edge_List.txt and SINGE_Gene_Influence.txt are calculated for the entire branching process by combining the results of the individual GLG tests from all branches. Alternatively, the user can store the individual GLG test results from each branch in a separate folder and call SINGE Aggregate to obtain branch specific network inference.

Note on reproducibility

The master branch of this repository may be unstable as new features are implemented. Use a versioned release for stable data analysis.

Because the subsampling and zero-removal stages involve pseudo-random sample removals, SINGE generates a random seed using input hyperparameters, including the date input. The results can be reproduced by providing the same inputs and date from a previous experiment.

Testing

The tests directory contains test scripts and reference output files to test SINGE.

GitHub Actions is used to run several types of tests in a Linux environment and to deploy a temporary Docker image to DockerHub every time the repository's master branch is updated. The tests build the SINGE Docker image, run SINGE on the example data in multiple ways using Docker, and compare the generated output with the reference output.

GitHub Actions is also used to test SINGE in a macOS environment. The tests install the MATLAB runtime, run the compiled SINGE code on the example data, and compare the generated output with the reference output. The macOS tests use a more permissive threshold when comparing the generated and reference adjacency matrices due to minor operating system-specific differences in the output.

Compiling

The compiled version of SINGE for Linux is generated by compiling the MATLAB code in MATLAB R2018a:

mcc -N -m -R -singleCompThread -R -nodisplay -R -nojvm -a ./glmnet_matlab/ -a ./code/ SINGE_GLG_Test.m
mcc -N -m -R -singleCompThread -R -nodisplay -R -nojvm -a ./code/ SINGE_Aggregate.m

compile_SINGE.sh is used for testing to compile SINGE and confirm the source .m files match the versions used to create the binaries.

The compiled version of SINGE for macOS is generated by running the compile_SINGE_mac.sh script in MATLAB R2020a.

Licenses

SINGE is available under the MIT License, Copyright © 2019 Atul Deshpande, Anthony Gitter.

The file iLasso_for_SINGE.m has been modified from iLasso.m. The original third-party code is available under the MIT License, Copyright © 2014 USC-Melady.

The compiled version of SINGE includes the glmnet_matlab package, which is available under the GPL-2 license.

singe's People

Contributors

agitter avatar atuldeshpande avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

singe's Issues

libgfortran issue on macOS

A user reports

I have not succeeded to run it on local MATLAB installation. I received errors trying to run the SINGE_Example locally (MATLAB R2019b) on a MacBook running macOS 10.15.2. Seems to be an issue with libgfortran, which is required by glmnet.

$ SINGE_Example
…

Invalid MEX-file
'…/SINGE-master/glmnet_matlab/glmnetMex.mexmaci64':
dlopen(…/SINGE-master/glmnet_matlab/glmnetMex.mexmaci64,
6): Library not loaded: /usr/local/gfortran/lib/libgfortran.3.dylib
Referenced from:
…/SINGE-master/glmnet_matlab/glmnetMex.mexmaci64
Reason: image not found

Error in glmnetControl (line 79)
ivals.pmin, ivals.exmx, ivals.prec, ivals.mxit] =
glmnetMex();

Error in glmnet (line 329)
inparms = glmnetControl();

Error in iLasso_for_SINGE (line 100)
fit = glmnet(Am, bm, params.family, opt);

Error in run_iLasso_row (line 27)
[metric] = iLasso_for_SINGE(m, outs,
lambda,p1,dT,std_dev,params);

Error in SINGE_GLG_Test (line 58)
[for_metric] = run_iLasso_row(m,outs,params,irow);

Error in SINGE (line 20)
SINGE_GLG_Test(Data,'--outdir',outdir,args{:})

Error in SINGE_Example (line 16)
SINGE(data,gene_list,outdir,hyperparameter_file);

I do not have a local machine running macOS, so I cannot reproduce this issue yet. In #54 I started working on testing SINGE with the MATLAB runtime as a workaround.

Error when dropping samples

The current implementation of the function that drops samples has a bug that only drops randomly-selected zero-valued samples instead of any sample:

ind = find((rand(size(X{i}(1,:)))<probRemove)&(X{i}(1,:)==0));

We plan to correct this and release a new version of SINGE soon. We will update the test cases as well.

Glmnet Mex dependency on old version of libgfortran - Ubuntu

I am trying to run Singe_Example.m on an Ubuntu machine. However, I am getting the following error from glmnetMex.mexa64. I unfortunately cannot install libgfortran3 on my machine as it is deprecated. Is there any workaround for this?

In SINGE (line 20)
In SINGE_Example (line 16)
In run (line 112)

randomizer =

  737892

Invalid MEX-file '/home/ubuntu/SINGE/SINGE/glmnet-matlab/glmnetMex.mexa64':
libgfortran.so.3: cannot open shared object file: No such file or directory

Error in glmnetControl (line 79)
ivals.pmin, ivals.exmx, ivals.prec, ivals.mxit] = glmnetMex();

Error in glmnet (line 329)
inparms = glmnetControl();

Error in iLasso_for_SINGE (line 111)
fit = glmnet(Am, bm, params.family, opt);

Error in run_iLasso_row (line 27)
[metric] = iLasso_for_SINGE(m, outs, lambda,p1,dT,std_dev,params);

Error in SINGE_GLG_Test (line 79)
[for_metric] = run_iLasso_row(m,outs,params,irow);

Error in SINGE (line 20)
SINGE_GLG_Test(Data,'--outdir',outdir,args{:})

Error in SINGE_Example (line 16)
SINGE(data,gene_list,outdir,hyperparameter_file);

Error in run (line 112)
evalin('caller', strcat(scriptStem, ';'));

Delete Temporary mat files at the end of run_SINGE_GLG_test.sh

Issue: Depending on the data size, large TempMat*.mat files are created during SINGE_GLG_Test, which could overrun user's storage budget leading to subsequent jobs being held.
Fix: We should include the line
eval "rm Temp*.mat"
near the end of run_SINGE_GLG_Test.sh

EDIT: The rm should be more targeted, especially if the storage is shared, with only Temp_<ID>.mat to be deleted.

Consistent argument ordering

Before we create a standalone SINGE aggregate script for #6, we should standardize the order of the arguments to these scripts:

  • standalone_SINGE.sh Data gene_list outdir hyperparameter_file runtime_dir
  • run_SINGE_GLG_Test.sh $runtime $data --outdir $outdir $arg
  • run_SINGE_Aggregate.sh $runtime $gene_list $data $outdir

SINGE_Aggregate expects the arguments in a different order than the others. Swapping $gene_list and $data would correct this.

Is it possible to have a more detailed intro for the inputs in 'mat' format ?

Thanks very much for developing this useful tool and I would like to use it to construct GRN with my own data. Since I'm not familiar with Matlab, I tried to run it with bash script. However, I did not find any detailed intros for the input files in 'mat' format. I knew it should be a saved Matlab workspace file with several variables.
So I tried to dissect the content of these input 'mat' files by myself, where I found two variables in 'X_SCODE_data.mat', 'X' and 'ptime', one variable in 'gene_list.mat', 'gene_list'. I also found 'X' in 'X_SCODE_data.mat' is a sparse matrix, 'ptime' is a matrix, and 'gene_list' is a cell. I created a similar 'mat' file with similar organization with my own data but got warnings: "Unable to read some of the variables due to unknown MAT-file error."
I'm more sure whether there are more hidden informations for these variables in 'mat' format ?
Is it possible to directly take more common files types, such as 'txt/csv' as inputs ?

Docker build error

Hi,

I'm working on integrating the latest SINGE release into BEELINE repo. However, when I tried building a docker image using the Dockerfile provided, I'm getting the following error:

Logs
:~/SINGE/docker$ docker build -t singe -f Dockerfile .
Sending build context to Docker daemon  5.632kB
Step 1/8 : FROM amarburg/matlab-runtime
 ---> 32585984ab81
Step 2/8 : RUN apt-get update &&     apt-get -y install libxt6 bzip2
 ---> Running in ab8830c0e025
Get:1 http://security.debian.org stable/updates InRelease [39.1 kB]
Get:2 http://deb.debian.org/debian stable InRelease [122 kB]
Get:3 http://deb.debian.org/debian stable-updates InRelease [49.3 kB]
Get:4 http://security.debian.org stable/updates/main amd64 Packages [120 kB]
Get:5 http://deb.debian.org/debian stable/main amd64 Packages [10.6 MB]
Get:6 http://deb.debian.org/debian stable-updates/main amd64 Packages.diff/Index [1720 B]
Get:6 http://deb.debian.org/debian stable-updates/main amd64 Packages.diff/Index [1720 B]
Get:7 http://deb.debian.org/debian stable-updates/main amd64 Packages [6292 B]
Reading package lists...
W: Conflicting distribution: http://security.debian.org stable/updates InRelease (expected stretch but got buster)
W: Conflicting distribution: http://deb.debian.org/debian stable InRelease (expected stretch but got buster)
W: Conflicting distribution: http://deb.debian.org/debian stable-updates InRelease (expected stretch-updates but got buster-updates)
E: Could not open file /var/lib/apt/lists/deb.debian.org_debian_dists_stable-updates_main_binary-amd64_Packages.diff_Index - open (2: No such file or directory)
The command '/bin/sh -c apt-get update &&     apt-get -y install libxt6 bzip2' returned a non-zero code: 100

I'm running this on Ubuntu 18.04 and Docker version 18.09.5, build e8ff056. I also got this error when I used the Dockerfile provided in the release version 0.3.0.

I'm not sure if this is a known issue, but README file tells me that the Docker support is still being improved. Do you may be have an estimate of when this would be completed? Please let me know if there is any other information I can provide about the issue that would be useful to you.

Command line interface for SINGE

We would like to have a script that shows how to run SCINGE from the command line. It will provide all of the input data files and GLG hyperparmeters. A demo script will reproduce the test run in SCINGE_Example.m.

Initially this will assume that all GLG runs are done locally. We can parallelize the GLG runs for high-thoughput computing (#6) as a later step.

Glmnet mex errors for large datasets

We see a significant percentage of the jobs resulting in a segmentation violation when glmnet is called with the params.family = 'poisson' option (example at bottom). We may need to involve the glmnet maintainer at Stanford at some point.

A temporary workaround would be to transform count-based transcriptomic data and using the params.family = 'gaussian' option.


   Segmentation violation detected at Tue Jul 02 17:21:13 2019 -0500

Configuration:
Crash Decoding : Disabled - No sandbox or build area path
Crash Mode : continue (default)
Default Encoding : US-ASCII
Deployed : true
GNU C Library : 2.17 stable
Graphics Driver : Unknown software
MATLAB Architecture : glnxa64
MATLAB Entitlement ID : Unknown
MATLAB Root : /var/lib/condor/execute/slot1/dir_24929/v94
MATLAB Version : 9.4.0.813654 (R2018a)
OpenGL : software
Operating System : Linux 5.0.8-1.el7.elrepo.x86_64 #1 SMP Wed Apr 17 10:11:44 EDT 2019 x86_64
Process ID : 25220
Processor ID : x86 Family 6 Model 23 Stepping 10, GenuineIntel
Session Key : 8f018b8e-68d7-47a0-bf6d-26a0bf86030f
Static TLS mitigation : Disabled: Unable to open display
Window System : No active display

Fault Count: 1

Abnormal termination

Register State (from fault):
RAX = 0000000000003b16 RBX = 0000150d175d84a0
RCX = 0000150d175c9840 RDX = ffffffffc767d558
RSP = 0000150d13ffbe00 RBP = 0000150d13ffbf60
RSI = 0000150d17622240 RDI = 000000003fefffff

R8 = 0000000000000006 R9 = 0000000000000000
R10 = 0000150d17cf8830 R11 = 0000000000000005
R12 = 000000003ff00000 R13 = 0000000000000012
R14 = 0000000000000000 R15 = 0000150d17622250

RIP = 0000150cde6ea912 EFL = 0000000000010202

CS = 0033 FS = 0000 GS = 0000

Stack Trace (from fault):
[ 0] 0x0000150cde6ea912 /tmp/.mcrCache9.4/GLG_In0/glmnet_matlab/glmnetMex.mexa64+00252178
[ 1] 0x0000150cde6b566e /tmp/.mcrCache9.4/GLG_In0/glmnet_matlab/glmnetMex.mexa64+00034414 mexfunction_+00030329
[ 2] 0x0000150d2d2721ea bin/glnxa64/libmex.so+00414186
[ 3] 0x0000150d2d272447 bin/glnxa64/libmex.so+00414791
[ 4] 0x0000150d2d272f2b bin/glnxa64/libmex.so+00417579
[ 5] 0x0000150d2d25d30c bin/glnxa64/libmex.so+00328460
[ 6] 0x0000150d2edb52ad bin/glnxa64/libmwm_dispatcher.so+00979629 ZN8Mfh_file16dispatch_fh_implEMS_FviPP11mxArray_tagiS2_EiS2_iS2+00000829
[ 7] 0x0000150d2edb5bae bin/glnxa64/libmwm_dispatcher.so+00981934 ZN8Mfh_file11dispatch_fhEiPP11mxArray_tagiS2+00000030
[ 8] 0x0000150d29ad7da1 bin/glnxa64/libmwm_lxe.so+12619169
[ 9] 0x0000150d29ad8982 bin/glnxa64/libmwm_lxe.so+12622210
[ 10] 0x0000150d29bc0e79 bin/glnxa64/libmwm_lxe.so+13573753
[ 11] 0x0000150d29b623e1 bin/glnxa64/libmwm_lxe.so+13186017
[ 12] 0x0000150d293685a8 bin/glnxa64/libmwm_lxe.so+04822440
[ 13] 0x0000150d2936acbc bin/glnxa64/libmwm_lxe.so+04832444
[ 14] 0x0000150d2936701d bin/glnxa64/libmwm_lxe.so+04816925
[ 15] 0x0000150d29360ba1 bin/glnxa64/libmwm_lxe.so+04791201
[ 16] 0x0000150d29360dd9 bin/glnxa64/libmwm_lxe.so+04791769
[ 17] 0x0000150d29366846 bin/glnxa64/libmwm_lxe.so+04814918
[ 18] 0x0000150d2936692f bin/glnxa64/libmwm_lxe.so+04815151
[ 19] 0x0000150d29495503 bin/glnxa64/libmwm_lxe.so+06055171
[ 20] 0x0000150d29498cf3 bin/glnxa64/libmwm_lxe.so+06069491
[ 21] 0x0000150d299a8f6d bin/glnxa64/libmwm_lxe.so+11378541
[ 22] 0x0000150d29ac57c4 bin/glnxa64/libmwm_lxe.so+12543940
[ 23] 0x0000150d29ac5d6b bin/glnxa64/libmwm_lxe.so+12545387
[ 24] 0x0000150d2edb52ad bin/glnxa64/libmwm_dispatcher.so+00979629 ZN8Mfh_file16dispatch_fh_implEMS_FviPP11mxArray_tagiS2_EiS2_iS2+00000829
[ 25] 0x0000150d2edb5bde bin/glnxa64/libmwm_dispatcher.so+00981982 ZN8Mfh_file22dispatch_fh_with_reuseEiPP11mxArray_tagiS2+00000030
[ 26] 0x0000150d29be5d4e bin/glnxa64/libmwm_lxe.so+13725006
[ 27] 0x0000150d29955416 bin/glnxa64/libmwm_lxe.so+11035670
[ 28] 0x0000150d2995558c bin/glnxa64/libmwm_lxe.so+11036044
[ 29] 0x0000150d299eaae8 bin/glnxa64/libmwm_lxe.so+11647720
[ 30] 0x0000150d299ec229 bin/glnxa64/libmwm_lxe.so+11653673
[ 31] 0x0000150d2ea14f80 bin/glnxa64/libmwm_interpreter.so+00688000 _Z44inCallFcnWithTrapInDesiredWSAndPublishEventsiPP11mxArray_tagiS1_PKcbP15inWorkSpace_tag+00000080
[ 32] 0x0000150d2d7b586d bin/glnxa64/libmwiqm.so+00768109 _ZN3iqm15BaseFEvalPlugin7executeEP15inWorkSpace_tagRN7mwboost10shared_ptrIN14cmddistributor17IIPCompletedEventEEE+00000525
[ 33] 0x0000150d301f14a1 bin/glnxa64/libmwmcr.so+00849057
[ 34] 0x0000150d2d7abab1 bin/glnxa64/libmwiqm.so+00727729
[ 35] 0x0000150d2d78ea95 bin/glnxa64/libmwiqm.so+00608917
[ 36] 0x0000150d301bffe5 bin/glnxa64/libmwmcr.so+00647141
[ 37] 0x0000150d301c06a4 bin/glnxa64/libmwmcr.so+00648868
[ 38] 0x0000150d301b93f1 bin/glnxa64/libmwmcr.so+00619505
[ 39] 0x0000150d37574dd5 /lib64/libpthread.so.0+00032213
[ 40] 0x0000150d3a1eaead /lib64/libc.so.6+01040045 clone+00000109
[ 41] 0x0000000000000000 +00000000

This error was detected while a MEX-file was running. If the MEX-file
is not an official MathWorks function, please examine its source code
for errors. Please consult the External Interfaces Guide for information
on debugging MEX-files.
** This crash report has been saved to disk as /tmp/matlab_crash_dump.25220-1 **

MATLAB is exiting because of fatal error
/var/lib/condor/execute/slot1/dir_24929/condor_exec.exe: line 40: 25220 Killed "/var/lib/condor/execute/slot1/dir_24929/GLG_Instance" "X_BMSparse" "lambda" "[0.01,0.02,0.05,0.1,0.001]" "dT" "3" "num_lags" "5" "kernel_width" ".5" "ID" "0" "replicate" "1" "family" "poisson" "date" "07/02/2019" "firsttarget" "1" "targetincr" "300" "prob_remove_samples" "0.3" "prob_zero_removal" "0.6"
[adeshpande4@submit-1 GitterNew]$ cat logs/hello-chtc_7936564_0.err


   Segmentation violation detected at Tue Jul 02 17:21:13 2019 -0500

Configuration:
Crash Decoding : Disabled - No sandbox or build area path
Crash Mode : continue (default)
Default Encoding : US-ASCII
Deployed : true
GNU C Library : 2.17 stable
Graphics Driver : Unknown software
MATLAB Architecture : glnxa64
MATLAB Entitlement ID : Unknown
MATLAB Root : /var/lib/condor/execute/slot1/dir_24929/v94
MATLAB Version : 9.4.0.813654 (R2018a)
OpenGL : software
Operating System : Linux 5.0.8-1.el7.elrepo.x86_64 #1 SMP Wed Apr 17 10:11:44 EDT 2019 x86_64
Process ID : 25220
Processor ID : x86 Family 6 Model 23 Stepping 10, GenuineIntel
Session Key : 8f018b8e-68d7-47a0-bf6d-26a0bf86030f
Static TLS mitigation : Disabled: Unable to open display
Window System : No active display

Fault Count: 1

Abnormal termination

Register State (from fault):
RAX = 0000000000003b16 RBX = 0000150d175d84a0
RCX = 0000150d175c9840 RDX = ffffffffc767d558
RSP = 0000150d13ffbe00 RBP = 0000150d13ffbf60
RSI = 0000150d17622240 RDI = 000000003fefffff

R8 = 0000000000000006 R9 = 0000000000000000
R10 = 0000150d17cf8830 R11 = 0000000000000005
R12 = 000000003ff00000 R13 = 0000000000000012
R14 = 0000000000000000 R15 = 0000150d17622250

RIP = 0000150cde6ea912 EFL = 0000000000010202

CS = 0033 FS = 0000 GS = 0000

Stack Trace (from fault):
[ 0] 0x0000150cde6ea912 /tmp/.mcrCache9.4/GLG_In0/glmnet_matlab/glmnetMex.mexa64+00252178
[ 1] 0x0000150cde6b566e /tmp/.mcrCache9.4/GLG_In0/glmnet_matlab/glmnetMex.mexa64+00034414 mexfunction_+00030329
[ 2] 0x0000150d2d2721ea bin/glnxa64/libmex.so+00414186
[ 3] 0x0000150d2d272447 bin/glnxa64/libmex.so+00414791
[ 4] 0x0000150d2d272f2b bin/glnxa64/libmex.so+00417579
[ 5] 0x0000150d2d25d30c bin/glnxa64/libmex.so+00328460
[ 6] 0x0000150d2edb52ad bin/glnxa64/libmwm_dispatcher.so+00979629 ZN8Mfh_file16dispatch_fh_implEMS_FviPP11mxArray_tagiS2_EiS2_iS2+00000829
[ 7] 0x0000150d2edb5bae bin/glnxa64/libmwm_dispatcher.so+00981934 ZN8Mfh_file11dispatch_fhEiPP11mxArray_tagiS2+00000030
[ 8] 0x0000150d29ad7da1 bin/glnxa64/libmwm_lxe.so+12619169
[ 9] 0x0000150d29ad8982 bin/glnxa64/libmwm_lxe.so+12622210
[ 10] 0x0000150d29bc0e79 bin/glnxa64/libmwm_lxe.so+13573753
[ 11] 0x0000150d29b623e1 bin/glnxa64/libmwm_lxe.so+13186017
[ 12] 0x0000150d293685a8 bin/glnxa64/libmwm_lxe.so+04822440
[ 13] 0x0000150d2936acbc bin/glnxa64/libmwm_lxe.so+04832444
[ 14] 0x0000150d2936701d bin/glnxa64/libmwm_lxe.so+04816925
[ 15] 0x0000150d29360ba1 bin/glnxa64/libmwm_lxe.so+04791201
[ 16] 0x0000150d29360dd9 bin/glnxa64/libmwm_lxe.so+04791769
[ 17] 0x0000150d29366846 bin/glnxa64/libmwm_lxe.so+04814918
[ 18] 0x0000150d2936692f bin/glnxa64/libmwm_lxe.so+04815151
[ 19] 0x0000150d29495503 bin/glnxa64/libmwm_lxe.so+06055171
[ 20] 0x0000150d29498cf3 bin/glnxa64/libmwm_lxe.so+06069491
[ 21] 0x0000150d299a8f6d bin/glnxa64/libmwm_lxe.so+11378541
[ 22] 0x0000150d29ac57c4 bin/glnxa64/libmwm_lxe.so+12543940
[ 23] 0x0000150d29ac5d6b bin/glnxa64/libmwm_lxe.so+12545387
[ 24] 0x0000150d2edb52ad bin/glnxa64/libmwm_dispatcher.so+00979629 ZN8Mfh_file16dispatch_fh_implEMS_FviPP11mxArray_tagiS2_EiS2_iS2+00000829
[ 25] 0x0000150d2edb5bde bin/glnxa64/libmwm_dispatcher.so+00981982 ZN8Mfh_file22dispatch_fh_with_reuseEiPP11mxArray_tagiS2+00000030
[ 26] 0x0000150d29be5d4e bin/glnxa64/libmwm_lxe.so+13725006
[ 27] 0x0000150d29955416 bin/glnxa64/libmwm_lxe.so+11035670
[ 28] 0x0000150d2995558c bin/glnxa64/libmwm_lxe.so+11036044
[ 29] 0x0000150d299eaae8 bin/glnxa64/libmwm_lxe.so+11647720
[ 30] 0x0000150d299ec229 bin/glnxa64/libmwm_lxe.so+11653673
[ 31] 0x0000150d2ea14f80 bin/glnxa64/libmwm_interpreter.so+00688000 _Z44inCallFcnWithTrapInDesiredWSAndPublishEventsiPP11mxArray_tagiS1_PKcbP15inWorkSpace_tag+00000080
[ 32] 0x0000150d2d7b586d bin/glnxa64/libmwiqm.so+00768109 _ZN3iqm15BaseFEvalPlugin7executeEP15inWorkSpace_tagRN7mwboost10shared_ptrIN14cmddistributor17IIPCompletedEventEEE+00000525
[ 33] 0x0000150d301f14a1 bin/glnxa64/libmwmcr.so+00849057
[ 34] 0x0000150d2d7abab1 bin/glnxa64/libmwiqm.so+00727729
[ 35] 0x0000150d2d78ea95 bin/glnxa64/libmwiqm.so+00608917
[ 36] 0x0000150d301bffe5 bin/glnxa64/libmwmcr.so+00647141
[ 37] 0x0000150d301c06a4 bin/glnxa64/libmwmcr.so+00648868
[ 38] 0x0000150d301b93f1 bin/glnxa64/libmwmcr.so+00619505
[ 39] 0x0000150d37574dd5 /lib64/libpthread.so.0+00032213
[ 40] 0x0000150d3a1eaead /lib64/libc.so.6+01040045 clone+00000109
[ 41] 0x0000000000000000 +00000000

This error was detected while a MEX-file was running. If the MEX-file
is not an official MathWorks function, please examine its source code
for errors. Please consult the External Interfaces Guide for information
on debugging MEX-files.
** This crash report has been saved to disk as /tmp/matlab_crash_dump.25220-1 **

MATLAB is exiting because of fatal error
/var/lib/condor/execute/slot1/dir_24929/condor_exec.exe: line 40: 25220 Killed "/var/lib/condor/execute/slot1/dir_24929/GLG_Instance" "X_BMSparse" "lambda" "[0.01,0.02,0.05,0.1,0.001]" "dT" "3" "num_lags" "5" "kernel_width" ".5" "ID" "0" "replicate" "1" "family" "poisson" "date" "07/02/2019" "firsttarget" "1" "targetincr" "300" "prob_remove_samples" "0.3" "prob_zero_removal" "0.6"

Fix Docker image build failures

The latest Docker image build on GitHub Actions failed because of apt-get update failures. They appear to refer to Debian stretch no longer being supported.

I disabled building the Docker image from our GitHub Actions workflow temporarily until we can fix the Docker image.

Option to output scores for all edges and regulators

GENIE3 is a popular network inference approach. It provides a way to output all regulator-target edges: https://github.com/aertslab/GENIE3/blob/master/vignettes/GENIE3.Rmd#L116

SINGE already has a similar output format but does not output edges with a score of 0. The 0s in the output are edges that had a non-zero score before rounding

ranked_edgesw.SCINGE_Score = floor(ranked_edgesw.SCINGE_Score*10^5)/10^5;

We could consider adding an optional parameter in the config file that would output all edge scores and all regulator scores even if they are 0. That would help users use SINGE in downstream analyses set up for GENIE3.

Example for High Throughput Computing

Provide example with high throughput computing.
Could include specific example with broad guidelines to facilitate the adaptation on other high-throughput systems.

Prepare Docker container with MATLAB runtime

#16 confirmed that it is possible to run SINGE inside a Docker to container using compiled MATLAB code. Our next step will be to create a general purpose SINGE Docker container. Specific tasks include:

  • Create a command line interface #17
  • Release a new version of SINGE
  • Compile that new release and host it somewhere stable
  • Remove the temp versions of SINGE at https://www.biostat.wisc.edu/~gitter/tmp/
  • Create a Docker container using the strategy in #16 and the new release

Glmnet mex error on Windows 10

Running SCINGE_Example on Windows 10 with MATLAB R2014b generated a system error: MATLAB has encountered an internal problem and needs to close.

The error details include the message Access violation detected. The first lines of the stack trace (with paths edited) are:

Stack Trace (from fault):
[  0] 0x00000000fe160b03 ...\MATLAB\R2014b\bin\win64\libmx.dll+00396035 MXGETPR+00000003
[  1] 0x00007ff879dc19dc ...\SCINGE\glmnet_matlab\glmnetMex.mexw64+00006620 MEXFUNCTION+00002524

Possibly related to these issues:

Re-compiling the Glmnet code may be the next step.

Matlab crash while glmnetMex.mexmaci64 was running

I am trying to run the SINGE_Example.m in MATLABR2020a on macOS Catalina.

ver -support


MATLAB Version: 9.8.0.1873465 (R2020a) Update 8
MATLAB License Number: 40707400
Operating System: Mac OS X Version: 10.15.7 Build: 19H1824
Java Version: Java 1.8.0_202-b08 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode

MATLAB Version 9.8 (R2020a) License 40707400

I get the following Warning message:

Warning: from glmnet Fortran code (error code -5); Convergence for 5th lambda value not reached
after maxit=10000 iterations; solutions for larger lambdas returned

In elnet (line 33)
In glmnet (line 443)
In iLasso_for_SINGE (line 111)
In run_iLasso_row (line 27)
In SINGE_GLG_Test (line 79)
In SINGE (line 20)
In SINGE_Example (line 16)

After several iterations MATLAB crashes.
According to MathWorks technical support, the crash was detected while the MEX-file glmnetMex.mexmaci64 was running.
Any suggestion to solve this issue?

Regulator list

We may want to support a pre-defined list of allowed regulators. This would be a subset of the gene list and could be used if someone wants to restrict the candidate source nodes using prior knowledge.

Exit if hyperparameters file does not exist

I was testing SINGE in the Docker image and provided a hyperparameters files that did not exist. However, SINGE continued running with default hyperparameters. SINGE.sh and SINGE.m should exit with an error in this case.

Here is the start of the output:

SINGE operating in standalone mode
standalone mode running GLG tests
grep: input/default_hyperparameters.txt: No such file or directory
hypenum: 1
arg:
sed: can't read input/default_hyperparameters.txt: No such file or directory
------------------------------------------
Setting up environment variables
---
LD_LIBRARY_PATH is .:/usr/local/MATLAB/MATLAB_Runtime/v94/runtime/glnxa64:/usr/l
ocal/MATLAB/MATLAB_Runtime/v94/bin/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v94/
sys/os/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v94/sys/opengl/lib/glnxa64
Creating MATLAB Runtime Cache at location: /tmp/.mcrCache9.4
.max_size not found.  Using default size of 33554432 bytes.
MATLAB Runtime cache extracting component: SINGE_GLG_Te_DE1F291F7B6F70644FA3E2EA
0D86137D
Acquiring MATLAB Runtime cache root-level directory lock... acquire succeeded.
Reading cache index file...
File open failed for /tmp/.mcrCache9.4/.mcr_cache_index
MATLAB Runtime cache: extractDir is /tmp/.mcrCache9.4/SINGE_0

Adding component SINGE_GLG_Te_DE1F291F7B6F70644FA3E2EA0D86137D to the cache.
MATLAB Runtime Cache: performing maintenance...
Processing cached components...
Done with cache maintenance.
Creating component directory: /tmp/.mcrCache9.4/SINGE_0
Acquiring component directory WRITE lock... acquire succeeded.
Extracting component... Component extracted to cache.  Writing creation timestam
p...
Timestamp successfully created.
done.
Downgrading WRITE lock to READ lock... downgrade successful.
Component SINGE_GLG_Te_DE1F291F7B6F70644FA3E2EA0D86137D has successfully been ac
cessed from the cache.
MATLAB Runtime Cache: performing maintenance...
Processing cached components...
Done with cache maintenance.
Checking whether index file /tmp/.mcrCache9.4/.mcr_cache_index needs to be writt
en...
Write is needed.
Writing cache index file: /tmp/.mcrCache9.4/.mcr_cache_index
Writing cache index entry:
SINGE_GLG_Te_DE1F291F7B6F70644FA3E2EA0D86137D
SINGE_0
2519071
2021-Jan-23 21:43:46.477392


params =

  struct with fields:

                   Data: 'input/X_SCODE_data.mat'
                   date: '23-Jan-2021'
                     dT: 1
                 family: 'gaussian'
                     ID: 0
           kernel_width: 2
                 lambda: 0.0100
               num_lags: 15
                 outdir: 'output'
    prob_remove_samples: 0.2000
      prob_zero_removal: 0
              replicate: 0
                     p1: 15
             DateNumber: 738179

Support input mat files from Octave

Copying the original report from Murali-group/Beeline#33 (comment):

At this line https://github.com/gitter-lab/SINGE/blob/master/code/iLasso_for_SINGE.m#L56 , values of m.fullKp are written into a copy of the input mat file before a variable declaration for fullKp, and MATLAB should infer that fullKp is a struct array. This works as expected when using version 7.3 input mat files written by MATLAB, which support partial writes of each value without loading fullKp into memory https://www.mathworks.com/help/matlab/import_export/load-parts-of-variables-from-mat-files.html . As I understand it, the issue I ran into when using a version 7 input mat file written by Octave, which does not support writing version 7.3 mat files, is that MATLAB first loads the fullKp variable into memory, and because that variable hasn't been declared yet MATLAB initializes it, but at that point MATLAB fails to correctly infer that the type should be a struct array. The actual error I get is that the Kp2 struct cannot be assigned to a value in m.fullKp, that full error message copied below:

Warning: The file '/usr/local/SINGE/TempMat_5.mat' was saved in a format that does not support partial loading. Temporarily loading variable 'fullKp' into memory. To use partial loading efficiently, save MAT-files with the -v7.3 flag.

Error using iLasso_for_SINGE (line 56)
Conversion to double from struct is not possible.

The workaround I implemented in that commit was to initialize the fullKp variable as a struct array in the input mat file, which appears to work but obviously it is not ideal to initialize a specific internal variable that could change in another SINGE version. I think this might be addressed in SINGE to support a version 7 mat file written by Octave by either initializing fullKp as a struct array before that write, loading and saving the input file as a version 7.3 mat file instead of directly copying to a temporary file when parsing the input, or otherwise writing that variable somewhere other than the mat file.

Store example output data for testing

We should create a tests subdirectory that can initially store the expected output files from the example data. Those will be needed for continuous integration testing.

In addition to the final outputs, we can also store some intermediate output files if they are not too large. This will be necessary if we ever want to port the code to Python or R.

Rebrand to SINGE

We're changing the abbreviation from SINGE to SCINGE. I'll change the repo name immediately and leave this open until we have fixed all of the individual files.

Automating SINGE for branching trajectories

We should investigate strategies to automate the workflow of splitting a branching trajectory into multiple cell-fate based subdirectories, followed by SINGE analysis on each subdirectory and obtaining GRNs based on these analyses. Some initial thoughts are provided in USAGE.md.

Reference GRN for ESC to endoderm differentiation

Hi, I am trying to reproduce Fig 3 from your paper and I was wondering if you could share the GRN for ESC to endoderm differentiation which you determined from the ESCAPE database? Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.