Coder Social home page Coder Social logo

diyabcgui's Introduction

icon Graphical User Interface for DIYABC-RF software

Disclaimer: DIYABC-RF GUI is under final development stage. You may still encounter a few bugs. Please check ongoing issues or fill a new one here if you encounter any problem.

We provide a graphical user interface (GUI) for the DIYABC-RF software [1], called DIYABC-RF GUI.

Note: DIYABC-RF GUI replaces the old interface DIYABC V2.1 which is not maintained anymore.

Please check the project website for additional information and detailed documentation.

Availability

DIYABC-RF GUI is available as a standalone application, or as a shiny web app implemented in the diyabcGUI R package .

You can either install and run the standalone app, or install the diyabcGUI R package and run DIYABC-RF GUI as a standard shiny app, c.f. below.

DIYABC-RF GUI provides a set of tools implementing Approximate Bayesian Computation (ABC) combined with supervised machine learning based on Random Forests (RF), for model choice and parameter inference in the context of population genetics analysis.

DIYABC-RF GUI (and the package diyabcGUI) is a user-friendly interface for command-line softwares diyabc and abcranger, which are elementary bricks of the DIYABC-RF pipeline.

Authorship and licensing

The DIYABC-RF GUI software is edited by the DIYABC-RF Core team.

DIYABC-RF Core team: François-David Collin, Ghislain Durif, Louis Raynal, Mathieu Gautier, Renaud Vitalis, Eric Lombaert, Jean-Michel Marin, Arnaud Estoup

The Windows DIYABC-RF GUI standalone app is based on DesktopDeployR by Wyming Lee Pang (https://github.com/wleepang/DesktopDeployR).

See the dedicated file for detailed copyright and licensing information.

Using and citing DIYABC-RF

If you use the DIYABC-RF software suite (GUI or CLI) in your study, please consider citing [1].


Installation

Requirements

  • zip program

Standalone app

For Windows users:

  1. Please download the latest release of DIYABC-RF GUI at https://github.com/diyabc/diyabcGUI/releases/latest and unzip DIYABC-RF_GUI_<latest_version>.zip

  2. To launch DIYABC-RF GUI, run DIYABC-RF_GUI (or DIYABC-RF_GUI.bat) in the previously extracted directory (either by double-clicking it or in a terminal, you can also create a shortcut to run it by right-clicking on it).

  3. It will open a new tab in your web browser and you can use DIYABC-RF GUI as a web app.

Important: you should not forget to quit the app when you are done with the dedicated button (otherwise some background related processes will remains active). Repeat steps 2 and 3 to launch again the application.

A log file for DIYABC-RF GUI is available in your user-specific directory for temporary files, generally C:\Users\<username>\AppData\Local\Temp\DIYABC-RF_GUI_<timestamp>_<random_number>/.

If you want to open multiple DIYABC-RF projects, you need to simultaneously open multiple instances of DIYABC-RF GUI (i.e. step 2 and 3).

At the moment, the standalone app is not available for Linux and MacOS users. Nonetheless, Linux and MacOS users can install the diyabcGUI package, c.f. below, and run the DIYABC-RF GUI as a standard shiny app.

Note: if encountering instability in the standalone app, we recommend to install and use the shiny app available in the diyabcGUI R package, c.f. below.

R package installation

  1. Install devtools package (if not installed on your system)
install.packages("devtools")

Note: if you encounter any issue when installing devtools, please check the next section.

  1. Install diyabcGUI package
devtools::install_github(
    "diyabc/diyabcGUI",
    subdir = "R-pkg"
)
  1. The first time after installation, you need to download required binary files (e.g. diyabc and abcranger command line tools) by running
library(diyabcGUI)
diyabcGUI::dl_all_latest_bin()

Note: you can run this command from time to time to update the required binary files in case new versions were released.

  1. Launch the interface
library(diyabcGUI)
diyabcGUI::diyabc()

The function diyabc() will launch DIYABC-RF GUI as a standard shiny web app, that you will be able to use either in your web browser or in the Rstudio shiny app viewer.

To run simultaneously mutliple instances of DIYABC-RF GUI, e.g. to simultaneously manage and run multiple projects, you just need to run several times the function diyabc() from R (this is not possible from RStudio).


Potential issue with devtools

You may encounter some issue when installing devtools, please check the official devtools page.

Following devtools recommendations, make sure you have a working development environment.

  • Windows: Install Rtools.
  • Mac: Install Xcode from the Mac App Store.
  • Linux: Install a compiler and various development libraries (details vary across different flavors of Linux).

For Ubuntu users here is a guide to install devtools requirement (users of other Linux distributions may still find it useful).


Shiny server installation

As a shiny app, DIYABC-RF GUI can be installed and run from a Shiny server. To do so, you just need (on Unix system, please adapt for Windows server) to:

  1. install the diyabcGUI package on your system, c.f. above
  2. manage the file access rights so that the Shiny server has access to the R package installation directory
  3. Create a symbolic link to the directory given by the R command system.file("application", package = "diyabcGUI") inside the site_dir folder configured in /etc/shiny-server/shiny-server.conf (by default /srv/shiny-server), e.g.:
ln -s /path/to/R_LIBS/diyabcGUI/application /srv/shiny-server/diyabc
  1. DIYABC-RF GUI is now available on your server at https://my.shiny.server.address/diyabc

Standalone build (for developpers)

Please see the dedicated directory for instructions about the standalone building.


Reference

[1] Collin F-D, Durif G, Raynal L, Gautier M, Vitalis R, Lombaert E., Marin J-M, Estoup A., 2021, Extending Approximate Bayesian Computation with Supervised Machine Learning to infer demographic history from genetic polymorphisms using DIYABC Random Forest. Molecular Ecology Resources, Wiley/Blackwell, 21(8), pp. 2598–2613. <doi/10.1111/1755-0998.13413> <hal-03229207>

diyabcgui's People

Contributors

fradav avatar gdurif avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

caoxie91 zhcchz

diyabcgui's Issues

Windows standalone build

  1. run build_standalone.R (based on R package electricShine)
  2. If 1. not working, try running build_standalone_windows.R (based on R package RInno)

[RF] For MSS data list of possible parameters to estimate is missing in RF section in parameter estimation mode

Avec microsat et/ou séquences, dans la partie random forest, il semble que la liste des paramètres
n’est pas donnée lorsqu’on veut faire une estimation de paramètre (il est écrit « missing
parameter »). Pour autant, ça fonctionne quand même : on peut tout-à-fait faire une estimation de
paramètre sur le paramètre qu’on veut. Il n’y a juste pas ce « pense-bête » qui s’affiche (alors qu’il est
bien présent sur mes essais avec SNPs).
Toutefois, ça n’est pas annodain, car il est impossible de connaître les vrais noms des paramètres
mutationnels à partir du Standalone. Il faut fouiller dans le header pour trouver, ce que beaucoup de
gens ne saurons pas faire. J’imagine d’ailleurs que c’est à cause des paramètres mutationnels que ça
ne marche pas pour les microsats/séquences, alors que ça fonctionne pour les SNPs.

Add information about summary statistics

Add following information in the training set simulation panel:

For SNP loci (both IndSeq and PoolSeq SNPs)

WARNING! ALL SUMMARY STATISTICS IMPLEMENTED IN THE PROGRAM WILL BE COMPUTED AND INCLUDED IN THE TRAINING DATASET

For both IndSeq and PoolSeq SNP loci, the following set of summary statistics has been implemented.

  1. Proportion of monomorphic loci for each population, as well as for each pair and triplet of populations (ML1p, ML2p, ML3p)

Mean and variance (over loci) values are computed for all subsequent summary statistics.
2. Heterozygosity for each population (HW) and for each pair of populations (HB)
3. FST-related statistics for each population (FST1), for each pair (FST2), triplet (FST3), quadruplet (FST4) and overall (FSTall) populations (when the dataset includes more than four populations)
4. Patterson’s f-statistics for each triplet (f3-statistics; F3) and quadruplet (f4-statistics; F4) of populations
5. Nei’s distance (NEI) for each pair of populations
6. Maximum likelihood coefficient of admixture (AML) computed for each triplet of populations.

For microsatellite loci

WARNING! ALL SUMMARY STATISTICS IMPLEMENTED IN THE PROGRAM WILL BE COMPUTED AND INCLUDED IN THE TRAINING DATASET

For microsatellite loci, the following set of summary statistics has been implemented.

Single sample statistics:
1. mean number of alleles across loci (NAL)
2. mean gene diversity across loci (HET)
3. mean allele size variance across loci (VAR)
4. mean M index across loci (MGW)

Two sample statistics:
1. mean number of alleles across loci (two samples) (N2P)
2. mean gene diversity across loci (two samples) (H2P)
3. mean allele size variance across loci (two samples) (V2P)
4. F_{ST} between two samples (FST)
5. mean index of classification (two samples) (LIK)
6. shared allele distance between two samples (DAS)
7. distance between two samples (DM2)

Three sample statistics:
1. Maximum likelihood coefficient of admixture (AML)

For DNA sequence loci

WARNING! ALL SUMMARY STATISTICS IMPLEMENTED IN THE PROGRAM WILL BE COMPUTED AND INCLUDED IN THE TRAINING DATASET

For DNA sequence loci, the following set of summary statistics has been implemented.

Single sample statistics:
1. number of distinct haplotypes (NHA)
2. number of segregating sites (NSS)
3. mean pairwise difference (MPD)
4. variance of the number of pairwise differences (VPD)
5. Tajima’s D statistics (DTA)
6. Number of private segregating sites (PSS)
7. Mean of the numbers of the rarest nucleotide at segregating sites (MNS)
8. Variance of the numbers of the rarest nucleotide at segregating sites (VNS)

Two sample statistics:
1. number of distinct haplotypes in the pooled sample (NH2)
2. number of segregating sites in the pooled sample (NS2)
3. mean of within sample pairwise differences (MP2)
4. mean of between sample pairwise differences (MPB)
5. between two samples (HST)

Three sample statistics:
1. Maximum likelihood coefficient of admixture (SML)

Errors encountered when install diyabc standalone on windows

Hi developers,
I tried to install diyabc standalone on Rstudio with R version 4.0.3 with "build_standalone.R" you provided in diyabcGUI-1.0.3.zip; However, the installation process failed with the following errror:

Finshed: Installing your Shiny package into electricShine framework
Error in system.file("extdata", "icon", package = my_package_name, lib.loc = library_path) :
'package' must be of length 1

but the package listed in the package catalogue, so I tried the code > diyabcGUI::diyabc(). GUI turned up but after I chose the data file, the GUI disppeared with the following error:

Loading required package: shiny
Listening on http://127.0.0.1:3375
[1] "project directory: C:\Users\admin\AppData\Local\Temp\RtmpIfo76m\diyabc56d876745c24"
Warning: Error in stri_c: can't find 'content'
50: stri_c
49: str_c
47: check_mss_data_file
46: check_data_file
45:
2: shiny::runApp
1: diyabcGUI::diyabc

Could you help me figure out the problem? I am planning to analyze my mss data with your work.

My system is Windows 10 X64 , RAM32GB, 6 cores.

Thank you very much in advance~

Siran

P.S. I also found DIYabc v2.1.0 and v2.0.4 can't run anymore.

[project setup] Fix data info for PoolSeq data set

Use following output for data info in case of PoolSeq data

  • Number of population pools : 4
  • Total number of loci = 30000
  • MRC=5 (forget MAF)
  • Number of loci available with MRC >= 5: XXXXXXXXXXXX (from diyabc output)

Same for section Loci description

  • Recall under '<n_loci> ': Number of loci available with MRC >= 5: XXXXXXXXXXXX

Windows build script asks for a CRAN mirror at first start

On a vanilla just installed windows 10 machine with R (never launched) :

franc@DESKTOP-QHPLHB6 C:\Users\Franc\Documents\dev\diyabcGUI>"c:\Program Files\R\R-4.0.2\bin\x64\R.exe" --no-save < build_standalone_windows.R

R version 4.0.2 (2020-06-22) -- "Taking Off Again"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R est un logiciel libre livré sans AUCUNE GARANTIE.
Vous pouvez le redistribuer sous certaines conditions.
Tapez 'license()' ou 'licence()' pour plus de détails.

R est un projet collaboratif avec de nombreux contributeurs.
Tapez 'contributors()' pour plus d'information et
'citation()' pour la façon de le citer dans les publications.

Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
Tapez 'q()' pour quitter R.

> # generate standalone interface for Windows
>
> # requirement
> install.packages("devtools")
Error in contrib.url(repos, "source") :
  essai d'utilisation de CRAN sans fixer un miroir
Calls: install.packages -> contrib.url
Exécution arrêtée

Scenario numbering

In the training set simulation sub-module, in the "historical scenario definition" panel, when adding new scenarii and then removing some, the numbering still account for deleted scenarii.

Check for full coalescence in historical scenario

When defining a scenario, a check for full coalescence should be done.

Example of non coalescent scenario which is currently not detected:

N1 N2 N3 N4
0 sample 1
0 sample 2
0 sample 3
0 sample 4
t1 merge 1 2
t2 merge 3 4

[training set simu] Freeze with complex scenarii (in particular merge to ghost pop)

J’ai fait différents tests avec des données microsats, notamment d’analyses réelles pour lesquelles j’avais des résultats, afin de pouvoir faire des comparaisons : je n’ai globalement pas vu de problèmes, à part ceux évoqués précédemment.

En revanche, j’ai rencontré un problème quand je suis passé à une analyse un peu plus complexe. Toujours sur un dataset microsat (ci-joint) contenant 7 pops, le programme s’est figé lorsque j’ai essayé le scénario suivant (à noter que ce scénario fonctionne très bien avec les derniers programmes sous linux) :

N1 NKo NKa NJ12 N2 N3 N4 Na
0 sample 1
0 sample 2
0 sample 3
0 sample 4
0 sample 5
0 sample 6
0 sample 7
tJ12-DBJ12 varNe 4 NJ12B
tJ12 VarNe 4 NgJ12
tgJ12 merge 8 4
tKa-DBKa varNe 3 NKaB
tKa VarNe 3 NgKa
tgKa merge 8 3
tKo-DBKo varNe 2 NKoB
tKo VarNe 2 NgKo
tgKo merge 8 2
t1 merge 8 1
t2 merge 8 5
t3 merge 8 6
t4 merge 8 7
ta VarNe 8 Naold

Edit1: no problem with R package on Linux, c.f. 86c39c2
Edit2: but simulations failed (because need of conditions over time parameters)

C’est un scénario qui utilise notamment une population fantôme (la pop 8) de manière assez récurrente. J’ai fait différents tests, et j’ai l’impression que c’est le fait de faire des merge avec cette pop fantôme plusieurs fois qui pose problème. Si je ne fait qu'un seul merge, ça fonctionne. Par exemple :

N1 NKo NKa NJ12 N2 N3 N4 Na
0 sample 1
0 sample 2
0 sample 3
0 sample 4
0 sample 5
0 sample 6
0 sample 7
tJ12-DBJ12 varNe 4 NJ12B
tJ12 merge 3 4
tKa-DBKa varNe 3 NKaB
tKa merge 2 3
tKo-DBKo varNe 2 NKoB
tKo merge 1 2
t1 merge 5 1
t2 merge 6 5
t3 merge 7 6
t4 merge 8 7
ta VarNe 8 Naold

Il suffit que je merge une seconde pop dans la pop 8 pour que ça plante.

Voilà, that’s all.

Je vais essayer de creuser d’autres aspects, notamment en travaillant avec des datasets Microsat/Séquences et Séquences seules. Ensuite, je ferai des essais sur SNP. Mais ça sera sans doute à la rentrée.

reference table generation issue

Hello,
I am using the GUI interface for my DIYABC analyses. I have been runnning the program and some scenarios successfully the last few days but now I keep getting an error :

"something happened during the reftable generation:
Program of thread "6juillet reference table generation" exited (with return code 1) unsuccessfully.
sOk[ipart]=18

I can't find any bug in the manual, I don't understand if this is a memory issue or what is wrong with the program.
I would highly appreciate any advices or suggestions,
Best regards
Alice

[general] Modify management of project name and sub-projects

  • Setup main project name (mandatory and not optional as now)
  • Setup training set simulation sub-project name with a corresponding sub-directory
  • Setup analysis name for each analysis (to avoid over-writing) with a corresponding sub-directory

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.