Coder Social home page Coder Social logo

npg_conda's Introduction

This repository contains Conda recipes to build tools and libraries used by WSI NPG.

Our recipes differ from those provided by Anaconda Inc., Conda Forge and BioConda in order to meet our specific needs:

  • Build artefacts are separated into sub-packages. For a typical package written in C and named example, these would be:

    • example containing executables and their documentation, such as manpages.

    • libexample containing the example libraries (static and shared).

    • example-dev containing the C headers, any pkg-config files, build-time configuration executables and API manpages.

  • Recipes depend only on the Anaconda defaults channel

  • We maintain recipes for multiple versions of packages in production. The recipes are located in directory hierarchy by name and version.

  • Recipes do not support Windows or macOS.

Typical Conda recipes create a single package bundling all build artefacts (executables, libraries, headers, manpages etc) together, so installing a program that depends on a C shared library from another package will cause any executables in that package also to be installed in the target environment. This is something we specifically want to avoid.

We avoid using the conda-forge and bioconda channels so that we are in complete control of the deployed package dependency graph and do not unexpectedly upgrade (or downgrade) packages that may affect data analysis.

We don't use the Windows or macOS platforms, so we simplify our recipes by omitting support for them.

Building the recipes

Building from source requires Conda (we use Miniconda), with the conda-build and conda-verify packages installed.

Builds normally take place within the environment of a Docker container. The benefits are

  • Builds will take place in the same environment, regardless of which OS they are run under.

  • Builds are isolated from one another because they each run in an independent container.

  • Our default build image does not contain build tools or a compiler, which reduces the chance of these being used over the Conda build tools.

Our build image contains Conda pre-installed and works by mounting two local directories, one which should contain the Conda recipes to build and another which will receive the built packages. The bin/build script handles mounting the directories and running the builds. The build script requires a list of packages to be supplied on STDIN and it will build them in that order.

This means that for a complete from-source build of all packages, they must be sorted so that packages that have dependencies are built after those they depend on. This can be achieved using the bin/recipebook script which inspects the recipes, calculates their dependency DAG and then outputs a list sorted so that they are built in the correct order:

./bin/recipebook recipes/ | head -4
rna-seqc 1.1.8 recipes/rna-seqc/1.1.8
bowtie2 2.2.7 recipes/bowtie2/2.2.7
teepot 1.2.0 recipes/teepot/1.2.0
eigen 3.3.4 recipes/eigen/3.3.4

Both of these scripts have command line help and a number of options to configure their behaviour. Note that the online help for bin/build reports default values dynamically (i.e. they are calculated for your current environment so that they describe accurately the values that will be used).

A complete build example:

./bin/recipebook | ./bin/build \
--recipes-dir $PWD --artefacts-dir $HOME/conda-artefacts \
--conda-build-image ghcr.io/wtsi-npg/centos-7-conda-build:latest --verbose

Here the recipes directory that will be mounted by the container is set explicitly, as is the artefacts directory, where the built packages will appear (these are both mounted into the container).

The artefacts directory can be used by multiple builds, sequentially. It will accumulate built packages that will be used as dependencies by later builds. Alternatively, you may prefer to push the built packages to a Conda channel and have later builds find them there.

If there are errors Conda will report the full path to the failed build so that you can investigate. Common reasons for build failures (aside from errors in the new recipe) are

  • The software being packaged requires an older or newer version of a compiler, build tool or library than is available in the channels

  • The software being packaged has a build system that fails to respect $PREFIX during installation

  • The software being packaged has undocumented dependencies

A successfully built package will be dropped in the output root directory, the default being <CONDA_PREFIX>/conda-bld/. This may be changed in the .condarc file or by setting the CONDA_BLD_PATH environment variable, see Conda build configuration section of the Conda User guide

Naming new recipes

The rules are:

  1. The package containing the executables should be named after the commonly used name for the software (e.g. bwa, minimap2, curl)

  2. If 1. is not possible, e.g. because the executables are in sub-package, the Conda meta-package is renamed {package name}-pkg and the executables sub-package keeps the common name (e.g. curl-pkg,curl,libcurl,libcurl-dev).

  3. If 2. is not possible because, e.g. the common name for the software is a library name and the software also provides executables, then the executables package is renamed {package}-bin, (e.g. libml2-pkg,libxml2-bin,libxml2,libxml2-dev).

Notes on glibc

The defaults Conda channel uses glibc 2.17 from CentOS 7.x. Our packages are built in a Docker CentOS 7.x container.

Special compilers

iRODS requires Clang to build. The Clang package available from conda-forge is unable to locate the Conda GCC 9.3 installation. We have made forks of the LLVM and Clang Conda recipes to work around this.

The packages may be built within the CentOS container using the following commands:

docker run --mount \
source=/home/ubuntu/llvmdev-feedstock,\
target=/home/conda/recipes,type=bind \
--mount \
source=/home/ubuntu/conda-artefacts,\
target=/opt/conda/conda-bld,type=bind \
-e CONDA_USER_ID=1001 -e CONDA_GROUP_ID=1001 -i --rm \
ghcr.io/wtsi-npg/centos-7-conda-build:latest \ 
/bin/sh -c 'exportCONDA_BLD_PATH="/opt/conda/conda-bld" ; conda config --set auto_update_conda False ; cd /home/conda/recipes && conda build recipe'

docker run --mount \
source=/home/ubuntu/clangdev-feedstock,\
target=/home/conda/recipes,type=bind \
--mount \
source=/home/ubuntu/conda-artefacts,\
target=/opt/conda/conda-bld,type=bind \
-e CONDA_USER_ID=1001 -e CONDA_GROUP_ID=1001 -i --rm \
ghcr.io/wtsi-npg/centos-7-conda-build:latest \
/bin/sh -c 'export CONDA_BLD_PATH="/opt/conda/conda-bld" ; conda config --set auto_update_conda False ; cd /home/conda/recipes && conda build recipe'

npg_conda's People

Contributors

ces avatar dkj avatar dozy avatar eclissi91 avatar freddodd6 avatar frinksy avatar grishah avatar jenniferliddle avatar jmtcsngr avatar kjsanger avatar marcomoscasgr avatar mgcam avatar mksanger avatar mp15 avatar sb10 avatar srl147 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

npg_conda's Issues

Dependency graph calculation fails on build variants.

This is apparent in recipes that depend on iRODS. Recipes declaring a dependency on the variant

e.g.

host:
    - irods-dev {{ irods }}

do not have the {{ irods }} template expanded to a version during processing (I think) and hence have
that dependency omitted. The variants are declared in the root conda_build_config.yml.

We use our own template expansion hack at the moment. Perhaps we can use Conda's rendering API
instead?

npg_qc_utils 65.0 fails to build, dependency conflict

ERROR:root:conda.exceptions.UnsatisfiableError: The following specifications were found to be in conflict:
ERROR:root:  - libhts-dev==1.9+66_gbcf9bff -> libhts==1.9+66_gbcf9bff=plugins_201712_2
ERROR:root:  - samtools-dev -> libhts==1.10.2+110_gda59588

Typo in HOWTO

Bionic release is given as 18.08 at one point in the first section.

npg_qc_utils 67.0 fails to build, target bin directory not created by the build script

ERROR:root:+ pushd norm_fit                                                                                   
ERROR:root:+ mkdir -p build                                                                                   
ERROR:root:+ make -j 8 CC=/opt/conda/conda-bld/npg_qc_utils_1598365796410/_build_env/bin/x86_64-conda_cos6-lin
ux-gnu-gcc LIBPATH=-L/opt/conda/conda-bld/npg_qc_utils_1598365796410/_h_env_placehold_placehold_placehold_plac
ehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac
ehold_placehold_placehold_placehold_placehold_placehold_/lib                                                  
ERROR:root:+ cp ./build/norm_fit /opt/conda/conda-bld/npg_qc_utils_1598365796410/_h_env_placehold_placehold_pl
acehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pl
acehold_placehold_placehold_placehold_placehold_placehold_placehold_/bin/                                     
ERROR:root:cp: cannot create regular file `/opt/conda/conda-bld/npg_qc_utils_1598365796410/_h_env_placehold_pl
acehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pl
acehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/bin/': Not a directory 

cp ./build/norm_fit "$PREFIX/bin/"

bam_stats 4.4.1 fails to build, target bin directory not created by the build script

ERROR:root:+ cp ../bin/bam_stats /opt/conda/conda-bld/bam_stats_1598376370374/_h_env_placehold_placehold_placehold_placehold_placehold_plac
ehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla
cehold_pla/bin/
ERROR:root:cp: cannot create regular file `/opt/conda/conda-bld/bam_stats_1598376370374/_h_env_placehold_placehold_placehold_placehold_plac
ehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla
cehold_placehold_pla/bin/': Not a directory

bam_stats 1.13.0 fails to build, target bin directory not created by the build script

RROR:root:+ cp ../bin/bam_stats /opt/conda/conda-bld/bam_stats_1598376291847/_h_env_placehold_placehold_placehold_placehold_placehold_plac
ehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla
cehold_pla/bin/                                                                                                                            
ERROR:root:cp: cannot create regular file `/opt/conda/conda-bld/bam_stats_1598376291847/_h_env_placehold_placehold_placehold_placehold_plac
ehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla
cehold_placehold_pla/bin/': Not a directory

Use official Conda (new) compilers and build tools

Move to using the new Conda compilers (supplied by https://github.com/crosstool-ng/crosstool-ng) and using build tools supplied by Conda.

Pro:

  • We don't need to build compilers ourselves
  • We allow conda-build to do more useful linkage checking
  • We will have fewer problems when --error-overlinking becomes the default behaviour in conda-build v4.
  • We can build on a very vanilla OS setup having no compiler, make, autoconf, etc with reduced risk of accidentally using the host compiler or tools
  • We gain the possibility of cross-compiling from crosstool-ng

Con:

  • Writing recipes is trickier because some build systems do not respect setting the compiler executable via environment variables (e.g. have hard-coded gcc)
    so
    • We document the tricky parts with examples so that knowledge is spread
    • We submit patches upstream, if appropriate
  • The earliest gcc in defaults is gcc 5.4.* and some software will not build with anything later than 4.8.*, so
    • We deprecate certain software if we can
      or
    • We maintain a fork or patch that software

More information about build.py in the docs

@dozy brought up his consern regarding build.py not being metioned in the docs. We should consider mentioning the role of scripts/build.py in the package build process. Currently, the docs assume it is well-known build.py is the one which coordinates the build process.

bambi 0.11.1 fails to build with multiple errors

./configure: line 3634: LT_INIT: command not found

and various confliciting type errors

ERROR:root:In file included from src/hts_addendum.h:27:0,                                                                                  
ERROR:root:                 from src/bambi.h:25,                                                                                           
ERROR:root:                 from src/i2b.c:19:                                                                                             
ERROR:root:src/cram/sam_header.h:242:10: error: conflicting types for 'sam_hdr_dup'                                                        
ERROR:root: SAM_hdr *sam_hdr_dup(SAM_hdr *hdr);                                                                                            
ERROR:root:          ^~~~~~~~~~~                         

Update HOWTO

Update for the new Conda channel, Docker builds and upload with aws rather than s3cmd

baton 2.0.1 fails to build when bioconda and conda-forge in channels

ERROR:root:In file included from /opt/conda/conda-bld/baton-pkg_1600274766318/_build_env/x86_64-conda-linux-gnu/sysroot/usr/include/string.h:637,
ERROR:root:                 from query.c:30:
ERROR:root:query.c: In function 'prepare_obj_repl_list':
ERROR:root:query.c:287:5: error: '__builtin_strncpy' specified bound depends on the length of the source argument [-Werror=stringop-overflow=]
ERROR:root:  287 |     strncpy(path1, path, len);
ERROR:root:      |     ^~~~~~~
ERROR:root:query.c:280:18: note: length computed here
ERROR:root:  280 |     size_t len = strlen(path) + 1;
ERROR:root:      |                  ^~~~~~~~~~~~
ERROR:root:In file included from /opt/conda/conda-bld/baton-pkg_1600274766318/_build_env/x86_64-conda-linux-gnu/sysroot/usr/include/string.h:637,
ERROR:root:                 from query.c:30:
ERROR:root:query.c:288:5: error: '__builtin_strncpy' specified bound depends on the length of the source argument [-Werror=stringop-overflow=]
ERROR:root:  288 |     strncpy(path2, path, len);
ERROR:root:      |     ^~~~~~~
ERROR:root:query.c:280:18: note: length computed here
ERROR:root:  280 |     size_t len = strlen(path) + 1;
ERROR:root:      |                  ^~~~~~~~~~~~
ERROR:root:In file included from /opt/conda/conda-bld/baton-pkg_1600274766318/_build_env/x86_64-conda-linux-gnu/sysroot/usr/include/string.h:637,
ERROR:root:                 from query.c:30:
ERROR:root:query.c: In function 'prepare_obj_list':
ERROR:root:query.c:189:5: error: '__builtin_strncpy' specified bound depends on the length of the source argument [-Werror=stringop-overflow=]
ERROR:root:  189 |     strncpy(path1, path, len);
ERROR:root:      |     ^~~~~~~
ERROR:root:query.c:182:18: note: length computed here
ERROR:root:  182 |     size_t len = strlen(path) + 1;
ERROR:root:      |                  ^~~~~~~~~~~~
ERROR:root:In file included from /opt/conda/conda-bld/baton-pkg_1600274766318/_build_env/x86_64-conda-linux-gnu/sysroot/usr/include/string.h:637,
ERROR:root:                 from query.c:30:
ERROR:root:query.c:190:5: error: '__builtin_strncpy' specified bound depends on the length of the source argument [-Werror=stringop-overflow=]
ERROR:root:  190 |     strncpy(path2, path, len);
ERROR:root:      |     ^~~~~~~
ERROR:root:query.c:182:18: note: length computed here
ERROR:root:  182 |     size_t len = strlen(path) + 1;
ERROR:root:

Caused by changes to build tools, I think. Also ... why/how is is the build running as root?! The container steps down from root to the conda user on startup ...

Support for semantic versioning

I was trying to build a package with version x.y.z-beta and a parser failed to understand the version. I think it is because we only allow numeric characters and '-' was the first thing it found to upset it. Maybe beta is also going to be a problem. I was wondering if we are interested in supporting full semantic versions?

Dirty htslib builds because of changes to aclocal.m4 when running autoreconf

$ git diff aclocal.m4                                                   
diff --git a/aclocal.m4 b/aclocal.m4                                    
index 7c76ad5..ae878b8 100644       
--- a/aclocal.m4                    
+++ b/aclocal.m4                    
@@ -1,7 +1,8 @@                     
-# generated automatically by aclocal 1.14.1 -*- Autoconf -*-           
-                                   
-# Copyright (C) 1996-2013 Free Software Foundation, Inc.               
+# generated automatically by aclocal 1.11.3 -*- Autoconf -*-           
                                    
+# Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004,  
+# 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation,   
+# Inc.                             
 # This file is free software; the Free Software Foundation             
 # gives unlimited permission to copy and/or distribute it,             
 # with or without modifications, as long as this notice is preserved.  
@@ -11,7 +12,6 @@                   
 # even the implied warranty of MERCHANTABILITY or FITNESS FOR A        
 # PARTICULAR PURPOSE.              
                                    
-m4_ifndef([AC_CONFIG_MACRO_DIRS], [m4_defun([_AM_CONFIG_MACRO_DIRS], [])m4_defun([AC_CONFIG_MACRO_DIRS], [_AM_CONFIG_MACRO_DIRS($@)])])        
 # pkg.m4 - Macros to locate and utilise pkg-config.            -*- Autoconf -*-                                                                
 # serial 1 (pkg-config-0.24)       
 #                                  

See samtools/htslib#733

dated directory: incorrectly linked executables

... due to muliple versions of executables

ubuntu@mg8-test-esa:~/npg_esa/npg_stack$ find /software/pkg///bin | grep curl
/software/pkg/bambi/0.10.1/bin/curl-config
/software/pkg/bambi/0.10.1/bin/curl
/software/pkg/bam_stats/1.13.0/bin/curl-config
/software/pkg/bam_stats/1.13.0/bin/curl
/software/pkg/bcftools/1.7/bin/curl-config
/software/pkg/bcftools/1.7/bin/curl
/software/pkg/biobambam2/2.0.79/bin/curl-config
/software/pkg/biobambam2/2.0.79/bin/curl
/software/pkg/curl/7.58.0/bin
/software/pkg/curl/7.58.0/bin/curl-config
/software/pkg/curl/7.58.0/bin/curl
/software/pkg/samtools/1.7/bin/curl-config
/software/pkg/samtools/1.7/bin/curl
/software/pkg/staden_io_lib/1.14.9/bin/curl-config
/software/pkg/staden_io_lib/1.14.9/bin/curl
/software/pkg/tears/1.2.4/bin/curl-config
/software/pkg/tears/1.2.4/bin/curl

ubuntu@mg8-test-esa:~/npg_esa/npg_stack$ readlink "$(which curl)"
/software/pkg/tears/1.2.4/bin/curl

ubuntu@mg8-test-esa:~/npg_esa/npg_stack$ /software/npg/20180405/bin/curl --version
curl 7.58.0 (x86_64-pc-linux-gnu) libcurl/7.58.0 zlib/1.2.11
Release-Date: 2018-01-24
Protocols: dict file ftp gopher http imap pop3 rtsp smtp telnet tftp
Features: AsynchDNS IPv6 Largefile libz UnixSockets

############ Compare to #############

ubuntu@mg8-test-esa:~/npg_esa/npg_stack$ /usr/bin/curl --version
curl 7.47.0 (x86_64-pc-linux-gnu) libcurl/7.47.0 GnuTLS/3.4.10 zlib/1.2.8 libidn/1.32 librtmp/2.3
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz TLS-SRP UnixSockets

curl: add TLS support

ubuntu@mg8-test-esa:~/npg_esa/npg_stack$ which curl
/software/npg/20180405/bin/curl

ubuntu@mg8-test-esa:~/npg_esa/npg_stack$ curl --version
curl 7.58.0 (x86_64-pc-linux-gnu) libcurl/7.58.0 zlib/1.2.11
Release-Date: 2018-01-24
Protocols: dict file ftp gopher http imap pop3 rtsp smtp telnet tftp
Features: AsynchDNS IPv6 Largefile libz UnixSockets

ubuntu@mg8-test-esa:~/npg_esa/npg_stack$ curl https://glide.sh/get | sh
curl: (1) Protocol "https" not supported or disabled in libcurl

Recent build issues

In the past week some C++ from-source package builds have started failing with these error messages:

/opt/conda/conda-bld/irods_1624267692392/_build_env/x86_64-conda-linux-gnu/sysroot/lib/../lib64/libstdc++.so: undefined reference to `aligned_alloc@GLIBC_2.16'
/opt/conda/conda-bld/irods_1624267692392/_build_env/x86_64-conda-linux-gnu/sysroot/lib/../lib64/libstdc++.so: undefined reference to `clock_gettime@GLIBC_2.17'

Failures include recipes that previously succeeded and that are built in Docker containers that have not been changed. Points to upsteam issue?

Failed builds characterised by

  • Using older C++ compilers e.g. 5.x

  • Using Clang 8.x (conda-forge) e.g. iRODS requires this vintage Clang

  • Rebuild all recipes to establish scope, use defaults channel only

  • Investigate Conda issues

  • Review Conda commit log

  • Investgate conda-build issues

  • Review conda-build commit log

  • Check Anaconda aggregate issues (partial)

  • Check Anaconda aggregate commit logs (partial)

baton 2.0.1 fails to build with Werror=stringop-overflow

ERROR:root:In file included from /opt/conda/conda-bld/baton-pkg_1598436768177/_build_env/x86_64-conda-linux-gnu/sysroot/usr/include/string.
h:637,                                                                                                                                     
ERROR:root:                 from query.c:30:                                                                                               
ERROR:root:query.c: In function 'prepare_obj_repl_list':                                                                                   
ERROR:root:query.c:287:5: error: '__builtin_strncpy' specified bound depends on the length of the source argument [-Werror=stringop-overflo
w=]                                                                                                                                        
ERROR:root:  287 |     strncpy(path1, path, len);                                                                                          
ERROR:root:      |     ^~~~~~~                                                                                                             
ERROR:root:query.c:280:18: note: length computed here                                                                                      
ERROR:root:  280 |     size_t len = strlen(path) + 1;                                                                                      
ERROR:root:      |                  ^~~~~~~~~~~~     

Tests mix iRODS 4.1 and 4.2 dependencies

The tests need to differentiate between iRODS 4.1 and the new iRODS 4.2 dependencies according to which version of the iRODS server is hosting the tests. The Conda channels for the new iRODS packages and dependencies are not set correctly.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.