biocontainers / multi-package-containers Goto Github PK
View Code? Open in Web Editor NEWTesting building mulled containers for multi-requirement tools.
Home Page: https://biocontainers.pro/#/multipackage
Testing building mulled containers for multi-requirement tools.
Home Page: https://biocontainers.pro/#/multipackage
I appears that it doesn't load the entire channel so if you search for something like: star in the bioconda channel it's not there. The last packages start with m (which is not likely).
I get this error after updating and committing to master
WARNING: Hard-coding image build instead of using Conda build - this is not recommended.
https://github.com/BioContainers/multi-package-containers/actions/runs/4218095412/jobs/7322346121#step:8:77
ls: cannot access 'singularity_import': No such file or directory
https://github.com/BioContainers/multi-package-containers/actions/runs/4218095412/jobs/7322346121#step:8:78
Error: Process completed with exit code 2.
The container below seems to exists on quay, but not galaxyproject.org/singularity. Is this expected?
mulled-v2-a1289c2d7470e63e3c3a9f6131984bbf7c28ad45:2c0dd6dc388570cf3b02c569a856aa0d6e9dc84a-0
which corresponds to this line in the hash.tsv
:
python=3.9,matplotlib=3.5.1,pandas=1.3.5,r-sys=3.4,regex=2021.11.10,scipy=1.7.3
Hi everyone,
We’ve recently tried to containerize around 2000 high quality Galaxy tools. All of these tools contain test cases (2800 in total) that we can run programmatically, and before accepting a new tool or upgrading a tool these tests are run. Previously these tests ran on travis-ci using Conda to satisfy any dependency. Running containerized tests revealed two very common classes of errors which I think we should address:
We don’t respect the extended-base requirement annotated in many bioconda recipes, which leads to broken multi-package-containers (bioconda sets the destination image when building biocontainers). I’ve fixed this in galaxy-tool-util by checking if any of the specified recipes requires the extended base container. A new version of planemo will roll this out. But I think we also need to indicate that a container has been built with the extended base somewhere. We could record whether the minimal base image or the extended base image (or other images we may need in the future, think GPU etc...) @bgruening proposed to prefix the combinations with container:extended;, etc.
One issue here is that we’ll only know that a container needs the extended base after running the check in galaxy-tool-util, so I’m not sure what could be done about the helper service (https://biocontainers.pro/#/multipackage).
We don’t update multi-containers when new package builds are published to Conda. New builds are frequently published because of missing dependencies, and so multi-package-containers are frequently broken despite the problem being fixed upstream. I think it would be feasible to scan for new builds and increment the multi-container build number (possibly also deleting the earlier build should we run into space problems). I think this is the biggest time-sink when switching a large code-base from Conda to containers at this point. Relatedly I think we should record the build numbers of a build somewhere, so that we know with which build numbers we are dealing with.
One suggestion here would be to include the Conda build numbers, so we would have container:extended;<package1>--<build2>,<package2>--<build10>
.
Again it would be great to get some suggestions for how to manage this.
If adding new versions would be restrictive in terms of size we could prune everything but the newest version. I think getting working and up to date containers should be higher priority than maintaining potentially broken containers (but I can see that this might be controversial).
Another more limited option is to prefix the build number, so we’d have build:2;container:extended;<package1>,<package2>
etc.
That also gives us a nice handle on manually triggering a rebuild. Detailed build info could still be parsed out manually or on a CI system from /usr/local/conda-meta.
Hello -
The ncbi-datasets-cli container does not download genome data from NCBI. I am able to launch the container and display the help menu. But when I attempt to use their example datasets download genome taxon "bos taurus"
, I get this error: No assembly available. This error occurs when trying the 12.20.1 and 12.11.0 versions. I get a different error when I attempt to use the 11.25.1 version. The error is:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x87ec9a]
goroutine 1 [running]:
main/datasets/datasets.baseRetryPolicy(...)
src/datasets/datasets/root.go:182
main/datasets/datasets.DefaultRetryPolicy(0xa35290, 0xc0000280d0, 0x0, 0xa2dfc0, 0xc000316810, 0x0, 0x4ae360, 0x203000)
src/datasets/datasets/root.go:152 +0x7a
github.com/hashicorp/go-retryablehttp.(*Client).Do(0xc000145b90, 0xc0000b9c50, 0x0, 0x0, 0x0)
external/com_github_hashicorp_go_retryablehttp/client.go:597 +0x1f1
github.com/hashicorp/go-retryablehttp.(*RoundTripper).RoundTrip(0xc00000e3f0, 0xc0001ed700, 0xc00000e3f0, 0x0, 0x0)
external/com_github_hashicorp_go_retryablehttp/roundtripper.go:44 +0x85
net/http.send(0xc0001ed700, 0xa2c8c0, 0xc00000e3f0, 0x0, 0x0, 0x0, 0xc0000103b0, 0x8f2540, 0x1, 0x0)
GOROOT/src/net/http/client.go:251 +0x454
net/http.(*Client).send(0xc0001fa240, 0xc0001ed700, 0x0, 0x0, 0x0, 0xc0000103b0, 0x0, 0x1, 0x0)
GOROOT/src/net/http/client.go:175 +0xff
net/http.(*Client).do(0xc0001fa240, 0xc0001ed700, 0x0, 0x0, 0x0)
GOROOT/src/net/http/client.go:717 +0x45f
net/http.(*Client).Do(...)
GOROOT/src/net/http/client.go:585
main/openapi_client.(*APIClient).callAPI(0xc0000cd180, 0xc0001ed700, 0x0, 0xc000026f80, 0x34)
src/generated/openapi_client/client.go:188 +0x4d
main/openapi_client.(*GenomeApiService).GenomeMetadataByPost(0xc0000cd188, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
src/generated/openapi_client/api_genome.go:1340 +0x46d
main/datasets/datasets.getAssemblyMetadataPage(0xc0000cd180, 0xc00014f980, 0x3e8, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
src/datasets/datasets/SummaryGenomeAccession.go:61 +0x105
main/datasets/datasets.getAssemblyMetadataWithPost(0xc00014f980, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
src/datasets/datasets/SummaryGenomeAccession.go:100 +0x194
main/datasets/datasets.glob..func7(0xd06d80, 0xc0000b9890, 0x1, 0x1, 0x0, 0x0)
src/datasets/datasets/DownloadGenomeTaxon.go:41 +0x10f
github.com/spf13/cobra.(*Command).execute(0xd06d80, 0xc0000b9870, 0x1, 0x1, 0xd06d80, 0xc0000b9870)
external/com_github_spf13_cobra/command.go:842 +0x472
github.com/spf13/cobra.(*Command).ExecuteC(0xd050a0, 0x4686c5, 0xc000000180, 0x200000003)
external/com_github_spf13_cobra/command.go:950 +0x375
github.com/spf13/cobra.(*Command).Execute(...)
external/com_github_spf13_cobra/command.go:887
main/datasets/datasets.Execute()
src/datasets/datasets/root.go:411 +0x31
main.main()
src/cmd/datasets/main.go:10 +0x25
Like this one: https://travis-ci.org/BioContainers/multi-package-containers/builds/644710026?utm_medium=notification&utm_source=github_status
Maybe we should move the build system to github's CI ? This is much more performant and doesn't have this silly 10m timeout hwne nothing appears on the console.
Dear BioContainers team,
How can we trigger a new build of image when a new build of the conda package is available?
Thanks
I think it would be nice to have the package names sorted alphabetically in the hash.tsv file.
Why? Faster search for humans and the hash computation depends on the sorting, too.
Just noticed that related repo have been archived before I got a chance to respond galaxyproject/galaxy-lib#156
========================================
Thank you both for your quick responses.
@mvdbeek I am pretty sure I've installed requests at some point, as well as Woosh. Now I tried again as per your instructions in a couple of fresh virtual environments for python 2 & 3.
mulled-search -s bwa
Traceback (most recent call last):
File "/tmp/ccc/.venv/bin/mulled-search", line 11, in <module>
sys.exit(main())
File "/tmp/ccc/.venv/lib/python3.6/site-packages/galaxy/tool_util/deps/mulled/mulled_search.py", line 321, in main
conda_results[item] = conda.get_json(item)
File "/tmp/ccc/.venv/lib/python3.6/site-packages/galaxy/tool_util/deps/mulled/mulled_search.py", line 127, in get_json
use_exception_handler=True)
TypeError: 'NoneType' object is not callable
(Also tried on an mostly unused WSL ubuntu with the same result.)
@bgruening, I created the environment as per your instructions (added -c bioconda
), but not quite there yet.
(base) rad@radm:~/repos$ conda activate galaxy
(galaxy) rad@radm:~/repos$ which mulled-search
/home/rad/miniconda3/envs/galaxy/bin/mulled-search
(galaxy) rad@radm:~/repos$ mulled-search -h
Traceback (most recent call last):
File "/home/rad/miniconda3/envs/galaxy/bin/mulled-search", line 6, in <module>
from galaxy.tool_util.deps.mulled.mulled_search import main
ModuleNotFoundError: No module named 'galaxy.tool_util'
(galaxy) rad@radm:~/repos$ mulled-search
Traceback (most recent call last):
File "/home/rad/miniconda3/envs/galaxy/bin/mulled-search", line 6, in <module>
from galaxy.tool_util.deps.mulled.mulled_search import main
ModuleNotFoundError: No module named 'galaxy.tool_util'
Then I remembered that channel order may play a part so tried conda create -n galaxy2 galaxy-tool-util --override-channels -c conda-forge -c bioconda -c default
and it works now 🙏
Hi,
I'm hitting an error when creating mulled containers that does not seem to be related to the specified dependencies, as it happens after a successful environment is built and all of the latest PRs are getting it, even the automated ones.
Here is the relevant part of the error message:
[Dec 18 13:16:16] SOUT Exploding layer: sha256:aa44502a478a5d41773e70f3357e5ef57a9365721e9d3c10627c87e5a207fb59.tar.gz
[Dec 18 13:16:16] SERR
[Dec 18 13:16:16] SERR gzip: /root/.singularity/docker/sha256:aa44502a478a5d41773e70f3357e5ef57a9365721e9d3c10627c87e5a207fb59.tar.gz: not in gzip format
[Dec 18 13:16:16] SERR tar: This does not look like a tar archive
[Dec 18 13:16:16] SERR tar: Exiting with failure status due to previous errors
[Dec 18 13:16:16] SOUT Cleaning up...
[Dec 18 13:16:16] ERRO Task processing failed: Unexpected exit code [2] of container [a6c8df8d6969 step-fd70b3ce91], container preserved
..
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.8.15/x64/bin/mulled-build-files", line 8, in <module>
sys.exit(main())
File "/opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/site-packages/galaxy/tool_util/deps/mulled/mulled_build_files.py", line 42, in main
ret = mull_targets(
File "/opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/site-packages/galaxy/tool_util/deps/mulled/mulled_build.py", line 286, in mull_targets
ret = involucro_context.exec_command(involucro_args)
File "/opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/site-packages/galaxy/tool_util/deps/mulled/mulled_build.py", line 344, in exec_command
shutil.rmtree('./build')
File "/opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/shutil.py", line 718, in rmtree
_rmtree_safe_fd(fd, path, onerror)
File "/opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/shutil.py", line 655, in _rmtree_safe_fd
_rmtree_safe_fd(dirfd, fullname, onerror)
File "/opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/shutil.py", line 655, in _rmtree_safe_fd
_rmtree_safe_fd(dirfd, fullname, onerror)
File "/opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/shutil.py", line 675, in _rmtree_safe_fd
onerror(os.unlink, fullname, sys.exc_info())
File "/opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/shutil.py", line 673, in _rmtree_safe_fd
os.unlink(entry.name, dir_fd=topfd)
PermissionError: [Errno 13] Permission denied: 'ncurses-6.3-h27087fc_1.json'
Error: Process completed with exit code 1.
Here is the full error message:
https://github.com/BioContainers/multi-package-containers/pull/2439/checks
Several of the latest PRs are getting it including:
#2441
#2439
#2438
When a biocontainer with planemo tries to build a new biocontainer locally, it fails because wget is called with a missing recursive option. This is an edge case that took some doing to induce, but it will appear the next time someone tries wget --recursive
even for a reasonable reason inside a biocontainer.
Galaxy log causing this shows it's trying to build the planemo/galaxyxml/git container - which is running at the time.
galaxy.tool_util.deps.containers INFO 2021-04-25 12:06:18,995 [pN:main.web.1,p:3729443,w:1,m:0,tN:LocalRunner.work_thread-0] Checking with container resolver [ExplicitContainerResolver[]] found description [None]
fi
docker kill 38f0771b56c143b9a69d89f5bfa29769 &> /dev/null
}
trap _on_exit 0
docker inspect quay.io/biocontainers/mulled-v2-c0c9dd2959e833cf8c69ae23a9398ddb0db5c98f:5d3a99482f42f5e17ec122729f67bf7be64df9e9-0 > /dev/null 2>&1
[ $? -ne 0 ] && docker pull quay.io/biocontainers/mulled-v2-c0c9dd2959e833cf8c69ae23a9398ddb0db5c98f:5d3a99482f42f5e17ec122729f67bf7be64df9e9-0 > /dev/null 2>&1
This in the tool log:
stdout:wget -q --recursive -O /home/ross/.planemo/involucro https://github.com/involucro/involucro/releases/download/v1.1.2/involucro
stderr:wget: unrecognized option '--recursive'
BusyBox v1.22.1 (2014-05-23 01:24:27 UTC) multi-call binary.
Usage: wget [-c|--continue] [-s|--spider] [-q|--quiet] [-O|--output-document FILE]
[--header 'header: value'] [-Y|--proxy on/off] [-P DIR]
[-U|--user-agent AGENT] [-T SEC] URL...
Retrieve files via HTTP or FTP
-s Spider mode - only check file existence
-c Continue retrieval of aborted transfer
-q Quiet
-P DIR Save to DIR (default .)
-T SEC Network read timeout is SEC seconds
-O FILE Save to FILE ('-' for stdout)
-U STR Use STR for User-Agent header
-Y Use proxy ('on' or 'off')
galaxy.tool_util.deps.installable WARNING: Involucro installation requested and failed.
Traceback (most recent call last):
File "/usr/local/bin/planemo", line 10, in <module>
sys.exit(planemo())
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/click/decorators.py", line 73, in new_func
return ctx.invoke(f, obj, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/planemo/cli.py", line 98, in handle_blended_options
return f(*args, **kwds)
File "/usr/local/lib/python3.8/site-packages/planemo/commands/cmd_test.py", line 82, in cli
return_value = test_runnables(ctx, runnables, original_paths=uris, **kwds)
File "/usr/local/lib/python3.8/site-packages/planemo/engine/test.py", line 34, in test_runnables
with galaxy_config(ctx, runnables, **kwds) as config:
File "/usr/local/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/usr/local/lib/python3.8/site-packages/planemo/galaxy/config.py", line 218, in galaxy_config
with c(ctx, runnables, **kwds) as config:
File "/usr/local/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/usr/local/lib/python3.8/site-packages/planemo/galaxy/config.py", line 477, in local_galaxy_config
_handle_container_resolution(ctx, kwds, properties)
File "/usr/local/lib/python3.8/site-packages/planemo/galaxy/config.py", line 1342, in _handle_container_resolution
involucro_context = build_involucro_context(ctx, **kwds)
File "/usr/local/lib/python3.8/site-packages/planemo/mulled.py", line 40, in build_involucro_context
raise Exception("Failed to install involucro for Planemo.")
Exception: Failed to install involucro for Planemo.
Hi,
I'm trying to generage a container for modbam2be.
I've tried the helper service, but it couldn't find either modbam2bed package or epi2melabs channel.
What can I do?
Thank you!
Dear BioContainers team,
Is there a way to know if a given combo is already built?
Say I want to know if bwa+samtools exists already, how do I find that if it is already available on BioContainers
Thanks in advance
The https://github.com/BioContainers/multi-package-containers/blob/master/combinations/hash.tsv file only contains the tools and build info for the mulled containers, but not the hash itself. I think it would make sense to include the computed hashes as well so the containers can easily be found and used from the information provided in this file.
Pip is warning about mismatched scheme.headers
in some logs coming from using quay.io/biocontainers/mulled-v2-c0c9dd2959e833cf8c69ae23a9398ddb0db5c98f:5d3a99482f42f5e17ec122729f67bf7be64df9e9-0 (planemo/git/galaxyxml)
Confirmed inside the container with:
root@7bab7f7af205:/# pip install setuptools
WARNING: Value for scheme.headers does not match. Please report this to <https://github.com/pypa/pip/issues/9617>
distutils: /usr/local/include/python3.8/UNKNOWN
sysconfig: /usr/local/include/python3.8
WARNING: Additional context:
user = False
home = None
root = None
prefix = None
Found one real failure.
#322
https://travis-ci.org/BioContainers/multi-package-containers/builds/244882058
Dear BioContainers team,
I tried to use the web service for combining packages (https://biocontainers.pro/#/multipackage) and couldn't select several packages present on BioContainers: samtools
, bedtools
and pandas
.
When I type samtools
in the Search box, it only shows me bioconductor-rsamtools
.
When searching for bedtools
, it does show me some bedtools packages, but not all. For example, it doesn't show version 2.29.0
, although it exists on BioContainers (for example, 2.29.0--hc088bd4_3
here - https://quay.io/repository/biocontainers/samtools?tab=tags).
When searching for pandas
, it only shows me biopandas
, although pandas
does exist: https://quay.io/repository/biocontainers/pandas?tab=tags
Could you please help me with this? Of course, I could just put the names and versions of the tools I need into hash.tsv
by myself, but then I'm not sure how to obtain names of the corresponding containers.
Thank you,
Slava
http://biocontainers.pro/multi-package-containers/ gives a 404.
Dear BioContainers team,
Is there a simple way to know what packages are installed in a mulled container? For example, if a have a container mulled-v2-ac74a7f02cebcfcc07d8e8d1d750af9c83b4d45a:f480262c6feea34eb5a49c4fdfbb4986490fefbb-0
, how could I infer that it has bowtie2=2.4.1
, samtools=1.9
and pigz=2.3.4
?
Thank you,
Slava
Hi all,
Not quite sure how to do this, but I'd like to rebuild the container cat=4.6,diamond=2.0.6
listed on line 137 of hash.tsv.
It seems this image was built before a bug with exposing resolv.conf in Singularity images was fixed (bioconda/bioconda-recipes#11583), which means that the image is unable to get network access (at least on my system).
What do I need to include in a PR to rebuild the container with an up-to-date base image?
Hello,
I got this error in one of the mulled containers
blastn: error while loading shared libraries: libbz2.so.1: cannot open shared object file: No such file or directory
Container ID:
mulled-v2-848eb9b6a829414c79e64bc96d43109620d1cfd9:869b2b6d573ee185276907ccda7237a29552a952-0
I thought on adding bzip2 to the list of software but the build process of the container already installs bzip2. Any suggestion on how to fix this?
Best,
Ramon
The pagination on the helper tool linked in the README is broken. After a search for a tool that has many versions there can be e.g. 2 pages of results (try scipy or openms), but once one clicks the forward arrow or a page number below the package table the search seems to be reset to the pre-search pagination including all packages in the original list in alphabetical order - and the search term remains in the reach field but has seemingly no effect.
I know this might not be the exact right place to report this, but I hope the people who can fix it see it.
Current container names are indiscernable from eachother.
mulled-v2-05fd88b9ac812a9149da2f2d881d62f01cc49835:f0f80d4cb5631deb3715817144cb290221be9d1c-0
mulled-v2-0df1816856a9b13d24526c54b9a0cfb0caa5b6c5:1b03ef4ed69b011d06a426e4b5a1a3cc28cb81b0-0
The names are so unique, that in practice they are all the same! They all look like a bunch of hexadecimals to me and therefore have no uniqueness or discernability whatsoever.
Can we move to an actual informative naming scheme?
For example container mulled-v2-9ce0e23bf798aa283902800c4cd106adf15090b2:38c4658d41b94a7b6f5d06712831441c5b522b09-0
contains
py-graphviz=0.4.10,graphviz=2.38.0
Why not call this py-graphviz+graphviz:0.4.10+2.38.0
It is readable, it is clear what versions are used, and it is discernable from other containers. Yes, it will get a bit long when there are more packages. But given the already quite long length of the current combinations this is not really a big problem.
Hi there,
I would like to create a single-package container for a package from conda-forge. Will updating the hash.tsv
file with this package work? If not, how should I proceed in this situation?
The package is:
levenshtein=0.20.1
Thank you very much in advance.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.