galaxyproject / ephemeris Goto Github PK

View Code? Open in Web Editor NEW

27.0 27.0 38.0 729 KB

Library for managing Galaxy plugins - tools, index data, and workflows.

Home Page: https://ephemeris.readthedocs.org/

License: Other

Makefile 2.71% Python 96.77% Shell 0.51%

ephemeris's People

Contributors

Stargazers

Watchers

ephemeris's Issues

shed-tools update should include data managers

This can be an optional thing.

By extension, this means get-tool-list should get functionality to also get the data-managers.

Collect useful scripts from ansible roles?

What do you think about putting a couple more scripts in here for better reference?
Ping @jmchilton @bgruening @afgane

I'm thinking of

The first two can probably be combined.

setup_data_libraries.py never completes

I have an issue with setup_data_libraries.py. When run, if there are any jobs on the target Galaxy server that are not in an ok state, the script never finishes as it is stuck in a loop waiting for ALL jobs on the Galaxy server to be in that state.

        no_break = True
        while True:
            no_break = False
            for job in jc.get_jobs():
                if job['state'] != 'ok':
                    no_break = True
            if not no_break:
                break
            time.sleep(3)

        time.sleep(20)
        log.info("Finished importing test data.")

If the target Galaxy server in question has been around for some time then there will more than likely be some jobs in a new or error state let alone other non library creation related jobs still running.

I think it would be much better to capture the upload job id's for each upload in a list and just wait for them to complete.

Project: Run tests on tools updates

Update planemo to allow tool testing against external servers.
Develop tooling to automate this as part of ephemeris updates.
usegalaxy-eu integration @erasche
- test usegalaxy-eu tools (https://github.com/usegalaxy-eu/usegalaxy-eu-tools).
- test usegalaxy-eu worflows

Testing on travis not reliable

See #55

Succeeding tests!
Changed something -> more succeeding tests.
Remove redundant verbosity flag in tests to clean up output. -> Testing fails on python 3.5 but builds on 2.7
Change some characters in echo statements to upper case -> Testing fails on python 2.7 but builds on 3.5
Try something else: same random crashes. ->revert again
try someting else. Breaks testing entirely
Revert ->. Should be the same code as on 3. But crashes in another way.

It is very annoying. From 1. the code is completely functional and working well. But since travis crashes the test randomly this request can not be pulled.

new data-manager manager script

I'm working currently on a data-manager manager script to pre-load Galaxy training instances.
Current definition looks like that:

# configuration for fetch and index genomes

data_managers:
    - id: toolshed.g2.bx.psu.edu/repos/devteam/data_manager_fetch_genome_dbkeys_all_fasta/data_manager_fetch_genome_all_fasta_dbkey/0.0.2
      params:
        - 'dbkey': '{{ item }}'
        - 'reference_source|reference_source_selector': 'ucsc'
        - 'reference_source|requested_dbkey': '{{ item }}'
      items:
        - dm3
        - mm9

    - id: toolshed.g2.bx.psu.edu/repos/devteam/data_manager_bowtie2_index_builder/bowtie2_index_builder_data_manager/2.2.6
      params:
        - 'all_fasta_source': '{{ item }}'
      items:
        - dm3
        - mm9

Configurable default install processes

Previously we defaulted to installing tool dependencies and repository dependencies,
and this could be overriden for each repository.
I would like to be able to specify this on the top level of the tool_list.yml and on the commandline,
as well as for each repository separately. So the new aspect here would be the top level parameter in the tool_list.yml (like we are currently handling the api_key and galaxy_url).

loading of API key from yaml tool file does not work

found by @selten

I have API key set here but the installation errors. If I use . the --api_key argument, it works.

$ shed-install -t install.yaml
Traceback (most recent call last):
  File "/Users/marten/devel/ephemeris/.venv/bin/shed-install", line 11, in <module>
    sys.exit(script_main())
  File "/Users/marten/devel/ephemeris/.venv/lib/python2.7/site-packages/ephemeris/shed_install.py", line 763, in script_main
    itm = get_install_tool_manager(options)
  File "/Users/marten/devel/ephemeris/.venv/lib/python2.7/site-packages/ephemeris/shed_install.py", line 600, in get_install_tool_manager
    gi = get_galaxy_connection(options)
  File "/Users/marten/devel/ephemeris/.venv/lib/python2.7/site-packages/ephemeris/__init__.py", line 26, in get_galaxy_connection
    return gi or False
UnboundLocalError: local variable 'gi' referenced before assignment

shed-tool install issue with "changeset_revision"

I'm experiencing some issues installing tools. This had worked $some_time_ago, but now the installation log is completely full of errors like:

	* Error installing a repository (after 0:00:00.932244 seconds)! Name: bedtools,owner: iuc, revision: ['b8348686a0b9'], error: {"err_msg": "No information is available for the requested repository revision.\nOne or more of the following parameter values is likely invalid:\ntool_shed_url: https://toolshed.g2.bx.psu.edu/\nname: bedtools\nowner: iuc\nchangeset_revision: [u'b8348686a0b9']\n", "err_code": 400008}

the last time I saw this it was because changeset_revision was being interpreted as a list and then cast to a string, and then ephemeris was searching for that and failing to find it (because of course there is no revision matching the string u"['asdfasdfa']".

Any ideas on this?

Here is the full console log
https://build.usegalaxy.eu/job/usegalaxy.eu/job/install-tools/12/consoleFull

It runs the command make install from our repo which runs the command shed-tools install --toolsfile $file.yaml --galaxy $url --api_key $api_key.

Running locally with increased logging level:

Installing any updated versions of tools_iuc.yaml.lock
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): galaxy.uni-freiburg.de
DEBUG:urllib3.connectionpool:https://galaxy.uni-freiburg.de:443 "GET /api/tool_shed_repositories?key=<redacted> HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): toolshed.g2.bx.psu.edu
DEBUG:urllib3.connectionpool:https://toolshed.g2.bx.psu.edu:443 "GET /api/repositories/get_ordered_installable_revisions?owner=iuc&name=bedtools HTTP/1.1" 200 None
(1/502) Installing repository bedtools from iuc to section "Operate on Genomic Intervals" at revision ['b8348686a0b9'] (TRT: 0:00:06.692430)
DEBUG:ephemeris.shed_tools:(1/502) Installing repository bedtools from iuc to section "Operate on Genomic Intervals" at revision ['b8348686a0b9'] (TRT: 0:00:06.692430)
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): galaxy.uni-freiburg.de
DEBUG:urllib3.connectionpool:https://galaxy.uni-freiburg.de:443 "POST /api/tool_shed_repositories/new/install_repository_revision?key=<redacted> HTTP/1.1" 400 None
Traceback (most recent call last):
  File "/home/hxr/.local/lib/python3.5/site-packages/ephemeris/shed_tools.py", line 609, in install_repositories
    install_repository_revision(repository, self.tsc)
  File "/home/hxr/.local/lib/python3.5/site-packages/ephemeris/shed_tools.py", line 428, in install_repository_revision
    response = tool_shed_client.install_repository_revision(**repository)
  File "/home/hxr/.local/lib/python3.5/site-packages/bioblend/galaxy/toolshed/__init__.py", line 146, in install_repository_revision
    return self._post(url=url, payload=payload)
  File "/home/hxr/.local/lib/python3.5/site-packages/bioblend/galaxy/client.py", line 160, in _post
    files_attached=files_attached)
  File "/home/hxr/.local/lib/python3.5/site-packages/bioblend/galaxyclient.py", line 146, in make_post_request
    body=r.text, status_code=r.status_code)
bioblend.ConnectionError: Unexpected HTTP status code: 400: {"err_msg": "No information is available for the requested repository revision.\nOne or more of the following parameter values is likely invalid:\ntool_shed_url: https://toolshed.g2.bx.psu.edu/\nname: bedtools\nowner: iuc\nchangeset_revision: [u'b8348686a0b9']\n", "err_code": 400008}

This sticks out to me:

DEBUG:ephemeris.shed_tools:(1/502) Installing repository bedtools from iuc to section "Operate on Genomic Intervals" at revision ['b8348686a0b9'] (TRT: 0:00:06.692430)

I really expected to see something like

DEBUG:ephemeris.shed_tools:(1/502) Installing repository bedtools from iuc to section "Operate on Genomic Intervals" at revision 'b8348686a0b9' (TRT: 0:00:06.692430)

Changing the changeset_revision to revisions in our yaml files looks like it fixes it, but it used to work. Did something change recently in the handling of changeset_revision to force it into a list?

0.7.1 Release

Run-data-managers has improved significantly, an error in galaxy-wait has been fixed an unbounded variable has been fixed. ( #53 #55 #57 )
@jmchilton could you release 0.7.1 so that these changes are made available to the public?
Also I believe we have fixed a lot of issues in #1 now. Maybe the issues in #31 can now also be addressed?
Thanks!

Use kwargs for bioblend

The latest bioblend adds one more kwarg to the install_repository_revision method.
We are passing argument values to bioblend, so this causes some unexpected behaviour (tool_panel_section_label turns into tool_panel_section_id) when using the old script in ansible-galaxy-tools. We should use kwargs anyway to have more obvious errors.

We should also consider deprecating the tool_panel_id, lot's of people on the mailing list appear to be struggling with the old role because of this.

Data manager does not check if index is already populated.

The shed-install command works really nice: you feed it a yaml and it installs the tools. Oops! You forgot a tool! No worries, just add it to the yaml and run shed-install again. It will skip the already installed tools.

This behaviour is not reproduced in the run-data-managers tool. If you run a yaml again, it will start indexing all the genomes in there. Even if they are already indexed. This is annoying if you have human genomes on the list which take a long time to index.

Retry failed external connections

We've been running our automated tool installation and we occasionally see errors due to transient network failures that cause failures of the jenkins job. If we wrapped all bioblend / external network connections in a retry (we use https://github.com/litl/backoff for some stuff here) this would be extremely helpful.

E.g.

(512/720) repository iwtomics_loadandplot already installed at revision ce633cc8f5f9. Skipping.
Traceback (most recent call last):
  File "/home/centos/shiningpanda/jobs/023ca033/virtualenvs/d41d8cd9/bin/shed-tools", line 11, in <module>
    sys.exit(main())
  File "/home/centos/shiningpanda/jobs/023ca033/virtualenvs/d41d8cd9/lib/python2.7/site-packages/ephemeris/shed_tools.py", line 744, in main
    install_tool_manager.install_repositories()
  File "/home/centos/shiningpanda/jobs/023ca033/virtualenvs/d41d8cd9/lib/python2.7/site-packages/ephemeris/shed_tools.py", line 572, in install_repositories
    repository = self.get_changeset_revision(repository)
  File "/home/centos/shiningpanda/jobs/023ca033/virtualenvs/d41d8cd9/lib/python2.7/site-packages/ephemeris/shed_tools.py", line 720, in get_changeset_revision
    installable_revisions = ts.repositories.get_ordered_installable_revisions(repository['name'], repository['owner'])
  File "/home/centos/shiningpanda/jobs/023ca033/virtualenvs/d41d8cd9/lib/python2.7/site-packages/bioblend/toolshed/repositories/__init__.py", line 149, in get_ordered_installable_revisions
    r = self._get(url=url, params=params)
  File "/home/centos/shiningpanda/jobs/023ca033/virtualenvs/d41d8cd9/lib/python2.7/site-packages/bioblend/galaxy/client.py", line 136, in _get
    status_code=r.status_code)
bioblend.ConnectionError: GET: error 502: '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">\n<html>\n\n<head>\n<title>Galaxy</title>\n<style type="text/css">\nbody\n{\n    font: 75% verdana, "Bitstream Vera Sans", geneva, arial, helvetica, helve, sans-serif;\n    background: white url(//error/content_bg.png) top repeat-x;\n    color: #303030;\n    padding: 0;\n    border: 0;\n    margin: 0;\n    margin-right: 0;\n    margin-left: 0;\n}\n\ndiv.pageTitle\n{\n    font-size: x-large;\n    font-weight: bold;\n}\n\ndiv.pageTitle a:link, div.pageTitle a:visited, div.pageTitle a:active, div.pageTitle a:hover\n{\n    text-decoration: none;\n    color: #ece7f2;\n}\n/*a:link, a:visited, a:active\n{\n}*/\ntd.masthead\n{\n    vertical-align: middle;\n    background: #023858 url(//error/masthead_bg.png) bottom;\n    height: 40px;\n    padding-left: 10px;\n}\ntd.content\n{\n    vertical-align: top;\n    padding: 10px;\n}\na:link, a:visited, a:active\n{\n    color: #303030;\n}\n</style>\n</head>\n<table width="100%" border="0" cellspacing="0" cellpadding="0">\n    <tr>\n        <td class="masthead"><div class="pageTitle"><a target="_blank" href="http://galaxyproject.org">Galaxy</a></div></td>\n    </tr>\n    <tr>\n        <td class="content">\n            <h2>The Tool Shed could not be reached</h2>\n            <p>\n            \n            \n                            You are seeing this message because a request to The Tool Shed timed out or was refused. This may\n                be a temporary issue which could be resolved by retrying the operation you were performing. If you receive this\n                message repeatedly or for an extended amount of time, please                contact an administrator.\n                        \n            </p>\n        </td>\n    </tr>\n</table>\n\n</html>\n', 0 attempts left: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">

create a frozen TTS repo for testing updates and such

xref: #64

possibility to install tools by version

It would be nice to have to possibility to install a tool by specifying a version (then the latest revision of the version should be installed / updated). See also galaxyproject/ansible-galaxy-tools#46

Tool testing improvements

These are a couple of things I found while implementing tool tests:

tests only run of the test data is available from where the tests are triggered (this should be fixed with a galaxy-lib update containing galaxyproject/galaxy#7160)
tests seem to require an admin API key, that shouldn't be necessary with galaxyproject/galaxy#7160
We create a new history for every test, even if this test is for the same tool
We just raise JobOutputExceptions, these need to be displayed properly
We should have some reporting, ideally re-using some things from planemo (manual planemo test_reports tool_test_output.json works fine)
Optionally delete test histories after run
Skip tests that use reference data
Move to next test when job becomes paused
Expose test timeout as command line option

tool_list.yaml.sample is missing

This file is missing from the repo, it'd be helpful if the sample file was present for users to take a look as mentioned here in the shed_install.py file.

1. In the YAML format via dedicated files (see ``tool_list.yaml.sample`` for a
   sample of such a file)

Doc link broken

Improve Project Structure

Implement a README.
Review and fix all Travis tests.
Generate and build documentation.
Setup a readthedocs site for this.
Write a couple tests.
Enable coverage testing, register with coveralls.

Project: refactor shed-tools

it's 1000 lines of code
It contains multiple functions that are over 50 lines long.
It can contains a class (that should be able to simplify a few things) but in this class multiple methods are over 50 lines long.
A global log variable is defined
It overwrites the ConnectionError that is in standard library.
It has by far the highest complexity of all files in ephemeris (18)

Shed-tools is not pleasant to work at. Navigating through it is a jungle. It should be totally refactored.
I have a few suggestions how to approach this:

Take get_tool_list_from_galaxy.py as an example. This was done for run-data-managers as well, and as a result complexity was dropped, while at the same time increasing functionality.
Essentially this means shed_tools should be written as a class that:
- Has a method to output currently installed tools
- Has a method to check whether a tool is currently installed
- Has a method to install a tool
- Has a method to test a tool (Thanks a lot @jmchilton !)
- Has some sort of install_tools method for multiple tools including logging.
The parsing of the tool list yaml may be moved to a separate file
Some stuff from get_tool_list from galaxy might be reused in shed-tools. (Like determining current tool list)
Some redundant command line options should be removed. Installing by passing a yaml string? OK. Installing by tool_list file? OK. Installing by specifying every frigging tool attribute on the commandline? We can drop that IMO.

This is something I have been thinking about doing it for a while. Just putting it out there before we increase the complexity of shed-tools even further. I think refactoring will also improve the ability to test using pytest #80.

Documentation is not visible

I would like to know how to use this tool, so I just refer to documentation: https://ephemeris.readthedocs.org. However, the documentation is empty.

Suggestion drop python 2 support and start using type hints.

Python 3 was released in December 2008. It is now almost ten years later. Anyone who desperately wants to cling to python 2 is welcome to it. But I don't want to miss out on some nice functionality that python3 offers.
One of the things that would be a pleasure to use in ephemeris is type hinting. This allows the IDE to actually know what type of variable you are working with. It prevents a lot of mistakes and makes it easier to develop. Now I often have to look up the original function to check what is being done and see whether this variable is some random class, a list, string or something else. Also some stuff will be checked at compile time, catching errors before they occur. This will be very nice when integrating pytest #80 and refactoring shed-tools #91.
So I suggest we drop the python2 testing and focus our efforts on python3 for the coming releases.
People who still need a python2 version of ephemeris: the releases up to 0.8.0 are still pretty solid and useful.

shed_tool_conf_cleaner.py

Hi everybody,

If you think that this script can be useful for Galaxy admin, I may push it in ephemeris. Tell me your thoughts.

ARTbio/tools-artbio#284

Data library hierarchy structure

Hi,

It would be nice to have the possibility via data_library.yaml file to submit a structured library with subfolders, etc.
@Slugger70 will work on it

Bérénice

Issue with tool installation (BWA 546ada4a9f43)

@bgruening
Take a look here master...ARTbio:bwatest
Notice that I am testing the installation of BWA rev 546ada4a9f43.
However, the ephemeris travis test reports the installation of the latest BWA rev 53646aaaafef - https://travis-ci.org/ARTbio/ephemeris/jobs/228938642

This reproduces exactly the issue I am struggling with in GalaxyKickStart.

shed-tools options -y and -t are confusing

Hi there,
I just had a devil of a time getting shed-tools to work because of confusion between passing in a yml tool file (-t) and passing in a yml tool string (-y). Is there some way to clarify this further in the documentation?

shed-tools uninstall?

Shouldn't it be possible to uninstall a tool?

Latest documentation not updated on readthedocs.

Are webhooks configured? This will allow automatic updating of readthedocs as soon as the documentation is updated. Now the latest documentation changes are not visible on readthedocs.

shed-install verbose

The documentation (https://ephemeris.readthedocs.io/en/latest/commands/shed-install.html) states that verbose is an option, however when I tried it I got:

usage: usage: python shed-install <options>
shed-install: error: unrecognized arguments: -v=True

Change test framework

Currently we use a bash script that calls the tools.
This is not bad. But the testing is quite crude and can never cover most use cases.
In order to solve #75 I think it is better to use pytest or something comparable. Just like ncbi-genome-download.
That will probably also allow for better coverage integration (#71 ).

generate_tool_list_from_workflow better library interface

https://github.com/galaxyproject/planemo/pull/776/files is writing out a file and then reading it back in - it should be structured more like shed_tools where there is a CLI layer/file based layer that wraps a Python object layer.

Also note to self: does this cover subworkflows?

shed-install stores logs in a fixed tmp location.

This is a problem when you run ephemeris as a different user on the same system:
Permission denied: "/tmp/galaxy_tool_install.log"
Since this is in /tmp you cannot delete the file of the other user.

Install tools into multiple sections

As requested here: bgruening/docker-galaxy-stable#203

run-data-managers execution order dependencies based on dbkeys

Hi ,

I have a situation that I'm not sure how to solve using ephemeris.
I'm trying to run some data managers that install data associated with a genome build (unique dbkey) and build indexes based on that. Mainly I have 2 types of "source" data: the genome file and the transcriptome file.
Both load an entry in the all_fasta table, which is done running the data_manager_fetch_genome_all_fasta_dbkey data manager.
Since both files belong to the same build they should be associated with the same dbkey so my first idea was to first list the data manager run to install the genome file and create the new dbkey, and then run the same data manager to install the transcriptome file associated with the new dbkey created. The problem is if I list these 2 data manager jobs in 1 yaml file then both will be run simultaneously, the first creates the dbkey but by the time the second is run the entry in the dbkeys table is not yet there so it will fail with "dbkey not found".

An example here

genomes:
    - genome_id: Ricinus_communis_JCVI_1.0
      name: Ricinus communis JCVI 1.0
      build_id: Ricinus_communis_JCVI_1.0
      genome: ftp://ftp.psb.ugent.be/pub/plaza/plaza_public_dicots_03/Genomes/rco.con.gz
      all_tx_transcriptome: ftp://ftp.psb.ugent.be/pub/plaza/plaza_public_dicots_03/Fasta/transcripts.all_transcripts.rco.fasta.gz
     
data_managers:
 ##Load the genome fasta
    - id: toolshed.g2.bx.psu.edu/repos/devteam/data_manager_fetch_genome_dbkeys_all_fasta/data_manager_fetch_genome_all_fasta_dbkey/0.0.2
      params:
        - 'dbkey_source|dbkey_source_selector': 'new'
        - 'dbkey_source|dbkey': '{{ item.genome_id }}'
        - 'dbkey_source|dbkey_name': '{{ item.name }}'
        - 'reference_source|reference_source_selector': 'url'
        - 'reference_source|user_url': '{{ item.genome }}'
        - 'sequence_name': '{{ item.name }}'
        - 'sequence_id': '{{ item.build_id }}'
        - 'sorting|sorting_selector': 'as_is'
      items: "{{ genomes }}"
      data_table_reload:
        - all_fasta
        - __dbkeys__

 ## Load the transcriptome fasta
    - id: toolshed.g2.bx.psu.edu/repos/devteam/data_manager_fetch_genome_dbkeys_all_fasta/data_manager_fetch_genome_all_fasta_dbkey/0.0.2
      params:
        - 'dbkey_source|dbkey_source_selector': 'new'
        - 'dbkey_source|dbkey': '{{ item.genome_id }}'
        - 'reference_source|reference_source_selector': 'url'
        - 'reference_source|user_url': '{{ item.all_tx_transcriptome }}'
        - 'sequence_name': '{{ item.name }} - Transcripts'
        - 'sequence_id': '{{ item.build_id }}_all_tx_transcriptome'
        - 'sorting|sorting_selector': 'as_is'
      items: "{{ genomes }}"
      data_table_reload:
        - all_fasta

I was wondering if there is a way to set this kind of precedence when running data managers in ephemeris, somehow similar to what happens with the fasta files and the indexes based on them .Or what would be a good practice for this situation? So far I split the yaml file in 2 parts: one for the genome fasta (and creating the dbkey) and the indexes based on it, and one with the transcriptome fasta and the indexes based on it.

Cheers,
Ignacio

Release 0.3.1?

Could the next minor release be created? It would make simplify the tool installation process for the Galaxy Standalone VM currently being worked on.

Shed-install should be able to update all the tools in a galaxy instance

Would be nice for a cron job to keep all the tools up to date.
Changes could be made also fixing #62.

Please let me know if this functionality is useful and I will fix this issue and #62.

I can't find a button in galaxy that just says: "update all tools"

Create release branch

Hi just thinking:
If we create a release branch, and only merge from master to that branch when we release (so now version 0.9.0), then we can point readthedocs to that branch as our stable branch.
This will result in people checking the documentation getting the actual documentation from 0.9.0 at this moment, instead of the documentation of features we are adding to master but are not released yet.
This has already caused confusion before and this might be a working fix.

Run-data-managers only runs one job at a time.

Not very useful if installing 18 genomes at once with 5 data managers. Waiting on 90 jobs one at a time.
Since data managers may be interdependent it is maybe best to run these jobs per data manager.
So starting 18 jobs in parallel for each data manager and then wait until all jobs are finished to reload the data table and start the next data manager for the 18 genomes.

Improve Release Process

Document it. It is the same procedure as planemo but it isn't documented - https://github.com/galaxyproject/planemo/blob/master/docs/developing.rst.
Add others' to PyPI roles.

shed-tools update doesn't update minor versions

Hi,
Using ephemeris 0.9 to run shed-tools update, when a tool has a new minor version (ie new revision but same tool version number) I get something like this in the logs:

(1/597) Installing repository mothur_otu_association from iuc to section "NGS: Mothur" at revision 0fb08aeb215e (TRT: 0:00:01.740143)
	Repository mothur_otu_association is already installed.
	repository mothur_otu_association installed successfully (in 0:00:02.909464) at revision 0fb08aeb215e

And the tool doesn't get updated to the newest revision.
Is it the expected behavior?

Requires Galaxy >16.04

Not directly clear from the script, but older Galaxy versions will not include the tool revision and the generate_tool_list_from_ga_workflow_files.py and get-tool-list will silently produce empty files.
see galaxyproject/galaxy#1752 and galaxyproject/galaxy#1636

Difficult to debug run_data_managers

When running the data manager from ephemeris, there is no way to see what went wrong unless you tell galaxy to turn off cleanup and go digging in the job working directory. This is very awkward to debug if it is being run during a docker build, say, after the tools install. Here is what one gets in the case of any error:

galaxy.jobs.output_checker DEBUG 2018-08-15 19:49:06,367 Tool produced standard error failing job - [Traceback (most recent call last):
  File "/shed_tools/toolshed.g2.bx.psu.edu/repos/trinity_ctat/ctat_genome_resource_libs_data_manager/ea7bc21cbb7a/ctat_genome_resource_libs_data_manager/data_manager/add_ctat_resource_lib.py", line 879, in <module> # this is the line that says main(), not helpful in this case..
]
galaxy.jobs DEBUG 2018-08-15 19:49:06,526 (2) setting dataset 2 state to ERROR
galaxy.jobs INFO 2018-08-15 19:49:06,636 Collecting metrics for Job 2
galaxy.jobs DEBUG 2018-08-15 19:49:06,664 job 2 ended (finish() executed in (162.449 ms))
galaxy.tools.error_reports DEBUG 2018-08-15 19:49:06,674 Bug report plugin <galaxy.tools.error_reports.plugins.sentry.SentryPlugin object at 0x7f1a1e525190> generated response None

==> /home/galaxy/logs/slurmd.log <==
[2018-08-15T19:49:06.299] [3] sending REQUEST_COMPLETE_BATCH_SCRIPT, error:0
[2018-08-15T19:49:06.306] [3] done with job
Job 2 finished with state error.
Not all jobs successful! aborting...
Traceback (most recent call last):
  File "/usr/local/bin/run-data-managers", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/dist-packages/ephemeris/run_data_managers.py", line 294, in main
    data_managers.run(log, args.ignore_errors, args.overwrite)
  File "/usr/local/lib/python2.7/dist-packages/ephemeris/run_data_managers.py", line 255, in run
    run_jobs(self.fetch_jobs, self.skipped_fetch_jobs)
  File "/usr/local/lib/python2.7/dist-packages/ephemeris/run_data_managers.py", line 248, in run_jobs
    raise Exception('Not all jobs successful! aborting...')
Exception: Not all jobs successful! aborting...

Is there a way to display the contents of the galaxy_#.e file, perhaps?

shed-tools install not installing all dependencies

shed-tools install -g http://127.0.0.1:8080/ -a 9da1b0b76e038a2a70e0649adb5205fa -t galaxy.yml --install_resolver_dependencies --latest --log_file logfile.txt

tools:
- name: tophat_fusion_post
owner: devteam
revisions:
- f83394a2c2da
tool_panel_section_id: ngs_mapping
tool_panel_section_label: 'NGS: Mapping'
tool_shed_url: https://toolshed.g2.bx.psu.edu/

When I click on "Manage Tools", it shows "Installed, missing tool dependencies". When checking these, it seems many 'Installed repository dependencies' and 'Missing tool dependencies' have not been installed

. Is this an issue or am I misunderstanding how to use shed-tools?

New method for representing galaxy in ephemeris

By suggestion of @mvdbeek . Testing against the docker image is overkill in most use cases and leads to very slow testing.
Testing against a simple galaxy instance that does not have a full stack of postgres, nginx etc. is better. This is implemented in planemo already. Maybe @jmchilton can give us some tips how to go around doing this? We could make it an independent library which all galaxy projects can use.

Enable coverage testing

Last issue on the list #1.
We should enable coverage testing so we can make sure the entire ephemeris package is tested.
This may lead us to remove/deprecate functions that are untestable/unmaintainable for stability's sake. I feel it would improve our tests and our code base greatly.

Fail with more meaningful message if tool doesn't exist in toolshed

Currently this will just fail with an index error in create_tool_install_payload

shed-tools update does not work on release 0.8

shed-tools update -a XXXXXXXXXXXXXXXX -g http://localhost:8080 
Must provide a tool list file, individual tools info , a list of data manager tasks or issue the update command. Look at usage.

Missing tests

I guess at some point we'll want to cover most of the functionality with tests, so in that spirit we can keep track of missing tests here:

Test for importing/publishing workflows (#74)

shed-tools Intermittent Connection Error

I get an error when using shed-tools. This seems to occur intermittently and never repeats at the same place (when deleting the installed tools via the GUI and restarting the installation process via shed-tools).

shed-tools install -g http://127.0.0.1:8080/ -a 9da1b0b76e038a2a70e0649adb5205fa -t files/sc1galaxy-tools.yml --install_resolver_dependencies --latest --log_file logfile.txt
...
Traceback (most recent call last):
  File "/remote/home/kuop/galaxy/.venv/bin/shed-tools", line 11, in <module>
    sys.exit(main())
  File "/remote/home/kuop/galaxy/.venv/lib/python2.7/site-packages/ephemeris/shed_tools.py", line 937, in main
    install_tool_manager.install_repositories()
  File "/remote/home/kuop/galaxy/.venv/lib/python2.7/site-packages/ephemeris/shed_tools.py", line 716, in install_repositories
    install_repository_revision(repository, self.tsc)
  File "/remote/home/kuop/galaxy/.venv/lib/python2.7/site-packages/ephemeris/shed_tools.py", line 517, in install_repository_revision
    response = tool_shed_client.install_repository_revision(**repository)
  File "/remote/home/kuop/galaxy/.venv/lib/python2.7/site-packages/bioblend/galaxy/toolshed/__init__.py", line 146, in install_repository_revision
    return self._post(url=url, payload=payload)
  File "/remote/home/kuop/galaxy/.venv/lib/python2.7/site-packages/bioblend/galaxy/client.py", line 160, in _post
    files_attached=files_attached)
  File "/remote/home/kuop/galaxy/.venv/lib/python2.7/site-packages/bioblend/galaxyclient.py", line 137, in make_post_request
    timeout=self.timeout)
  File "/remote/home/kuop/galaxy/.venv/lib/python2.7/site-packages/requests/api.py", line 112, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/remote/home/kuop/galaxy/.venv/lib/python2.7/site-packages/requests/api.py", line 58, in request
    return session.request(method=method, url=url, **kwargs)
  File "/remote/home/kuop/galaxy/.venv/lib/python2.7/site-packages/requests/sessions.py", line 508, in request
    resp = self.send(prep, **send_kwargs)
  File "/remote/home/kuop/galaxy/.venv/lib/python2.7/site-packages/requests/sessions.py", line 618, in send
    r = adapter.send(request, **kwargs)
  File "/remote/home/kuop/galaxy/.venv/lib/python2.7/site-packages/requests/adapters.py", line 490, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', BadStatusLine("''",))

Add option to skip changeset revision from get_tool_list_from_galaxy.py

To enable this more easily

galaxyproject / ephemeris Goto Github PK

ephemeris's People

Contributors

Stargazers

Watchers

Forkers

ephemeris's Issues

Recommend Projects

Recommend Topics

Recommend Org