galaxyproject / bioblend Goto Github PK

View Code? Open in Web Editor NEW

85.0 30.0 113.0 17.27 MB

A Python library for interacting with the Galaxy API

Home Page: https://bioblend.readthedocs.io/

License: MIT License

Python 98.29% Shell 1.58% Makefile 0.13%

usegalaxy api api-client cloudman hacktoberfest

bioblend's Introduction

Join the chat at https://gitter.im/galaxyproject/bioblend

BioBlend is a Python library for interacting with the Galaxy API.

BioBlend is supported and tested on:

Python 3.8 - 3.12
Galaxy release 19.05 and later.

Full docs are available at https://bioblend.readthedocs.io/ with a quick library overview also available in ABOUT.rst.

bioblend's People

Contributors

Stargazers

Watchers

Forkers

versi-australia kellrott nuwang claresloggett ilveroluca bioinformaticsarchive jdiggans chapmanb slugger70 andrewjrobinson tlipic crs4 odoppelt nsoranzo bzeitner takadonet jmchilton pradal dannon cariaso simleo bgruening tombair martenson fmareuil abretaud jmeppley hexylena davebx longtianpy raonyguimaraes intel-hss blankenberg ratzeni boratonaj afgane natefoo dgdekoning ykowsar gvlproject matthewralston kidaak mvdbeek remyd1 gregvonkuster carrieganote cganote manabuishii alenzhao annblackz damcorreia mandiayba whitesymmetry artbio ashvark lecorguille kikkomep dpryan79 marcoooo thomaswollmann dfornika patterja feigeliudan01 shaikhalvee kylelamoureux luke-c-sargent kkamieniecka pcm32 gitter-badger selten pvanheus andrewm-bose fredericbga freddielpf szjshuffle andreassko simonbray davidchristiany corburn brinkmanlab cat-bro ruanzhenglin nvk747 violethaze74 gallardoalba olegzharkov gmauro jazzed-edtech rikeshi gururajrkatti davelopez edwardsnj alejandro-bulgaris-qcif qfab-bioinformatics ksuderman jaidevjoshi83 b1nslashsh kxk302 martmists-gh alisajid

bioblend's Issues

Use bioblend for local install?

Hello,
I know this isn't an issue exactly, but, it would be extremely useful to have all of the automated Galaxy setup goodness of bioblend available for a local install on a single machine. Would this be possible without Herculean effort?

set_library_permissions unexpected behaviour

Hello,
When using set_library_permissions with a list containing all the user_ids of my galaxy instance for each of the roles, the last created user is not added to the said roles. I guess this is not working as intended.
Regards.

RuntimeValue substitution

Trying to substitute {"class": "RuntimeValue"} values while setting parameters. This returns referenced Galaxy error. Setting these as dummy values instead of runtime values allows the workflow to run as expected.

Running command:
gi.workflows.run_workflow(workflow_id, datamap, params=param_set, history_name=hist_name, import_inputs_to_history=True)

galaxy.web.framework.decorators ERROR 2015-08-24 20:20:27,397 Uncaught exception in exposed API method:
Traceback (most recent call last):
  File "/home/groups/clinical/Galaxy/galaxy-dist/lib/galaxy/web/framework/decorators.py", line 251, in decorator
    rval = func( self, trans, *args, **kwargs)
  File "/home/groups/clinical/Galaxy/galaxy-dist/lib/galaxy/webapps/galaxy/api/workflows.py", line 199, in create
    populate_state=True,
  File "/home/groups/clinical/Galaxy/galaxy-dist/lib/galaxy/workflow/run.py", line 22, in invoke
    return __invoke( trans, workflow, workflow_run_config, workflow_invocation, populate_state )
  File "/home/groups/clinical/Galaxy/galaxy-dist/lib/galaxy/workflow/run.py", line 61, in __invoke
    modules.populate_module_and_state( trans, workflow, workflow_run_config.param_map )
  File "/home/groups/clinical/Galaxy/galaxy-dist/lib/galaxy/workflow/modules.py", line 1068, in populate_module_and_stat
    step_errors = module_injector.inject( step, step_args=step_args, source="json" )
  File "/home/groups/clinical/Galaxy/galaxy-dist/lib/galaxy/workflow/modules.py", line 1049, in inject
    step.upgrade_messages = module.check_and_update_state()
  File "/home/groups/clinical/Galaxy/galaxy-dist/lib/galaxy/workflow/modules.py", line 743, in check_and_update_state
    return self.tool.check_and_update_param_values( inputs, self.trans, allow_workflow_parameters=True )
  File "/home/groups/clinical/Galaxy/galaxy-dist/lib/galaxy/tools/__init__.py", line 1921, in check_and_update_param_values
    self.check_and_update_param_values_helper( self.inputs, values, trans, messages, update_values=update_values, allow_workflow_parameters=allow_workflow_parameters )
  File "/home/groups/clinical/Galaxy/galaxy-dist/lib/galaxy/tools/__init__.py", line 1950, in check_and_update_param_values_helper
    values[ input.name ] = input.get_initial_value( trans, context )
  File "/home/groups/clinical/Galaxy/galaxy-dist/lib/galaxy/tools/parameters/basic.py", line 1880, in get_initial_value
    return self.get_initial_value_from_history_prevent_repeats(trans, context, None, history=history)
  File "/home/groups/clinical/Galaxy/galaxy-dist/lib/galaxy/tools/parameters/basic.py", line 1890, in get_initial_value_from_history_prevent_repeats
    history = self._get_history( trans, history )
  File "/home/groups/clinical/Galaxy/galaxy-dist/lib/galaxy/tools/parameters/basic.py", line 1660, in _get_history
    assert history is not None, "%s requires a history" % class_name
AssertionError: DataToolParameter requires a history

Is get_folders() working for people in current release?

I've got :

folderNew = gi.libraries.create_folder(library_id, version_label, description='Cached folder', base_folder_id=base_folder_id)
folder_matches = gi.libraries.get_folders(library_id, name=version_label)

But folder_matches is returning an empty array?! "print len(folder_matches)" displays 0 ?
I can go to the galaxy library and base folder in question and see the folder sitting there. Are there any characters not allowed in the version_label parameter? My typical folder label is like:

2014-02-26 20:19 . v23

I've done alot of outrageous experimenting with creating/deleting library folder/subfolder/file so my galaxy database for that might not be in pristine shape. But I'd think that wouldn't affect this call.

Help appreciated, thanks ...

Damion

Feature request: objects: Folder object attributes and methods

Once you have a Folder object from a Library, there is not much in the way of operations you can do.

Minimally, the parent attribute could point back to the Library or parent Folder.

Also, it'd be useful, but harder to implement, to be able to list, retrieve, and add datasets or sub-folders using the Folder object.

Question on replacement_params

I am trying to run a workflow that uses workflow parameters (specified like ${varliable_name} in the workflow. Is replacement_params the correct way to fill those values? If not is there a way?

Thanks

Error 404 in HistoryClient.download_dataset() function

Hi,
I'm having trouble to use the HistoryClient.download_dataset() function.

import bioblend.galaxy
from bioblend.galaxy.histories import HistoryClient
from bioblend.galaxy.datasets import DatasetClient
gi=bioblend.galaxy.GalaxyInstance(url='http://lbcd41.snv.jussieu.fr/galaxy/', key='xxx')
hc = HistoryClient( gi )
dc = DatasetClient( gi )
dataset_id='a72a8d806d8884e1'
history_id='86ef4cf00f944e48'
hc.download_dataset(dataset_id=dataset_id, history_id=history_id, file_path='~', use_default_filename=False)

this gives me

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-21-c1bcfd7c7e41> in <module>()
      7 dataset_id='a72a8d806d8884e1'
      8 history_id='86ef4cf00f944e48'
----> 9 hc.download_dataset(dataset_id=dataset_id, history_id=history_id, file_path='~', use_default_filename=False)

/usr/local/lib/python2.7/dist-packages/bioblend-0.5.0-py2.7.egg/bioblend/galaxy/histories/__init__.pyc in download_dataset(self, history_id, dataset_id, file_path, use_default_filename, to_ext)
    263             d_type = meta['data_type']
    264         url = '/'.join([self.gi.base_url, 'datasets', meta['id'], "display"]) + "?to_ext=" + d_type
--> 265         req = urllib2.urlopen(url)
    266         if use_default_filename:
    267             file_local_path = os.path.join(file_path, meta['name'])

/usr/lib/python2.7/urllib2.pyc in urlopen(url, data, timeout)
    125     if _opener is None:
    126         _opener = build_opener()
--> 127     return _opener.open(url, data, timeout)
    128 
    129 def install_opener(opener):

/usr/lib/python2.7/urllib2.pyc in open(self, fullurl, data, timeout)
    408         for processor in self.process_response.get(protocol, []):
    409             meth = getattr(processor, meth_name)
--> 410             response = meth(req, response)
    411 
    412         return response

/usr/lib/python2.7/urllib2.pyc in http_response(self, request, response)
    521         if not (200 <= code < 300):
    522             response = self.parent.error(
--> 523                 'http', request, response, code, msg, hdrs)
    524 
    525         return response

/usr/lib/python2.7/urllib2.pyc in error(self, proto, *args)
    446         if http_err:
    447             args = (dict, 'default', 'http_error_default') + orig_args
--> 448             return self._call_chain(*args)
    449 
    450 # XXX probably also want an abstract factory that knows when it makes

/usr/lib/python2.7/urllib2.pyc in _call_chain(self, chain, kind, meth_name, *args)
    380             func = getattr(handler, meth_name)
    381 
--> 382             result = func(*args)
    383             if result is not None:
    384                 return result

/usr/lib/python2.7/urllib2.pyc in http_error_default(self, req, fp, code, msg, hdrs)
    529 class HTTPDefaultErrorHandler(BaseHandler):
    530     def http_error_default(self, req, fp, code, msg, hdrs):
--> 531         raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
    532 
    533 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 404: Not Found

The DatasetClient.download_dataset() function is working fine:
dc.download_dataset(dataset_id=dataset_id, file_path='~', use_default_filename=False)

Support for collecting outputs from tools that produce arbitrary number of outputs at runtime

Currently, tools that produce arbitrary number of outputs at runtime (referred to as A for arbitrary outputs), making use of the discover_datasets instruction are more complicated to use than regular tools (referred to as F for fixed outputs) given that the real outputs are only added to the history when the first output completes.

I first noticed this limitation when trying to run the tool "SamToFastq" from the "picard_plus" package using bioblend.galaxy.objects.GalaxyInstance and tool = gi.tools.get("...") ; outputs = tool.run(..., wait=True).

For the sake of user simplicity I propose that these two types of tools A and F are handled transparently by bioblend and the real outputs collected and returned by tool.run(..., wait=True).

My current understanding is that:

F type tools, when launched produce X outputs which when in state "ok" represent the actual tool results.
A type tools, when launched produce 1 output which when in state "ok" adds X outputs (in "ok" state?) to the history. The latter are the actual tool results.

I don't know if the API currently provides:

How to reliably distinguish an output from an F tool and an A tool.
Given an output from an A tool, how to reliably figure out which outputs were produced by it.

Does this seem doable or reasonable to implement in Bioblend?

bioblend.galaxy.libraries.create_history(name=None) returns a dict() instead of a list()

Hi,

I'm not sure is this is a valid issue or not, please comment on it.

I never understood why Galaxy API would return a list() when doing things like creating a data library or a folder in a data library, uploading a dataset to a history or a data library and searching for a dataset, history or library using an id. I'm guessing the last one shares code with searching using a name and that could be the reason.

In any case, if a list is the preferred response in all these cases, I can't see why create_history should be different.

Thanks,
Carlos

Feature request : add_user/roles_to_group functions

Hello!
It would be really nice to have this functions implemented in bioblend.
Regards.

Commit message not used

See L250, mis-use of dict() vs {}, the commit message is stored in a key equal to its value, never being used.

I could've sworn I saw an email/IRC message about this recently, just discovered this while reading the code.

Updated Workflow Client

bioblend doesn't implement much of the newer workflow APIs that include a lot of important functionality. In particular a performant run submission endpoint and ability to poll workflow scheduling.

get_libraries() cannot fetch by library_id

The code will only attempt to view the first library before breaking out. Believe the short circuiting break statement happen always after the first iteration.

Code in question is in get_libraries() fcn ,; line 77 to 81

Tools without input files break workflows.list()

I have a workflow that includes a tool that has no input file. This tool creates a file from the input parameters only.

Line 212 in bioblend.galaxy.objects.wrapper.py:

assert self.input_ids == set(self.inputs)

Compares the declared inputs to the number of starting points in the DAG. Since this tool has no input files, it looks just like an input file in the DAG layout and gets put into the variable self.input_ids. But, it is NOT an input, so the above assertion fails.

I can solve the problem by removing the assertion, but I'm not sure what else that breaks. The other option is to loop through the inputs and check them in the workflow data. The actual input files should have type set to 'data_input', while the tool should have type 'tool'.

BioBlend with CloudMan on Windows Azure?

Hi,

I was thinking that I might be able to provide a python library to support BioBlend with Cloudman on Windows Azure. The library has to use Windows SDK instead of boto to utilize Azure Virtual Machines but most features should be similar. More details should be described but by using azure-sdk-for-python, it shouldn't be difficult.

What do you think? I would like to hear any thoughts and comments for that.

Thanks,
Hyungro

ImportError: cannot import name tools

I've downloaded the bioblend source (0.2.4-dev) via git and installed without error but upon testing (in Windows and Linux via Python 2.7.5 and 2.7.2 respectively) from the python interpreter I get the following error:

Python 2.7.5 (default, May 15 2013, 22:44:16) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from bioblend import galaxy
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "build\bdist.win-amd64\egg\bioblend\galaxy\__init__.py", line 9, in <module>
ImportError: cannot import name tools

The only other (vaguely) relevant info I can find on this issue is here:

bcbio/bcbio-nextgen#12

I see the tools source in the bioblend tree so I'm unclear on what exactly is failing here. Any insight would be appreciated.

Add method to copy a HDA into a library

Add a new method to LibraryClient class in bioblend/galaxy/libraries/__init__.py .

Workflow inputs are imported into history even when using 'import_inputs_to_history=False'

Hi,

I recently found this[1] issue. I believe this is more an issue on Galaxy, but after no response on galaxy-dev I was hoping reporting it here would help bring some attention to this issue affecting Bioblend.

Also and maybe related. I mentioned in the same email to galaxy-dev that in both cases, setting 'import_inputs_to_history' to 'False' or 'True', all inputs are imported to the history in duplicates.

[1]http://dev.list.galaxyproject.org/Issue-with-Bioblend-s-run-workflow-import-inputs-to-history-td4661726.html

Thanks,
Carlos

Library.get_folder() incompatible with Library.folder_ids

When I try to get a folder using one of the ids found in Library.folder_ids, I get an error (or an arbitrary wrong object if the ids happen to line up).

It looks to me like the LibraryContentInfo class is stripping the leading "F" from the folder ids. I'm not sure why this is. The result is that the Galaxy API does not recognize the ID when passed to /api/libraries/LIBRARY_ID/content/FOLDER_ID

If I pull out the 'wrapped' dict from the folder's content_info object and access the u'id' value directly, I can get the Folder from Library.get_folder().

Here is the code that produces the error (library is a library object) and works around using library.content_infos:

library = ngi.libraries.get(libraryID)
content_info_ids=[c.wrapped['id'] for c in library.content_infos]
print "Library %s(%s)\nfolder ids: %r\ncontent info ids: %r" % (library.name,library.id, library.folder_ids, content_info_ids)

# get using folder_ids
try:
    folder=library.get_folder(library.folder_ids[0])
    print "folder_ids method: " + repr(folder.wrapped)
except Exception as e:
    # bioblend will report the error
    print "folder_ids method FAILED" 

# get using dict wrapped content_info
try:
    folder=library.get_folder(library.content_infos[0].wrapped['id'])
    print "content_infos method: " + repr(folder.wrapped)
except Exception as e:
    # bioblend will report the error
    print "content_infos method FAILED"

This prints:

Library MG ASsembly: HOT237_3_0125m(f597429621d6eb2b)
folder ids: [u'f597429621d6eb2b']
content info ids: [u'Ff597429621d6eb2b']

ERROR:bioblend:GET: error 400: '<html>\r\n  <head><title>Bad Request</title></head>\r\n  <body>\r\n    <h1>Bad Request</h1>\r\n    <p>The server could not comply with the request since\r\nit is either malformed or otherwise incorrect.\r\n\r\n<br/>Invalid LibraryDataset id ( f597429621d6eb2b ) specified\r\n<!--  --></p>\r\n    <hr noshade>\r\n    <div align="right">WSGI Server</div>\r\n  </body>\r\n</html>\r\n', 0 attempts left
folder_ids method FAILED

content_infos method: {u'parent_library_id': u'f597429621d6eb2b', u'update_time': u'2015-02-22T23:59:55.268456', u'library_path': [], u'deleted': False, u'description': u'', u'item_count': 0, u'parent_id': u'F2100007bb1a6035d', u'genome_build': None, u'model_class': u'LibraryFolder', u'id': u'Ff597429621d6eb2b', u'name': u'MG ASsembly: HOT237_3_0125m'}

_wait_datasets not accurate

Guys, I see that you monitor dataset status (in galaxy/objects/galaxy_instance.py) with a snippet like this:

        self.log.info('waiting for datasets')
        datasets = [_ for _ in datasets if _.state in _PENDING_DS_STATES]
        while datasets:
            time.sleep(polling_interval)
            for i in xrange(len(datasets)-1, -1, -1):
                ds = datasets[i]
                ds.refresh()
                self.log.info('{0.id}: {0.state}'.format(ds))
                if break_on_error and ds.state == 'error':
                    raise RuntimeError(_get_error_info(ds))
                if ds.state not in _PENDING_DS_STATES:
                    del datasets[i]

The problem with with running this immediately after a workflow.run() is that the logic assumes you have access to all dataset statuses while they are still in a pending state. In the event that galaxy is slow to relinquish control of the queuing connection, some datasets may have already switched from a pending state to a failed or ok state. Then your break_on_error is not able to catch those early problems in the workflow, because they are never added to the datasets queue. Make sense?

Documentation inconsistencies

So! I was writing https://github.com/galaxy-iuc/parsec and realised that writing wrappers was dumb, so naturally I automated the process. For the command line parameter documentation, I discovered that Bioblend kept really great docs! Thanks y'all! However...there are some minor inconsistencies that I'll be making a PR on soon, but I have a few questions before the PR is completed:

bioblend.galaxy.histories show_history() are deleted/visible supposed to be booleans? Additionally, other values for details should be documented, or switched to a bool if all/none are the only options. Types is undocumented.
bioblend.galaxy.histories create_history_tag() does this really return a json object? Or should that be a dict
bioblend.galaxy.histories update_dataset/update_dataset_collection both take kwds, probably should specify just what goes in there.
bioblend.galaxy.libraries create_library() Does this really return a list? it's a create method...that really seems like it should return a single dict nota list.
bioblend.galaxy.libraries upload_file_from_server() here it's specified that if folder_id is empty, then the root folder will be used. Is this true everywhere? Should this be an optional param with None as a default? What does roles do?
upload_file_from_server/upload_from_galaxy_filesystem are confusingly similar, but that's not bioblend's fault.
bioblend.galaxy.libraries upload_file_from_server() what does roles do?

Minor complaint (and goodness knows I'm guilty of abusing locals() as well), but several places in the codebase switch between named args, and **kwargs automatically passed through. My automating wrapping of all the functions inspect method signatures (since I trust those more than the docs...)

Tool Shed Users API

Implement client to the Tool Shed users API.

Support upload of FTP files.

Add support for uploading files from FTP (if configured by the server admin) in bioblend/galaxy/tools/__init__.py (using payload['files_0|ftp_files']) and bioblend/galaxy/objects/wrappers.py .

Reported by inf_b_ on IRC: https://botbot.me/freenode/galaxyproject/2015-07-28/?msg=45759506

bioblend-0.4.1.tar.gz broken

the tar file at https://pypi.python.org/pypi/bioblend/0.4.1 is broken because it contains unresolved merge conflicts which prevent me from installing it.

~/xtmp/bioblend-0.4.1/bioblend> ls
init.py cloudman config.pyc galaxyclient.py galaxyclient.py.BASE.5106.py galaxyclient.py.REMOTE.5106.py galaxyclient.pyc util
init.pyc config.py galaxy galaxyclient.py.BACKUP.5106.py galaxyclient.py.LOCAL.5106.py galaxyclient.py.orig toolshed

download dataset using local copy

Curious to see if anyone would be interesting for the ability to copy a dataset using shutil.copyfile if the dataset file path is accessible. It would be useful in cases when you want to avoid sending a huge file(s) through a http get request and where both your galaxy instance and local machine share a common storage backend.

Anyone be interested? Got some code that does it....

Argument error in "install_repository_revision"

I am trying to install some tools in our local Galaxy instance through the BioBlend API. My code is as follows:

#parse tool JSON
try:
    with open(toolfile) as data_file:
        tools = json.load(data_file)
except TypeError:
    sys.exit("Could not read toolfile")
#install specified tools
for tool in tools:
    print(tool["install_tool_dependencies"])
    gi.toolShed.install_repository_revision(
            tool_shed_url = tool["toolshed"], 
            name = tool["id"], 
            owner = tool["owner"],
            install_tool_dependencies = tool["install_tool_dependencies"], 
            install_repository_dependencies = tool["install_repository_dependencies"], 
            tool_panel_section_id = tool["tool_panel_section_id"], 
            new_tool_panel_section_label = tool["tool_panel_section_id"]
    )

the argument are parsed from a provided json file. As follows:

[
    {
    "id":"data_manager_bowtie2_index_builder",
    "owner":"devteam",
    "install_tool_dependencies":true,
    "install_repository_dependencies":true,
    "tool_panel_section_id":null,
    "toolshed":"https://toolshed.g2.bx.psu.edu/"
}]

but when I run the code I get the following error:

Traceback (most recent call last):
  File "config_galaxy.py", line 60, in <module>
    _galaxyConfig(args.instance, args.api_key, args.tools)
  File "config_galaxy.py", line 41, in _galaxyConfig
    new_tool_panel_section_label = tool["tool_panel_section_id"]
TypeError: install_repository_revision() takes at least 5 arguments (8 given)

Anybody any ideas why? Is there a mandatory argument short? I have not defined the revision code, because I just want the latest revision.

Thanks
Matthias

possible race condition

My usage of bioblend is perhaps a bit weird, so this may be due to my own extras, but I just got a stack trace and a crash which looks like it may be due to a race condition between launching a machine, and being able to determine it's identifier. Perhaps you'll see something there and choose to guard against this presumably rare race condition

[corkscrewmachine] out: Waiting for this CloudMan instance to start...
[corkscrewmachine] out: Traceback (most recent call last):
[corkscrewmachine] out: File "/usr/local/lib/python2.7/dist-packages/fabric/main.py", line 743, in main
[corkscrewmachine] out: _args, *_kwargs
[corkscrewmachine] out: File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 405, in execute
[corkscrewmachine] out: results[''] = task.run(_args, *_new_kwargs)
[corkscrewmachine] out: File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 171, in run
[corkscrewmachine] out: return self.wrapped(_args, *_kwargs)
[corkscrewmachine] out: File "/home/ubuntu/enza/inception/fabfile.py", line 145, in boot
[corkscrewmachine] out: cmi.initialize(cluster_type='Galaxy', initial_storage_size=initial_storage_size)
[corkscrewmachine] out: File "/usr/local/lib/python2.7/dist-packages/bioblend/cloudman/init.py", line 44, in wrapper
[corkscrewmachine] out: obj.wait_till_instance_ready(timeout, interval)
[corkscrewmachine] out: File "/usr/local/lib/python2.7/dist-packages/bioblend/cloudman/init.py", line 333, in wait_till_instance_ready
[corkscrewmachine] out: raise VMLaunchException(msg)
[corkscrewmachine] out: bioblend.cloudman.VMLaunchException: 'Error launching an instance: Problem updating instance 'i-1e30314f' state: EC2ResponseError: 400 Bad Request\n\nInvalidInstanceID.NotFoundThe instance ID 'i-1e30314f' does not exist2e6eb61f-5e0c-410a-a484-d5a143a1bfc6'
[corkscrewmachine] out:

Reduced Travis testing matrix.

Test only

dev with Python 2.6
dev with Python 2.7
release_15.03 with Python 2.7
release_15.01 with Python 2.7
release_14.10 with Python 2.7

See discussion here #120 (comment).

Add Endpoint for Tool Shed Repository Creation

galaxyproject/galaxy#2 (comment)

Update bioblend documentation

Cannot seem to figure which argument to pass the makefile in the doc/ directory that will update documentation that being presented at http://bioblend.readthedocs.org/en/latest/

set_library_permissions bug

Hello,

Sadly, apparently this is not only a UI bug.
I am trying to create a tool in order to allow normal users to create a data library with the right permissions for them on my Galaxy instance.

When using the set_library_permissions method, it sometimes works as intended, and sometimes gives the permissions to another user. I have no idea what is causing that.

Here is my script : http://pastebin.com/zHB89XgG
(Of course, if I uncomment the asserts, I get assertion errors, also, I tried as well with the user id formatted as a list, and i get the exact same behaviour)

Here is the stdout : http://pastebin.com/eExmWVB3
There is not any stderr, and no errors showing in the paster.log .

I am using python 2.7.3, and the latest version of Galaxy.
Let me know if i can provide anything else...
Regards,
Loïc

gi.workflows.list() fails when workflow contains unconnected input dataset

I have a Galaxy workflow with a number of input datasets connected to a full graph of work. I also have two input datasets that are not connected to anything. So long as the two inputs exist without connection the Bioblend api throws an error:

Traceback (most recent call last):
  File "/home/ubuntu/monitor.py", line 303, in <module>
    m.run_workflow(w_name, h_name, l_name)
  File "/home/ubuntu/monitor.py", line 283, in run_workflow
    raise e
AssertionError

Deleting the two inputs brings the API back to life.

JSONDecodeError for executing show_workflow()

Hi,

I tried to call show_workflow() but I got errors like this:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/bioblend/galaxy/workflows/__init__.py", line 45, in show_workflow
    return Client._get(self, id=workflow_id)
  File "/usr/local/lib/python2.7/dist-packages/bioblend/galaxy/client.py", line 35, in _get
    return r.json()
  File "/usr/local/lib/python2.7/dist-packages/requests/models.py", line 651, in json
    return json.loads(self.text or self.content, **kwargs)
  File "/usr/lib/python2.7/dist-packages/simplejson/__init__.py", line 413, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/dist-packages/simplejson/decoder.py", line 402, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.7/dist-packages/simplejson/decoder.py", line 420, in raw_decode
    raise JSONDecodeError("No JSON object could be decoded", s, idx)
simplejson.decoder.JSONDecodeError: No JSON object could be decoded: line 1 column 0 (char 0)

The code snippets that I tried look like this:

from bioblend.galaxy import GalaxyInstance

galaxy_url = "http://127.0.0.1:8080"
galaxy_api_key = "d8688f27a08cc6f42a065e39966b6c47"
gi = GalaxyInstance(url=galaxy_url, key=galaxy_api_key)
workflows = gi.workflows.get_workflows()[0]
gi.workflows.show_workflow(workflows['id'])

So, I printed the workflows variable to make sure it has an id.

print workflows
{u'name': u"Workflow 1", u'tags': [], u'url': u'/api/workflows/f597429621d6eb2b', u'published': True, u'model_class': u'StoredWorkflow', u'id': u'f597429621d6eb2b'}

Is there something that I missed?

Thanks,
Hyungro

Python 3 Support

Path forward:

Add dependency on six.
Switch all changed imports to use six.moves, at this point all the files would be importable within Python 3 - which would help downstream projects like planemo.
Replace unmaintained poster dependency - (requests-toolbelt may be a solution)
~~Test dropping in boto3 in lieu of boto (or just wait for it to replace the project).~~ Not needed
Update the actual BioBlend code for Python 3.
Use tox for testing.
Setup a branch for Python 3 support using Travis to test against Python 3.4.

Download returning empty file

Trying to get the contents of a dataset is returning nothing:

dataset.get_contents()

''

I have my Galaxy instance configured to make the system file names visible. Looking at the file returns data:

print "".join(open(dataset.file_name).readlines())

stats_info  bases   1130197476
stats_info  reads   7511356
stats_len   max 151
stats_len   mean    150.47
stats_len   median  151
stats_len   min 32
stats_len   mode    151
stats_len   modeval 4812241
stats_len   range   120
stats_len   stddev  1.97

Error running long workflow

Hi,

I have ran long workflow (50 steps) and things seem to be working fine for sometimes before it crashes. First all the data get loaded into the library. The history gets created and then around 40/50 steps get scheduled (looking at the new created history). The script keeps running but then it crashes and here is the log:

Traceback (most recent call last):
File "run_imported_workflow.py", line 136, in
result = gi.workflows.run_workflow(workflow, datamap, history_id=outputhist, import_inputs_to_history=True)
File "build/bdist.linux-x86_64/egg/bioblend/galaxy/workflows/init.py", line 177, in run_workflow
File "build/bdist.linux-x86_64/egg/bioblend/galaxy/client.py", line 70, in _post
File "build/bdist.linux-x86_64/egg/bioblend/galaxyclient.py", line 94, in make_post_request
bioblend.galaxy.client.ConnectionError: Unexpected response from galaxy: 500

My question, why only 40 not all the steps get scheduled? Is the cause for the above issue relating to waiting time?

By the way, the workflow gets executed if I just ran from the Galaxy GUI.

Any help is highly appreciated.

Run_workflow is synchronous?

Inside the Workflow class (I'm using the Galaxy Objects), I see:

      res = self.gi.gi.workflows.run_workflow(self.id, **kwargs)
        # res structure: {'history': HIST_ID, 'outputs': [DS_ID, DS_ID, ...]}
        out_hist = self.gi.histories.get(res['history'])
        assert set(res['outputs']).issubset(out_hist.dataset_ids)
        outputs = [out_hist.get_dataset(_) for _ in res['outputs']]

        if wait:
            self.gi._wait_datasets(outputs, polling_interval=polling_interval,
                                   break_on_error=break_on_error)

With wait=False the run() command is supposed to be asynchronous. However, in all my tests, the run_workflow() command is synchronous and does not relinquish control until the last step of the workflow completes or fails.

How do we get run_workflow asynchronous?

No documentation for Roles on readthedocs

This is related to issue #141 where there was an issue with the documentation for set_library_permissions. However, there is currently no section on http://bioblend.readthedocs.org/en/latest/api_docs/galaxy/all.html#object-oriented-galaxy-api documenting the Roles.

Some ToolShedClient functions give a 404

Bioblend doesn't seem to behave as the manual (at page http://bioblend.readthedocs.org/en/latest/api_docs/galaxy/all.html?highlight=toolshedclient#bioblend.galaxy.toolshed.ToolShedClient) describes. When I run the following code:

from bioblend.galaxy import GalaxyInstance
import bioblend
bioblend.__version__
>>> '0.6.1'
from bioblend import toolshed
gi = GalaxyInstance(url='127.0.0.1', key='<key>')
tsc = bioblend.toolshed.repositories.ToolShedClient(gi)
tsc.repository_revisions()

It gives the following 404:

bioblend.galaxy.client.ConnectionError: GET: error 404: '<html>\r\n  <head><title>Not Found</title></head>\r\n  <body>\r\n    <h1>Not Found</h1>\r\n    <p>The resource could not be found.\r\n<br/>No route for /api/repository_revisions\r\n<!--  --></p>\r\n    <hr noshade>\r\n    <div align="right">WSGI Server</div>\r\n  </body>\r\n</html>\r\n', 0 attempts left: None

Could it be that the url has moved from:

gi.url + '/repository_revisions'

https://github.com/galaxyproject/bioblend/blob/master/bioblend/toolshed/repositories/__init__.py#L236
https://github.com/galaxyproject/bioblend/blob/master/bioblend/toolshed/repositories/__init__.py#L290

to:

gi.url + '/tool_shed_repositories'

When I continue with the following code:

tsc.get_repositories()

It also gives a 404 error:

bioblend.galaxy.client.ConnectionError: GET: error 404: '<html>\r\n  <head><title>Not Found</title></head>\r\n  <body>\r\n    <h1>Not Found</h1>\r\n    <p>The resource could not be found.\r\n<br/>No route for /api/repositories\r\n<!--  --></p>\r\n    <hr noshade>\r\n    <div align="right">WSGI Server</div>\r\n  </body>\r\n</html>\r\n', 0 attempts left: None

Similarly, tsc.show_repository('65d5559f7c7e2cb9') results in a 404 to /api/repositories/65d5559f7c7e2cb9. I think this url should be /api/tool_shed_repositories/65d5559f7c7e2cb9.

To solve that, I guess the following needs to be changed (https://github.com/galaxyproject/bioblend/blob/master/bioblend/toolshed/repositories/__init__.py#L13):

self.module = 'repositories'

self.module = 'tool_shed_repositories'

p.s. The API key I use is linked to an admin account.

Question about parameter in workflow

Hi,

I'm writing a script for automating a workflow with bowtie2 in it. I have downloaded indices from bowtie2 website and wrote them in bowtie2_indices.loc. But how could I choose which reference index to use according to different sequences I will be using? I noticed that in run_workflow(), there is an argument 'param', but I'm not sure how to write this dictionary.
Maybe like:
param = {'bowtie2': {'source':'indexed', {'index':'hg19'}}}

Run_workflows returns list of workflows in stead of a dict containing a history id

Hi!

I am trying out the Galaxy API of blend I am running in this issue when invoking a workflow via the API. The call is like this:
gi.workflows.run_workflow(workflow_id=u'36ddb788a0f14eb3',dataset_map={'1186': {'id': '113b5e0ab7b052dd', 'src': 'ld'}, '1189': {'id': '999d792a6e3b4811', 'src': 'ld'}},history_name="Test API run", import_inputs_to_history=True)

And the output is this:
[{u'id': u'b847e822bdc195d0',
u'name': u'A Workflow',
u'url': u'/api/workflows/b847e822bdc195d0'},
{u'id': u'eafb646da3b7aac5',
]

So no history id in the dictionary and the workflow isn't run too. Am I calling the API wrong or is this a bug?

Thanks!

Tags cannot be added to a history

Galaxy commit bit: 32edb07930b17a868ed4088080b8d7fc577d1ce7

This is how I'm calling it:
_historyClient.create_history_tag(history.get('id'), "processed")
such that history is a history object and _historyClient is the GalaxyInstance.histories wrapper

It looks like it's dying on the server side, but I'm not sure.

Stack Trace and Error Message:
Traceback (most recent call last):
File "get_stats.py", line 64, in
_historyClient.create_history_tag(history.get('id'), "processed")
File "/usr/local/lib/python2.7/dist-packages/bioblend/galaxy/histories/init.py", line 314, in create_history_tag
return Client._post(self, payload, url=url)
File "/usr/local/lib/python2.7/dist-packages/bioblend/galaxy/client.py", line 171, in _post
files_attached=files_attached)
File "/usr/local/lib/python2.7/dist-packages/bioblend/galaxyclient.py", line 104, in make_post_request
r.status_code, body=r.text)
bioblend.galaxy.client.ConnectionError: Unexpected response from galaxy: 500: {"traceback": "Traceback (most recent call last):\n File "/Warehouse/Users/dbouchard/galaxy/lib/galaxy/web/framework/decorators.py", line 251, in decorator\n rval = func( self, trans, _args, *_kwargs)\n File "/Warehouse/Users/dbouchard/galaxy/lib/galaxy/webapps/galaxy/api/item_tags.py", line 39, in create\n tag = self._apply_item_tag( trans, self.tagged_item_class, kwd[ self.tagged_item_id ], tag_name, value )\n File "/Warehouse/Users/dbouchard/galaxy/lib/galaxy/web/base/controller.py", line 2043, in _apply_item_tag\n tag_assoc = self.get_tag_handler( trans ).apply_item_tag( trans, user, tagged_item, tag_name, tag_value )\nTypeError: apply_item_tag() takes at most 5 arguments (6 given)\n", "err_msg": "Uncaught exception in exposed API method:", "err_code": 0}

ImportError: cannot import name tools

Sometimes alternates as 'cannot import name libraries'

In [1]: from bioblend.galaxy import GalaxyInstance
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-1-37a6e20ea420> in <module>()
----> 1 from bioblend.galaxy import GalaxyInstance

/Library/Python/2.7/site-packages/bioblend/galaxy/__init__.py in <module>()
      7 import urllib2
      8 import simplejson
----> 9 from bioblend.galaxy import (libraries, histories, workflows, datasets, users, genomes, tools)
     10
     11

ImportError: cannot import name tools

Use actual API endpoint for dataset downloads (at least for HDAs).

http://dev.list.galaxyproject.org/Problems-downloading-files-via-bioblend-td4665435.html

Unexpected error response

In some of the commits I've made to planemo (and the entirety of parsec), I've taken to attempting to parse the bioblend error code text as JSON because bioblend primarily passes the json response from Galaxy straight to the user.

However, in https://github.com/galaxyproject/bioblend/blob/master/bioblend/galaxyclient.py#L103 I encountered one of the few times when bioblend doesn't do that. Due to the prefixed text ("Unexpected...400: ") I cannot extract the nice err_msg key from the dict, and thus have to render the entire error message.

Would a PR be accepted to change this to just raise r.text instead of the formatted string? Submitted as an issue first in case there's commentary on changing error message strings.

Have Travis test against both Galaxy master and dev branches.

unknown step type: u'data_collection_input', when running gi.workflows.list()

Hi guys,

I'm unable to list workflows via the object API - I get a ValueError. The data_collection_input isn't in the allowable set of step types.

the offending line:
https://github.com/galaxyproject/bioblend/blob/master/bioblend/galaxy/objects/wrappers.py#L274

Cheers,
Cam

Release version 0.6.1

@jmchilton @nsoranzo
I'd like to make the next minor release. You guys have something you'd like to get in or can I prepare it?

Tests fail after galaxy-central update

The commit at https://github.com/Unode/bioblend/commit/30b8dfd77bf1dbdc4f51cdcffd6aba75fa228122 addresses the configuration changes though tests still fail after this.

Could you try to rebuild it on travis to confirm this.

Thanks

docs for tool_inputs parameter to run_tool()

I could not find anything anywhere on how to construct tool_inputs dictionary for GalaxyInstance.tools.run_tool() method. Could you document it, please?

It is not immediately obvious neither from your tests/TestGalaxyTools.py nor from Galaxy code in lib/galaxy/webapps/galaxy/api/tools.py.

I also notice that executing workflows, as opposed as tools, is much better documented both here and in Galaxy API Wiki. Is that by intent? Is running tools directly discouraged? The problem with workflows is that they have to be hand-edited every time the tools used in them are upgraded.

Thanks,
Andrey