Coder Social home page Coder Social logo

dalmatian's People

Contributors

agraubert avatar bknisbac avatar danielrosebrock avatar francois-a avatar jcha40 avatar jkobject avatar julianhess avatar sammeier avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dalmatian's Issues

Entity does not exist Error

Hello, I am trying to add pairs to a workspace.

I have a dataframe with pair_id as index and case_sample, control_sample, participants that all exist already

I have created the entity pair on the workspace
As there is no update_pair() function, I am using the update_entity_attributes('pair', newpairs) function, however always get an error message rejecting all pairs by saying that "message":"Entity does not exist"

I have tried on other workspaces and just get the same message.

Thanks

TypeError: string indices must be integers

Hello,

wanting to run a pipeline by passing it a sample_set, which it is meant to receive. I get this error.
'All_samples' does not exist in pair_sets and the workflow is parametrized for sample_sets.

I don't really get why it is getting pairs...

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-170-ec321d99cc07> in <module>
----> 1 CreatePanelOfNormalsGATK_PANCAN = wto.create_submission('CreatePanelOfNormalsGATK_PANCAN', 'All_normals')
      2 DepthOfCovQC_PANCAN = wto.create_submission('DepthOfCovQC_PANCAN', samplesetname + "_all", 'sample_set', expression='this.samples')
      3 print("waiting for 'DepthOfCovQC_PANCAN' & 'CNV_CreatePoNForCNV'")
      4 terra.waitForSubmission(wto, [DepthOfCovQC_PANCAN, CreatePanelOfNormalsGATK_PANCAN])

/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/wmanager.py in create_submission(self, config, entity, etype, expression, use_callcache)
   1433             except ConfigNotFound:
   1434                 self.update_config(config)
-> 1435         preflight = self.preflight(config, entity, expression, etype)
   1436         if not preflight.result:
   1437             raise ValueError(preflight.reason)

/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/wmanager.py in preflight(self, config_name, entity, expression, etype)
   1378             etype,
   1379             entity,
-> 1380             (expression if expression is not None else 'this')+'.%s_id' % config['rootEntityType']
   1381         )
   1382         if isinstance(workflow_entities, dict) and 'statusCode' in workflow_entities and workflow_entities['statusCode'] >= 400:

/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/wmanager.py in call_with_lock(self, *args, **kwargs)
    149     def call_with_lock(self, *args, **kwargs):
    150         with self.lock:
--> 151             return func(self, *args, **kwargs)
    152     return call_with_lock
    153 

/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/wmanager.py in evaluate_expression(self, etype, entity, expression)
   1308             evaluator.add_entities(
   1309                 _etype,
-> 1310                 self._get_entities_internal(_etype)
   1311             )
   1312         if 'workspace' in expression:

/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/wmanager.py in _get_entities_internal(self, etype)
    517 
    518     def _get_entities_internal(self, etype):
--> 519         return getattr(self, 'get_{}s'.format(etype))()
    520 
    521     @_synchronized

/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/base.py in get_pair_sets(self)
   1026         """Get DataFrame with sample sets and their attributes"""
   1027         df = self.get_entities('pair_set')
-> 1028         df['pairs'] = df['pairs'].apply(lambda x: [i['entityName'] for i in x] if type(x) is list and np.all(pd.notnull(x)) else x)
   1029 
   1030         # # convert JSON to table

/anaconda3/envs/py36/lib/python3.6/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
   3589             else:
   3590                 values = self.astype(object).values
-> 3591                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   3592 
   3593         if len(mapped) and isinstance(mapped[0], Series):

pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()

/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/base.py in <lambda>(x)
   1026         """Get DataFrame with sample sets and their attributes"""
   1027         df = self.get_entities('pair_set')
-> 1028         df['pairs'] = df['pairs'].apply(lambda x: [i['entityName'] for i in x] if type(x) is list and np.all(pd.notnull(x)) else x)
   1029 
   1030         # # convert JSON to table

/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/base.py in <listcomp>(.0)
   1026         """Get DataFrame with sample sets and their attributes"""
   1027         df = self.get_entities('pair_set')
-> 1028         df['pairs'] = df['pairs'].apply(lambda x: [i['entityName'] for i in x] if type(x) is list and np.all(pd.notnull(x)) else x)
   1029 
   1030         # # convert JSON to table

TypeError: string indices must be integers

Best,

uploading samples taking 8 cores for 30mn

dm v0.0.17:

Reuploading samples after a quick change is doing a parallel upload over all of my cpus.

It takes around 30mn tu complete. the dataframe is not that big: 1500x200.

Is there any way to speed up this process as it use to be?

Best,

dry_run logic in delete_entity_attributes

Hi,
When running e.g. delete_sample_attributes with dry_run=True, the current logic within delete_entity_attributes dictates that dry_run mode is active only if delete_files=True.

I think it would be more intuitive and prevent accidentally deleting attributes if in such a case a message of the to-be-deleted attributes would be printed without actually deleting them from the data model.

Thank you for dalmatian - it's super useful!
Binyamin

Most recent version not on pip

WorkspaceManager.import_config raises an UnboundLocalError with the version currently on PyPi, but when I changed core.py to match the one on github, it worked.

get_participants cannot handle lists of non-strings

(Probably related to #32)

I recently added a column containing lists of ints to the participant table, and now get_participants fails:

-> 1378 df = df.applymap(lambda x: [i['entityName'] if 'entityName' in i else i for i in x] 1379 if isinstance(x, list) and np.all(pd.notnull(x)) else x)

... (long stack trace) ...

TypeError: argument of type 'int' is not iterable

New WorkspaceManager constructor incompatible

The old WorkspaceManager constructor signature (as described in the readme) was

dalmatian.WorkspaceManager(<namespace>, <workspace>)

However, this commit changed the signature to WorkspaceManager("<namespace>/<workspace>"). This breaks a lot of old scripts that were expecting the old signature.

Would it be possible to revert this @agraubert ?

Unknown firecloud model entity type

when trying to use upload_entities for another entity than participant, sample, sampleset, it never works.

df.index.name = [entityname]_id
upload_entities(entityname, df)

getting

APIException: Sample_group import failed.: (400) : {
  "causes": [],
  "message": "ErrorReport(FireCloud,Unknown firecloud model entity type: sample_group,Some(400 Bad Request),List(),List(),None)",
  "source": "FireCloud",
  "stackTrace": [],
  "statusCode": 400,
  "timestamp": 162462155497

workflow graphs

Hi, we have a code to make workflow graphs (with nodes being workflows rather than tasks) within a workspace. You can find it here. It's helping us figure out how workflows are piped to each other. Any interest to incorporate it into dalmatian? I can make a PR if it seems worthwhile.

import_config error

I'm using firecloud-dalmatian==0.0.17 and get an error message when trying to import a public config:
wm.import_config('broadinstitute-gtex/rnaseq_bam_star_rsem_rnaseqc_v1-2_BETA_cfg')


UnboundLocalError Traceback (most recent call last)
in
----> 1 wm.import_config('broadinstitute-gtex/rnaseq_bam_star_rsem_rnaseqc_v1-2_BETA_cfg')

~/anaconda3/lib/python3.7/site-packages/dalmatian/wmanager.py in import_config(self, reference)
1161 """
1162 # get latest snapshot
-> 1163 c = get_config(reference)
1164 if len(c)==0:
1165 raise ValueError('Configuration "{}" not found (name must match exactly).'.format(reference))

~/anaconda3/lib/python3.7/site-packages/dalmatian/core.py in get_config(reference)
634 3) reference = "name"
635 """
--> 636 namespace, name, version = decode_config(namespace, name, decode_only=True)
637 return _get_config_internal(namespace, name)
638

UnboundLocalError: local variable 'namespace' referenced before assignment

The wm object is okay, because commands like wm.get_samples() work as expected.
Any suggestions?

Thanks,
Binyamin

upload to files to workspace

Hi, is there any option to upload files (e.g. interval lists) to the Workspace Data using this? I wasn't able to find that in the documents. -thanks

API exception when trying to connect

when trying todo any action on Terra, I get this error message, preventing me from running any pipeline.

Do you have any idea what to do here?
I did gcloud auth application-default login a bunch and even reinstalled firecloud to the latest version but nothing seems to work.

APIException: Unable to query samples: (401) : <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head><script src="https://jsagent.tcell.io/tcellagent.min.js" tcellappid="FCProd-EBOgM" tcellapikey="AQQBBAFLGLOxL7VE9IF9ESlLvCxD5Ykr_7xkQKq_rgn_P58IWjOhOzIh6p3aI4pTWaprlUw" tcellbaseurl="https://us.agent.tcell.insight.rapid7.com/api/v1"></script>
<title>401 Unauthorized</title>
</head><body>
<h1>Unauthorized</h1>
<p>This server could not verify that you
are authorized to access the document
requested.  Either you supplied the wrong
credentials (e.g., bad password), or your
browser doesn't understand how to supply
the credentials required.</p>
<hr>
<address>Apache Server at api.firecloud.org Port 443</address>
</body></html>

Thanks!

wm.get_config() does not return all parameters

It does display some optional inputs but not all of them..

tried with wm.get_config("gatk","CNV_Somatic_Panel_Workflow")

the function seems to do a lot of post processing of the request response..
I don't know what might be causing it.

Best,
:)

wm.get_sample_sets() function is returning an error

Hi,

I have been receiving errors when attempting to retrieve the sample set table from the data model. This function worked for me last week. For clarity this is the full code I used to return a sample set:
wm = dalmatian.WorkspaceManager('vanallen-firecloud-dfci/ogct_germline_analysis')
sampleset = wm.get_sample_sets()

This is the error I receive:

TypeError: ("argument of type 'int' is not iterable", 'occurred at index scatterIndices')

Any help would be greatly appreciated. I love your tool and use it daily!

All the best,
Sabrina

configuration error

when I submit new jobs any change to the configuration json results in the same error for all workspaces and pipelines

User provided a full method configuration, which did not match the configuration already present on the workspace
Either upload the provided configuration, or use a different reference type to fetch the online version
---------------------------------------------------------------------------
ConfigNotUnique                           Traceback (most recent call last)
<ipython-input-21-2e080c5450d6> in <module>()
      1 samtofastq['samtofastq_workflow.samtofastq.input_bam_cram']= 'this.WES_bam'
----> 2 refwm.create_submission(samtofastq, samplesetname,'sample_set',expression='this.samples')

~/anaconda2/envs/py36/lib/python3.6/site-packages/dalmatian/wmanager.py in create_submission(self, config, entity, etype, expression, use_callcache)
   1430                     print("User provided a full method configuration, which did not match the configuration already present on the workspace", file=sys.stderr)
   1431                     print("Either upload the provided configuration, or use a different reference type to fetch the online version", file=sys.stderr)
-> 1432                     raise ConfigNotUnique("Provided configuration did not match live version {}/{}".format(cfg['namespace'], cfg['name']))
   1433             except ConfigNotFound:
   1434                 self.update_config(config)

ConfigNotUnique: 'Provided configuration did not match live version broadinstitute_gtex/samtofastq_v1-0_BETA_cfg'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.