getzlab / dalmatian Goto Github PK
View Code? Open in Web Editor NEWdalmatian is a collection of high-level companion functions for Firecloud and FISS.
dalmatian is a collection of high-level companion functions for Firecloud and FISS.
The function update_pair_attributes() exists in the dalmatian source code, but is not callable in an actual python session.
Hello, I am trying to add pairs to a workspace.
I have a dataframe with pair_id
as index and case_sample
, control_sample
, participants
that all exist already
I have created the entity pair
on the workspace
As there is no update_pair()
function, I am using the update_entity_attributes('pair', newpairs)
function, however always get an error message rejecting all pairs by saying that "message":"Entity does not exist"
I have tried on other workspaces and just get the same message.
Thanks
Hello,
wanting to run a pipeline by passing it a sample_set, which it is meant to receive. I get this error.
'All_samples' does not exist in pair_sets and the workflow is parametrized for sample_sets.
I don't really get why it is getting pairs...
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-170-ec321d99cc07> in <module>
----> 1 CreatePanelOfNormalsGATK_PANCAN = wto.create_submission('CreatePanelOfNormalsGATK_PANCAN', 'All_normals')
2 DepthOfCovQC_PANCAN = wto.create_submission('DepthOfCovQC_PANCAN', samplesetname + "_all", 'sample_set', expression='this.samples')
3 print("waiting for 'DepthOfCovQC_PANCAN' & 'CNV_CreatePoNForCNV'")
4 terra.waitForSubmission(wto, [DepthOfCovQC_PANCAN, CreatePanelOfNormalsGATK_PANCAN])
/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/wmanager.py in create_submission(self, config, entity, etype, expression, use_callcache)
1433 except ConfigNotFound:
1434 self.update_config(config)
-> 1435 preflight = self.preflight(config, entity, expression, etype)
1436 if not preflight.result:
1437 raise ValueError(preflight.reason)
/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/wmanager.py in preflight(self, config_name, entity, expression, etype)
1378 etype,
1379 entity,
-> 1380 (expression if expression is not None else 'this')+'.%s_id' % config['rootEntityType']
1381 )
1382 if isinstance(workflow_entities, dict) and 'statusCode' in workflow_entities and workflow_entities['statusCode'] >= 400:
/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/wmanager.py in call_with_lock(self, *args, **kwargs)
149 def call_with_lock(self, *args, **kwargs):
150 with self.lock:
--> 151 return func(self, *args, **kwargs)
152 return call_with_lock
153
/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/wmanager.py in evaluate_expression(self, etype, entity, expression)
1308 evaluator.add_entities(
1309 _etype,
-> 1310 self._get_entities_internal(_etype)
1311 )
1312 if 'workspace' in expression:
/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/wmanager.py in _get_entities_internal(self, etype)
517
518 def _get_entities_internal(self, etype):
--> 519 return getattr(self, 'get_{}s'.format(etype))()
520
521 @_synchronized
/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/base.py in get_pair_sets(self)
1026 """Get DataFrame with sample sets and their attributes"""
1027 df = self.get_entities('pair_set')
-> 1028 df['pairs'] = df['pairs'].apply(lambda x: [i['entityName'] for i in x] if type(x) is list and np.all(pd.notnull(x)) else x)
1029
1030 # # convert JSON to table
/anaconda3/envs/py36/lib/python3.6/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
3589 else:
3590 values = self.astype(object).values
-> 3591 mapped = lib.map_infer(values, f, convert=convert_dtype)
3592
3593 if len(mapped) and isinstance(mapped[0], Series):
pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/base.py in <lambda>(x)
1026 """Get DataFrame with sample sets and their attributes"""
1027 df = self.get_entities('pair_set')
-> 1028 df['pairs'] = df['pairs'].apply(lambda x: [i['entityName'] for i in x] if type(x) is list and np.all(pd.notnull(x)) else x)
1029
1030 # # convert JSON to table
/anaconda3/envs/py36/lib/python3.6/site-packages/dalmatian/base.py in <listcomp>(.0)
1026 """Get DataFrame with sample sets and their attributes"""
1027 df = self.get_entities('pair_set')
-> 1028 df['pairs'] = df['pairs'].apply(lambda x: [i['entityName'] for i in x] if type(x) is list and np.all(pd.notnull(x)) else x)
1029
1030 # # convert JSON to table
TypeError: string indices must be integers
Best,
dm v0.0.17:
Reuploading samples after a quick change is doing a parallel upload over all of my cpus.
It takes around 30mn tu complete. the dataframe is not that big: 1500x200.
Is there any way to speed up this process as it use to be?
Best,
wrong variable used in wmanager.py
in line 435.
Need to replace json_body
with config
Hi,
When running e.g. delete_sample_attributes with dry_run=True
, the current logic within delete_entity_attributes dictates that dry_run mode is active only if delete_files=True
.
I think it would be more intuitive and prevent accidentally deleting attributes if in such a case a message of the to-be-deleted attributes would be printed without actually deleting them from the data model.
Thank you for dalmatian - it's super useful!
Binyamin
WorkspaceManager.import_config raises an UnboundLocalError with the version currently on PyPi, but when I changed core.py to match the one on github, it worked.
(Probably related to #32)
I recently added a column containing lists of ints to the participant table, and now get_participants
fails:
-> 1378 df = df.applymap(lambda x: [i['entityName'] if 'entityName' in i else i for i in x] 1379 if isinstance(x, list) and np.all(pd.notnull(x)) else x)
... (long stack trace) ...
TypeError: argument of type 'int' is not iterable
Since upgrading to V0.0.10 I get this error message for all functions.
RecursionError: maximum recursion depth exceeded in comparison
on this workspace <dalmatian.wmanager.WorkspaceManager rmc-rnaseq/St_Jude_RMS>
Best,
The old WorkspaceManager constructor signature (as described in the readme) was
dalmatian.WorkspaceManager(<namespace>, <workspace>)
However, this commit changed the signature to WorkspaceManager("<namespace>/<workspace>")
. This breaks a lot of old scripts that were expecting the old signature.
Would it be possible to revert this @agraubert ?
when trying to use upload_entities for another entity than participant, sample, sampleset, it never works.
df.index.name = [entityname]_id
upload_entities(entityname, df)
getting
APIException: Sample_group import failed.: (400) : {
"causes": [],
"message": "ErrorReport(FireCloud,Unknown firecloud model entity type: sample_group,Some(400 Bad Request),List(),List(),None)",
"source": "FireCloud",
"stackTrace": [],
"statusCode": 400,
"timestamp": 162462155497
Hi, we have a code to make workflow graphs (with nodes being workflows rather than tasks) within a workspace. You can find it here. It's helping us figure out how workflows are piped to each other. Any interest to incorporate it into dalmatian? I can make a PR if it seems worthwhile.
I'm using firecloud-dalmatian==0.0.17 and get an error message when trying to import a public config:
wm.import_config('broadinstitute-gtex/rnaseq_bam_star_rsem_rnaseqc_v1-2_BETA_cfg')
UnboundLocalError Traceback (most recent call last)
in
----> 1 wm.import_config('broadinstitute-gtex/rnaseq_bam_star_rsem_rnaseqc_v1-2_BETA_cfg')
~/anaconda3/lib/python3.7/site-packages/dalmatian/wmanager.py in import_config(self, reference)
1161 """
1162 # get latest snapshot
-> 1163 c = get_config(reference)
1164 if len(c)==0:
1165 raise ValueError('Configuration "{}" not found (name must match exactly).'.format(reference))
~/anaconda3/lib/python3.7/site-packages/dalmatian/core.py in get_config(reference)
634 3) reference = "name"
635 """
--> 636 namespace, name, version = decode_config(namespace, name, decode_only=True)
637 return _get_config_internal(namespace, name)
638
UnboundLocalError: local variable 'namespace' referenced before assignment
The wm
object is okay, because commands like wm.get_samples()
work as expected.
Any suggestions?
Thanks,
Binyamin
Hi, is there any option to upload files (e.g. interval lists) to the Workspace Data using this? I wasn't able to find that in the documents. -thanks
When installing using pip, a version that is before September 2019 is installed.
List attribute updating was added to FISS in broadinstitute/fiss#175. It would be great for Dalmatian to incorporate it.
when trying todo any action on Terra, I get this error message, preventing me from running any pipeline.
Do you have any idea what to do here?
I did gcloud auth application-default login
a bunch and even reinstalled firecloud to the latest version but nothing seems to work.
APIException: Unable to query samples: (401) : <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head><script src="https://jsagent.tcell.io/tcellagent.min.js" tcellappid="FCProd-EBOgM" tcellapikey="AQQBBAFLGLOxL7VE9IF9ESlLvCxD5Ykr_7xkQKq_rgn_P58IWjOhOzIh6p3aI4pTWaprlUw" tcellbaseurl="https://us.agent.tcell.insight.rapid7.com/api/v1"></script>
<title>401 Unauthorized</title>
</head><body>
<h1>Unauthorized</h1>
<p>This server could not verify that you
are authorized to access the document
requested. Either you supplied the wrong
credentials (e.g., bad password), or your
browser doesn't understand how to supply
the credentials required.</p>
<hr>
<address>Apache Server at api.firecloud.org Port 443</address>
</body></html>
Thanks!
It does display some optional inputs but not all of them..
tried with wm.get_config("gatk","CNV_Somatic_Panel_Workflow")
the function seems to do a lot of post processing of the request response..
I don't know what might be causing it.
Best,
:)
Hi! Just letting you know that iteritems was removed in pandas 2.0. I ran into this while trying to use the function "update_sample_attributes".
This is the error I get.
pair_set_ids = [i['name'] for i in r]
TypeError: string indices must be integers
Hi,
I have been receiving errors when attempting to retrieve the sample set table from the data model. This function worked for me last week. For clarity this is the full code I used to return a sample set:
wm = dalmatian.WorkspaceManager('vanallen-firecloud-dfci/ogct_germline_analysis')
sampleset = wm.get_sample_sets()
This is the error I receive:
TypeError: ("argument of type 'int' is not iterable", 'occurred at index scatterIndices')
Any help would be greatly appreciated. I love your tool and use it daily!
All the best,
Sabrina
when I submit new jobs any change to the configuration json results in the same error for all workspaces and pipelines
User provided a full method configuration, which did not match the configuration already present on the workspace
Either upload the provided configuration, or use a different reference type to fetch the online version
---------------------------------------------------------------------------
ConfigNotUnique Traceback (most recent call last)
<ipython-input-21-2e080c5450d6> in <module>()
1 samtofastq['samtofastq_workflow.samtofastq.input_bam_cram']= 'this.WES_bam'
----> 2 refwm.create_submission(samtofastq, samplesetname,'sample_set',expression='this.samples')
~/anaconda2/envs/py36/lib/python3.6/site-packages/dalmatian/wmanager.py in create_submission(self, config, entity, etype, expression, use_callcache)
1430 print("User provided a full method configuration, which did not match the configuration already present on the workspace", file=sys.stderr)
1431 print("Either upload the provided configuration, or use a different reference type to fetch the online version", file=sys.stderr)
-> 1432 raise ConfigNotUnique("Provided configuration did not match live version {}/{}".format(cfg['namespace'], cfg['name']))
1433 except ConfigNotFound:
1434 self.update_config(config)
ConfigNotUnique: 'Provided configuration did not match live version broadinstitute_gtex/samtofastq_v1-0_BETA_cfg'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.