metagenome-atlas / atlas_analyze Goto Github PK
View Code? Open in Web Editor NEWScripts to get the most of the output of metagenome-atlas
Scripts to get the most of the output of metagenome-atlas
When I run
cd atlas_analyze
python analyze.py /scratch/gent/vo/000/gvo00043/vsc42339/MICROBIAN/CLEAN_READS
I get
Error: Snakefile "/Snakefile" not found.
Traceback (most recent call last):
File "analyze.py", line 21, in <module>
"snakemake "
File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 265, in __new__
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail; snakemake -d /scratch/gent/vo/000/gvo00043/vsc42339/MICROBIAN/CLEAN_READS -j 1 -s /Snakefile' returned non-zero exit status 1.
However, atlas_analyze/Snakefile
does exist, but for some reason it does not find it. I tried copying it to the atlas working directory, but that didn't work
When running ./setup.sh I got an error with mamba:
InstallError: Error: the following specs depend on 'conda' and can only be installed
into the root environment: mamba
Only when I run it in my directory miniconda3/bin;
conda install mamba -c conda-forge
It wants to update a lot of things:
Package plan for installation in environment /vsc-hard-mounts/leuven-data/314/vs c31426/miniconda3:
The following NEW packages will be INSTALLED:
_libgcc_mutex: 0.1-conda_forge conda-forge
_openmp_mutex: 4.5-1_gnu conda-forge
brotlipy: 0.7.0-py36he6145b8_1001 conda-forge
bzip2: 1.0.8-h7f98852_4 conda-forge
c-ares: 1.17.1-h36c2ea0_0 conda-forge
ca-certificates: 2020.12.5-ha878542_0 conda-forge
certifi: 2020.12.5-py36h5fab9bb_1 conda-forge
chardet: 4.0.0-py36h5fab9bb_1 conda-forge
conda-package-handling: 1.7.2-py36he6145b8_0 conda-forge
icu: 68.1-h58526e2_0 conda-forge
krb5: 1.17.2-h926e7f8_0 conda-forge
ld_impl_linux-64: 2.35.1-hed1e6ac_1 conda-forge
libarchive: 3.5.1-h3f442fb_1 conda-forge
libcurl: 7.71.1-hcdd3856_8 conda-forge
libedit: 3.1.20191231-he28a2e2_2 conda-forge
libev: 4.33-h516909a_1 conda-forge
libgcc-ng: 9.3.0-h2828fa1_18 conda-forge
libgomp: 9.3.0-h2828fa1_18 conda-forge
libiconv: 1.16-h516909a_0 conda-forge
libnghttp2: 1.41.0-h8cfc5f6_2 conda-forge
libsolv: 0.7.16-h8b12597_0 conda-forge
libssh2: 1.9.0-hab1572f_5 conda-forge
libstdcxx-ng: 9.3.0-h6de172a_18 conda-forge
libxml2: 2.9.10-h72842e0_3 conda-forge
lz4-c: 1.9.3-h9c3ff4c_0 conda-forge
lzo: 2.10-h516909a_1000 conda-forge
mamba: 0.7.8-py36h05d92e0_1 conda-forge
ncurses: 6.2-h58526e2_4 conda-forge
pysocks: 1.7.1-py36h5fab9bb_3 conda-forge
python_abi: 3.6-1_cp36m conda-forge
reproc: 14.2.1-h36c2ea0_0 conda-forge
reproc-cpp: 14.2.1-h58526e2_0 conda-forge
tqdm: 4.56.0-pyhd8ed1ab_0 conda-forge
urllib3: 1.26.2-pyhd8ed1ab_0 conda-forge
zstd: 1.4.8-ha95c52a_1 conda-forge
The following packages will be UPDATED:
conda: 4.3.21-py36_0 --> 4.9.2-py36h 5fab9bb_0 conda-forge
conda-env: 2.6.0-0 --> 2.6.0-1 conda-forge
cryptography: 1.8.1-py36_0 --> 3.3.1-py36h 0a59100_1 conda-forge
openssl: 1.0.2l-0 --> 1.1.1i-h7f9 8852_0 conda-forge
pycosat: 0.6.2-py36_0 --> 0.6.3-py36h 8f6f2f9_1006 conda-forge
python: 3.6.1-2 --> 3.6.11-h6f2 ec95_2_cpython conda-forge
readline: 6.2-2 --> 8.0-he28a2e 2_2 conda-forge
requests: 2.14.2-py36_0 --> 2.25.1-pyhd 3deb0d_0 conda-forge
setuptools: 27.2.0-py36_0 --> 49.6.0-py36 h5fab9bb_3 conda-forge
sqlite: 3.13.0-0 --> 3.34.0-h74c db3f_0 conda-forge
tk: 8.5.18-0 --> 8.6.10-hed6 95b0_1 conda-forge
xz: 5.2.2-1 --> 5.2.5-h5169 09a_1 conda-forge
zlib: 1.2.8-3 --> 1.2.11-h516 909a_1010 conda-forge
Proceed ([y]/n)? n
I was not sure about it if it wouldn't interfere with other miniconda envs..
So I did:
conda env create -n analyze -f condaenv.yml
and this worked with only 2 warnings:
Warning: 2 possible package resolutions (only showing differing packages):
- conda-forge::glib-2.66.4-hc4f0c31_1, conda-forge::libglib-2.66.4-h748fe8e_1
- conda-forge::glib-2.66.4-ha03b18c_1, conda-forge::libglib-2.66.4-hdb14261_1
It can maybe help others having the same issue.
Sofie
Hi Silas,
I got this error while running analyze.py. Error pop up at at import_files step.
I am also wondering that is it ok to stop here, because the programe gave already all the output ineed (taxonomy.tsv, mapping_rate.tsv, genome_completeness.tsv, counts/raw_counts_genomes.tsv, counts/median_coverage_genomes.tsv, annotations/KO.tsv, annotations/CAZy.tsv, annotations/KO.tsv,annotations/CAZy.tsv)
Thank you.
Zhou
--------------------------------------error-----------------------
Job counts:
count jobs
1 import_files
1
[Mon Feb 8 14:47:37 2021]
Finished job 5.
3 of 6 steps (50%) done
Select jobs to execute...
[Mon Feb 8 14:47:37 2021]
localrule analyze:
input: Results/taxonomy.tsv, Results/mapping_rate.tsv, Results/genome_completeness.tsv, Results/counts/raw_counts_genomes.tsv, Results/counts/median_coverage_genomes.tsv, Results/annotations/KO.tsv, Results/annotations/CAZy.tsv, Results/annotations/KO.tsv, Results/annotations/CAZy.tsv
log: Results/Code.ipynb
jobid: 2
[NbConvertApp] ERROR | Notebook JSON is invalid: Additional properties are not allowed ('id' was unexpected)
Failed validating 'additionalProperties' in code_cell:
On instance['cells'][0]:
{'cell_type': 'code',
'execution_count': None,
'id': 'talented-colors',
'metadata': {'tags': ['snakemake-job-properties']},
'outputs': ['...0 outputs...'],
'source': '\n'
'######## snakemake preamble start (automatically inserted, do '
'n...'}
counts_per_genome= relab.sum().sort_values()
ax= counts_per_genome[-10:].plot.bar(figsize=(10,5))
TypeError Traceback (most recent call last)
in
2
3 counts_per_genome= relab.sum().sort_values()
----> 4 ax= counts_per_genome[-10:].plot.bar(figsize=(10,5))
5
6 _= ax.set_xticklabels(Labels.loc[counts_per_genome.index[-10:]])
~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/plotting/_core.py in bar(self, x, y, **kwargs)
946 >>> ax = df.plot.bar(x='lifespan', rot=0)
947 """
--> 948 return self(kind="bar", x=x, y=y, **kwargs)
949
950 def barh(self, x=None, y=None, **kwargs):
~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/plotting/_core.py in call(self, *args, **kwargs)
792 data.columns = label_name
793
--> 794 return plot_backend.plot(data, kind=kind, **kwargs)
795
796 def line(self, x=None, y=None, **kwargs):
~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/plotting/_matplotlib/init.py in plot(data, kind, **kwargs)
60 kwargs["ax"] = getattr(ax, "left_ax", ax)
61 plot_obj = PLOT_CLASSES[kind](data, **kwargs)
---> 62 plot_obj.generate()
63 plot_obj.draw()
64 return plot_obj.result
~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/plotting/_matplotlib/core.py in generate(self)
277 def generate(self):
278 self._args_adjust()
--> 279 self._compute_plot_data()
280 self._setup_subplots()
281 self._make_plot()
~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/plotting/_matplotlib/core.py in _compute_plot_data(self)
402 data = data._convert(datetime=True, timedelta=True)
403 numeric_data = data.select_dtypes(
--> 404 include=[np.number, "datetime", "datetimetz", "timedelta"]
405 )
406
~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/core/frame.py in select_dtypes(self, include, exclude)
3440 # the "union" of the logic of case 1 and case 2:
3441 # we get the included and excluded, and return their logical and
-> 3442 include_these = Series(not bool(include), index=self.columns)
3443 exclude_these = Series(not bool(exclude), index=self.columns)
3444
~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/core/series.py in init(self, data, index, dtype, name, copy, fastpath)
312 data = data.copy()
313 else:
--> 314 data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
315
316 data = SingleBlockManager(data, index, fastpath=True)
~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/core/internals/construction.py in sanitize_array(data, index, dtype, copy, raise_cast_failure)
710 value = maybe_cast_to_datetime(value, dtype)
711
--> 712 subarr = construct_1d_arraylike_from_scalar(value, len(index), dtype)
713
714 else:
~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/core/dtypes/cast.py in construct_1d_arraylike_from_scalar(value, length, dtype)
1231 value = ensure_str(value)
1232
-> 1233 subarr = np.empty(length, dtype=dtype)
1234 subarr.fill(value)
1235
TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type
TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type
[Mon Feb 8 14:52:19 2021]
Error in rule analyze:
jobid: 2
log: Results/Code.ipynb (check log file(s) for error message)
RuleException:
CalledProcessError in line 82 of /home/Desktop/atlasTest/atlas_analyze/Snakefile:
Command 'set -euo pipefail; jupyter-nbconvert --log-level ERROR --execute --output /data/GV009/GV009_035/Results/Code.ipynb --to notebook --ExecutePreprocessor.timeout=-1 /data/GV009/GV009_035/.snakemake/scripts/tmpntaqydkv.Analyis_genome_abundances.ipynb' returned non-zero exit status 1.
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 2340, in run_wrapper
File "/home/Desktop/atlasTest/atlas_analyze/Snakefile", line 82, in __rule_analyze
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 568, in _callback
File "/home/anaconda3/envs/analyze/lib/python3.7/concurrent/futures/thread.py", line 57, in run
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 554, in cached_or_run
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 2352, in run_wrapper
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /data/GV009/GV009_035/.snakemake/log/2021-02-08T144532.032128.snakemake.log
Traceback (most recent call last):
File "/home/Desktop/atlasTest/atlas_analyze/analyze.py", line 21, in
"snakemake "
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 213, in new
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail; snakemake -d . -j 1 -s /home/Desktop/atlasTest/atlas_analyze/Snakefile' returned non-zero exit status 1.
Hi Silas,
I was trying out atlas_analyze. So far so good, I have only one error at the second last step in rule convert_nb.
Here is the error line:
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 all
1 convert_nb
2
Select jobs to execute...
[Sat Jan 16 16:07:43 2021]
rule convert_nb:
input: Results/Code.ipynb
output: Results/Summary.html
jobid: 1
[NbConvertApp] Converting notebook Results/Code.ipynb to html
Traceback (most recent call last):
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbformat/reader.py", line 14, in parse_json
nb_dict = json.loads(s, **kwargs)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/bin/jupyter-nbconvert", line 11, in <module>
sys.exit(main())
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/jupyter_core/application.py", line 254, in launch_instance
return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/traitlets/config/application.py", line 845, in launch_instance
app.start()
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 350, in start
self.convert_notebooks()
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 524, in convert_notebooks
self.convert_single_notebook(notebook_filename)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 489, in convert_single_notebook
output, resources = self.export_single_notebook(notebook_filename, resources, input_buffer=input_buffer)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 418, in export_single_notebook
output, resources = self.exporter.from_filename(notebook_filename, resources=resources)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 181, in from_filename
return self.from_file(f, resources=resources, **kw)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 199, in from_file
return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, **kw)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbformat/__init__.py", line 143, in read
return reads(buf, as_version, **kwargs)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbformat/__init__.py", line 73, in reads
nb = reader.reads(s, **kwargs)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbformat/reader.py", line 58, in reads
nb_dict = parse_json(s, **kwargs)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbformat/reader.py", line 17, in parse_json
raise NotJSONError(("Notebook does not appear to be JSON: %r" % s)[:77] + "...") from e
nbformat.reader.NotJSONError: Notebook does not appear to be JSON: ''...
[Sat Jan 16 16:07:49 2021]
Error in rule convert_nb:
jobid: 1
output: Results/Summary.html
shell:
jupyter nbconvert --output Summary --to=html --TemplateExporter.exclude_input=True Results/Code.ipynb
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /ddn1/vol1/site_scratch/leuven/314/vsc31426/NGS_oct/Narmena/.snakemake/log/2021-01-16T160743.880488.snakemake.log
Traceback (most recent call last):
File "./analyze.py", line 21, in <module>
"snakemake "
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 213, in __new__
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail; snakemake -d /ddn1/vol1/site_scratch/leuven/314/vsc31426/NGS_oct/Narmena -j 1 -s ./Snakefile' returned non-zero exit status 1.
And this was the previous error:
[Sat Jan 16 16:05:35 2021]
localrule analyze:
input: Results/taxonomy.tsv, Results/mapping_rate.tsv, Results/genome_completeness.tsv, Results/counts/raw_counts_genomes.tsv, Results/counts/median_coverage_genomes.tsv, Results/annotations/KO.tsv, Results/annotations/CAZy.tsv, Results/annotations/KO.tsv, Results/annotations/CAZy.tsv
log: Results/Code.ipynb
jobid: 2
[NbConvertApp] ERROR | Notebook JSON is invalid: Additional properties are not allowed ('id' was unexpected)
Failed validating 'additionalProperties' in code_cell:
On instance['cells'][0]:
{'cell_type': 'code',
'execution_count': None,
'id': 'hourly-liberal',
'metadata': {'tags': ['snakemake-job-properties']},
'outputs': ['...0 outputs...'],
'source': '\n'
'######## snakemake preamble start (automatically inserted, do '
'n...'}
Traceback (most recent call last):
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/bin/jupyter-nbconvert", line 11, in <module>
sys.exit(main())
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/jupyter_core/application.py", line 254, in launch_instance
return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/traitlets/config/application.py", line 845, in launch_instance
app.start()
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 350, in start
self.convert_notebooks()
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 524, in convert_notebooks
self.convert_single_notebook(notebook_filename)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 489, in convert_single_notebook
output, resources = self.export_single_notebook(notebook_filename, resources, input_buffer=input_buffer)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 418, in export_single_notebook
output, resources = self.exporter.from_filename(notebook_filename, resources=resources)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 181, in from_filename
return self.from_file(f, resources=resources, **kw)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 199, in from_file
return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, **kw)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/notebook.py", line 32, in from_notebook_node
nb_copy, resources = super().from_notebook_node(nb, resources, **kw)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 143, in from_notebook_node
nb_copy, resources = self._preprocess(nb_copy, resources)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 318, in _preprocess
nbc, resc = preprocessor(nbc, resc)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/base.py", line 47, in __call__
return self.preprocess(nb, resources)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 79, in preprocess
self.execute()
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 74, in wrapped
return just_run(coro(*args, **kwargs))
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 53, in just_run
return loop.run_until_complete(coro)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete
return future.result()
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbclient/client.py", line 541, in async_execute
cell, index, execution_count=self.code_cells_executed + 1
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 123, in async_execute_cell
cell, resources = self.preprocess_cell(cell, self.resources, cell_index)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 146, in preprocess_cell
cell = run_sync(NotebookClient.async_execute_cell)(self, cell, index, store_history=self.store_history)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 74, in wrapped
return just_run(coro(*args, **kwargs))
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 53, in just_run
return loop.run_until_complete(coro)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nest_asyncio.py", line 98, in run_until_complete
return f.result()
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/asyncio/futures.py", line 181, in result
raise self._exception
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/asyncio/tasks.py", line 249, in __step
result = coro.send(None)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbclient/client.py", line 832, in async_execute_cell
self._check_raise_for_error(cell, exec_reply)
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbclient/client.py", line 740, in _check_raise_for_error
raise CellExecutionError.from_cell_and_msg(cell, exec_reply['content'])
nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
------------------
#CAZy
CAZy_annotations_genome= pd.read_table('Results/annotations/CAZy.tsv',index_col=0)
CAZy_presence= (CAZy_annotations_genome>0).astype(int)
CAZy_presence.head()
function_relab = relab @ CAZy_presence
sns.clustermap(function_relab)
function_relab.head()
------------------
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-1-87471da32f9f> in <module>
7 function_relab = relab @ CAZy_presence
8
----> 9 sns.clustermap(function_relab)
10
11 function_relab.head()
/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/_decorators.py in inner_f(*args, **kwargs)
44 )
45 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 46 return f(**kwargs)
47 return inner_f
48
/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/matrix.py in clustermap(data, pivot_kws, method, metric, z_score, standard_scale, figsize, cbar_kws, row_cluster, col_cluster, row_linkage, col_linkage, row_colors, col_colors, mask, dendrogram_ratio, colors_ratio, cbar_pos, tree_kws, **kwargs)
1410 row_cluster=row_cluster, col_cluster=col_cluster,
1411 row_linkage=row_linkage, col_linkage=col_linkage,
-> 1412 tree_kws=tree_kws, **kwargs)
/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/matrix.py in plot(self, metric, method, colorbar_kws, row_cluster, col_cluster, row_linkage, col_linkage, tree_kws, **kws)
1221 self.plot_dendrograms(row_cluster, col_cluster, metric, method,
1222 row_linkage=row_linkage, col_linkage=col_linkage,
-> 1223 tree_kws=tree_kws)
1224 try:
1225 xind = self.dendrogram_col.reordered_ind
/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/matrix.py in plot_dendrograms(self, row_cluster, col_cluster, metric, method, row_linkage, col_linkage, tree_kws)
1067 self.data2d, metric=metric, method=method, label=False, axis=0,
1068 ax=self.ax_row_dendrogram, rotate=True, linkage=row_linkage,
-> 1069 tree_kws=tree_kws
1070 )
1071 else:
/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/_decorators.py in inner_f(*args, **kwargs)
44 )
45 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 46 return f(**kwargs)
47 return inner_f
48
/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/matrix.py in dendrogram(data, linkage, axis, label, metric, method, rotate, tree_kws, ax)
774 plotter = _DendrogramPlotter(data, linkage=linkage, axis=axis,
775 metric=metric, method=method,
--> 776 label=label, rotate=rotate)
777 if ax is None:
778 ax = plt.gca()
/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/matrix.py in __init__(self, data, linkage, metric, method, axis, label, rotate)
582
583 if linkage is None:
--> 584 self.linkage = self.calculated_linkage
585 else:
586 self.linkage = linkage
/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/matrix.py in calculated_linkage(self)
649 warnings.warn(msg)
650
--> 651 return self._calculate_linkage_scipy()
652
653 def calculate_dendrogram(self):
/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/matrix.py in _calculate_linkage_scipy(self)
618 def _calculate_linkage_scipy(self):
619 linkage = hierarchy.linkage(self.array, method=self.method,
--> 620 metric=self.metric)
621 return linkage
622
/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/scipy/cluster/hierarchy.py in linkage(y, method, metric, optimal_ordering)
1063
1064 if not np.all(np.isfinite(y)):
-> 1065 raise ValueError("The condensed distance matrix must contain only "
1066 "finite values.")
1067
ValueError: The condensed distance matrix must contain only finite values.
ValueError: The condensed distance matrix must contain only finite values.
[Sat Jan 16 16:06:24 2021]
Error in rule analyze:
jobid: 2
log: Results/Code.ipynb (check log file(s) for error message)
RuleException:
CalledProcessError in line 82 of /ddn1/vol1/site_scratch/leuven/314/vsc31426/atlas_analyse/atlas_analyze/Snakefile:
Command 'set -euo pipefail; jupyter-nbconvert --log-level ERROR --execute --output /ddn1/vol1/site_scratch/leuven/314/vsc31426/NGS_oct/Narmena/Results/Code.ipynb --to notebook --ExecutePreprocessor.timeout=-1 /ddn1/vol1/site_scratch/leuven/314/vsc31426/NGS_oct/Narmena/.snakemake/scripts/tmpbmbi86fl.Analyis_genome_abundances.ipynb' returned non-zero exit status 1.
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2338, in run_wrapper
File "/ddn1/vol1/site_scratch/leuven/314/vsc31426/atlas_analyse/atlas_analyze/Snakefile", line 82, in __rule_analyze
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 566, in _callback
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/concurrent/futures/thread.py", line 57, in run
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 552, in cached_or_run
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2350, in run_wrapper
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /ddn1/vol1/site_scratch/leuven/314/vsc31426/NGS_oct/Narmena/.snakemake/log/2021-01-16T160453.469631.snakemake.log
Traceback (most recent call last):
File "./analyze.py", line 21, in <module>
"snakemake "
File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 213, in __new__
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail; snakemake -d /ddn1/vol1/site_scratch/leuven/314/vsc31426/NGS_oct/Narmena -j 1 -s ./Snakefile' returned non-zero exit status 1.
Hi team, i am facing the error pasted below. This is a run that successfully completed in Atlas. Do you know what the problem is?
Traceback (most recent call last):
File "/home/marco/miniconda3/atlas-testing/.snakemake/scripts/tmpgsvxn6_u.get_taxonomy.py", line 11, in
from utils.mag_scripts import tax2table
ModuleNotFoundError: No module named 'utils'
[Wed Sep 29 18:02:47 2021]
Error in rule get_taxonomy:
jobid: 3
output: Results/taxonomy.tsv
Traceback (most recent call last):
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 593, in _callback
raise ex
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 579, in cached_or_run
run_func(*args)
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 2460, in run_wrapper
raise ex
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 2441, in run_wrapper
runtime_sourcecache_path,
File "/home/marco/miniconda3/atlas_analyze/Snakefile", line 97, in __rule_get_taxonomy
"Results/mapping_rate.tsv"
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 1365, in script
executor.evaluate()
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 377, in evaluate
self.execute_script(fd.name, edit=edit)
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 578, in execute_script
self._execute_cmd("{py_exec} {fname:q}", py_exec=py_exec, fname=fname)
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 421, in _execute_cmd
**kwargs
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 265, in new
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail; /home/marco/miniconda3/envs/analyze/bin/python3.7 /home/marco/miniconda3/atlas-testing/.snakemake/scripts/tmpgsvxn6_u.get_taxonomy.py' returned non-zero exit status 1.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /home/marco/miniconda3/atlas-testing/.snakemake/log/2021-09-29T180246.949895.snakemake.log
Traceback (most recent call last):
File "./analyze.py", line 21, in
"snakemake "
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 265, in new
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail; snakemake -d /home/marco/miniconda3/atlas-testing/ -j 1 -s ./Snakefile' returned non-zero exit status 1.
When running
cd atlas_analyze
python analyze.py /scratch/gent/vo/000/gvo00043/vsc42339/MICROBIAN/CLEAN_READS -s ./Snakefile
I get an error in job 3 out of 7: rule get_taxonomy:
Error in rule get_taxonomy:
jobid: 3
output: Results/taxonomy.tsv
Traceback (most recent call last):
File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 593, in _callback
raise ex
File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 579, in cached_or_run
run_func(*args)
File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2460, in run_wrapper
raise ex
File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2441, in run_wrapper
runtime_sourcecache_path,
File "/kyukon/scratch/gent/vo/000/gvo00043/vsc42339/atlas_analyze/Snakefile", line 97, in __rule_get_taxonomy
"Results/mapping_rate.tsv"
File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 1365, in script
executor.evaluate()
File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 377, in evaluate
self.execute_script(fd.name, edit=edit)
File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 578, in execute_script
self._execute_cmd("{py_exec} {fname:q}", py_exec=py_exec, fname=fname)
File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 421, in _execute_cmd
**kwargs
File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 265, in __new__
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail; /scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/bin/python3.7 /kyukon/scratch/gent/vo/000/gvo00043/vsc42339/MICROBIAN/CLEAN_READS/.snakemake/scripts/tmp2ifocuhj.get_taxonomy.py' returned non-zero exit status 1.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /kyukon/scratch/gent/vo/000/gvo00043/vsc42339/MICROBIAN/CLEAN_READS/.snakemake/log/2021-09-27T174323.495744.snakemake.log
Traceback (most recent call last):
File "analyze.py", line 21, in <module>
"snakemake "
File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 265, in __new__
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail; snakemake -d /kyukon/scratch/gent/vo/000/gvo00043/vsc42339/MICROBIAN/CLEAN_READS -j 1 -s /Snakefile -s ./Snakefile' returned non-zero exit status 1.
complete snakemake log:
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
---------------- ------- ------------- -------------
all 1 1 1
analyze 1 1 1
convert_nb 1 1 1
get_annotations 1 1 1
get_mapping_rate 1 1 1
get_taxonomy 1 1 1
import_files 1 1 1
total 7 1 1
Select jobs to execute...
[Mon Sep 27 17:43:30 2021]
rule get_mapping_rate:
input: stats/read_counts.tsv, genomes/counts/raw_counts_genomes.tsv
output: Results/mapping_rate.tsv
jobid: 4
resources: tmpdir=/tmp
[Mon Sep 27 17:43:39 2021]
Finished job 4.
1 of 7 steps (14%) done
Select jobs to execute...
[Mon Sep 27 17:43:39 2021]
rule get_annotations:
input: genomes/annotations/gene2genome.tsv.gz, Genecatalog/annotations/eggNog.tsv.gz
output: genomes/annotations/KO.tsv, genomes/annotations/CAZy.tsv, Genecatalog/annotations/KO.tsv, Genecatalog/annotations/CAZy.tsv
jobid: 6
resources: tmpdir=/tmp, mem=60
[Mon Sep 27 17:59:30 2021]
Finished job 6.
2 of 7 steps (29%) done
Select jobs to execute...
[Mon Sep 27 17:59:30 2021]
localrule get_taxonomy:
input: genomes/taxonomy/gtdb/gtdbtk.bac120.summary.tsv
output: Results/taxonomy.tsv
jobid: 3
resources: tmpdir=/tmp
[Mon Sep 27 17:59:38 2021]
Error in rule get_taxonomy:
jobid: 3
output: Results/taxonomy.tsv
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /kyukon/scratch/gent/vo/000/gvo00043/vsc42339/MICROBIAN/CLEAN_READS/.snakemake/log/2021-09-27T174323.495744.snakemake.log
rules get_mapping_rate
and get_annotations
seem to have finished fine, but somehow get_taxonomy
was not able to run
Hi Silas,
I have a question regarding the output written to the 'results' folder:
(1) KO.tsv
Structure is:
MAG K00001 K0002 etc..
MAG1 1 1
and the file written to the Genecatalog/annotations folder:
(2) KO.tsv
Structure is:
Gene0014228 KO7304
Gene...
And the summary.html the last table and plot of the Kegg orthologs. I was trying to understand what I am seeing, or which file you load in:
Is this Kegg ortholog table and heatmap based based on your file (1), so it are the KEGG orthologs/mags/sample
Or do you read in the file (2), KEGG ortholog/gene/(somehow gene_abundance)/sample?
I wanted to do further downstream analyses with the genecatalogs file, but I don't know how you translate the query to its abundance (occurrence) in a sample.
I was thinking, can we make graphs in which we express the abundance of certain genes relative to a common gene in each sample. So we speak about ratio's, I think this reduces (statistically) a bit the dependency/skew of the data on its different throughput. Like rpoB as reference gene, and suppose the query I am interested in, is an oil degrading gene. So I can say in this groundwater monitoring well, I have 5 oil degrading targets/ 10 rpob (rpoB is common to all bacteria, so I can say either half of my bacteria population is oil degrader, or, one strain expresses 5 oil degrading genes, etc...). The samples can have different number of reads, that doesn't really matter than, as long as we take the ratio between strains in a sample?
Jackson did something similar with the outcome of metannotate and I was wondering if I can translate the outcome of your genecatalog file as input to his R script. I just need one extra column in the table saying to which sample, the gene belongs, then I can continue :-)
Sofie
Hi!
I was just trying to play around with the scripts here, is this repo still in use?
I'm now getting this error:
MissingInputException in rule get_mapping_rate in line 92 of /scratch/pawsey0390/pbayer/seagrass_atlas/atlas_analyze/Snakefile:
Missing input files for rule get_mapping_rate:
output: Results/mapping_rate.tsv
affected files:
genomes/counts/raw_counts_genomes.tsv
As it seems like the rules in binning.smk dropped tsv files in favor of parquet files. I guess that wasn't updated here?
Hi,
First of all thanks for creating this tool. I had some troubles installing it via conda (mamba really) because of an issue related to the python version needed to build the conda environment. Your conda configuration file suggests we use pyton 3.6 but when I do that I get an error when running the pipeline with ./analyze.py {folder} :
"module 'asyncio' has no attribute 'create_task'"
This is because the create_task top-level function was added in Python 3.7 (https://stackoverflow.com/questions/53247533/attributeerror-module-asyncio-has-no-attribute-create-task)
I modified the conda config file condaenv.yml to replace the depency:
with :
Then, I run the modified setup.py file to create an environment with python 3.7.
The asyncio issue was solved but another one was generated later on with the conversion of the jupyter notebook to html.
jupyter nbconvert wanted me to specify the output format. This is probably due to differences in the versions of jupyter installed.
To solve this second option I added the "--to=html" to this line in the snakemake file
Line 90, before:
"jupyter nbconvert --output Summary --TemplateExporter.exclude_input=True {input}"
Line 90, after:
"jupyter nbconvert --output Summary --to=html --TemplateExporter.exclude_input=True {input}"
That solved the problem. Not sure if other problems were generated but just wanted to post this here just in case someone else runs into the same problem.
Cheers,
Erick
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.