metagenome-atlas / atlas_analyze Goto Github PK

View Code? Open in Web Editor NEW

5.0 5.0 1.0 1.02 MB

Scripts to get the most of the output of metagenome-atlas

Python 61.67% Jupyter Notebook 37.06% Shell 1.26%

atlas_analyze's People

Contributors

Stargazers

Watchers

Forkers

paleome

atlas_analyze's Issues

Error: Snakefile "/Snakefile" not found.

When I run

cd atlas_analyze
python analyze.py /scratch/gent/vo/000/gvo00043/vsc42339/MICROBIAN/CLEAN_READS

I get

Error: Snakefile "/Snakefile" not found.
Traceback (most recent call last):
  File "analyze.py", line 21, in <module>
    "snakemake "
  File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 265, in __new__
    raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail;  snakemake -d /scratch/gent/vo/000/gvo00043/vsc42339/MICROBIAN/CLEAN_READS -j 1 -s /Snakefile' returned non-zero exit status 1.

However, atlas_analyze/Snakefile does exist, but for some reason it does not find it. I tried copying it to the atlas working directory, but that didn't work

Suggestion for ./setup.sh if problems with mamba

When running ./setup.sh I got an error with mamba:

InstallError: Error: the following specs depend on 'conda' and can only be installed
into the root environment: mamba

Only when I run it in my directory miniconda3/bin;
conda install mamba -c conda-forge

It wants to update a lot of things:

Package plan for installation in environment /vsc-hard-mounts/leuven-data/314/vs                                                                                                                 c31426/miniconda3:

The following NEW packages will be INSTALLED:

    _libgcc_mutex:          0.1-conda_forge          conda-forge
    _openmp_mutex:          4.5-1_gnu                conda-forge
    brotlipy:               0.7.0-py36he6145b8_1001  conda-forge
    bzip2:                  1.0.8-h7f98852_4         conda-forge
    c-ares:                 1.17.1-h36c2ea0_0        conda-forge
    ca-certificates:        2020.12.5-ha878542_0     conda-forge
    certifi:                2020.12.5-py36h5fab9bb_1 conda-forge
    chardet:                4.0.0-py36h5fab9bb_1     conda-forge
    conda-package-handling: 1.7.2-py36he6145b8_0     conda-forge
    icu:                    68.1-h58526e2_0          conda-forge
    krb5:                   1.17.2-h926e7f8_0        conda-forge
    ld_impl_linux-64:       2.35.1-hed1e6ac_1        conda-forge
    libarchive:             3.5.1-h3f442fb_1         conda-forge
    libcurl:                7.71.1-hcdd3856_8        conda-forge
    libedit:                3.1.20191231-he28a2e2_2  conda-forge
    libev:                  4.33-h516909a_1          conda-forge
    libgcc-ng:              9.3.0-h2828fa1_18        conda-forge
    libgomp:                9.3.0-h2828fa1_18        conda-forge
    libiconv:               1.16-h516909a_0          conda-forge
    libnghttp2:             1.41.0-h8cfc5f6_2        conda-forge
    libsolv:                0.7.16-h8b12597_0        conda-forge
    libssh2:                1.9.0-hab1572f_5         conda-forge
    libstdcxx-ng:           9.3.0-h6de172a_18        conda-forge
    libxml2:                2.9.10-h72842e0_3        conda-forge
    lz4-c:                  1.9.3-h9c3ff4c_0         conda-forge
    lzo:                    2.10-h516909a_1000       conda-forge
    mamba:                  0.7.8-py36h05d92e0_1     conda-forge
    ncurses:                6.2-h58526e2_4           conda-forge
    pysocks:                1.7.1-py36h5fab9bb_3     conda-forge
    python_abi:             3.6-1_cp36m              conda-forge
    reproc:                 14.2.1-h36c2ea0_0        conda-forge
    reproc-cpp:             14.2.1-h58526e2_0        conda-forge
    tqdm:                   4.56.0-pyhd8ed1ab_0      conda-forge
    urllib3:                1.26.2-pyhd8ed1ab_0      conda-forge
    zstd:                   1.4.8-ha95c52a_1         conda-forge

The following packages will be UPDATED:

    conda:                  4.3.21-py36_0                        --> 4.9.2-py36h                                                                                                                 5fab9bb_0      conda-forge
    conda-env:              2.6.0-0                              --> 2.6.0-1                                                                                                                                    conda-forge
    cryptography:           1.8.1-py36_0                         --> 3.3.1-py36h                                                                                                                 0a59100_1      conda-forge
    openssl:                1.0.2l-0                             --> 1.1.1i-h7f9                                                                                                                 8852_0         conda-forge
    pycosat:                0.6.2-py36_0                         --> 0.6.3-py36h                                                                                                                 8f6f2f9_1006   conda-forge
    python:                 3.6.1-2                              --> 3.6.11-h6f2                                                                                                                 ec95_2_cpython conda-forge
    readline:               6.2-2                                --> 8.0-he28a2e                                                                                                                 2_2            conda-forge
    requests:               2.14.2-py36_0                        --> 2.25.1-pyhd                                                                                                                 3deb0d_0       conda-forge
    setuptools:             27.2.0-py36_0                        --> 49.6.0-py36                                                                                                                 h5fab9bb_3     conda-forge
    sqlite:                 3.13.0-0                             --> 3.34.0-h74c                                                                                                                 db3f_0         conda-forge
    tk:                     8.5.18-0                             --> 8.6.10-hed6                                                                                                                 95b0_1         conda-forge
    xz:                     5.2.2-1                              --> 5.2.5-h5169                                                                                                                 09a_1          conda-forge
    zlib:                   1.2.8-3                              --> 1.2.11-h516                                                                                                                 909a_1010      conda-forge

Proceed ([y]/n)? n

I was not sure about it if it wouldn't interfere with other miniconda envs..

So I did:
conda env create -n analyze -f condaenv.yml
and this worked with only 2 warnings:

Warning: 2 possible package resolutions (only showing differing packages):
  - conda-forge::glib-2.66.4-hc4f0c31_1, conda-forge::libglib-2.66.4-h748fe8e_1
  - conda-forge::glib-2.66.4-ha03b18c_1, conda-forge::libglib-2.66.4-hdb14261_1

It can maybe help others having the same issue.

Sofie

Issue in atlas_analyze.py at import_files step

Hi Silas,

I got this error while running analyze.py. Error pop up at at import_files step.

I am also wondering that is it ok to stop here, because the programe gave already all the output ineed (taxonomy.tsv, mapping_rate.tsv, genome_completeness.tsv, counts/raw_counts_genomes.tsv, counts/median_coverage_genomes.tsv, annotations/KO.tsv, annotations/CAZy.tsv, annotations/KO.tsv,annotations/CAZy.tsv)

Thank you.

Zhou

--------------------------------------error-----------------------
Job counts:
count jobs
1 import_files
1
[Mon Feb 8 14:47:37 2021]
Finished job 5.
3 of 6 steps (50%) done
Select jobs to execute...

[Mon Feb 8 14:47:37 2021]
localrule analyze:
input: Results/taxonomy.tsv, Results/mapping_rate.tsv, Results/genome_completeness.tsv, Results/counts/raw_counts_genomes.tsv, Results/counts/median_coverage_genomes.tsv, Results/annotations/KO.tsv, Results/annotations/CAZy.tsv, Results/annotations/KO.tsv, Results/annotations/CAZy.tsv
log: Results/Code.ipynb
jobid: 2

[NbConvertApp] ERROR | Notebook JSON is invalid: Additional properties are not allowed ('id' was unexpected)

Failed validating 'additionalProperties' in code_cell:

On instance['cells'][0]:
{'cell_type': 'code',
'execution_count': None,
'id': 'talented-colors',
'metadata': {'tags': ['snakemake-job-properties']},
'outputs': ['...0 outputs...'],
'source': '\n'
'######## snakemake preamble start (automatically inserted, do '
'n...'}

Traceback (most recent call last):
File "/home/anaconda3/envs/analyze/bin/jupyter-nbconvert", line 11, in
sys.exit(main())
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/jupyter_core/application.py", line 254, in launch_instance
return super(JupyterApp, cls).launch_instance(argv=argv, kwargs)
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/traitlets/config/application.py", line 845, in launch_instance
app.start()
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 350, in start
self.convert_notebooks()
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 524, in convert_notebooks
self.convert_single_notebook(notebook_filename)
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 489, in convert_single_notebook
output, resources = self.export_single_notebook(notebook_filename, resources, input_buffer=input_buffer)
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 418, in export_single_notebook
output, resources = self.exporter.from_filename(notebook_filename, resources=resources)
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 181, in from_filename
return self.from_file(f, resources=resources, kw)
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 199, in from_file
return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, kw)
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/notebook.py", line 32, in from_notebook_node
nb_copy, resources = super().from_notebook_node(nb, resources, kw)
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 143, in from_notebook_node
nb_copy, resources = self._preprocess(nb_copy, resources)
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 318, in _preprocess
nbc, resc = preprocessor(nbc, resc)
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/base.py", line 47, in call
return self.preprocess(nb, resources)
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 79, in preprocess
self.execute()
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 74, in wrapped
return just_run(coro(*args, **kwargs))
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 53, in just_run
return loop.run_until_complete(coro)
File "/home/anaconda3/envs/analyze/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete
return future.result()
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbclient/client.py", line 541, in async_execute
cell, index, execution_count=self.code_cells_executed + 1
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 123, in async_execute_cell
cell, resources = self.preprocess_cell(cell, self.resources, cell_index)
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 146, in preprocess_cell
cell = run_sync(NotebookClient.async_execute_cell)(self, cell, index, store_history=self.store_history)
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 74, in wrapped
return just_run(coro(*args, **kwargs))
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 53, in just_run
return loop.run_until_complete(coro)
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nest_asyncio.py", line 98, in run_until_complete
return f.result()
File "/home/anaconda3/envs/analyze/lib/python3.7/asyncio/futures.py", line 181, in result
raise self._exception
File "/home/anaconda3/envs/analyze/lib/python3.7/asyncio/tasks.py", line 249, in __step
result = coro.send(None)
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbclient/client.py", line 832, in async_execute_cell
self._check_raise_for_error(cell, exec_reply)
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbclient/client.py", line 740, in _check_raise_for_error
raise CellExecutionError.from_cell_and_msg(cell, exec_reply['content'])
nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:

get most abundant genomes

counts_per_genome= relab.sum().sort_values()
ax= counts_per_genome[-10:].plot.bar(figsize=(10,5))

_= ax.set_xticklabels(Labels.loc[counts_per_genome.index[-10:]])
ax.set_title('Most abundant genomes')
ax.set_ylabel('Abundance [relab]')

TypeError Traceback (most recent call last)
in
2
3 counts_per_genome= relab.sum().sort_values()
----> 4 ax= counts_per_genome[-10:].plot.bar(figsize=(10,5))
5
6 _= ax.set_xticklabels(Labels.loc[counts_per_genome.index[-10:]])

~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/plotting/_core.py in bar(self, x, y, **kwargs)
946 >>> ax = df.plot.bar(x='lifespan', rot=0)
947 """
--> 948 return self(kind="bar", x=x, y=y, **kwargs)
949
950 def barh(self, x=None, y=None, **kwargs):

~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/plotting/_core.py in call(self, *args, **kwargs)
792 data.columns = label_name
793
--> 794 return plot_backend.plot(data, kind=kind, **kwargs)
795
796 def line(self, x=None, y=None, **kwargs):

~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/plotting/_matplotlib/init.py in plot(data, kind, **kwargs)
60 kwargs["ax"] = getattr(ax, "left_ax", ax)
61 plot_obj = PLOT_CLASSES[kind](data, **kwargs)
---> 62 plot_obj.generate()
63 plot_obj.draw()
64 return plot_obj.result

~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/plotting/_matplotlib/core.py in generate(self)
277 def generate(self):
278 self._args_adjust()
--> 279 self._compute_plot_data()
280 self._setup_subplots()
281 self._make_plot()

~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/plotting/_matplotlib/core.py in _compute_plot_data(self)
402 data = data._convert(datetime=True, timedelta=True)
403 numeric_data = data.select_dtypes(
--> 404 include=[np.number, "datetime", "datetimetz", "timedelta"]
405 )
406

~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/core/frame.py in select_dtypes(self, include, exclude)
3440 # the "union" of the logic of case 1 and case 2:
3441 # we get the included and excluded, and return their logical and
-> 3442 include_these = Series(not bool(include), index=self.columns)
3443 exclude_these = Series(not bool(exclude), index=self.columns)
3444

~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/core/series.py in init(self, data, index, dtype, name, copy, fastpath)
312 data = data.copy()
313 else:
--> 314 data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
315
316 data = SingleBlockManager(data, index, fastpath=True)

~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/core/internals/construction.py in sanitize_array(data, index, dtype, copy, raise_cast_failure)
710 value = maybe_cast_to_datetime(value, dtype)
711
--> 712 subarr = construct_1d_arraylike_from_scalar(value, len(index), dtype)
713
714 else:

~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/core/dtypes/cast.py in construct_1d_arraylike_from_scalar(value, length, dtype)
1231 value = ensure_str(value)
1232
-> 1233 subarr = np.empty(length, dtype=dtype)
1234 subarr.fill(value)
1235

TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type
TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type

[Mon Feb 8 14:52:19 2021]
Error in rule analyze:
jobid: 2
log: Results/Code.ipynb (check log file(s) for error message)

RuleException:
CalledProcessError in line 82 of /home/Desktop/atlasTest/atlas_analyze/Snakefile:
Command 'set -euo pipefail; jupyter-nbconvert --log-level ERROR --execute --output /data/GV009/GV009_035/Results/Code.ipynb --to notebook --ExecutePreprocessor.timeout=-1 /data/GV009/GV009_035/.snakemake/scripts/tmpntaqydkv.Analyis_genome_abundances.ipynb' returned non-zero exit status 1.
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 2340, in run_wrapper
File "/home/Desktop/atlasTest/atlas_analyze/Snakefile", line 82, in __rule_analyze
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 568, in _callback
File "/home/anaconda3/envs/analyze/lib/python3.7/concurrent/futures/thread.py", line 57, in run
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 554, in cached_or_run
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 2352, in run_wrapper
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /data/GV009/GV009_035/.snakemake/log/2021-02-08T144532.032128.snakemake.log
Traceback (most recent call last):
File "/home/Desktop/atlasTest/atlas_analyze/analyze.py", line 21, in
"snakemake "
File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 213, in new
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail; snakemake -d . -j 1 -s /home/Desktop/atlasTest/atlas_analyze/Snakefile' returned non-zero exit status 1.

Error in rule convert_nb

Hi Silas,

I was trying out atlas_analyze. So far so good, I have only one error at the second last step in rule convert_nb.
Here is the error line:

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job counts:
        count   jobs
        1       all
        1       convert_nb
        2
Select jobs to execute...

[Sat Jan 16 16:07:43 2021]
rule convert_nb:
    input: Results/Code.ipynb
    output: Results/Summary.html
    jobid: 1

[NbConvertApp] Converting notebook Results/Code.ipynb to html
Traceback (most recent call last):
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbformat/reader.py", line 14, in parse_json
    nb_dict = json.loads(s, **kwargs)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/bin/jupyter-nbconvert", line 11, in <module>
    sys.exit(main())
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/jupyter_core/application.py", line 254, in launch_instance
    return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/traitlets/config/application.py", line 845, in launch_instance
    app.start()
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 350, in start
    self.convert_notebooks()
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 524, in convert_notebooks
    self.convert_single_notebook(notebook_filename)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 489, in convert_single_notebook
    output, resources = self.export_single_notebook(notebook_filename, resources, input_buffer=input_buffer)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 418, in export_single_notebook
    output, resources = self.exporter.from_filename(notebook_filename, resources=resources)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 181, in from_filename
    return self.from_file(f, resources=resources, **kw)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 199, in from_file
    return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, **kw)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbformat/__init__.py", line 143, in read
    return reads(buf, as_version, **kwargs)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbformat/__init__.py", line 73, in reads
    nb = reader.reads(s, **kwargs)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbformat/reader.py", line 58, in reads
    nb_dict = parse_json(s, **kwargs)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbformat/reader.py", line 17, in parse_json
    raise NotJSONError(("Notebook does not appear to be JSON: %r" % s)[:77] + "...") from e
nbformat.reader.NotJSONError: Notebook does not appear to be JSON: ''...
[Sat Jan 16 16:07:49 2021]
Error in rule convert_nb:
    jobid: 1
    output: Results/Summary.html
    shell:
        jupyter nbconvert --output Summary --to=html --TemplateExporter.exclude_input=True Results/Code.ipynb
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /ddn1/vol1/site_scratch/leuven/314/vsc31426/NGS_oct/Narmena/.snakemake/log/2021-01-16T160743.880488.snakemake.log
Traceback (most recent call last):
  File "./analyze.py", line 21, in <module>
    "snakemake "
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 213, in __new__
    raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail;  snakemake -d /ddn1/vol1/site_scratch/leuven/314/vsc31426/NGS_oct/Narmena -j 1 -s ./Snakefile' returned non-zero exit status 1.

And this was the previous error:

[Sat Jan 16 16:05:35 2021]
localrule analyze:
    input: Results/taxonomy.tsv, Results/mapping_rate.tsv, Results/genome_completeness.tsv, Results/counts/raw_counts_genomes.tsv, Results/counts/median_coverage_genomes.tsv, Results/annotations/KO.tsv, Results/annotations/CAZy.tsv, Results/annotations/KO.tsv, Results/annotations/CAZy.tsv
    log: Results/Code.ipynb
    jobid: 2

[NbConvertApp] ERROR | Notebook JSON is invalid: Additional properties are not allowed ('id' was unexpected)

Failed validating 'additionalProperties' in code_cell:

On instance['cells'][0]:
{'cell_type': 'code',
 'execution_count': None,
 'id': 'hourly-liberal',
 'metadata': {'tags': ['snakemake-job-properties']},
 'outputs': ['...0 outputs...'],
 'source': '\n'
           '######## snakemake preamble start (automatically inserted, do '
           'n...'}
Traceback (most recent call last):
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/bin/jupyter-nbconvert", line 11, in <module>
    sys.exit(main())
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/jupyter_core/application.py", line 254, in launch_instance
    return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/traitlets/config/application.py", line 845, in launch_instance
    app.start()
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 350, in start
    self.convert_notebooks()
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 524, in convert_notebooks
    self.convert_single_notebook(notebook_filename)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 489, in convert_single_notebook
    output, resources = self.export_single_notebook(notebook_filename, resources, input_buffer=input_buffer)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 418, in export_single_notebook
    output, resources = self.exporter.from_filename(notebook_filename, resources=resources)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 181, in from_filename
    return self.from_file(f, resources=resources, **kw)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 199, in from_file
    return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, **kw)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/notebook.py", line 32, in from_notebook_node
    nb_copy, resources = super().from_notebook_node(nb, resources, **kw)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 143, in from_notebook_node
    nb_copy, resources = self._preprocess(nb_copy, resources)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 318, in _preprocess
    nbc, resc = preprocessor(nbc, resc)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/base.py", line 47, in __call__
    return self.preprocess(nb, resources)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 79, in preprocess
    self.execute()
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 74, in wrapped
    return just_run(coro(*args, **kwargs))
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 53, in just_run
    return loop.run_until_complete(coro)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete
    return future.result()
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbclient/client.py", line 541, in async_execute
    cell, index, execution_count=self.code_cells_executed + 1
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 123, in async_execute_cell
    cell, resources = self.preprocess_cell(cell, self.resources, cell_index)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 146, in preprocess_cell
    cell = run_sync(NotebookClient.async_execute_cell)(self, cell, index, store_history=self.store_history)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 74, in wrapped
    return just_run(coro(*args, **kwargs))
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 53, in just_run
    return loop.run_until_complete(coro)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nest_asyncio.py", line 98, in run_until_complete
    return f.result()
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/asyncio/futures.py", line 181, in result
    raise self._exception
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/asyncio/tasks.py", line 249, in __step
    result = coro.send(None)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbclient/client.py", line 832, in async_execute_cell
    self._check_raise_for_error(cell, exec_reply)
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/nbclient/client.py", line 740, in _check_raise_for_error
    raise CellExecutionError.from_cell_and_msg(cell, exec_reply['content'])
nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
------------------
#CAZy
CAZy_annotations_genome= pd.read_table('Results/annotations/CAZy.tsv',index_col=0)
CAZy_presence= (CAZy_annotations_genome>0).astype(int)
CAZy_presence.head()


function_relab = relab @ CAZy_presence

sns.clustermap(function_relab)

function_relab.head()
------------------

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-87471da32f9f> in <module>
      7 function_relab = relab @ CAZy_presence
      8
----> 9 sns.clustermap(function_relab)
     10
     11 function_relab.head()

/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/_decorators.py in inner_f(*args, **kwargs)
     44             )
     45         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 46         return f(**kwargs)
     47     return inner_f
     48

/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/matrix.py in clustermap(data, pivot_kws, method, metric, z_score, standard_scale, figsize, cbar_kws, row_cluster, col_cluster, row_linkage, col_linkage, row_colors, col_colors, mask, dendrogram_ratio, colors_ratio, cbar_pos, tree_kws, **kwargs)
   1410                         row_cluster=row_cluster, col_cluster=col_cluster,
   1411                         row_linkage=row_linkage, col_linkage=col_linkage,
-> 1412                         tree_kws=tree_kws, **kwargs)

/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/matrix.py in plot(self, metric, method, colorbar_kws, row_cluster, col_cluster, row_linkage, col_linkage, tree_kws, **kws)
   1221         self.plot_dendrograms(row_cluster, col_cluster, metric, method,
   1222                               row_linkage=row_linkage, col_linkage=col_linkage,
-> 1223                               tree_kws=tree_kws)
   1224         try:
   1225             xind = self.dendrogram_col.reordered_ind

/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/matrix.py in plot_dendrograms(self, row_cluster, col_cluster, metric, method, row_linkage, col_linkage, tree_kws)
   1067                 self.data2d, metric=metric, method=method, label=False, axis=0,
   1068                 ax=self.ax_row_dendrogram, rotate=True, linkage=row_linkage,
-> 1069                 tree_kws=tree_kws
   1070             )
   1071         else:

/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/_decorators.py in inner_f(*args, **kwargs)
     44             )
     45         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 46         return f(**kwargs)
     47     return inner_f
     48

/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/matrix.py in dendrogram(data, linkage, axis, label, metric, method, rotate, tree_kws, ax)
    774     plotter = _DendrogramPlotter(data, linkage=linkage, axis=axis,
    775                                  metric=metric, method=method,
--> 776                                  label=label, rotate=rotate)
    777     if ax is None:
    778         ax = plt.gca()

/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/matrix.py in __init__(self, data, linkage, metric, method, axis, label, rotate)
    582
    583         if linkage is None:
--> 584             self.linkage = self.calculated_linkage
    585         else:
    586             self.linkage = linkage

/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/matrix.py in calculated_linkage(self)
    649                 warnings.warn(msg)
    650
--> 651         return self._calculate_linkage_scipy()
    652
    653     def calculate_dendrogram(self):

/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/seaborn/matrix.py in _calculate_linkage_scipy(self)
    618     def _calculate_linkage_scipy(self):
    619         linkage = hierarchy.linkage(self.array, method=self.method,
--> 620                                     metric=self.metric)
    621         return linkage
    622

/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/scipy/cluster/hierarchy.py in linkage(y, method, metric, optimal_ordering)
   1063
   1064     if not np.all(np.isfinite(y)):
-> 1065         raise ValueError("The condensed distance matrix must contain only "
   1066                          "finite values.")
   1067

ValueError: The condensed distance matrix must contain only finite values.
ValueError: The condensed distance matrix must contain only finite values.

[Sat Jan 16 16:06:24 2021]
Error in rule analyze:
    jobid: 2
    log: Results/Code.ipynb (check log file(s) for error message)

RuleException:
CalledProcessError in line 82 of /ddn1/vol1/site_scratch/leuven/314/vsc31426/atlas_analyse/atlas_analyze/Snakefile:
Command 'set -euo pipefail;  jupyter-nbconvert --log-level ERROR --execute --output /ddn1/vol1/site_scratch/leuven/314/vsc31426/NGS_oct/Narmena/Results/Code.ipynb --to notebook --ExecutePreprocessor.timeout=-1 /ddn1/vol1/site_scratch/leuven/314/vsc31426/NGS_oct/Narmena/.snakemake/scripts/tmpbmbi86fl.Analyis_genome_abundances.ipynb' returned non-zero exit status 1.
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2338, in run_wrapper
  File "/ddn1/vol1/site_scratch/leuven/314/vsc31426/atlas_analyse/atlas_analyze/Snakefile", line 82, in __rule_analyze
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 566, in _callback
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/concurrent/futures/thread.py", line 57, in run
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 552, in cached_or_run
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2350, in run_wrapper
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /ddn1/vol1/site_scratch/leuven/314/vsc31426/NGS_oct/Narmena/.snakemake/log/2021-01-16T160453.469631.snakemake.log
Traceback (most recent call last):
  File "./analyze.py", line 21, in <module>
    "snakemake "
  File "/vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 213, in __new__
    raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail;  snakemake -d /ddn1/vol1/site_scratch/leuven/314/vsc31426/NGS_oct/Narmena -j 1 -s ./Snakefile' returned non-zero exit status 1.

Utils error

Hi team, i am facing the error pasted below. This is a run that successfully completed in Atlas. Do you know what the problem is?

Traceback (most recent call last):
File "/home/marco/miniconda3/atlas-testing/.snakemake/scripts/tmpgsvxn6_u.get_taxonomy.py", line 11, in
from utils.mag_scripts import tax2table
ModuleNotFoundError: No module named 'utils'
[Wed Sep 29 18:02:47 2021]
Error in rule get_taxonomy:
jobid: 3
output: Results/taxonomy.tsv

Traceback (most recent call last):
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 593, in _callback
raise ex
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 579, in cached_or_run
run_func(*args)
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 2460, in run_wrapper
raise ex
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/init.py", line 2441, in run_wrapper
runtime_sourcecache_path,
File "/home/marco/miniconda3/atlas_analyze/Snakefile", line 97, in __rule_get_taxonomy
"Results/mapping_rate.tsv"
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 1365, in script
executor.evaluate()
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 377, in evaluate
self.execute_script(fd.name, edit=edit)
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 578, in execute_script
self._execute_cmd("{py_exec} {fname:q}", py_exec=py_exec, fname=fname)
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 421, in _execute_cmd
**kwargs
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 265, in new
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail; /home/marco/miniconda3/envs/analyze/bin/python3.7 /home/marco/miniconda3/atlas-testing/.snakemake/scripts/tmpgsvxn6_u.get_taxonomy.py' returned non-zero exit status 1.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /home/marco/miniconda3/atlas-testing/.snakemake/log/2021-09-29T180246.949895.snakemake.log
Traceback (most recent call last):
File "./analyze.py", line 21, in
"snakemake "
File "/home/marco/miniconda3/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 265, in new
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail; snakemake -d /home/marco/miniconda3/atlas-testing/ -j 1 -s ./Snakefile' returned non-zero exit status 1.

error in rule get_taxonomy

When running

cd atlas_analyze
python analyze.py /scratch/gent/vo/000/gvo00043/vsc42339/MICROBIAN/CLEAN_READS -s ./Snakefile

I get an error in job 3 out of 7: rule get_taxonomy:

Error in rule get_taxonomy:
    jobid: 3
    output: Results/taxonomy.tsv

Traceback (most recent call last):
  File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 593, in _callback
    raise ex
  File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 579, in cached_or_run
    run_func(*args)
  File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2460, in run_wrapper
    raise ex
  File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2441, in run_wrapper
    runtime_sourcecache_path,
  File "/kyukon/scratch/gent/vo/000/gvo00043/vsc42339/atlas_analyze/Snakefile", line 97, in __rule_get_taxonomy
    "Results/mapping_rate.tsv"
  File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 1365, in script
    executor.evaluate()
  File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 377, in evaluate
    self.execute_script(fd.name, edit=edit)
  File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 578, in execute_script
    self._execute_cmd("{py_exec} {fname:q}", py_exec=py_exec, fname=fname)
  File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/script.py", line 421, in _execute_cmd
    **kwargs
  File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 265, in __new__
    raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail;  /scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/bin/python3.7 /kyukon/scratch/gent/vo/000/gvo00043/vsc42339/MICROBIAN/CLEAN_READS/.snakemake/scripts/tmp2ifocuhj.get_taxonomy.py' returned non-zero exit status 1.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /kyukon/scratch/gent/vo/000/gvo00043/vsc42339/MICROBIAN/CLEAN_READS/.snakemake/log/2021-09-27T174323.495744.snakemake.log
Traceback (most recent call last):
  File "analyze.py", line 21, in <module>
    "snakemake "
  File "/scratch/gent/vo/000/gvo00043/vsc42339/conda/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 265, in __new__
    raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail;  snakemake -d /kyukon/scratch/gent/vo/000/gvo00043/vsc42339/MICROBIAN/CLEAN_READS -j 1 -s /Snakefile -s ./Snakefile' returned non-zero exit status 1.

complete snakemake log:

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job                 count    min threads    max threads
----------------  -------  -------------  -------------
all                     1              1              1
analyze                 1              1              1
convert_nb              1              1              1
get_annotations         1              1              1
get_mapping_rate        1              1              1
get_taxonomy            1              1              1
import_files            1              1              1
total                   7              1              1

Select jobs to execute...

[Mon Sep 27 17:43:30 2021]
rule get_mapping_rate:
    input: stats/read_counts.tsv, genomes/counts/raw_counts_genomes.tsv
    output: Results/mapping_rate.tsv
    jobid: 4
    resources: tmpdir=/tmp

[Mon Sep 27 17:43:39 2021]
Finished job 4.
1 of 7 steps (14%) done
Select jobs to execute...

[Mon Sep 27 17:43:39 2021]
rule get_annotations:
    input: genomes/annotations/gene2genome.tsv.gz, Genecatalog/annotations/eggNog.tsv.gz
    output: genomes/annotations/KO.tsv, genomes/annotations/CAZy.tsv, Genecatalog/annotations/KO.tsv, Genecatalog/annotations/CAZy.tsv
    jobid: 6
    resources: tmpdir=/tmp, mem=60

[Mon Sep 27 17:59:30 2021]
Finished job 6.
2 of 7 steps (29%) done
Select jobs to execute...

[Mon Sep 27 17:59:30 2021]
localrule get_taxonomy:
    input: genomes/taxonomy/gtdb/gtdbtk.bac120.summary.tsv
    output: Results/taxonomy.tsv
    jobid: 3
    resources: tmpdir=/tmp

[Mon Sep 27 17:59:38 2021]
Error in rule get_taxonomy:
    jobid: 3
    output: Results/taxonomy.tsv

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /kyukon/scratch/gent/vo/000/gvo00043/vsc42339/MICROBIAN/CLEAN_READS/.snakemake/log/2021-09-27T174323.495744.snakemake.log

rules get_mapping_rate and get_annotations seem to have finished fine, but somehow get_taxonomy was not able to run

Question KO_tsv file and plot

Hi Silas,

I have a question regarding the output written to the 'results' folder:
(1) KO.tsv
Structure is:
MAG K00001 K0002 etc..
MAG1 1 1

and the file written to the Genecatalog/annotations folder:
(2) KO.tsv
Structure is:
Gene0014228 KO7304
Gene...

And the summary.html the last table and plot of the Kegg orthologs. I was trying to understand what I am seeing, or which file you load in:

Is this Kegg ortholog table and heatmap based based on your file (1), so it are the KEGG orthologs/mags/sample

Or do you read in the file (2), KEGG ortholog/gene/(somehow gene_abundance)/sample?

I wanted to do further downstream analyses with the genecatalogs file, but I don't know how you translate the query to its abundance (occurrence) in a sample.

I was thinking, can we make graphs in which we express the abundance of certain genes relative to a common gene in each sample. So we speak about ratio's, I think this reduces (statistically) a bit the dependency/skew of the data on its different throughput. Like rpoB as reference gene, and suppose the query I am interested in, is an oil degrading gene. So I can say in this groundwater monitoring well, I have 5 oil degrading targets/ 10 rpob (rpoB is common to all bacteria, so I can say either half of my bacteria population is oil degrader, or, one strain expresses 5 oil degrading genes, etc...). The samples can have different number of reads, that doesn't really matter than, as long as we take the ratio between strains in a sample?

Jackson did something similar with the outcome of metannotate and I was wondering if I can translate the outcome of your genecatalog file as input to his R script. I just need one extra column in the table saying to which sample, the gene belongs, then I can continue :-)

Sofie

Missing raw_counts_genomes.tsv

Hi!

I was just trying to play around with the scripts here, is this repo still in use?

I'm now getting this error:

MissingInputException in rule get_mapping_rate  in line 92 of /scratch/pawsey0390/pbayer/seagrass_atlas/atlas_analyze/Snakefile:
Missing input files for rule get_mapping_rate:
    output: Results/mapping_rate.tsv
    affected files:
        genomes/counts/raw_counts_genomes.tsv

As it seems like the rules in binning.smk dropped tsv files in favor of parquet files. I guess that wasn't updated here?

problems with asyncio when building conda environment with python 3.6

Hi,
First of all thanks for creating this tool. I had some troubles installing it via conda (mamba really) because of an issue related to the python version needed to build the conda environment. Your conda configuration file suggests we use pyton 3.6 but when I do that I get an error when running the pipeline with ./analyze.py {folder} :

"module 'asyncio' has no attribute 'create_task'"

This is because the create_task top-level function was added in Python 3.7 (https://stackoverflow.com/questions/53247533/attributeerror-module-asyncio-has-no-attribute-create-task)

I modified the conda config file condaenv.yml to replace the depency:

python=3.6

with :

python=3.7

Then, I run the modified setup.py file to create an environment with python 3.7.

The asyncio issue was solved but another one was generated later on with the conversion of the jupyter notebook to html.
jupyter nbconvert wanted me to specify the output format. This is probably due to differences in the versions of jupyter installed.

To solve this second option I added the "--to=html" to this line in the snakemake file

Line 90, before:

"jupyter nbconvert --output Summary --TemplateExporter.exclude_input=True {input}"

Line 90, after:

"jupyter nbconvert --output Summary --to=html --TemplateExporter.exclude_input=True {input}"

That solved the problem. Not sure if other problems were generated but just wanted to post this here just in case someone else runs into the same problem.

Cheers,

Erick

metagenome-atlas / atlas_analyze Goto Github PK

atlas_analyze's People

Contributors

Stargazers

Watchers

Forkers

atlas_analyze's Issues

Error: Snakefile "/Snakefile" not found.

Suggestion for ./setup.sh if problems with mamba

Issue in atlas_analyze.py at import_files step

get most abundant genomes

_= ax.set_xticklabels(Labels.loc[counts_per_genome.index[-10:]])
ax.set_title('Most abundant genomes')
ax.set_ylabel('Abundance [relab]')

Error in rule convert_nb

Utils error

error in rule get_taxonomy

Question KO_tsv file and plot

Missing raw_counts_genomes.tsv

problems with asyncio when building conda environment with python 3.6

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

metagenome-atlas / atlas_analyze Goto Github PK

atlas_analyze's People

Contributors

Stargazers

Watchers

Forkers

atlas_analyze's Issues

get most abundant genomes

_= ax.set_xticklabels(Labels.loc[counts_per_genome.index[-10:]]) ax.set_title('Most abundant genomes') ax.set_ylabel('Abundance [relab]')

Recommend Projects

Recommend Topics

Recommend Org

_= ax.set_xticklabels(Labels.loc[counts_per_genome.index[-10:]])
ax.set_title('Most abundant genomes')
ax.set_ylabel('Abundance [relab]')