Coder Social home page Coder Social logo

lcmsmatching's Introduction

Docker Automated buil Docker Pulls Docker Stars bioconda-badge Build Status

workflow

Our project

The Workflow4Metabolomics, W4M in short, is a French infrastructure offering software tool processing, analyzing and annotating metabolomics data. It is based on the Galaxy platform.

In the context of collaboration between metabolomics (MetaboHUB French infrastructure) and bioinformatics platforms (IFB: Institut Français de Bioinformatique), we have developed full LC/MS, GC/MS and NMR pipelines using Galaxy framework for data analysis including preprocessing, normalization, quality control, statistical analysis and annotation steps. Those modular and extensible workflows are composed with existing components (XCMS and CAMERA packages, etc.) but also a whole suite of complementary homemade tools. This implementation is accessible through a web interface, which guarantees the parameters completeness. The advanced features of Galaxy have made possible the integration of components from different sources and of different types. Thus, an extensible Virtual Research Environment (VRE) is offered to metabolomics communities (platforms, end users, etc.), and enables preconfigured workflows sharing for new users, but also experts in the field.

Citation

Giacomoni F., Le Corguillé G., Monsoor M., Landi M., Pericard P., Pétéra M., Duperier C., Tremblay-Franco M., Martin J.-F., Jacob D., Goulitquer S., Thévenot E.A. and Caron C. (2014). Workflow4Metabolomics: A collaborative research infrastructure for computational metabolomics. Bioinformatics, http://dx.doi.org/10.1093/bioinformatics/btu813

Galaxy

Galaxy is an open, web-based platform for data intensive biomedical research. Whether on the free public server or your own instance, you can perform, reproduce, and share complete analyses.

Homepage: https://galaxyproject.org/

workflow

How to contribute

Get our tools

All our tools are publicly available in GitHub and freely installable through the Galaxy ToolShed

However, we will be glad to have [good] feedbacks on their usage in order to motivate us (and our funders).

It will also be great if you can cite our papers:

Franck Giacomoni, Gildas Le Corguillé, Misharl Monsoor, Marion Landi, Pierre Pericard, Mélanie Pétéra, Christophe Duperier, Marie Tremblay-Franco, Jean-François Martin, Daniel Jacob, Sophie Goulitquer, Etienne A. Thévenot and Christophe Caron (2014). Workflow4Metabolomics: A collaborative research infrastructure for computational metabolomics. Bioinformatics

doi:10.1093/bioinformatics/btu813

Push your tools / W4M as a Showcase

Your tools can be installed, integrated and hosted within the main W4M instance Tools.

Quality standards

However, the tools must stick to the IUC standards in order to be easily integrated:

In the first place, your tools will be displayed in the Contribution section of the tool panel. And eventually, it should be promoted among the other tools.

Advanced mode

In order to be fully integrated in our reference workflows, your tools must follow your exchange formats between tools (for more information, contact us).

A collaboration should be established if help is needed!

Support / HelpDesk

In all cases, the tools must be maintained by the developers themselves. A tool can be removed if this after sales service isn't done.

Guidelines

lcmsmatching's People

Contributors

bernt-matthias avatar lecorguille avatar pkrog avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

bernt-matthias

lcmsmatching's Issues

speed up HTML writing

The HTML writing is done by writing line by line into the output file.
Build the file in memory instead and write it in a single shot.
This way of writing could be related to slow execution in container.

Chromatographic column names are ambiguous in Peakforest

The Peakforest column names displayed are ambiguous.
There is no indication of the platform/laboratory that has input the column inside the database.
Even inside a laboratory there could be several installations of the same column.

List chrom cols tool

Create a separate tool (hence a separate XML file list-chrom-cols.xml) for listing chrom cols of a database.

Matching de deux variables en même temps

Une idée de Stéphane Bernillon, INRA Bordeaux :

Bonjour Pierrick
Merci pour la présentation de ton outil de matching pour les spectres MS.
Si j'ai bien compris, il s'agit d'utiliser les m/z des variables de la table "Variable Metadata" et de les comparer une à une avec tous les m/z d'une bibliothèque de référence.
Je vois deux usages à cet outil :

  •   L'annotation de variables dans une matrice précédemment annotée.
    
  •   L'annotation d'une matrice jamais annotée.
    

Pour le premier cas, en l'état actuel, l'outil est complètement satisfaisant avec une base in-house ad hoc en combinant m/z et Rt.

Dans le deuxième cas, il me semblerait intéressant d'utiliser l'information du temps de rétention des variables du fichier.
Si je prends l'exemple d'un métabolite inconnu associé aux variables M100T1000 et M200T1002.
Si je cherche successivement pour la variable M100T1000 et M200T1002, les spectres proposés seront moins pertinents que si je cherche pour les deux variables M100T1000 et M200T1002 en même temps.
Il reste à trouver le bon critère pour associer les variables ce qui permettrait de faire une recherche sur un pseudo-spectre et non pas sur un m/z unique. Un coefficient de corrélation pourrait être ce critère.

Je suis disponible pour en discuter plus en détail si nécessaire.
A bientôt
Stéphane

Conda dependency seemingly installed but failed to build job environment

Hello. Did anybody get this error:
Conda dependency seemingly installed but failed to build job environment.
??

From general Galaxy bug reports it seems to be related to the default behaviour of Conda with job environments which are taking long time to build (which might be the case here). However it's only a guess. Here it is in more detail:

Traceback (most recent call last):
File "/galaxy-central/lib/galaxy/jobs/runners/init.py", line 170, in prepare_job
job_wrapper.prepare()
File "/galaxy-central/lib/galaxy/jobs/init.py", line 971, in prepare
self.dependency_shell_commands = self.tool.build_dependency_shell_commands(job_directory=self.working_directory)
File "/galaxy-central/lib/galaxy/tools/init.py", line 1415, in build_dependency_shell_commands
tool_instance=self
File "/galaxy-central/lib/galaxy/tools/deps/init.py", line 112, in dependency_shell_commands
return [dependency.shell_commands(requirement) for requirement, dependency in requirement_to_dependency.items()]
File "/galaxy-central/lib/galaxy/tools/deps/resolvers/conda.py", line 392, in shell_commands
self.build_environment()
File "/galaxy-central/lib/galaxy/tools/deps/resolvers/conda.py", line 387, in build_environment
raise DependencyException("Conda dependency seemingly installed but failed to build job environment.")
DependencyException: Conda dependency seemingly installed but failed to build job environment.

Wrong reordering of column output

In output, do not insert columns at beginning of array, append them at the end, in order to do not disturb the importance of the first column in W4M (variable names).
Make an option for that in the script.

Remove useless methods

Methods like getMoleculesIds(), getMoleculeNames() and others are deprecated. They are not used by search-mz script. Remove them or put them aside for 4TabSql and Xls databases for porting them later into biodb.

Add a NA value for pos mode or neg mode

In the Galaxy tool page, in fields "File database MS Positive mode" and "File database MS Negative mode", in case of only one mode found mode name (either neg or pos), add the missing mode or propose NA.

Append columns

Add an option for outputing the same input file with new columns appended to it. Do not change order of columns of input file, write all columns as they are and only append new columns.

Integrate PeakForest compound information

When a match is found in PeakForest, only the spectra info is printed in the output tables, and the PeakForest compound ID.
We need to retrieve compound info and add it to the output.

Wrong link in HTML output when several results per row

In main output, where several results can be shown on the same line, database IDs are listed using a character separator like "," or "|". The HTML output of this main table, does not however take account of that, and thus displays a wrong URL link.

Update README

  • Check XML and particularly the help text.
  • Write all changes made since version 3.4.3 in README update section.

Remove ant

Try to remove the use of ant.
Particularly, remove the need to ant test-data in test subdir.
Maybe store generated files and do not generate them anymore, or use a Makefile to re-generate them if needed.

Planemo test failing

See branch refact/makefile.

----------------------------------------------------------------------
XML: /private/var/folders/kd/nz_frc_x1231cz37xvyzc_v40000gn/T/tmpdrFQKD/xunit.xml
----------------------------------------------------------------------
Ran 1 test in 82.431s

FAILED (errors=1)
2017-03-26 18:22:45,435 INFO  [functional_tests.py] Shutting down
2017-03-26 18:22:45,435 INFO  [functional_tests.py] Shutting down embedded web server
2017-03-26 18:22:45,454 INFO  [functional_tests.py] Embedded web server stopped
2017-03-26 18:22:45,455 INFO  [functional_tests.py] Shutting down app
2017-03-26 18:22:45,455 INFO  [galaxy.jobs.handler] sending stop signal to worker thread
2017-03-26 18:22:45,455 INFO  [galaxy.jobs.handler] job handler queue stopped
2017-03-26 18:22:45,455 INFO  [galaxy.jobs.runners] TaskRunner: Sending stop signal to 2 worker threads
2017-03-26 18:22:45,455 INFO  [galaxy.jobs.runners] LocalRunner: Sending stop signal to 4 worker threads
2017-03-26 18:22:45,455 INFO  [galaxy.jobs.handler] sending stop signal to worker thread
2017-03-26 18:22:45,456 INFO  [galaxy.jobs.handler] job handler stop queue stopped
2017-03-26 18:22:45,459 INFO  [functional_tests.py] Embedded Universe application stopped
2017-03-26 18:22:45,460 INFO  [functional_tests.py] Cleaning up temporary files in /var/folders/kd/nz_frc_x1231cz37xvyzc_v40000gn/T/tmpSiNHxc/tmpma6L8B
Exception in thread Thread-3:
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/paste/httpserver.py", line 1101, in serve_forever
    self.handle_request()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/SocketServer.py", line 276, in handle_request
    fd_sets = _eintr_retry(select.select, [self], [], [], timeout)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/SocketServer.py", line 155, in _eintr_retry
    return func(*args)
error: (9, 'Bad file descriptor')

2017-03-26 18:22:45,462 INFO  [galaxy.jobs.handler] sending stop signal to worker thread
2017-03-26 18:22:45,463 INFO  [galaxy.jobs.handler] job handler queue stopped
2017-03-26 18:22:45,463 INFO  [galaxy.jobs.runners] TaskRunner: Sending stop signal to 2 worker threads
2017-03-26 18:22:45,463 INFO  [galaxy.jobs.runners] LocalRunner: Sending stop signal to 4 worker threads
2017-03-26 18:22:45,463 INFO  [galaxy.jobs.handler] sending stop signal to worker thread
2017-03-26 18:22:45,463 INFO  [galaxy.jobs.handler] job handler stop queue stopped
2017-03-26 18:22:45,465 ERROR [galaxy.jobs.runners.local] Job wrapper finish method failed
Traceback (most recent call last):
  File "/private/var/folders/kd/nz_frc_x1231cz37xvyzc_v40000gn/T/tmpdrFQKD/galaxy-dev/lib/galaxy/jobs/runners/local.py", line 128, in queue_job
    job_wrapper.finish( stdout, stderr, exit_code )
  File "/private/var/folders/kd/nz_frc_x1231cz37xvyzc_v40000gn/T/tmpdrFQKD/galaxy-dev/lib/galaxy/jobs/__init__.py", line 1362, in finish
    job.set_final_state( final_job_state )
  File "/private/var/folders/kd/nz_frc_x1231cz37xvyzc_v40000gn/T/tmpdrFQKD/galaxy-dev/lib/galaxy/model/__init__.py", line 686, in set_final_state
    if self.workflow_invocation_step:
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/sqlalchemy/orm/attributes.py", line 237, in __get__
    return self.impl.get(instance_state(instance), dict_)
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/sqlalchemy/orm/attributes.py", line 578, in get
    value = self.callable_(state, passive)
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/sqlalchemy/orm/strategies.py", line 529, in _load_for_state
    return self._emit_lazyload(session, state, ident_key, passive)
  File "<string>", line 1, in <lambda>
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/sqlalchemy/orm/strategies.py", line 599, in _emit_lazyload
    result = q.all()
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2399, in all
    return list(self)
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2516, in __iter__
    return self._execute_and_instances(context)
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2531, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 914, in execute
    return meth(self, multiparams, params)
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1010, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1146, in _execute_context
    context)
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1341, in _handle_dbapi_exception
    exc_info
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb)
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
    context)
  File "/Users/pierrick/.planemo/gx_venv/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 450, in do_execute
    cursor.execute(statement, parameters)
OperationalError: (sqlite3.OperationalError) disk I/O error [SQL: u'SELECT workflow_invocation_step.id AS workflow_invocation_step_id, workflow_invocation_step.create_time AS workflow_invocation_step_create_time, workflow_invocation_step.update_time AS workflow_invocation_step_update_time, workflow_invocation_step.workflow_invocation_id AS workflow_invocation_step_workflow_invocation_id, workflow_invocation_step.workflow_step_id AS workflow_invocation_step_workflow_step_id, workflow_invocation_step.job_id AS workflow_invocation_step_job_id, workflow_invocation_step.action AS workflow_invocation_step_action \nFROM workflow_invocation_step \nWHERE ? = workflow_invocation_step.job_id'] [parameters: (2,)]
There were problems with 1 test(s) - out of 1 test(s) executed. See /Users/pierrick/dev/lcmsmatching/tool_test_output.html for detailed breakdown.

Local continuous integration

Put in place a local continuous integration system in order to test also Peakforest database that uses an access token.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.