Coder Social home page Coder Social logo

wbs-tw / arcms Goto Github PK

View Code? Open in Web Editor NEW

This project forked from leesulab/arcms

0.0 0.0 0.0 543 KB

Mass spectrometry data converter from UNIFI to Parquet and HDF5 formats

Home Page: https://leesulab.github.io/arcMS/

License: Other

R 99.89% Rez 0.11%

arcms's Introduction

🏹 arcMS

arcMS can convert HDMSE data acquired with Waters UNIFI to tabular format for use in R or Python, with a small filesize when saved on disk. test

Two output data file formats can be obtained:

  • the Apache Parquet format for minimal filesize and fast access. Two files are produced: one for MS data, one for metadata.

  • the HDF5 format with all data and metadata in one file, fast access but larger filesize.

arcMS stands for accessible, rapid and compact, and is also based on the french word arc, which means bow, to emphasize that it is compatible with the Apache Arrow library.

⬇️ Installation

You can install arcMS in R with the following command:

install.packages("pak")
pak::pkg_install("leesulab/arcMS") 

To use the HDF5 format, the rhdf5 package needs to be installed:

pak::pkg_install("rhdf5")

🚀 Usage

First load the package:

library("arcMS")

Then create connection parameters to the UNIFI API (retrieve token). See vignette("api-configuration") to know how to configure the API and register a client app.

con = create_connection_params(apihosturl = "http://localhost:50034/unifi/v1", identityurl = "http://localhost:50333/identity/connect/token")

If arcMS and the R session are run from another computer than where the UNIFI API is installed, replace localhost by the IP address of the UNIFI API.

con = create_connection_params(apihosturl = "http://192.0.2.0:50034/unifi/v1", identityurl = "http://192.0.2.0:50333/identity/connect/token")

Now these connection parameters will be used to access the UNIFI folders. The following function will show the list of folders and their IDs (e.g. abe9c297-821e-4152-854a-17c73c9ff68c in the example below).

folders = folders_search()
folders
#>                                     id                name
#> 3 abe9c297-821e-4152-854a-17c73c9ff68c          Christelle
#> 4 dde4ecfc-fe08-4cb2-ad8a-c10f3e45f4dd Imports temporaires
#>                          path folderType                             parentId
#> 3          Company/Christelle    Project 7c3a0fc7-3805-4c14-ab68-8da3e115702e
#> 4 Company/Imports temporaires    Project 7c3a0fc7-3805-4c14-ab68-8da3e115702e

With a folder ID, we can access the list of Analysis items in the folder:

ana = analysis_search("abe9c297-821e-4152-854a-17c73c9ff68c")
ana

Finally, with an Analysis ID, we can get the list of samples (injections) acquired in this Analysis:

samples = get_samples_list("e236bf99-31cd-44ae-a4e7-74915697df65")
samples

Once we get a sample ID, we can use it to download the sample data:

convert_one_sample_data(sample_id = "0134efbf-c75a-411b-842a-4f35e2b76347")

This command will get the sample name (sample_name) and its parent analysis (analysis_name), create a folder named analysis_name in the working directory and save the sample data with the name sample_name.parquet and its metadata with the name sample_name-metadata.parquet.

With an Analysis ID, we can convert and save all samples from the chosen Analysis:

convert_all_samples_data(analysis_id = "e236bf99-31cd-44ae-a4e7-74915697df65")

To use the HDF5 format instead of Parquet, the format argument can be used as below:

convert_one_sample_data(sample_id = "0134efbf-c75a-411b-842a-4f35e2b76347", format = "hdf5")

convert_all_samples_data(analysis_id = "e236bf99-31cd-44ae-a4e7-74915697df65", format = "hdf5")

This will save the samples data and metadata in the same file.h5 file.

Parquet or HDF5 files can be opened easily in R with the arrow or rhdf5 packages. Parquet files contain both low and high energy spectra (HDMSe), and HDF5 files contain low energy in the “ms1” dataset, high energy in the “ms2” dataset, and metadata in the “metadata” dataset. The fromJSON function from jsonlite package will import the metadata json file (associated with the Parquet file) as a list of dataframes.

sampleparquet = arrow::read_parquet("sample.parquet")
metadataparquet = jsonlite::fromJSON("sample-metadata.json")

samplems1hdf5 = rhdf5::h5read("sample.h5", name = "ms1")
samplems2hdf5 = rhdf5::h5read("sample.h5", name = "ms2")
samplemetadatahdf5 = rhdf5::h5read("sample.h5", name = "samplemetadata")
spectrummetadatahdf5 = rhdf5::h5read("sample.h5", name = "spectrummetadata")

✨ Shiny App

A Shiny application is available to use the package easily. To run the app, just use the following command (it might need to install a few additional packages):

run_app()

arcms's People

Contributors

julienleroux5 avatar narvall018 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.