Coder Social home page Coder Social logo

Comments (19)

tobiasko avatar tobiasko commented on August 26, 2024 3

I think this task is not named correctly. If we go for this we do not want to have a coercion function, but we need to def. a backend that can access spectral data stored in raw files:

https://github.com/rformassspectrometry/Spectra/blob/master/man/MsBackend.Rd

The easiest might be to create an MsBackendDataFrame that keeps all data in memory. A backend that keeps peak lists on disk might be a cool thing. Here lazy evaluation could become pretty powerful. One could also think about mixed models. Have metadata in memory and keep spectral data on disk.

from rawdiag.

lgatto avatar lgatto commented on August 26, 2024 2

Yes, a separate MsBackendRawDiagpackage on github that ships the DLLs is an option. Or, a function that downloads them upon first initialisation is also a solution.

from rawdiag.

cpanse avatar cpanse commented on August 26, 2024 2

1. All in memory

library(rawDiag)
(rawfile <- file.path(path.package(package = 'rawDiag'), 'extdata', 'sample.raw'))
system.time(PLS <-readScans(rawfile))
system.time(DF <- as.peaklistSet.DataFrame(PLS))
DF$fromFile = as.integer(1)

if(require(Spectra)){
  rawDiagSample <- MsBackendDataFrame()
  system.time(BE <- backendInitialize(object=rawDiagSample, files=rawfile, spectraData=DF))
}
R> library(rawDiag)
R> (rawfile <- file.path(path.package(package = 'rawDiag'), 'extdata', 'sample.raw'))
[1] "/home/cp/R/x86_64-pc-linux-gnu-library/3.6/rawDiag/extdata/sample.raw"
R> system.time(PLS <-readScans(rawfile))
   user  system elapsed 
  0.123   0.053   0.643 
R> system.time(DF <- as.peaklistSet.DataFrame(PLS))
   user  system elapsed 
  0.029   0.000   0.029 
R> DF$fromFile = as.integer(1)
R> 
R> if(require(Spectra)){
+    rawDiagSample <- MsBackendDataFrame()
+    system.time(BE <- backendInitialize(object=rawDiagSample, files=rawfile, spectraData=DF))
+  }
   user  system elapsed 
  0.122   0.000   0.122 
R> BE
MsBackendDataFrame with 574 spectra
      msLevel     rtime scanIndex
    <integer> <numeric> <integer>
1           1     0.097         1
2           2      0.35         2
3           2     0.419         3
4           2     0.489         4
5           2     0.558         5
...       ...       ...       ...
570         2    46.512       570
571         2    46.581       571
572         2    46.651       572
573         1    46.806       573
574         2    47.059       574
 ... 18 more variables/columns.

2. Use MsBackendRawDiag()

library(rawDiag)
fls <- rep(rawfile <- file.path(path.package(package = 'rawDiag'), 'extdata', 'sample.raw'), 1)
be <- backendInitialize(MsBackendRawDiag(), files = fls)
sps_thermofinnigan <- Spectra(be)
sps_thermofinnigan
R> sps_thermofinnigan
MSn data (Spectra) with 574 spectra in a MsBackendRawDiag backend:
      msLevel     rtime scanIndex
    <integer> <numeric> <integer>
1           1     0.097         1
2           2      0.35         2
3           2     0.419         3
4           2     0.489         4
5           2     0.558         5
...       ...       ...       ...
570         2    46.512       570
571         2    46.581       571
572         2    46.651       572
573         1    46.806       573
574         2    47.059       574
 ... 15 more variables/columns.

file(s):
sample.raw
Processing:
  
R> 

from rawdiag.

jorainer avatar jorainer commented on August 26, 2024 2

We recently made some changes in the MsBackend definition:

  • @files and @modCount slots are gone.
  • Add new spectra variables dataStorage and dataOrigin

The dataStorage is thought to be the replacement for the @files with the difference, that it will return for each spectrum the current storage location (e.g. mzML file, memory, HDF5 file, ...).

I don't expect any major chages anymore in the MsBackend class. I will work now mostly in implementing all missing analysis methods for Spectra.

from rawdiag.

cpanse avatar cpanse commented on August 26, 2024 2

@jorainer @lgatto @sgibb; we are on the way shaping a MsBackendRawFileReader package.

Package: MsBackendRawFileReader
Type: Package
Title: Bridging Spectra and ThermoFinnigan raw files
Version: 0.0.1
Authors@R: c(person(given = "Christian",
    family = "Panse", email = "[email protected]", role = c("aut", "cre"),
    comment = c(ORCID = "0000-0003-1975-3064")),
	person(given = "Tobias", family = "Kockmann",
	  email = "[email protected]", role = "aut", 
    comment = c(ORCID = "0000-0002-1847-885X")))
Depends: R (>= 3.6),
       IRanges,
	methods,
	Spectra,
	rDotNet (>= 0.9)
Suggests:
	knitr,
	testthat
Description: implements an MsBackend for the Spectra package using
  Thermo Fisher Scientific's NewRawFileReader .Net libraries.
  The package is generalizing the functionallity introduced by the
  rawDiag package (Trachsel, 2018 <doi:10.1021/acs.jproteome.8b00173>).
SystemRequirements: mono 4.x or higher on OSX / Linux, .NET 4.x or
        higher on Windows, 'msbuild' and 'nuget' available in the path
URL: https://github.com/cpanse/MsBackendRawFileReader
BugReports: https://github.com/cpanse/MsBackendRawFileReader/issues
Encoding: UTF-8
LazyData: true
NeedsCompilation: no
RoxygenNote: 6.1.1
License: GPL-3
VignetteBuilder: knitr
Collate: 
    'hidden_aliases.R'
    'AllGenerics.R'
    'MsBackendRawFileReader-functions.R'
    'MsBackendRawFileReader.R'
    'zzz.R'

from rawdiag.

cpanse avatar cpanse commented on August 26, 2024 2

I wouldn't have much time for development at the moment, but would be happy to provide support if needed. I imaging that @jorainer and @sgibb would also contribute important feedback.

@lgatto @jorainer @sgibb we added you to the repo just to enable some transparency and important feedback.

from rawdiag.

lgatto avatar lgatto commented on August 26, 2024

Yes, I confirm that the correct way forward would be to define a backend that uses the mono libraries to access the raw data and the metadata in a DataFrame - see the MsBackendMzR for that does exactly that but with mzR.

from rawdiag.

cpanse avatar cpanse commented on August 26, 2024

@lgatto @tobiasko yep; I am not so fast ... upgrading BioC form 3.7 to devel on my playground box reminds me of my 1st SuSE4.2 install in 1996. (except changing the install disk 1-3)

from rawdiag.

tobiasko avatar tobiasko commented on August 26, 2024

@lgatto, would you suggest to bundle the code for the alternative backend into a separate bioconductor package? Will bioconductor be ok with hosting Thermo DLLs?

https://planetorbitrap.com/rawfilereader

from rawdiag.

lgatto avatar lgatto commented on August 26, 2024

Pinging @jorainer

from rawdiag.

tobiasko avatar tobiasko commented on August 26, 2024

Cool!!!

But this statement still confuses me: as.peaklistSet.DataFrame(...)

The above function coerces a peaklistSet object to a DataFrame? Why not simply as.DataFrame(...)? Like date <- as.Date("2017-01-01").

And I am still wondering if there is a more elegant way to initialize the backend. Maybe @lgatto can explain us why the backendInitialize(...) needs a file argument if used for MsBackendDataFrame. DF$fromFile = as.integer(1) looks creepy.

from rawdiag.

jorainer avatar jorainer commented on August 26, 2024

Nice! But be aware that there will be some quite substantial changes to the MsBackend:

  • @files slot will be removed, information about where the data is stored should be provided by the dataStorage spectra variable (and method).
  • fromFile will be removed.
  • fileNames will be removed.

I am currently finalizing the required changes in the Spectra and fixing/adding unit tests. I'll ping you when it is ready.

from rawdiag.

tobiasko avatar tobiasko commented on August 26, 2024

@jorainer Thx for keeping us in the loop! Is the class definition of Spectra already stable/is it safe to inherit?

from rawdiag.

lgatto avatar lgatto commented on August 26, 2024

There aren't any major changes to be expected, but some small changes are very possible. If you prefer to wait for a more stable release, I would suggest to wait of a pre-Bioconductor release (I think it is conceivable that we will submit for the next release).

from rawdiag.

cpanse avatar cpanse commented on August 26, 2024

Cool!!!

But this statement still confuses me: as.peaklistSet.DataFrame(...)

The above function coerces a peaklistSet object to a DataFrame? Why not simply as.DataFrame(...)? Like date <- as.Date("2017-01-01").

@tobiasko this is just S3 cosmetics
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Registering-S3-methods
this is just a Hello, World! function. We won't use that S3method.

And I am still wondering if there is a more elegant way to initialize the backend. Maybe @lgatto can explain us why the backendInitialize(...) needs a file argument if used for MsBackendDataFrame. DF$fromFile = as.integer(1) looks creepy.

from rawdiag.

tobiasko avatar tobiasko commented on August 26, 2024

@cpanse Ok, let's discuss coding style issues offline! ;-)

from rawdiag.

lgatto avatar lgatto commented on August 26, 2024

That's cool.

I had initially the impression that you were using the MsBackendDataFrame as template. You should however rather look at MsBackendMzR for a better fit.

from rawdiag.

cpanse avatar cpanse commented on August 26, 2024

@lgatto at the moment I am only ctrl-c/ ctrl-v-ing from MsBackendMzR. We have it private for the moment to avoid confusion, but if you wish, we can add you all three to the repository at any time.

from rawdiag.

lgatto avatar lgatto commented on August 26, 2024

I wouldn't have much time for development at the moment, but would be happy to provide support if needed. I imaging that @jorainer and @sgibb would also contribute important feedback.

from rawdiag.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.