manonmartin / pepsnmr Goto Github PK

View Code? Open in Web Editor NEW

10.0 10.0 5.0 249.29 MB

R package for 1H-NMR data pre-treatment

License: GNU General Public License v2.0

R 100.00%

pepsnmr's People

Contributors

Stargazers

Watchers

Forkers

proto4426 haihaba fenglb mtremblayfr stas-malavin

pepsnmr's Issues

Pretreated Spectrum are not displayed correctly with the option `stacked`

Draw(DATA[[2]], type.draw = "signal",
     output = "pdf", dirpath = out.path,
     filename = "signal", height = 480, width = 640, subtype= "stacked")

Draw option for zero order phase correction

Cannot open a new window to plot all the spectra with the dev.new() function

no plot with plot_rms in 0OPC function

No plot is drawn with non-null plot_rms argument

Reading Bruker data fails when data is stored in float format

Some Bruker data I have was loading incorrectly- it looked like the gain was set way too high- but other software and libraries loaded it fine. I looked into the code of NMRGlue specifically, and found that there is a parameter indicating datatype:

# determine data type (assume int32 unless DTYPA is 2)
    if isfloat is None:
        isfloat = False     # default value
        if "acqus" in dic and "DTYPA" in dic["acqus"]:
            if dic["acqus"]["DTYPA"] == 2:
                isfloat = True
            else:
                isfloat = False

This data is from firmware version (DSPFVS) 21 and I'm not sure how early this parameter was added. I tested it out and found that changing some of the ReadBin parameters in ReadFid fixed the issue.

Original:

readBin(fidFile, what = "int", n = TD, size=4L, endian = endianness)

Fixed:

readBin(fidFile, what = "double", n = TD, endian = endianness)

I would submit a pull request for this, but I'm not sure about the best way to organize this fix since the appropriate settings for readBin depend on a few different parameters.

Exponential line broadening with `expLB`

Hi,
I have been using PepsNMR for a while now and am really happy with it.

There is this thing with the line broadening, though. I've always been wondering why the default expLB=0.3 produces as good as nothing in noise reduction. According to the help of Apodization a factor of exp(-t/expLB) is multiplied with the FID.

Looking at the code of Apodization the factor that is really applied is exp(-expLB*t) which amounts to close to nothing for the default expLB setting. To get a true line broadening of 0.3 Hz a value of expLB = 2 is required.

Am I misinterpreting something here or is this a bug in the help documentation?

Cheers
Andreas

Set the bin width in bucketing

Be able to optionally decide to set the number of bins of the bins' width

Reading in only specific Fid files

Hi,
I have the following situation: a batch with a large number of samples measured with different pulse programs. Directory structure is such that there is a sub dir per sample. In each sample dir there are four sub dirs for the four pulse progs used. Each pulse prog sub dir contains the fid, acqu, acqus, etc.
ReadFids will read in all fids in all the sample dirs if a path to the top dir is provided. But I only want to read in all the data for a specific pulse program. Providing a vector with the relevant dir names does not work. Any suggestions how I can manage this? I could use lapply with ReadFid but would not have the structure required for further processing.

Alignment with non zero reference

Dear Manon,
first of all many thanks for your package and your work.
I'm using PepsNMR to process a set of lipidomic spectra and I have an issue in the Internal referencing phase.
In our case we would like to use signal of chloroform as reference peak setting it to 7.26 ppm.

As you can see the peak is clearly visible in all the spectra, but when I try to run the InternalReferencing function I get the following error "ppm.value = 7.26 is not in the ppm interval [-7.27,4.7], and is set to its default ppm.value 0".

The explicit code I'm running is the following

test <- InternalReferencing(nmr_data$zopc[[1]], nmr_data$fid_info[[1]], ppm.value = 7.26, range = "window", method = "max", fromto.RC = list(c(7.5,7)))

Am I doing something fundamentally wrong?

I've been trying to manually dig into the function and I've found something odd in the part where the ppm.scale is calculated

...

   TMSPpeaks <- apply(Re(Data), 1, which.max)
}
maxpeak <- max(TMSPpeaks)
minpeak <- min(TMSPpeaks)
if (shiftHandling %in% c("zerofilling", "NAfilling", "cut")) {
    fill <- NA
    if (shiftHandling == "zerofilling") {
        fill <- 0
    }
    start <- maxpeak - 1
    end <- minpeak - m
    ppmScale <- (start:end) * ppmInterval
    if (ppm.value < min(ppmScale) | ppm.value > max(ppmScale)) {
        warning("ppm.value = ", ppm.value, " is not in the ppm interval [", 
            round(min(ppmScale), 2), ",", round(max(ppmScale), 
              2), "], and is set to its default ppm.value 0")
        ppm.value = 0
    }

so, if I understand well, Data (which is already real), is a matrix full of constant small numbers with the exclusion of the spectral interval defined in the fromto.RC argument. This is to allow the identification of the maximum of each spectrum in the RC interval

TMSPpeaks should then contain the index of the chloroform maximum in all the individual samples, so maxpeak and minpeak are telling the range of variation of the maximum in term of column indices across the samples.

The new ppm scale is calculated from these values ... but why checking if the reference value (ppm.value) is inside the range of ppmscale? (ppm.value < min(ppmScale) | ppm.value > max(ppmScale)). In my case everything works well if I remove the check.

Should the test be performed after the command

ppmScale <- ppmScale + ppm.value

Let me know and many thanks for your kind help.

all the best

Pietro

Normalization with type.norm = "firstquartile"

Dear contributors,
I believe there is a bug in Normalization. If using type.norm = "firstquartile" the following error is thrown:

Error in matrixStats::rowQuantiles(Spectrum_data, 0.25, na.rm = TRUE)[[1]] : subscript out of bounds

The cause of this is this line of code in Normalization.R:

factor <- matrixStats::rowQuantiles(Spectrum_data, 0.25, na.rm = TRUE)[[1]]

the correct code should be:

factor <- matrixStats::rowQuantiles(Spectrum_data, probs=0.25, na.rm = TRUE)

Works for me with this change.

FFT conversion not possible with current FidInfoHS and FidInfoHU

Hi Manon,

I am a user of your package and I am currently using it to help to develop workflows. I would like to use as examples your Human Serum and Human Urine Metabolome datasets, but it is not possible to do the FFT Conversion as some data is missing from FidInfoHS and FIDInfoHU. If I am not wrong, it is necessary to have nine columns and variables have not all of them, like O1:

Error in getArg(O1, Fid_info, "O1") :
impossible to get argument O1 it was not given directly and is not in the info matrix

Could you update these fid_info_files to be able to reproduce the preprocessing with the Human Serum and Human Urine Metabolome files?

Thank you for the help.

ATB

Daniel

InternalReferencing at any ppm location (not just 0 ppm)

Travis fails with examples in `SOAP-Ex.R`

Running examples in ‘SOAP-Ex.R’ failed
The error most likely occurred in:
> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: Apodization
> ### Title: Apodization of the FID
> ### Aliases: Apodization
> ### Keywords: manip
> 
> ### ** Examples
> 
> Apod.fid <- Apodization(Data_HS$FidData_HS_2, FidInfo_HS, plotWindow=FALSE)
Begin Apodization 
End Apodization 
It lasted 0.006 s user time, 0 s system time and 0.006 s elapsed time.
> 
> #or
> Apod.res <- Apodization(Data_HS$FidData_HS_2, FidInfo_HS, plotWindow=FALSE, returnFactor=TRUE)
Begin Apodization 
End Apodization 
It lasted 0.004 s user time, 0 s system time and 0.004 s elapsed time.
> Apod.fid = Apod.res[["Fid_data"]]
> plot(Apod.res[["factor"]], type="l")
Warning in min(x) : no non-missing arguments to min; returning Inf
Warning in max(x) : no non-missing arguments to max; returning -Inf
Warning in min(x) : no non-missing arguments to min; returning Inf
Warning in max(x) : no non-missing arguments to max; returning -Inf
Error in plot.window(...) : need finite 'xlim' values
Calls: plot -> plot.default -> localWindow -> plot.window

Add pqn normalisation

Cannot run R CMD build --resave-data to better compress data

* checking data for ASCII and uncompressed saves ... WARNING

  Note: significantly better compression could be obtained
        by using R CMD build --resave-data
                         old_size new_size compress
  Data_HS.RData            21.4Mb   11.7Mb       xz
  FullFidData_HS_0.RData    4.5Mb    3.1Mb       xz

Add support for Varian data

Debug plots in InternalReferencing

Forthcoming ggplot2 release and PepsNMR

We're in the process of preparing a ggplot2 release. As part of the release process, we run the R CMD check on packages that use ggplot2 to make sure we don't accidentally break code for downstream packages.

In running the R CMD check on PepsNMR, we identified the following issue:

checking examples ... ERROR

Running examples in ‘PepsNMR-Ex.R’ failed
The error most likely occurred in:

> ### Name: DrawPCA
> ### Title: Draw the PCA scores or loadings of the signals
> ### Aliases: DrawPCA
> ### Keywords: hplot
>
> ### ** Examples
>
> require(PepsNMRData)
Loading required package: PepsNMRData
> # Draw loadings
> DrawPCA(FinalSpectra_HS, main = "PCA loadings plot",
+       Class = NULL, axes =c(1,3, 5), type ="loadings", loadingstype="l",
+       num.stacked=4, xlab="ppm", createWindow = TRUE)
dev.new(): using pdf(file="Rplots2.pdf")
Error: Aesthetics must be either length 1 or the same as the data (9): label
Execution halted

This error is occuring because PepsNMR was relying on undefined behaviour of annotate(), which isn't designed to draw different items on each facet. To fix this error, we suggest using a geom_*() function with its own data. Note that using .data$col_name within aes() is the preferred way to avoid CMD check issues about undefined variables when mapping columns (make sure you include #' importFrom rlang .data in at least one roxygen documentation block to avoid an error about the name .data being undefined).

library(ggplot2)

plot_data <- data.frame(
  facet = c("facet 1", "facet 2"),
  x = c(1, 2),
  y = c(1, 2)
)

facet_labels <- data.frame(
  facet = c("facet 1", "facet 2"),
  label = c("Text on facet 1", "Text on facet 2")
)

# to silence CMD check
# in the future you will be able to use ggplot2::vars(.data$facet)
# but this is currently not supported (tidyverse/ggplot2#2963)
facet <- NULL; rm(facet)

ggplot2::ggplot(plot_data, aes(x = .data$x, y = .data$y)) +
  ggplot2::geom_point() +
  ggplot2::geom_text(
    ggplot2::aes(label = .data$label, x = .data$x, y = .data$y),
    data = facet_labels,
    x = 1.5,
    y = 1.5
  ) +
  ggplot2::facet_wrap(ggplot2::vars(facet))

We hope to release the new version of ggplot2 in the next two weeks, at which point you will get a note from CRAN that your package checks are failing. Let me know if I can help!