Coder Social home page Coder Social logo

Comments (7)

Chris1221 avatar Chris1221 commented on July 20, 2024

Reading in as a sqldf does not work. Slower at least by a factor of 10.

        for(k in 1:5){
          if(k == 1){
            f = file(paste0(path, "chr1_block_", i, "_perm_", j, "_k_", k, ".controls.gen"), h = F, sep = " "))

            sqldf("select * from f", dbname = tempfile(), file.format = list(header = T, row.names = F)) -> gen
          } else if(k != 1){

            f = file(paste0(path, "chr1_block_", i, "_perm_", j, "_k_", k, ".controls.gen"), h = F, sep = " "))

        sqldf("select * from f", dbname = tempfile(), file.format = list(header = T, row.names = F)) %>% data.table::merge(gen, ., by = "V1:V5") %>% cbind(gen, .) -> gen
          }
        }

from corge.

Chris1221 avatar Chris1221 commented on July 20, 2024

Read lines was at least 100 times slower.

inputFile <- "../inst/extdata/toy.gen"


system.time({
con  <- file(inputFile, open = "r")

out <- data.table(ID = 1:1000)

while (length(oneLine <- readLines(con, n = 1, warn = FALSE)) > 0) {
  myVector <- (strsplit(oneLine, " "))
  myVector <- as.vector(as.factor(unlist(myVector)))

  foreach(row = 1:nrow(gen)) %:% foreach(i = seq(6,((length(myVector)-2)),by=3), .combine = c) %do% {

    myVector <- gen[row,]

    j <- i + 1
    h <- i + 2

    one <- myVector[i]
    two <- myVector[j]
    three <- myVector[h]

    final <- NA

    if (one > 0.9) {
      final <- 0
    } else if (two > 0.9) {
      final <- 1
    } else if (three > 0.9) {
      final <- 2
    } else {
      final <- NA
    }

    final

  }


  out[, myVector[3] := vec, with = FALSE] -> out
  message(paste0(ncol(out)))

}

from corge.

Chris1221 avatar Chris1221 commented on July 20, 2024

The above was also slower when

library(doParallel)
makeCluster(8)

Then %dopar%.

from corge.

Chris1221 avatar Chris1221 commented on July 20, 2024

Reading lines one at a time with coRge::gen2R was really bad.

from corge.

Chris1221 avatar Chris1221 commented on July 20, 2024

Chaining rows together with %:% was equally disastrous. Don't go down this path.

from corge.

Chris1221 avatar Chris1221 commented on July 20, 2024

foreach with .combine = 'rbind()' and .combine = 'c' was just insanely slow.

from corge.

Chris1221 avatar Chris1221 commented on July 20, 2024

Issue #19 might do it

from corge.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.