Coder Social home page Coder Social logo

Comments (10)

klin333 avatar klin333 commented on June 29, 2024

I was going to try the suggestion from the above thread, of using a different cacheDir per call to cppFunction, but that prevents caching which is essential for most use cases. I can't even work around it by memoise::memoise(Rcpp::cppFunction) because the pesky env argument's hash will change over time. So the only work around is to call Rcpp::cppFunction once and keep the resulting function as a global.

from rcpp.

eddelbuettel avatar eddelbuettel commented on June 29, 2024

Can we start at the top please: why would you want to call file 1000 times?

We have been at this for quite some time. There are hundreds of issues here, over three thousand posts at StackOverflow and I-honestly-have-no-idea-how-many on the dedicated mailing list. Most question of the "I am driving cppFunction() or sourceCpp() hard and it breaks" type are a fundamental misunderstanding: If you need to do more than these functions offer, create a package.

PS You are also on R version that is two years old. It does not add credibility.

from rcpp.

eddelbuettel avatar eddelbuettel commented on June 29, 2024

Does not reproduce.

Modiefied Code

edd@rob:/tmp/rcpp_issue_1279$ cat demo.cpp 
#include <Rcpp/Lightest>

// [[Rcpp::export]]
double f(double x){
  return x;
}
edd@rob:/tmp/rcpp_issue_1279$ cat caller.R 
cache_dir <- "cache"
if (dir.exists(cache_dir)) unlink(cache_dir, recursive=TRUE)
if (!dir.exists(cache_dir)) dir.create(cache_dir)
for (i in seq(2000)) {
    if (i %% 100 == 0) cat(i, " ")
    f <- Rcpp::sourceCpp("demo.cpp", cacheDir = cache_dir)
}
cat("\nDone\n")
edd@rob:/tmp/rcpp_issue_1279$ 

Demo

edd@rob:/tmp/rcpp_issue_1279$ Rscript caller.R 
100  200  300  400  500  600  700  800  900  1000  1100  1200  1300  1400  1500  1600  1700  1800  1900  2000  
Done
edd@rob:/tmp/rcpp_issue_1279$ 

Of course, I use ccache too. And I see no reason to turn it off given that you gave no real reason (yet?) why this loop makes sense.

from rcpp.

klin333 avatar klin333 commented on June 29, 2024

Arh fair enough, perhaps a warning in the documentation of sourceCpp or cppFunction would be helpful? Not essentially.

Sorry for the trouble, really appreciate this package!

FYI as you asked for how this came up, this problem came up via using the einsum package (https://github.com/const-ae/einsum), which uses R code to generate cpp code on the fly, and calls cppFunction on the fly to create bespoke cpp functions based on user specified einstein summation spec, see code below.

I fully understand this is then a problem with packages and use cases downstream to Rcpp, and that is completely fine if package developers and users know about this problem. The reason why I thought it's fine to call cppFunction 1000 times with same cpp code, is that I thought this caching is done by cppFunction, ie I thought calling cppFunction a second time is completely free, so it made sense to not bother making the user R code harder by doing caching outside of Rcpp.

Also I'm on Windows, not sure if it's a Windows specific problem...

I'm more than happy to close this issue as it appears out of scope. I don't need this fixed at all, can easily workaround by simply not calling Rcpp::cppFunction 1000 times.

# innocent looking but crashes R when foo1 is executed 1000 times. 
foo1 <- function(x) {
  einsum_f <- einsum::einsum_generator('ijk->ik', compile_function = TRUE) 
  einsum_f(x)
}

# users require awareness of Rcpp::cppFunction shortcomings to do it this way
einsum_f <- einsum::einsum_generator('ijk->ik', compile_function = TRUE) 
foo2 <- function(x) {
  einsum_f(x)
}

from rcpp.

klin333 avatar klin333 commented on June 29, 2024

FYI just to finish this thread, I believe this is due to dyn.load which is called within Rcpp::sourceCpp via source(scriptPath, local = env). The loop below will cause a R crash (the DLL is generated from same debug code as earlier).

From the help page of dyn.load, "By default, the maximum number of DLLs that can be loaded is now 614 when the OS limit on the number of open files allows or can be increased, but less otherwise". Probably that's why. Though, no idea why my loop crashes R around i = 1056 instead of 614.

for (i in seq(2000)) {
  print(i)
  `.sourceCpp_1_DLLInfo` <- dyn.load('C:/Users/User/workspace/tmp123/sourceCpp-x86_64-w64-mingw32-1.0.11/sourcecpp_169095e1a73/sourceCpp_2.dll')
}

Edit: confirmed it's due to dyn.load limits on maximum number of DLLs. If I run the above loop to i = 1055, ie 1 before the crash, and then library(dplyr) or any other package that loads DLLs, R will crash.

from rcpp.

klin333 avatar klin333 commented on June 29, 2024

I suppose if anyone ever gets bothered by this, conceptually you can skip dyn.load when build is not required. That should fix this problem, assuming it's a Windows reproducible problem not specific to me.

As I said, I'm not bothered by this, so happy for nothing to be done.

from rcpp.

eddelbuettel avatar eddelbuettel commented on June 29, 2024

Lots of messages so a lot for me (or anybody else) to digest.

I know little about einsum. But for example rstan and alike have been doing just this for well over a decade, and it works, on all platforms. So if there is something that does not work with einsum you need to distill it more.

I would suspect that Windows may have something to do with it. So if you can, try macOS or Linux too.

from rcpp.

Enchufa2 avatar Enchufa2 commented on June 29, 2024

I thought calling cppFunction a second time is completely free

It is. If you try to compile the same function twice, the cached library is used (unless you force recompilation with rebuild=TRUE). So if there is any issue, it's with the dyn.load call.

But I cannot reproduce this. E.g., on Linux:

$ cat test.R 
trace(dyn.load, quote(message(i, ": call to dyn.load")), print=FALSE)
for (i in 1:2000)
  Rcpp::cppFunction("double f(double x){ return x; }")
$ time Rscript test.R 
Tracing function "dyn.load" in package "base"
[1] "dyn.load"
1: call to dyn.load
...
2000: call to dyn.load

real    0m4.637s
user    0m4.301s
sys     0m0.364s

I tried exactly the same on Windows 10 with R 4.2.3, and it works the same, no crash.

from rcpp.

klin333 avatar klin333 commented on June 29, 2024

Oh wow, upgrade to R 4.2.3 fixed it (Windows. never reproduced the problem on mac for me). I sincerely apologise for taking everyone's time.

Turned out this dyn.load crash existed at least since R 3.3.2 for Windows (https://stackoverflow.com/questions/47528881/why-does-calling-dyn-load-in-a-for-loop-crash-my-r-session).

I sincerely thank everyone for their time.

from rcpp.

Enchufa2 avatar Enchufa2 commented on June 29, 2024

For completeness, I tried the following in the same Windows machine:

foo1 <- function(x) {
  einsum_f <- einsum::einsum_generator('ijk->ik', compile_function = TRUE) 
  einsum_f(x)
}

for (i in 1:2000) foo1(array(c(1:18), c(3, 3, 2)))

No issue either. But yet again, I see no reason for doing this. foo2 is at least x1000 faster.

from rcpp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.