johnh2o2 / cuvarbase Goto Github PK
View Code? Open in Web Editor NEWPython library for fast time-series analysis on CUDA GPUs
Home Page: https://johnh2o2.github.io/cuvarbase/index.html
License: GNU General Public License v3.0
Python library for fast time-series analysis on CUDA GPUs
Home Page: https://johnh2o2.github.io/cuvarbase/index.html
License: GNU General Public License v3.0
@astrobatty has done some great work (awhile ago now) showing PDM some love and care after I abandoned it pretty early on. We should revisit this some more and add appropriate documentation to bring it onto level footing with the rest of the algorithms we have here.
From the 4 available kernels, binless_tophat
and binless_gauss
cannot be used. The following error is raised:
The problem is caused by the fact that we you compile these functions here, you except names without nbins
suffix (which should really be the case), but after that when the function is selected here, the nbins
suffix is required regardless the type of the function.
This can solved by providing a bin size even if the kernel in the .cu
file won't use it.
It would be good to have some standard benchmarking plots to compare our speed to other algorithms. Obviously this will differ with the type of GPU but we should have something. Probably best to use astropy implementations of Lomb Scargle and BLS. I don't know if there are alternatives to PDM (astrobase? https://astrobase.readthedocs.io/en/latest/astrobase.periodbase.spdm.html), and Conditional Entropy (I have found nothing)?
Current implementation of fap_baluev
is numerically unstable and will return 0 when FAP is small enough. There should be a way to obtain a numerically stable implementation (perhaps using a log(-log(FAP)) transform or something).
Kevin Burdge pointed out a paper that came out last year that would be cool for us to incorporate: https://arxiv.org/abs/2103.06193
The basic idea is when you have a small enough number of observations, you can get rid of the binning part of BLS and the gridsearch over q and this is actually a bit more efficient. Basically: at each trial frequency, you sort the observations by phase and then instead of grid searching every transit parameter (phase_of_left_transit_boundary, width_of_transit), you just choose every combination of pairs of observations, and each pair defines both of these parameters.
It saves you from redundantly searching over too-finely grained parameter grids.
Our BLS implementation returns only one type of bls "power" = the square of the "signal residue" from https://arxiv.org/abs/astro-ph/0206099.
The astropy BLS implementations give additional conventions for the "power": https://docs.astropy.org/en/stable/timeseries/bls.html. We should (1) be very clear about what exactly we are calculating and returning so that we avoid confusion, and (2) if possible, allow user to specify which convention they want to have returned.
I believe the same is also true for Lomb Scargle -- we're only returning a single convention. In LS we are clear in the documentation about what convention we are using but it might be good to allow alternatives.
People are sometimes interested in finding/modeling signals that have a period which is changing over time (usually slowly, hence usually only the interest in a linear derivative of the period)
This is extremely straightforward to add to any phase-folding algorithm. It just requires a simple adjustment of the phase calculation.
Kevin Burdge (MIT) has discovered a memory leak issue when using cuvarbase.bls.eebls_gpu_fast
; to reproduce the problem, make a file called memory_leak_test.py
containing the following code (only dependency is numpy
and cuvarbase
)
import numpy as np
import cuvarbase.bls as bls
def run_BLS(t,y,dy):
# set up search parameters
search_params = dict(qmin=1e-2,
qmax=0.12,
# The logarithmic spacing of q
dlogq=0.1,
# Number of overlapping phase bins
# to use for finding the best phi0
noverlap=1)
# derive baseline from the data for consistency
baseline = max(t) - min(t)
# df ~ qmin / baseline
df = search_params['qmin'] / baseline
fmin = 4/baseline
fmax = fmin + 1000000 * df
nf = int(np.ceil((fmax - fmin) / df))
freqs = fmin + df * np.arange(nf)
bls_power = bls.eebls_gpu_fast(t, y, dy, freqs,
**search_params)
baseline = 1000.
n_obs = 100
rand = np.random.RandomState(1)
t = baseline * np.sort(rand.rand(n_obs))
dy = 1 + 0.05 * rand.randn(n_obs)
y = rand.randn(n_obs)
for i in range(100):
print(i)
run_BLS(t,y,dy)
running
mprof run --interval 0.01 --include-children python memory_leak_test.py
and subsequently running
mprof plot --output memory-profile.png
will produce the following plot
which clearly demonstrates a memory leak, since each iteration of the for loop should begin with approximately the same amount of memory.
Hi
I'm trying to install cuvarbase under debian 10 amd64 which uses python 3.7.3.
"pip3 install cuvarbase" fails as pycuda 2017.1.1 will not install. I've also tried building pycuda-2017.1.1 from source using the instructions on https://johnh2o2.github.io/cuvarbase/install.html and that also fails.
The same failure is described on...
inducer/pycuda#184
and was fixed with pycuda release 2018.1.1
I can install pycuda 2019.1.2 perfectly fine using pip, but then cannot install cuvarbase either from pip or the git version as it's specifically looking for pycuda==2017.1.1
Is there a way to use cuvarbase with pycuda 2019.1.2 ?
Regards
Paul.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.