Coder Social home page Coder Social logo

frbs / sigpyproc3 Goto Github PK

View Code? Open in Web Editor NEW
14.0 3.0 11.0 13.23 MB

Python3 version of Ewan Barr's sigpyproc library

Home Page: https://sigpyproc3.readthedocs.io

License: MIT License

Python 100.00%
filterbank pulsars fast-radio-bursts radio-astronomy

sigpyproc3's Introduction

sigpyproc

GitHub CI Docs codecov License Code style: black

sigpyproc is a pulsar and FRB data analysis library for python. It provides an OOP approach to pulsar data handling through the use of objects representing different data types (e.g. SIGPROC filterbank, PSRFITS, time-series, fourier-series, etc.). As pulsar data processing is often time critical, speed is maintained using the excellent numba library.

Installation

The quickest way to install the package is to use pip:

pip install -U git+https://github.com/FRBs/sigpyproc3

Note that you will need Python (>=3.8) installed to use sigpyproc. Also check out the installation documentation page for more options.

Legacy Version

sigpyproc is currently undergoing major developements which will modify the existing API in order to be a modern python replacemet for SIGPROC. To use the older API, you can install the legacy branch of this repo, or install the last released version 0.5.5.

Usage

from sigpyproc.readers import FilReader, PFITSReader

fil = FilReader("tutorial.fil")
fits = PFITSReader("tutorial.fits")

Check out the tutorials and API docs on the docs page for example usage and more info.

Contributing

Check out the developer documentation for more info about getting started.

sigpyproc3's People

Contributors

david-mckenna avatar ewanbarr avatar kmjc avatar mserylak avatar pravirkr avatar telegraphic avatar vivgastro avatar wfarah avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

sigpyproc3's Issues

Compilation issues on Mac OSX

python setup.py install fails on Mac OSX (10.15.6, but it should be the same in all recent versions) with

clang: error: unsupported option '-fopenmp'
error: command 'gcc' failed with exit status 1

It is easy to work around (the system's compiler is not OpenMP-enabled and one needs to point to another gcc, e.g. installed with Homebrew), but it might be good to have a recommended Mac OSX installation somewhere in the repo or the docs. I myself don't know what is the cleanest and easiest solution.

I installed fftw and a new gcc with Homebrew, and then ran

CC="/usr/local/bin/gcc-10 -I/usr/local/include -L/usr/local/lib" python setup.py install

read_dedisp_block gives only the lowest frequency channel copied into all channels

I realised that I can just read a block and dedisperse afterwards, but I thought I will report it anyways.

fil = FilReader(file)
data = fil.read_block(0, fil.header.nsamples)
data_dd = fil.read_dedisp_block(0, fil.header.nsamples, 0)
print(data_dd == data_dd[0])  # all rows are the same as the 0th
print(data_dd[0] == data)  # The 0th row is the same as the last row in the data.

Output

FilterbankBlock([[ True,  True,  True, ...,  True,  True,  True],
                 [ True,  True,  True, ...,  True,  True,  True],
                 [ True,  True,  True, ...,  True,  True,  True],
                 ...,
                 [ True,  True,  True, ...,  True,  True,  True],
                 [ True,  True,  True, ...,  True,  True,  True],
                 [ True,  True,  True, ...,  True,  True,  True]])
FilterbankBlock([[False, False, False, ..., False, False, False],
                 [False, False, False, ..., False, False, False],
                 [False, False, False, ..., False, False, False],
                 ...,
                 [False, False, False, ..., False, False, False],
                 [False, False, False, ..., False, False, False],
                 [ True,  True,  True, ...,  True,  True,  True]])

downsample is broken

the call here

kernels.downsample_2d(
data, write_ar, tfactor, ffactor, self.header.nchans, nsamps
)

has an extra argument write_arr that isn't in the function
def downsample_2d(array, tfactor, ffactor, nchans, nsamps):

so doing something like

fil=FilReader(filfname)
myfil.downsample(tfactor=2)

gives

downsample :  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% -:--:--
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "sigpyproc3/sigpyproc/base.py", line 499, in downsample
    kernels.downsample_2d(
TypeError: too many arguments: expected 5, got 6

Module load time

Loading sigpyproc modules is really slow—even console scripts where we do not need numba kernels.

This is because of numba compilation times. Maybe moving to numba was a bad idea.

Feature: buffered file reads

Issue

Currently the file IO infrastructure in sigpyproc is limited by the fact that each read of the file creates a new buffer. Calls to the sigpyproc.io.fileio.FileReader.cread(...) function can result in two allocations, one for the data read from file and another for the unpacking buffer used in the case of 1, 2 and 4 bit data. While Python may elide some of the performance cost of these buffer allocations via caching, the behaviour is unpredictable.

To reduce the memory performance issues from from these re-allocations and to open up interoperability between sigpyproc and tools like pycuda and torch, it would be useful to have more fine-grained control over some of the buffer allocations.

Say we wish to build a sigpyproc pipeline that uses torch. To enable asynchronous memcopies between the host and GPU it is necessary to page-align, lock and register each memory buffer ("pinning" in CUDA parlance). Currently, to do this with a cread() we need to pin a new buffer on each loop. Pinning has a very heavy overhead and so should be avoided at all costs. The general strategy is to pin a buffer once at the beginning of a program and reuse that buffer.

Feature request

I suggest that the read_plan interface (at least on FilReader but maybe elsewhere) be updated to take an allocator method. The allocator method should take a number of bytes as an argument and return a object that exports the Python Buffer Protocol interface (PEP 3118), e.g.

# Simple bytearray allocator (probably the default allocator)
def bytearray_allocator(nbytes) -> Buffer:
    return bytearray(nbytes)
    
# A torch pinned memory allocator
def pinned_allocator(nbytes) -> Buffer:
    buffer = bytearray(nbytes)
    cudart = torch.cuda.cudart()
    tensor = torch.frombuffer(buffer, dtype=torch.int8)
    r = cudart.cudaHostRegister(tensor.data_ptr(), tensor.numel() * tensor.element_size(), 0)
    if not r.success:
        raise RuntimeError(f"Unable to pin memory buffer: {r}")
    return buffer

The new call signature for read_plan would look like:

    def read_plan_buffered(
        self,
        gulp: int = 16384,
        start: int = 0,
        nsamps: int | None = None,
        skipback: int = 0,
        description: str | None = None,
        quiet: bool = False,
        allocator: Callable[[int], Buffer] = None,
    ) -> Iterator[tuple[int, int, np.ndarray]]:

The semantics of the call would remain the mostly same. The main difference now being that the ndarray returned on each iteration is the same ndarray just containing different data. This could have some side affects if the behaviour is not understood, e.g. if I push the array from each loop to a list then I end up with a list containing only references to the same object, where updating one, updates all.

Other parts of the codebase that would need to change would be:

  • The FileReader class would need a new method creadinto that wraps the existing readinto funtion in Python's FileIO module.
  • The unpack functions would need to take their unpacked buffers as arguments instead of creating new buffers on every call.

PFITS multi-file

Add support to read multiple contiguous SEARCH-mode PSRFITS files.

UnboundLocalError: local variable 'data' referenced before assignment

When i use "PFITSReader" to read a fits file, then use "read_block" get the data, i have the problem:

Traceback (most recent call last):
File "/home/mbs/Desktop/mycode/ASTROSOFT/sigpyproc3/read_fil.py", line 13, in
c = fil.read_block(start=1000, nsamps=2000)
File "/home/mbs/Desktop/mycode/ASTROSOFT/sigpyproc3/sigpyproc/readers.py", line 208, in read_block
data = self._fitsfile.read_subints(startsub, nsubs)
File "/home/mbs/Desktop/mycode/ASTROSOFT/sigpyproc3/sigpyproc/io/pfits.py", line 412, in read_subints
sdata = self.read_subint_pol(
File "/home/mbs/Desktop/mycode/ASTROSOFT/sigpyproc3/sigpyproc/io/pfits.py", line 464, in read_subint_pol
return data
UnboundLocalError: local variable 'data' referenced before assignment

How can i fix it.

Roadmap discussion

Hi, I am starting this thread to discuss plans for sigpyproc.

Current work

I am refactoring the code in packaging branch based on PEP8 and the new type hints. Adding more abstraction and moving the dynamic header class to more strictly structured. This is going to break the existing API (functions name changed to lower case, etc.). Another addition would be to refactor some of the existing code into 3 classes profile (for 1D pulse profile), block (for 2D freq-time spectrum) and cube (for folded data) similar to psrchive. Also, will be adding robust S/N estimation (using pdmp approach).

Future work

FRB simulator

I have plans to integrate @vivgastro Furby as a class inside sigpyproc (with some additional features and support for UWL-like frequency bands). This will complete sigpyproc as a Single-pulse toolbox in the sense that it can generate data/pulses as well as search, visualize and measure properties of those pulses.

PSRFITS support

As @telegraphic suggested, it would be nice to support other formats (e.g., PSRFITS, HDF5). I think we can add support to read those formats into the existing sigpyproc framework. I am not sure if we should also have a unified header (e.g., all PSRFITS keywords) or a writer class as all these formats (at least sigproc and PSRFITS) are completely different. Also, there are existing packages like your working towards this. IMO we should keep the header keywords (~25) defined in the sigproc docs as the base of this package and read other format files into this framework.

For example, we can have from sigpyproc.Readers import FitsReader with all the functionalities of FilReader.

Roadmap

Should we move towards an entirely python-based package? Most of the C++ code (running mean/median, FFT) can easily be accessed using NumPy and SciPy. One issue might be the speed and multi-threading, but it can be compensated using the Numba. @telegraphic

We can revive the FRBs/sigproc project to have a fully modern C++ and sigpyproc-like object-oriented framework with proper documentation. The codebase there is very old and can be easily condensed using modern third-party libraries. @evanocathain

pybind11 with openmp

Test properly if OpenMP works in pybind11 framework, or there is a need to acquire/release GIL while calling C++ code.
py::gil_scoped_acquire and py::gil_scoped_release

Built OK but cannot import on Mac

Hi,

I have seen past issues on compiling sigpyproc3 on Mac. Here I share my experience with a successful compiling, but my problem is I cannot import the installed sigpyproc (sadly). Please give some instructions.

  1. Install OpenMP

brew install libomp

  1. Install clang-omp using homebrew:

brew install llvm

Add llvm binaries to your path using :

echo 'export PATH="/usr/local/opt/llvm/bin:$PATH"' >> ~/.bash_profile

echo 'export PATH="/usr/local/include:$PATH"' >> ~/.bash_profile

echo 'export PATH="/usr/local/lib:$PATH"' >> ~/.bash_profile
  1. Test clang usgae:
clang -fopenmp hello.c -o hello -L /usr/local/lib/

./hello

You can create any simple hello.c file here to test -fopenmp and clang

  1. linking
ln -s /usr/local/Cellar/gcc/10.2.0/lib/gcc/10/libgomp.spec /usr/local/lib/libgomp.spec
ln -s /usr/local/Cellar/gcc/10.2.0/lib/gcc/10/libgomp.1.dylib /usr/local/lib/libgomp.1.dylib
ln -s /usr/local/Cellar/gcc/10.2.0/lib/gcc/10/libgomp.dylib /usr/local/lib/libgomp.dylib
ln -s /usr/local/Cellar/gcc/10.2.0/lib/gcc/10/libgomp.a /usr/local/lib/libgomp.a
  1. Installation
> pip3 install git+https://github.com/FRBs/sigpyproc3

Collecting git+https://github.com/FRBs/sigpyproc3
  Cloning https://github.com/FRBs/sigpyproc3 to /private/var/folders/62/chn0plln2b37czw0t47n9kd80000gn/T/pip-req-build-83fgta5g
  Running command git clone -q https://github.com/FRBs/sigpyproc3 /private/var/folders/62/chn0plln2b37czw0t47n9kd80000gn/T/pip-req-build-83fgta5g
Requirement already satisfied (use --upgrade to upgrade): sigpyproc==0.5.1 from git+https://github.com/FRBs/sigpyproc3 in /usr/local/lib/python3.8/site-packages
Requirement already satisfied: pybind11>=2.6.0 in /usr/local/lib/python3.8/site-packages (from sigpyproc==0.5.1) (2.6.1)
Requirement already satisfied: numpy in /usr/local/lib/python3.8/site-packages (from sigpyproc==0.5.1) (1.19.4)
Requirement already satisfied: tqdm in /usr/local/lib/python3.8/site-packages (from sigpyproc==0.5.1) (4.53.0)
Building wheels for collected packages: sigpyproc
  Building wheel for sigpyproc (setup.py) ... done
  Created wheel for sigpyproc: filename=sigpyproc-0.5.1-cp38-cp38-macosx_10_15_x86_64.whl size=138411 sha256=6eaf5ceb8639cf1b00d76989f1a6755f6289257663a9f978dbf02329df934f96
  Stored in directory: /private/var/folders/62/chn0plln2b37czw0t47n9kd80000gn/T/pip-ephem-wheel-cache-f8hrvu4u/wheels/16/24/22/1cf298bc509480534c02d09f5529f91c47cb10053eba7b6a12
Successfully built sigpyproc

Problem:

I don't know whether it is installed properly, so I checked available python3 libraries:

>help("modules")

I can find that sigpyproc is inside the list.

However, I cannot import sigpyproc:

> python3
Python 3.8.5 (default, Jul 21 2020, 10:48:26) 
[Clang 11.0.3 (clang-1103.0.32.62)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from sigpyproc.Readers import FilReader
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/site-packages/sigpyproc/__init__.py", line 1, in <module>
    from sigpyproc.Readers import FilReader
  File "/usr/local/lib/python3.8/site-packages/sigpyproc/Readers.py", line 6, in <module>
    from sigpyproc.Utils import File
  File "/usr/local/lib/python3.8/site-packages/sigpyproc/Utils.py", line 5, in <module>
    import sigpyproc.libSigPyProc as lib
ImportError: dlopen(/usr/local/lib/python3.8/site-packages/sigpyproc/libSigPyProc.cpython-38-darwin.so, 2): Symbol not found: ___kmpc_for_static_fini
  Referenced from: /usr/local/lib/python3.8/site-packages/sigpyproc/libSigPyProc.cpython-38-darwin.so
  Expected in: flat namespace
 in /usr/local/lib/python3.8/site-packages/sigpyproc/libSigPyProc.cpython-38-darwin.so
>>> 

I am stuck here. Any advice? Thank you very much!



Best,
Zoe

[Tracker] Implement sigproc CLI tools

Implement python version of sigproc CLI tools.

This is a tracker issue that lists the remaining apps to be added.

List of APIs

IO

  • chaninfo (statistics)
  • chop_fil
  • extract
  • filedit
  • filmerge
  • header
  • shredder
  • splice (dice)
  • zerodm

FRB/Pulsar

  • bandpass
  • best
  • decimate (downsample)
  • dedisperse (tree)
  • dedisperse_all
  • ffa
  • find
  • fold
  • giant
  • peak
  • seek

Pulse Injection

  • fake

telescope id not defined

When a new telescope id is encountered, it should fall back to some default. Currently, it results in a KeyError.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.