mikegrudic / pytreegrav Goto Github PK

View Code? Open in Web Editor NEW

57.0 57.0 8.0 2.3 MB

Fast N-body gravitational force and potential in Python

License: MIT License

Python 88.22% TeX 11.78%

pytreegrav's People

Contributors

Stargazers

Watchers

Forkers

austinorr ebortolas quatrope gianuzzi bwkeller agurvich jingyaodou

pytreegrav's Issues

Units of positions, masses and accelrations?

Hi Mike

What units are the positions and masses expected to be in and what are the units of the returned acceleration?

Kind Regards
Jordan

pykdgrav import potential fail

Hi,

Apparently, pykdgrav requires a non-standard module,
numba, which is not automatically installed when installing
pykdgrav via pip, e.g.:

pip install pykdgrav

Unfortunately,

pip install numba

fails.

Any hints on how to get a full, working installation of pykdgrav
would be much appreciated.

Thor.

Segmentation fault with `Potential()`

Hi, I'm very thankful to you for making pytreegrav!
This time, I have come across an issue with pytreegrav. I'm happy if I could get some help.

Problem

I have received Segmentation fault from Potential()

What is mysterious is that larger size of arrays don't lead to Segmentation fault, while smaller size of arrays do lead to.

What I did & received

The code outputs

Kinetic complete.  
Segmentation fault (core dumped)

To identify the place producing Segmentation fault, I put print("Kinetic complete.") and print("Potential complete."). I received the first but not the latter, so I think Potential() causes Segmentation fault.
I added del and gc.collect() and it doesn't make any change.
I tried sys.settrace and gdb python refering to https://stackoverflow.com/questions/10035541/what-causes-a-python-segmentation-fault. However I couldn't use it well.
I just put sys.settrace(None) in the first block of my code, and (gdb) backtrace returned

#0  0x00007fffbd5e6bd0 in pytreegrav::octree::Octree::BuildTree$2415(instance::jitclass::Octree$237fffd6b84c10$3cSizes$3aarray$28float64$2c$201d$2c$20A$29$2cDeltas$3aarray$28float64$2c$201d$2c$20A$29$2cCoordinates$3aarray$28float64$2c$202d$2c$20A$29$2cMasses$3aarray$28float64$2c$201d$2c$20A$29$2cQuadrupoles$3aarray$28float64$2c$203d$2c$20A$29$2cHasQuads$3abool$2cNumParticles$3aint64$2cNumNodes$3aint64$2cSoftenings$3aarray$28float64$2c$201d$2c$20A$29$2cNextBranch$3aarray$28int64$2c$201d$2c$20A$29$2cFirstSubnode$3aarray$28int64$2c$201d$2c$20A$29$2cTreewalkIndices$3aarray$28int64$2c$201d$2c$20A$29$3e, Array<double, 2, C, mutable, aligned>, Array<double, 1, C, mutable, aligned>, Array<double, 1, C, mutable, aligned>) ()

#1  0x00007fffbd00a5de in pytreegrav::octree::Octree::__init__$2414(instance::jitclass::Octree$237fffd6b84c10$3cSizes$3aarray$28float64$2c$201d$2c$20A$29$2cDeltas$3aarray$28float64$2c$201d$2c$20A$29$2cCoordinates$3aarray$28float64$2c$202d$2c$20A$29$2cMasses$3aarray$28float64$2c$201d$2c$20A$29$2cQuadrupoles$3aarray$28float64$2c$203d$2c$20A$29$2cHasQuads$3abool$2cNumParticles$3aint64$2cNumNodes$3aint64$2cSoftenings$3aarray$28float64$2c$201d$2c$20A$29$2cNextBranch$3aarray$28int64$2c$201d$2c$20A$29$2cFirstSubnode$3aarray$28int64$2c$201d$2c$20A$29$2cTreewalkIndices$3aarray$28int64$2c$201d$2c$20A$29$3e, Array<double, 2, C, mutable, aligned>, Array<double, 1, C, mutable, aligned>, Array<double, 1, C, mutable, aligned>, bool, bool) ()

#2  0x00007fffbcfe8151 in $3cdynamic$3e::ctor$2413(Array<double, 2, C, mutable, aligned>, Array<double, 1, C, mutable, aligned>, Array<double, 1, C, mutable, aligned>, bool, bool) ()

#3  0x00007fffbcfe863c in cpython::$3cdynamic$3e::ctor$2413(Array<double, 2, C, mutable, aligned>, Array<double, 1, C, mutable, aligned>, Array<double, 1, C, mutable, aligned>, bool, bool) ()

... (long backtrace)  

#46 0x000055555573afe5 in _start () at ../sysdeps/x86_64/elf/start.S:103

(gdb) info locals returned No symbol table info available.

Script

It's long, so I cut off some parts of the code.

from pytreegrav import Potential as ptgPotential

...(cut)

for j,i in enumerate(snaplist):
    with h5py.File(datafile, 'r') as f:
        pos_DM = np.asarray(f["PartType1/Coordinates"])
        vel_DM = np.asarray(f["PartType1/Velocities"], dtype="float64") * 1e-2  # kpc/Myr
        mas_DM = np.asarray(f["PartType1/Masses"]) * 1e8  # M_sun
        IDs_DM = np.asarray(f["PartType1/ParticleIDs"], dtype="int32")
    
    # Withdraw satellite particles having been bound in previous snapshot
#------------------------------#
#    pos_DM_sate_bnd = pos_DM
#    vel_DM_sate_bnd = vel_DM
#    mas_DM_sate_bnd = mas_DM
#    IDs_DM_sate_bnd = IDs_DM
#    num_DM_sate_bnd = len(IDs_DM_sate_bnd)
#------------------------------#
    idxs_DM_sate_bnd = np.empty_like(IDs_DM_sate_bnd)
    libc1.get_bound_idxs(IDs_DM, IDs_DM_sate_bnd, idxs_DM_sate_bnd)
    pos_DM_sate_bnd = pos_DM[idxs_DM_sate_bnd,:]
    vel_DM_sate_bnd = vel_DM[idxs_DM_sate_bnd,:]
    mas_DM_sate_bnd = mas_DM[idxs_DM_sate_bnd]
    IDs_DM_sate_bnd = IDs_DM[idxs_DM_sate_bnd]
    num_DM_sate_bnd = len(idxs_DM_sate_bnd)
#------------------------------#
    del pos_DM; del vel_DM; del mas_DM; del IDs_DM; gc.collect()


# Not important section from here...  
    # Calculate BcV
    BcV_DM_sate_bnd = calc_CoM(vel_DM_sate_bnd, num_DM_sate_bnd)

    # Calculate kinetic energies
    num_DM_sate_bnd_ctp = ctypes.c_int(num_DM_sate_bnd)
    mass_sate_ctp = ctypes.c_double(mas_DM_sate_bnd[0])
    kin_ene = np.empty(num_DM_sate_bnd, dtype="float64")
    libc1.calc_kin_ene( vel_DM_sate_bnd - BcV_DM_sate_bnd, mass_sate_ctp, num_DM_sate_bnd_ctp, kin_ene)
    del vel_DM_sate_bnd; gc.collect()
    print("Kinetic complete.")
# to here.
    

    # Calculate potential energies
    eps_DM = np.repeat(eps, num_DM_sate_bnd)
    pot_ene = ptgPotential(pos_DM_sate_bnd, mas_DM_sate_bnd, softening=eps_DM, G=4.493e-12, theta=.6, parallel=True, method="tree")
    print("Potential complete.")

...

Helps for you to read the script

The function ptgPotential is a calculator of potential energy, and is also the causer of Segmentation fault.
The flow is following:
1. Read the particle data.
2. Pick out a part of the particles using idxs_DM_sate_bnd. If I use the comment-out part (switch the section #-----#), the code uses all particles.
  2.5. (Not important) Calculate the baryocentric velocity (BcV).
3. (Not important) Calculate kinetic energies of each particles.
4. Calculate potential energies of each particles.

Context

This is an analysis code for N-body gravitational simulation.
The number of DM particles (N) = 23075000 (~10^7).

I want to calculate bounding energy (= kinetic energy - potential energy).

I use my own C library to calculate kinetic energy, and it works with no problem.

For potential energy, I use a function of pytreegrav: Potential(). And it seems to cause Segmentation fault.

What is mysterious, is that it doesn't raise Segmentation fault with all (N~10^7) particles while it raises if I pick out and use N=8075000 (~10^6) particles.

I suspect that it's not pytreegrav's fault, but I put this question here just in case.

Environment

The machine is a cluster of an institute. I don't have access to the details of OS.

Linux
Python 3.8.8 (Anaconda3-2022.05-Linux-x86_64)
Numpy 1.20.1
Numba 0.53.1

JOSS review (adrn)

Ref: openjournals/joss-reviews#3675

Hi @mikegrudic 👋 ! Overall things look good here -- I'm looking forward to using this package myself! -- but I do have some suggestions and comments below.

Installation

I did not find any instructions for installation. I recommend that the authors add an "Installation" section to the README or documentation (see below), which could either contain the necessary information or link to an INSTALL file that describes how to install the package. For an example: https://github.com/adrn/gala#installation-and-dependencies

(I was able to pip install pytreegrav, which successfully installed a wheel)

Functionality

I successfully ran the walkthrough code on my machine after installing the package.

Performance

I have verified the scaling claims (Figure 1 in JOSS article) on my machine (MacBook Pro laptop).

Documentation

The documentation is in the form of a section of the repository README. My main comment would be to separate the documentation from the README. At maximum, switch to using a documentation engine like Sphinx or MkDocs built and served on a service like Readthedocs, and link to this documentation from your README. At minimum, I would recommend making a docs/ directory, moving your current README.ipynb to docs/walkthrough.ipynb, and link to this from the README.txt file. I would also recommend removing the code/walkthrough from the README.txt itself and instead link to the walkthrough IPython notebook (so you avoid duplicating that text/code).

Other comments:

I don't see any mention of what profile is used to soften the point mass potentials when h is provided. Is there a standard profile to use (i.e. Plummer?), or is this configurable? It would be good to explicitly state this in the documentation.
As mentioned above, at minimum, please add an "Installation" section to the README that links to a file that contains installation instructions (even if it just states explicitly that the way to install is with pip)
Some users might find it useful to have API documentation, so it's worth considering whether you want to write this yourself or build it automatically with an automatic documentation build system.
Please add some documentation of how a user can run tests of the package. It looks like there is a minimal test in tests/test.py, but I would recommend switching to a pytest or nosetests-compatible test layout.
It looks like there is an associated github pages build of the README: Please link to this page (maybe with a badge link?) at the top of your README, as this serves as your main documentation.
Please add some documentation of how other users can contribute or engage with the developers. In particular, how do users 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support. An example statement: http://gala.adrian.pw/en/latest/contributing.html

General comments

Many repositories on GitHub (including GitHub itself) are switching from a default branch named "master" to "main" (see, e.g., https://github.com/github/renaming). You might want to consider renaming the default branch (but it's up to you!).
My understanding from reading the README and walkthrough is that the primary use case for this code is for re-analyzing simulation output (i.e. not for users who actually want to run simulations, as the code operates at the Python level, using numba to accelerate things). It might be worth having a few scientific case studies in the walkthrough that put the utilities in this package in context of some "real world" examples.
Though the details of code style can be subjective, Python packages generally at least follow the convention (laid out in PEP 8) that class names follow the CapWords convention and function names are lower case. Please consider reformatting the functions in this package to follow PEP8 guidelines.

octree force walk is not actually parallelizable???

CPU usage is consistent with threads being launched and doing work but timing is 2x worse when you use GetAccelParallel from pykdgrav.pykdgrav.octree

Large Memory Use Resulting in Crash

Hi,

I've been running into problems with large memory use. The code works well for a test case which I ran where 10 million particles within a unit cube and random masses were used:

Interestingly, the code crashes when I pass in particle positions and masses for approximately 8.6 million particles.

It produces the following error:

The server I am running on has approximately 129Gbs of ram. The thread memory use grows gradually until it has filled up the ram at which point it crashes. I am unsure why this is the case when the problem does not occur with the 10 million particle simulation. I initially thought the problem was to do with the particles not lying within a unit cube, and the particle masses not being between 0 and 1, but even after rescaling them so that this is the case, the error is still produced.

Any tips/suggestions would be greatly appreciated. If there is any important information I have missed don't hesitate to let me know!

Thanks in advance,
Geoff

calculations with nan inputs silently enter an infinite loop

thanks to Alex G for discovering this. will address soon.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.