Coder Social home page Coder Social logo

inducer / boxtree Goto Github PK

View Code? Open in Web Editor NEW
64.0 4.0 19.0 2.01 MB

Quad/octree building for FMMs in Python and OpenCL

Python 100.00%
python opencl pyopencl parallel-computing shared-memory parallel-algorithm quadtree octree fmm fast-multipole-method

boxtree's People

Contributors

alexfikl avatar dependabot[bot] avatar gaohao95 avatar inducer avatar isuruf avatar mattwala avatar xywei avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

boxtree's Issues

Density-likes should not be wrangler-global

Specifically, this code here:

# FIXME: dipole_vec shouldn't be stored here! Otherwise, we'll recompute
# bunches of tree-dependent stuff for every new dipole vector.
# It's not super bad because the dipole vectors are typically geometry
# normals and thus change about at the same time as the tree... but there's
# still no reason for them to be here.
self.use_dipoles = dipole_vec is not None
if self.use_dipoles:
assert dipole_vec.shape == (self.dim, self.tree.nsources)
if not dipoles_already_reordered:
dipole_vec = self.reorder_sources(dipole_vec)
self.dipole_vec = dipole_vec.copy(order="F")
else:
self.dipole_vec = None

should not exist.

cc @gaohao95

generated kernels require >48k shared data

After installing boxtree, I tried running the included example (demo.py), and got the following error during compilation of an OpenCL kernel:

Traceback (most recent call last):
  File "/home/lee8rx/anaconda/envs/py34/lib/python3.4/site-packages/pytools-2014.3.5-py3.4.egg/pytools/__init__.py", line 467, in wrapper
    return getattr(self, cache_dict_name)[key]

AttributeError: 'TreeBuilder' object has no attribute '_memoize_dic_get_kernel_info'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "demo.py", line 29, in <module>
    tree, _ = tb(queue, particles, max_particles_in_box=30)
  File "/home/lee8rx/anaconda/envs/py34/lib/python3.4/site-packages/boxtree-2013.1-py3.4.egg/boxtree/tree_build.py", line 170, in __call__
  File "/home/lee8rx/anaconda/envs/py34/lib/python3.4/site-packages/pytools-2014.3.5-py3.4.egg/pytools/__init__.py", line 469, in wrapper
    result = method(self, *args, **kwargs)
  File "/home/lee8rx/anaconda/envs/py34/lib/python3.4/site-packages/boxtree-2013.1-py3.4.egg/boxtree/tree_build.py", line 71, in get_kernel_info
  File "/home/lee8rx/anaconda/envs/py34/lib/python3.4/site-packages/boxtree-2013.1-py3.4.egg/boxtree/tree_build_kernels.py", line 1218, in get_tree_build_kernel_info
  File "/home/lee8rx/anaconda/envs/py34/lib/python3.4/site-packages/pyopencl-2015.1-py3.4-linux-x86_64.egg/pyopencl/scan.py", line 1036, in __init__
    self.finish_setup()
  File "/home/lee8rx/anaconda/envs/py34/lib/python3.4/site-packages/pyopencl-2015.1-py3.4-linux-x86_64.egg/pyopencl/scan.py", line 1125, in finish_setup
    use_bank_conflict_avoidance=use_bank_conflict_avoidance)
  File "/home/lee8rx/anaconda/envs/py34/lib/python3.4/site-packages/pyopencl-2015.1-py3.4-linux-x86_64.egg/pyopencl/scan.py", line 1277, in build_scan_kernel
    prg = cl.Program(self.context, scan_src).build(self.options)
  File "/home/lee8rx/anaconda/envs/py34/lib/python3.4/site-packages/pyopencl-2015.1-py3.4-linux-x86_64.egg/pyopencl/__init__.py", line 218, in build
    options=options, source=self._source)
  File "/home/lee8rx/anaconda/envs/py34/lib/python3.4/site-packages/pyopencl-2015.1-py3.4-linux-x86_64.egg/pyopencl/__init__.py", line 258, in _build_and_catch_errors
    raise err
pyopencl.RuntimeError: clBuildProgram failed: invalid binary - 

Build on <pyopencl.Device 'GeForce GTX 670' on 'NVIDIA CUDA' at 0x132bc80>:

ptxas error   : Entry function 'scan_scan_intervals_lev1' uses too much shared data (0xc0bc bytes, 0xc000 max)

It appears that the kernel requires slightly more than the 48k shared memory limit on my GPU (Nvidia GTX 670). Is there any relatively simple way to help limit the shared memory usage of the kernel in this case? I am not entirely certain whether the error is the fault of boxtree or pyopencl itself, but any feedback on how to avoid the issue would be appreciated.

If I modify the example to run with dims = 3 instead, the kernels do compile and run correctly for that case.

python>=3.9 and numpy>=1.22 required

Because of the line

coord_dtype: np.dtype[Any], dimensions: int) -> np.dtype[Any]:

If python <3.9, I get,

  File "/home/idf2/miniconda3/envs/py38/lib/python3.8/site-packages/boxtree/tools.py", line 919, in <module>
    coord_dtype: np.dtype[Any], dimensions: int) -> np.dtype[Any]:
TypeError: Type subscription requires python >= 3.9

If numpy<1.22, I get,

>>> np.dtype[Any]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'numpy._DTypeMeta' object is not subscriptable

Auto-balance the FMM

It'd be nice if we had a mechanism for automatically balancing the FMM, maybe by automatic timing runs.

Quantify memory usage

The question here is effectively, "what is the biggest calculation we can fit on Titan V? (i.e. how many particles can we fit into 12 GB?)"

If that number is embarrassingly low---say below a few hundred million (@isuruf suggested as much, reporting that 100M dies in double), we should account for what the biggest memory hogs are.

Cannot install from PIP

Hi, using

sudo pip install boxtree

returns the following:

Downloading/unpacking boxtree
Could not find any downloads that satisfy the requirement boxtree
Cleaning up...
No distributions at all found for boxtree
Storing debug log for failure in /home/mrea1g12/.pip/pip.log

On Ubuntu 14.04, 64-bit. Any thoughts?

Tree build crashes if no sources present

  return DirectionalSourceDerivative(
/home/kirby/Code/firedrake-complex/src/sumpy/sumpy/kernel.py:1316: UserWarning: specified the name of the direction vector
  return type(kernel)(
/home/kirby/Code/firedrake-complex/src/pytential/pytential/qbx/refinement.py:476: RefinerNotConvergedWarning: QBX layer potential source refiner did not terminate after 0 iterations (the maximum). You may call 'refine_geometry_collection()' manually and pass 'visualize=True' to see what area of the geometry is causing trouble. If the issue is disturbance of expansion disks, you may pass a slightly increased value (currently: 0.025) for 'expansion_disturbance_tolerance'. As a last resort, you may use Python's warning filtering mechanism to not treat this warning as an error. The criteria triggering refinement in each iteration were: . 
  warn(
/home/kirby/Code/firedrake-complex/lib/python3.8/site-packages/numpy/core/fromnumeric.py:86: VisibleDeprecationWarning: Creating an ndarray from nested sequences exceeding the maximum number of dimensions of 32 is deprecated. If you mean to do this, you must specify 'dtype=object' when creating the ndarray.
  return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
/home/kirby/Code/firedrake-complex/src/pytential/pytential/qbx/refinement.py:476: RefinerNotConvergedWarning: QBX layer potential source refiner did not terminate after 0 iterations (the maximum). You may call 'refine_geometry_collection()' manually and pass 'visualize=True' to see what area of the geometry is causing trouble. If the issue is disturbance of expansion disks, you may pass a slightly increased value (currently: 0.025) for 'expansion_disturbance_tolerance'. As a last resort, you may use Python's warning filtering mechanism to not treat this warning as an error. The criteria triggering refinement in each iteration were: . 
  warn(
/home/kirby/Code/firedrake-complex/src/loopy/loopy/target/execution.py:193: ParameterFinderWarning: Unable to generate code to automatically find 'nunit_dofs' from the shape of 'result':
division with remainder in linear solve for 'nunit_dofs'
  warn("Unable to generate code to automatically "
/home/kirby/Code/firedrake-complex/src/boxtree/boxtree/tree_build.py:380: RuntimeWarning: overflow encountered in subtract
  bbox["max_"+ax] - bbox["min_"+ax]
Traceback (most recent call last):
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/symbolic/execution.py", line 822, in _get_qbx_discretization
    discr = self._get_discr_from_cache(geometry, discr_stage)
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/symbolic/execution.py", line 782, in _get_discr_from_cache
    raise KeyError(
KeyError: "cached discretization does not exist on '{geometry}'for stage '{discr_stage}'"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/qbx/refinement.py", line 893, in _refine_for_global_qbx
    discr, conn = get_from_cache(*ds)
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/qbx/refinement.py", line 863, in get_from_cache
    discr = places._get_discr_from_cache(geometry, to_ds)
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/symbolic/execution.py", line 782, in _get_discr_from_cache
    raise KeyError(
KeyError: "cached discretization does not exist on '{geometry}'for stage '{discr_stage}'"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "mi_decoupled_nl.py", line 241, in <module>
    Bctx.dlp.get_pot_and_grad(x, complex(kt))
  File "/home/kirby/Documents/mibc/code/laypot.py", line 97, in get_pot_and_grad
    op_grad_x = self.bound_op_and_grad(self.actx, u=density, k=kappa)
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/symbolic/execution.py", line 1157, in __call__
    return self.eval(kwargs, array_context=array_context)
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/symbolic/execution.py", line 1131, in eval
    return self.code.execute(exec_mapper)
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/symbolic/compiler.py", line 404, in execute
    self.get_exec_function(insn, exec_mapper)(
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/symbolic/execution.py", line 313, in exec_assign
    return [(name, evaluate(expr))
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/symbolic/execution.py", line 313, in <listcomp>
    return [(name, evaluate(expr))
  File "/home/kirby/Code/firedrake-complex/lib/python3.8/site-packages/pymbolic/mapper/__init__.py", line 129, in __call__
    return method(expr, *args, **kwargs)
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/symbolic/execution.py", line 279, in map_interpolation
    conn = self.places.get_connection(expr.from_dd, expr.to_dd)
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/symbolic/execution.py", line 853, in get_connection
    return connection_from_dds(self, from_dd, to_dd)
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/symbolic/dof_connection.py", line 234, in connection_from_dds
    to_discr = places.get_discretization(to_dd.geometry, to_dd.discr_stage)
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/symbolic/execution.py", line 881, in get_discretization
    return self._get_qbx_discretization(geometry, discr_stage)
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/symbolic/execution.py", line 829, in _get_qbx_discretization
    _refine_for_global_qbx(self, dofdesc,
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/qbx/refinement.py", line 895, in _refine_for_global_qbx
    discr, conn = _refine_qbx_stage2(
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/qbx/refinement.py", line 736, in _refine_qbx_stage2
    tree = wrangler.build_tree(places,
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/qbx/utils.py", line 129, in build_tree
    return build_tree_with_qbx_metadata(
  File "/home/kirby/Code/firedrake-complex/lib/python3.8/site-packages/pytools/__init__.py", line 2511, in wrapper
    return wrapped(*args, **kwargs)
  File "/home/kirby/Code/firedrake-complex/src/pytential/pytential/qbx/utils.py", line 321, in build_tree_with_qbx_metadata
    tree, evt = tree_builder(queue, particles,
  File "/home/kirby/Code/firedrake-complex/src/boxtree/boxtree/tree_build.py", line 461, in __call__
    assert nboxes_guess > 0
AssertionError```

GPU tests started failing on K40

PYOPENCL_TEST=nvi:k40 python test_fmm.py 'test_fmm_completeness(cl._csc, 3, 50000, 40000, "", p_normal, p_normal, None)' 

seems to fail somewhat reliably. Strangely, this happens with a fairly ancient set of versions: 88bdb3a for boxtree, and f999323804b6df44abf7da1181e66fed831f86f7 for pyopencl. It happens in the same fashion with newer code. The timing coincides with an Nvidia driver upgrade on those machines. I've tried rolling back to an older driver, but I can't quite reconstruct what version we were on.

cc @mattwala

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.