Coder Social home page Coder Social logo

3gx / halloc Goto Github PK

View Code? Open in Web Editor NEW

This project forked from canonizer/halloc

0.0 1.0 0.0 978 KB

A fast and highly scalable GPU dynamic memory allocator

License: Other

Makefile 2.18% Cuda 60.95% C 8.94% C++ 10.79% Shell 1.10% Perl 10.02% Python 6.01%

halloc's Introduction

Halloc GPU memory allocator, version 0.11

INTRO

Halloc is a high-throughput malloc/free-style dynamic memory allocator for
NVidia Kepler GPUs. It is based on using bit arrays to represent free blocks and
using a hash function to quickly search for free blocks. This idea, combined
with clever slab management and performance tuning, enables a really fast
allocator. Halloc achieves more than 1.5 bln. mallocs/s (more than 1
bln. malloc/free pairs/s) on K20X and 16-byte allocations, with tens of
thousands of GPU threads and more than 100MiB allocated. This is much higher
than other state-of-the-art GPU allocators. In addition, halloc's performance is
also more stable. This makes halloc suitable for use in GPGPU applications
requiring fast dynamic memory management. Halloc is mainly designed for small
allocation sizes, and delegates allocations larger than 3KiB to CUDA allocator.


REQUIREMENTS

Software: CUDA 5.0 or higher (tested with 6.5)
Hardware: Compute Capability 2.0 or higher (tested on CC 3.5 devices K20X and K40).

Note: libraries and tests are currently not compiled for compute_50/sm_50, i.e. Maxwell.


COMPILING

To compile halloc library, type (in project's top directory):

	make

To run correctness tests (CAUTION: takes a lot of time!):

	make test

To build correctness tests without running them:

	make build-corr

To build performance tests without running them:
	 
	make build-perf

Performance tests are then located in ./tst/perf/bin directory, and can be
invoked individually, e.g.

	./tst/perf/bin/throughput
	./tst/perf/bin/phase-throughput -f0.95 -F0.05 -e0.91 -g5 -t128

To install, edit PREFIX variable in the makefile to your desired install
directory (default ~/usr) and type:

	make install

To uninstall:

	make uninstall


USING HALLOC 

See samples/ directory for samples using Halloc.

Compiling Your Program

The GPU application then needs to be compiled with halloc static library using
separate device compilation and linking. Assuming that the variable $PREFIX
contains the installation prefix, and myprog.cu is the file being compiled, this
can be done as follows:

  nvcc -arch=sm_35 -O3 -I $(PREFIX)/include -dc myprog.cu -o myprog.o
	nvcc -arch=sm_35 -O3 -L $(PREFIX)/lib -lhalloc -o myprog myprog.o


Halloc API 

The functions defined by halloc are in the halloc.h file, which needs to be
included into your code to use halloc:

#include <halloc.h>

Before using halloc, in device functions it has to be initialized with ha_init()
function:

void ha_init(halloc_opts_t opts = halloc_opts_t());

It can be given a full halloc_opts_t structure to control fine halloc
parameters, such as slab size or fraction of used chunks at which the slab is
considered "busy". It can also be called just with specifying amount of memory
to allocate, or completely without any parameter list to preserve defaults:

ha_init(512 * 1024 * 1024);  // pass memory to allocate
ha_init();  // use default amount of memory

Halloc defines two functions, hamalloc to allocate and hafree to free memory
(malloc and free are used by CUDA allocator, therefore halloc has to use other
names). These functions can only be called from device code.

void *hamalloc(size_t nbytes);
void hafree(void *p);

Otherwise, these functions have pretty much the same behavior as standard C
malloc/free, e.g.:

// allocate an array
int *p = (int *)hamalloc(8 * sizeof(int));
p[0] = 0;
p[1] = threadIdx.x;
p[2] = 2;
// ...
// free the array
hafree(p);


// allocate a list
typedef struct list_ {
	int element;
	struct list_ *next;
} list;
// ...
list *l = (list *)hamalloc(sizeof(list));
l->element = 1;
l->next = (list *)hamalloc(sizeof(list));
l->next->element = 2;
l->next->next = NULL;

The functions can be used in pretty much the same way as in C programs. hamalloc
accepts the number of bytes to allocate, and returns the pointer to allocated
memory, or NULL if memory cannot be allocated. Similarly, hafree accepts either
a pointer returned by hamalloc or NULL, and frees the memory previously
allocated. Naturally, hamalloc and hafree are thread-safe, and can be called
simultaneously by threads of the same or different kernels. hamalloc allocations
persist across kernel invocations, and can be used in other kernel
calls. Pointers allocated by hamalloc can only be freed by hafree;
they cannot be deallocated, e.g., by host/device cudaFree/free.

ha_shutdown() is intended to free resources used by halloc, but is currently a
no-op.


LIMITATIONS

There is currently no way to change parameters or allocate more memory after
halloc has been initialized.


BUGS

Though correctness tests pass successfully, this provies nothing, of
course. Some bugs are most likely there ;)

halloc's People

Contributors

canonizer avatar

Watchers

3gx avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.