Comments (10)
On my MacOSx it is not 110 MB but 36 MB
Here's a breakup what takes how much size (bytes) on Mac OSX
scan 8396256
compact 175028
segscan 25003640
spmvmult 151168
radixsort 3735368
rand 155888
Original comment by [email protected]
on 26 Jun 2009 at 1:02
from cudpp.
Thanks. How did you figure that out?
Original comment by [email protected]
on 26 Jun 2009 at 2:07
from cudpp.
Added cu file to the Makefile in cudpp one at a time. That gave me a cumulative
sum - so subtracted between
two consecutive sizes of the library to get the data.
Original comment by [email protected]
on 26 Jun 2009 at 2:11
from cudpp.
Original comment by [email protected]
on 29 Jun 2009 at 7:51
from cudpp.
Just a comment to help users who have issues with this problem. If you need to
reduce the CUDPP library binary size, you can comment out generation of the
template
kernels you don't need. In any of the *_app.cu files there is a Dispatch
function
for the corresponding algorithm. To optimize performance we have to use a
large
switch/if-else to dispatch at run time the appropriate compile-time optimized
template kernel function. To reduce compiled object size and also compile
time, you
can simply comment out the switch options that you don't need.
For example, if you don't need segmented scan, comment out everything inside
cudppSegmentedScanDispatch():
http://code.google.com/p/cudpp/source/browse/tags/1.1/cudpp/src/app/segmented_sc
an_ap
p.cu#386
Then, if you only need forward exclusive integer +-scans, comment out
everything but
the lines that invoke that type of scan:
http://code.google.com/p/cudpp/source/browse/tags/1.1/cudpp/src/app/scan_app.cu#
446
Your compile time and file size will be greatly reduced.
Perhaps the solution to this problem is to make compilation configurable in
some way.
Original comment by [email protected]
on 10 Dec 2009 at 10:40
- Added labels: Priority-Medium, Type-Enhancement
- Removed labels: Priority-Low, Type-Defect
from cudpp.
Original comment by [email protected]
on 6 Jul 2011 at 2:36
- Added labels: Milestone-Release2.1, Priority-High
- Removed labels: Priority-Medium
from cudpp.
Ha - my library is 280MB! (linux 64)
Original comment by [email protected]
on 1 Oct 2011 at 8:42
from cudpp.
Yes, as we add support for more datatypes, it naturally multiplies the binary
size. The only solutions are separate compilation and linkage and/or runtime
code generation. The former is not supported by CUDA yet (will be in the
future), and the latter is not easy in CUDA yet...
Original comment by [email protected]
on 1 Oct 2011 at 10:25
from cudpp.
If I need only to use cudppCompact - I assume I also need to be keep
cudppScanDispatch the same? (ie. uncommented out) Because if I comment out the
*Dispatch() functions in everything but cudppCompact - the library does not
work (it compiles, but just doesn't work). Would I also need to keep
reduceDispatch ?
Original comment by [email protected]
on 12 Oct 2011 at 5:03
from cudpp.
Not exactly. Compact only needs a specific type of scan -- I believe it does a
forward exclusive sum scan of unsigned integers. So if you comment the lines
inside cudppScanDispatch (and the functions it calls) for everything but
forward, exclusive, operator+, and CUDPP_UINT, then it should work and be a
much smaller library.
Original comment by [email protected]
on 12 Oct 2011 at 5:34
from cudpp.
Related Issues (20)
- CMake Error when building NVCC object (CUDPP 2.0) HOT 1
- virtual memory exhausted: 无法分配内存 HOT 1
- Related to use cuda programme in cpp file HOT 5
- Regarding MAKE file of CUDA HOT 2
- Add short/ushort support HOT 2
- Modular build system HOT 7
- Fail to compile on Win7 SP1 with VS 2010 HOT 2
- make install omits cudpp_config.h HOT 9
- Header file (cudpp_config.h) not installed by CMake HOT 1
- CHAR_BIT already defined (on OS X) HOT 2
- [deleted issue]
- Cannot compile on multi gpu machine HOT 2
- Is code of INPAR "Efficient Parallel Merge Sort for Fixed and Variable Length Keys" merged here.. HOT 2
- atomicAdd not supported for compute_30 HOT 5
- Error runnig cudpp (scan in compact) HOT 2
- Build fails on VS2010, win 7 , cuda 4.2 HOT 1
- tridiagonal solver fails for systems of two equations. (NaN in place of result) HOT 3
- cudppPlan execution time HOT 1
- Cmake problem HOT 2
- 2 different files named mt19937ar.cpp are swapped in the distro. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cudpp.