Comments (8)
Ryan
I am actually working on a proposal for allowing extension libraries (in the
form of OpenCL .cl files + separate java implementation - in case OpenCL not
available) to be added to Aparapi. I will post the suggestion either as an
issue, or possibly as a wiki page (and link here) in the next week or so.
In short the extension implementer would provide a Java interface and a Java
implementation along with a way of mapping .cl source to the interface so that
Aparapi can compile and bind the args (args would use Annotations to help
Aparapi work out the access type). This would allow the OpenCL version to use
vector types, local memory, barriers etc.
WRT to the issue above.
Obviously the cost of bytecode analysis, OpenCL creation and compilation is
only incurred on the first call to a kernel instance. Provided just the data is
changing this cost should not be incurred more than once.
Are you seeing this cost each time you execute? If so this is a bug.
Maybe you are creating multiple instances of the same Kernel. In this case
each will indeed incur the cost of analaysis->code creation and compilation.
It is possible we could share the source creation between instances in this
case, obviously the actual bound args have to be on a per-instance basis.
Can you please elaborate on the use case a little, so I can work out whether
this is a bug or an enhancement?
I will bounce the extension doc proposal of you (and anyone with an interest)
once I have mulled it over for a few days.
Original comment by [email protected]
on 28 Nov 2011 at 4:52
- Changed state: Accepted
- Added labels: Type-Enhancement
- Removed labels: Type-Defect
from aparapi.
This is an enhancement request and not a bug.
One of the primary use cases we are investigating is a section of code which
calls Aparapi repeatedly, but each time has to re-execute Aparapi, incurring a
large running cost compared to the existing CPU-bound algorithm. For example,
the CPU algorithm takes ~170ms to execute, while Aparapi takes ~300ms of which
only ~8ms of that is actual compute time. If we could avoid the ~292ms of
overhead during production, that would be excellent. Of course, there are
potentially other work-arounds, but this ticket could provide an elegant
solution to that problem.
We also had a request from a collaborator who is interested in this framework,
who asked if he could use Aparapi to generate OpenCL from Java code, but then
have the ability to use only the resultant .cl file afterwards.
Original comment by [email protected]
on 28 Nov 2011 at 7:59
from aparapi.
Can you not create the Kernel instance once (outside the loop) and
just change the data?
Sorry if I am being slow here.
So instead of :-
for (...){
int []data = //...
// fill data
Kernel kernel = new Kernel(){
public void run(){
// use data[]
}
};
kernel.execute(...);
// use modified data
}
Instead use something like
int []data = //...
Kernel kernel = new Kernel(){
public void run(){
// use data[]
}
};
for (...){
// fill data
kernel.execute(...);
// use modified data
}
Or is there a reason for recreating the Kernel?
Certainly we could dump the OpenCL.
We even had an idea earlier whereby we could tell the JNI layer (via a
property) to output a compiler ready C source file containing the
required buffer/host manipulation code. Kind of like a wrapped C
function that would take just pointers to float/int/arrays int's and
the function's C code for sheparding the args would be generated
automatically. We thought this might make a good unit test. It would
certainly give someone a good starting point.
However, our code generation is very very literal and anyone with even
a few weeks of OpenCL experience would possibly scoff at it from a
performance POV. Maybe scoff is too too strong. Snigger is probably
better ;)
Don't get me wrong, our codegen recreates OpenCL source structure
fairly well from bytecode. But without an autovectorization optimizer
or possibly a loop unroller optimizer our code is fairly naive.
Gary
Original comment by [email protected]
on 28 Nov 2011 at 8:30
from aparapi.
Actually, that is almost the exact work-around we are using at the moment :^)
Your code generation is fine right now...the Java developer can also unroll the
loops if needed.
Original comment by [email protected]
on 28 Nov 2011 at 8:38
from aparapi.
That's good, and actually this is a common pattern. I might need to add a wiki
page covering this.
We might still want to look at a way to at least have the code analysis, code
generation and OpenCL compile kept with the Class rather than with the
instance. This way when we create multiple instances they could share (and
minimize) this overhead.
Do need to be careful with subclasses which inherit from another Kernel...
Original comment by [email protected]
on 28 Nov 2011 at 11:39
from aparapi.
Take a look at the proposal for allowing extensions to be added by
developers/third party library providers.
http://code.google.com/p/aparapi/wiki/AparapiExtensionProposal
Original comment by [email protected]
on 2 Dec 2011 at 10:26
from aparapi.
This should be marked "closed" since the feature has now been implemented and
is in place.
Original comment by [email protected]
on 29 Mar 2013 at 11:06
from aparapi.
Original comment by [email protected]
on 20 Apr 2013 at 12:31
- Changed state: Done
from aparapi.
Related Issues (20)
- Problem when running with NVIDIA GPUs HOT 8
- Generating OpenCL
- Patch for /trunk/samples/add/src/com/amd/aparapi/sample/add/Main.java
- fatal error when disposing a 2D float execution kernel HOT 2
- Failed to load aparapi native library
- 2D arrays management HOT 2
- FFT Extension example fails to run HOT 1
- Add support for Intel Xeon Phi
- Trouble running samples in lambda branch on Kaveri HOT 6
- Release in the downloads section is old and no guide on how to compile on Mac HOT 1
- High total processing/running time on GPU mode w/Aparapi HOT 2
- Aparapi can't find OpenCL HOT 21
- Mandel Works Fine With The GPU But, When I Run My Code From BlueJ It Doesn't Work HOT 1
- Can i Run Aparapi on "Nvidia Gpu" or "Intel Gpu"? HOT 4
- Missing Sync or Volatile with Aparapi HOT 2
- Dump modified Java bytecode HOT 4
- Please update the tutorial about HSA settings
- Adding Vector data type at Aparapi HOT 1
- OpenCL compile fails... sometimes? (w/ Processing)
- How Learn Aparapi and use it!Any pdf , tutorial and ... HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aparapi.