Coder Social home page Coder Social logo

Comments (8)

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
Ryan

I am actually working on a proposal for allowing extension libraries (in the 
form of OpenCL .cl files + separate java implementation - in case OpenCL not 
available) to be added to Aparapi.  I will post the suggestion either as an 
issue, or possibly as a wiki page (and link here) in the next week or so. 

In short the extension implementer would provide a Java interface and a Java 
implementation along with a way of mapping .cl source to the interface so that 
Aparapi can compile and bind the args (args would use Annotations to help 
Aparapi work out the access type).  This would allow the OpenCL version to use 
vector types, local memory, barriers etc.

WRT to the issue above.
Obviously the cost of bytecode analysis, OpenCL creation and compilation is 
only incurred on the first call to a kernel instance. Provided just the data is 
changing this cost should not be incurred more than once. 

Are you seeing this cost each time you execute? If so this is a bug.

Maybe you are creating multiple instances of the same Kernel.  In this case 
each will indeed incur the cost of analaysis->code creation and compilation.  
It is possible we could share the source creation between instances in this 
case, obviously the actual bound args have to be on a per-instance basis.

Can you please elaborate on the use case a little, so I can work out whether 
this is a bug or an enhancement?

I will bounce the extension doc proposal of you (and anyone with an interest) 
once I have mulled it over for a few days. 


Original comment by [email protected] on 28 Nov 2011 at 4:52

  • Changed state: Accepted
  • Added labels: Type-Enhancement
  • Removed labels: Type-Defect

from aparapi.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
This is an enhancement request and not a bug.

One of the primary use cases we are investigating is a section of code which 
calls Aparapi repeatedly, but each time has to re-execute Aparapi, incurring a 
large running cost compared to the existing CPU-bound algorithm. For example, 
the CPU algorithm takes ~170ms to execute, while Aparapi takes ~300ms of which 
only ~8ms of that is actual compute time. If we could avoid the ~292ms of 
overhead during production, that would be excellent. Of course, there are 
potentially other work-arounds, but this ticket could provide an elegant 
solution to that problem.

We also had a request from a collaborator who is interested in this framework, 
who asked if he could use Aparapi to generate OpenCL from Java code, but then 
have the ability to use only the resultant .cl file afterwards.

Original comment by [email protected] on 28 Nov 2011 at 7:59

from aparapi.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
Can you not create the Kernel instance once (outside the loop) and
just change the data?

Sorry if I am being slow here.

So instead of :-

for (...){
   int []data = //...
   // fill data
   Kernel kernel = new Kernel(){
       public void run(){
         // use data[]
       }
    };
    kernel.execute(...);
    // use modified data
}

Instead use something like

int []data = //...
Kernel kernel = new Kernel(){
    public void run(){
      // use data[]
    }
};

for (...){
   // fill data
    kernel.execute(...);
    // use modified data
}

Or is there a reason for recreating the Kernel?

Certainly we could dump the OpenCL.

We even had an idea earlier whereby we could tell the JNI layer (via a
property) to output a compiler ready C source file containing the
required buffer/host manipulation code.  Kind of like a wrapped C
function that would take just pointers to float/int/arrays int's and
the function's C code for sheparding the args would be generated
automatically.  We thought this might make a good unit test.  It would
certainly give someone a good starting point.

However, our code generation is very very literal and anyone with even
a few weeks of OpenCL experience would possibly scoff at it from a
performance POV.  Maybe scoff is too too strong. Snigger is probably
better ;)

Don't get me wrong, our codegen recreates OpenCL source structure
fairly well from bytecode.  But without an autovectorization optimizer
or possibly a loop unroller optimizer our code is fairly naive.

Gary

Original comment by [email protected] on 28 Nov 2011 at 8:30

from aparapi.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
Actually, that is almost the exact work-around we are using at the moment :^)

Your code generation is fine right now...the Java developer can also unroll the 
loops if needed.

Original comment by [email protected] on 28 Nov 2011 at 8:38

from aparapi.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
That's good, and actually this is a common pattern.  I might need to add a wiki 
page covering this.   

We might still want to look at a way to at least have the code analysis, code 
generation  and OpenCL compile kept with the Class rather than with the 
instance. This way when we create multiple instances they could share (and 
minimize) this overhead.  

Do need to be careful with subclasses which inherit from another Kernel... 

Original comment by [email protected] on 28 Nov 2011 at 11:39

from aparapi.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
Take a look at the proposal for allowing extensions to be added by 
developers/third party library providers. 

http://code.google.com/p/aparapi/wiki/AparapiExtensionProposal

Original comment by [email protected] on 2 Dec 2011 at 10:26

from aparapi.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024
This should be marked "closed" since the feature has now been implemented and 
is in place.

Original comment by [email protected] on 29 Mar 2013 at 11:06

from aparapi.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 18, 2024

Original comment by [email protected] on 20 Apr 2013 at 12:31

  • Changed state: Done

from aparapi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.