aparapi / aparapi Goto Github PK

Official AMD Aparapi repository

License: Other

C++ 2.06% C 0.06% HTML 88.90% CSS 0.08% Java 8.89% Objective-C 0.01%

aparapi's Introduction

aparapi

This is the official home of AMD's Aparapi. Please see http://developer.amd.com/tools-and-sdks/open-source/ for additional AMD open source contributions.

NOTE: This site and source code is not affiliated with www.aparapi.com

Please refer to the current documentation or the older docs on google code Aparapi Google Code for documentation.

Binary downloads at Maven Central coming soon!

Watch this space!

Getting Started

If you are an official Contributor than Clone the repository and work on it as needed.

If you are an interested developer and just want to experiment with Aparapi then Fork the repository and submit Pull requests.

Users:

Download known working binary releases from Aparapi Releases.

Developers:

Every single project in Aparapi, including the root of the entire repository are Eclipse projects, although execution purely from the command-line is also supported.

Steps:

Clone/Fork the repository to your local machine
Import 'aparapi' to your Eclipse workspace making sure to import 'nested projects'
Open the appropriate Ant build.xml files in your Eclipse Ant view

Thank you,

The Aparapi Team

aparapi's People

Contributors

Stargazers

Watchers

Forkers

ekasitk log2 ssouyris nagyistoce codemason melvinsquest comaniac ibtawfik mykill rtvt123 anddegs mibrahim hooji sj1980 kishida klonikar iotamudelta giannileuani thesilencelies poneyo shamim8888 taozihk macroing yousraregaya qq332982511 whybear denislanks sharptrick florianroesler tpolzer juliocspires rowhit niiwise nanangarsyad lucashmsilva mozinrat qunaibit hallao0 tiongl mlbit morganyvm vbychkoviak harlixxy jameslinus deepsidhu1313 panzuanxin openthings ramanathandinesh liqingrikeiikyeong stjordanis theonanu wookimm hackerceo fagan2888 mikepapadim riyufuchi adian98 uzbekdev1 ilibx krystilizenevadies alessandroborges

aparapi's Issues

Issue with ProfileInfo timers?

I am interested in measuring the timers for copy in, execution and copy out of OpenCL code generated by Aparapi. If I do this on AMD GPU, the data transfer is 0 .

 private static class SaxpyKernel extends Kernel {
        private float alpha = 0.4f;
        private float[] x;
        private float[] y;
        private float[] z;

        public SaxpyKernel() {
        }

        public float[] getResult() {
            return this.z;
        }

        public void setArrays(float[] x, float[] y) {
            this.x = x;
            this.y = y;
            this.z = new float[x.length];
        }       

        @Override
        public void run() {
            int idx = getGlobalId();
            z[idx] = alpha * x[idx] + y[idx];
        }
}

If I print the timers for data transfer + kernel execution like this:

               Range range = Range.create(size);
        SaxpyKernel kernel = new SaxpyKernel();
        kernel.setExecutionMode(Kernel.EXECUTION_MODE.GPU);
               kernel.setArrays(a, b);

        for (int i = 0; i < ITERATIONS; i++) {      
            kernel.execute(range);
            List<ProfileInfo> profileInfo = kernel.getProfileInfo();
            for (ProfileInfo p : profileInfo) {             
                System.out.println(" " + p.getType()  + " : " + ((p.getEnd() - p.getStart())) + " (ns)");
            }
        }

I get these timers:

W : 0 (ns)
X : 6492297 (ns)
R : 0 (ns)

This happens ONLY on AMD GPU (AMD Radeon R9). If I use either Intel CPU (with Intel driver) or NVIDIA GPU I get these timers > 0 ns.

error: function "getClass" declared implicitly

I am running with JDK 1.8_101.
Here the details, this is the Aparapi Kernel:

public static class TrainKernel extends Kernel {
    private final int maps[];
    private final int vectors[];
    private final int tmp[];
    private final int mapSize, vectorSize, vectorsCount, epoch, bSize = 64;

    TrainKernel(int[][] vectors, int[][] maps, int epoch) {
        this.epoch=epoch;
        vectorSize=vectors[0].length;
        vectorsCount=vectors.length;
        this.vectors=new int[vectorsCount*vectorSize];
        int count=0;
        for (int i = 0; i < vectorsCount; i++) {
            for (int j = 0; j < vectorSize; j++, count++) {
                this.vectors[count] = vectors[i][j];
            }
        }
        mapSize = maps[0].length;
        this.maps =new int[maps.length* mapSize];
        count=0;
        for (int i = 0; i < maps.length; i++) {
            for (int j = 0; j < mapSize; j++, count++) {
                this.maps[count] = maps[i][j];
            }
        }
        tmp=new int[maps.length * mapSize * bSize];
    }

    @Override
    public void run() { 
        final int mSize= mapSize;
        final int bSize = this.bSize;
        final int mStart=mSize*getGlobalId();
        final int tStart=mStart*bSize;
        final int vectorsLength=vectors.length;
        final int vSize=vectorSize;
        final int count = vSize - mSize + 1;
        final int tLength=mSize * bSize;
        for (int e = 0; e < epoch; e++) {
            for (int i = 0; i < tLength; i++) {
                tmp[tStart+i]=0;
            }
            for (int vStart = 0; vStart < vectorsLength; vStart=vStart+vSize) {
                int maximum = 0;
                int maxIndex = 0;
                for (int i = 0; i < count; i++) {
                    int sum = 0;
                    for (int j = 0; j < mSize; j++) {
                        sum = sum + (vectors[vStart+i + j] & maps[mStart+j]);
                    }
                    if (maximum < sum) {
                        maximum = sum;
                        maxIndex = i;
                    }
                }
                for (int j = 0; j < mSize; j++) {
                    tmp[tStart+j*bSize+vectors[vStart + maxIndex + j] ^ maps[mStart + j]]++;
                }
            }
            int sum = 0;
            for (int i = 0; i < mSize; i++) {
                int maximum = 0;
                int maxIndex = 0;
                for (int j = 0; j < bSize; j++) {
                    if (maximum < tmp[tStart+i*bSize+j]) {
                        maximum = tmp[tStart+i*bSize+j];
                        maxIndex = j;
                    }
                }
                sum = sum + maxIndex;
                maps[mStart+i] = maps[mStart+i] | maxIndex;
            }
            if (sum == 0) {
                return;
            }
        }
    }
}

This is the kernel Aparapi generates:

typedef struct This_s{
int mapSize;
__global int *vectors;
int vectors__javaArrayLength0;
int vectors__javaArrayDimension0;
int vectorSize;
int epoch;
__global int *tmp;
__global int *maps;
int passid;
}This;
int get_pass_id(This *this){
return this->passid;
}
__kernel void run(
int mapSize,
__global int *vectors,
int vectors__javaArrayLength0,
int vectors__javaArrayDimension0,
int vectorSize,
int epoch,
__global int *tmp,
__global int maps,
int passid
){
This thisStruct;
This this=&thisStruct;
this->mapSize = mapSize;
this->vectors = vectors;
this->vectors__javaArrayLength0 = vectors__javaArrayLength0;
this->vectors__javaArrayDimension0 = vectors__javaArrayDimension0;
this->vectorSize = vectorSize;
this->epoch = epoch;
this->tmp = tmp;
this->maps = maps;
this->passid = passid;
{
int mSize = this->mapSize;
getClass();
int bSize = 64;
int mStart = mSize * get_global_id(0);
int tStart = mStart * bSize;
int vectorsLength = this->vectors__javaArrayLength0;
int vSize = this->vectorSize;
int count = (vSize - mSize) + 1;
int tLength = mSize * bSize;
for (int e = 0; eepoch; e++){
for (int i = 0; i<tLength; i++){
this->tmp[tStart + i] = 0;
}
for (int vStart = 0; vStart<vectorsLength; vStart = vStart + vSize){
int maximum = 0;
int maxIndex = 0;
for (int i = 0; i<count; i++){
int sum = 0;
for (int j = 0; j<mSize; j++){
sum = sum + (this->vectors[((vStart + i) + j)] & this->maps[(mStart + j)]);
}
if (maximum<sum){
maximum = sum;
maxIndex = i;
}
}
for (int j = 0; j<mSize; j++){
this->tmp[((tStart + (j * bSize)) + this->vectors[((vStart + maxIndex) + j)]) ^ this->maps[(mStart + j)]] = this->tmp[((tStart + (j * bSize)) + this->vectors[((vStart + maxIndex) + j)]) ^ this->maps[(mStart + j)]] + 1;
}
}
int sum = 0;
for (int i = 0; i<mSize; i++){
{
int maximum = 0;
int maxIndex = 0;
for (int j = 0; j<bSize; j++){
if (maximumtmp[((tStart + (i * bSize)) + j)]){
maximum = this->tmp[((tStart + (i * bSize)) + j)];
maxIndex = j;
}
}
sum = sum + maxIndex;
this->maps[mStart + i] = this->maps[(mStart + i)] | maxIndex;
}
}
if (sum==0){
return;
}
}
return;
}
}

There is an error:

nov 20, 2016 4:00:08 PM com.amd.aparapi.internal.kernel.KernelRunner warnFallBackAndExecute
WARNING: Reverting to Java Thread Pool (JTP) for class Layer$TrainKernel: OpenCL compile failed

clBuildProgram failed

"C:\TEMP\OCL4704T1.cl", line 39: error: function "getClass" declared implicitly
getClass();
^

is there a way to pass an unsafe memory region

Is there a way to pass a sun.misc.Unsafe allocated memory block to the Aparapi kernel and back again? I'm trying to avoid a memory copy if I can help it but I am unsure how to instruct Aparapi push this generic buffer region down to the kernel and back again if it's the case.

thanks,
alex

Multi-GPU support?

Hello,

There is a thread on the Google code website talking about multi-GPU support. A comment made by a contributor to this project suggest that the Range class contains a method called setDevice(Device d) which allows the range to be executed on a specific GPU or device in general. I haven't been able to find this method however. Does it exist in the available versions? If so, how can I find it.

Thank you

Forward references

There is forward references when there is no more new methods to discover, yet call tree is not completely analyzed. I fixed it here Vineg/aparapi@4386305 but since patch appeared to be quite ugly, I didn't create pull request.

Kernel error for MonteCarlo on NVIDIA and AMD GPUs

I am running MonteCarlo simulation within Aparapi. For testing I am using Intel OpenCL locally. I am running with JDK 1.8_65. The kernel that Aparapi generates is correct and the result when I compare to the sequential code is correct as well. However if I use the GPU, NVidia GPU or AMD GPU, the kernel is not correct. One declaration type is missing.

My understanding is, Aparapi generates the OpenCL kernel indendently of the architecture behind. Bytecodes -> C OpenCL. Is that correct? or is there any communication during the code generation?

Here the details, this is the Aparapi Kernel:

public static class MonteCarloKernel extends Kernel {

        private int size;
        private float[] result;

        public MonteCarloKernel(int size) {
            this.size = size;
            result = new float[size];
        }

        @Override
        public void run() {
            int idx = getGlobalId();
            int iter = 25000;

            long seed = idx;
            float sum = 0.0f;

            for (int j = 0; j < iter; ++j) {
                // generate a pseudo random number (you do need it twice)
                seed = (seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) - 1);
                seed = (seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) - 1);

                // this generates a number between 0 and 1 (with an awful entropy)
                float x = ((float) (seed & 0x0FFFFFFF)) / 268435455f;

                // repeat for y
                seed = (seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) - 1);
                seed = (seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) - 1);
                float y = ((float) (seed & 0x0FFFFFFF)) / 268435455f;

                float dist = (float) Math.sqrt(x * x + y * y);
                if (dist <= 1.0f)
                    sum += 1.0f;
            }
            sum *= 4;
            result[idx] = (float) sum / (float) iter;
        }

        public boolean checkResult(float[] seq) {
            for (int i = 0; i < seq.length; i++) {
                if (Math.abs( (float)(result[i] - seq[i])) > 0.001) {
                    return false;
                }
            }
            return true;
        }

        public float[] getResult() {
            return result;
        }

        public int getSize() {
            return size;
        }
    }

If I use Intel OpenCL:

NAME: Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz
VENDOR: Intel(R) Corporation
TYPE: CPU
DRIVER: 1.2.0.57

This is the kernel Aparapi generates (the correct kernel):


#pragma OPENCL EXTENSION cl_khr_fp64 : enable

typedef struct This_s{
   __global float *result;
   int passid;
}This;
int get_pass_id(This *this){
   return this->passid;
}
__kernel void run(
   __global float *result, 
   int passid
){
   This thisStruct;
   This* this=&thisStruct;
   this->result = result;
   this->passid = passid;
   {
      int idx = get_global_id(0);
      int iter = 25000;
      long seed = (long)idx;
      float sum = 0.0f;
      for (int j = 0; j<iter; j++){
         seed = ((seed * 25214903917L) + 11L) & 281474976710655L;
         seed = ((seed * 25214903917L) + 11L) & 281474976710655L;
         float x = (float)(seed & 268435455L) / 2.68435456E8f;
         seed = ((seed * 25214903917L) + 11L) & 281474976710655L;
         seed = ((seed * 25214903917L) + 11L) & 281474976710655L;
         float y = (float)(seed & 268435455L) / 2.68435456E8f;
         float dist = (float)sqrt((double)((x * x) + (y * y)));
         if (dist<=1.0f){
            sum = sum + 1.0f;
         }
      }
      sum = sum * 4.0f;
      this->result[idx]  = sum / (float)iter;
      return;
   }
}

When I use Aparapi on NVIDIA or AMD GPUs (same JVM - JDK 1.8.65, but different driver), I get this kernel:

#pragma OPENCL EXTENSION cl_khr_fp64 : enable

typedef struct This_s{
   __global float *result;
   int passid;
}This;
int get_pass_id(This *this){
   return this->passid;
}
__kernel void run(
   __global float *result, 
   int passid
){
   This thisStruct;
   This* this=&thisStruct;
   this->result = result;
   this->passid = passid;
   {
      int i_1 = get_global_id(0);
      int i_2 = 25000;
       l_3 = (long)i_1;
      float f_5 = 0.0f;
      int i_6 = 0;
      for (; i_6<i_2; i_6++){
         l_3 = ((l_3 * 25214903917L) + 11L) & 281474976710655L;
         l_3 = ((l_3 * 25214903917L) + 11L) & 281474976710655L;
         float f_7 = (float)(l_3 & 268435455L) / 2.68435456E8f;
         l_3 = ((l_3 * 25214903917L) + 11L) & 281474976710655L;
         l_3 = ((l_3 * 25214903917L) + 11L) & 281474976710655L;
         float f_8 = (float)(l_3 & 268435455L) / 2.68435456E8f;
         float f_9 = (float)sqrt((double)((f_7 * f_7) + (f_8 * f_8)));
         if (f_9<=1.0f){
            f_5 = f_5 + 1.0f;
         }
      }
      f_5 = f_5 * 4.0f;
      this->result[i_1]  = f_5 / (float)i_2;
      return;
   }
}

There is an error:

clBuildProgram failed
************************************************
:21:8: error: use of undeclared identifier 'l_3'
       l_3 = (long)i_1;
       ^

Note:
NVIDIA-SMI 331.79 Driver Version: 331.79

AMD:

Name: Hawaii
Vendor: Advanced Micro Devices, Inc.
Device OpenCL C version: OpenCL C 1.2
Driver version: 1598.5 (VM)

Nvidia mobile GPU

Currently having the following issue:

OpenCL 1.2 CUDA 8.0.0
!!!!!!! clCreateContextFromType() failed device not available
jul 11, 2016 3:20:10 AM com.amd.aparapi.internal.kernel.KernelRunner warnFallBackAndExecute
ADVERT╩NCIA: Reverting to Java Thread Pool (JTP) for class com.CopyKernel: initJNI failed to return a valid handle
!!!!!!! clCreateContextFromType() failed device not available
jul 11, 2016 3:20:10 AM com.amd.aparapi.internal.kernel.KernelRunner warnFallBackAndExecute
ADVERT╩NCIA: Reverting to Java Thread Pool (JTP) for class com.DilateKernel: initJNI failed to return a valid handle

I cant get Aparapi to run on my GTX 960M. I'm on a notebook.

It runs fine on my Intel processor, however.

I tried to recompile the code but the source have some missing headers.

Thank you so much in advance.

A kernel cannot be recompiled after it is disposed

Any time a kernel is disposed with the intent of recompiling results in attempts to fall back to next device. This bug specifically effects this code in KernelRunner.java L1348

I believe that it is specifically due to the persistence of the seenBinaryKeys set after disposing of the jniContextHandle in KernelRunner.java L190.

I believe that by simply clearing the seenBinaryKeys set upon disposal will resolve this issue.

Clean-up all documentation and website

There have been a lot of changes over the past few years that are missing from the current documentation and website, causing some issues for users. This task is intended to address those concerns.

"no device object!" on 64bit Linux

I'm attempting to run the code below on 64bit Fedora 21. I have OpenCL 1.0. I have this code in a simple Java project in Eclipse with the Aparapi jar on the build path. My code compiles fine and so do the samples. I am using the 1.0.0 pre-release from the GitHub repo.

When I execute my code I receive the message "no device object!" and Java throws a SIGSEGV.

Running with -Dcom.amd.aparapi.enableVerboseJNI=true does not provide any extra information.

I'm primarily interested in running this code in JTP mode before trying out OpenCL mode (thus the big starting buffer size).

Any thoughts or suggestions would be appreciated!

Jay

package org.jayjaybillings.math.prototypes;

import com.amd.aparapi.Kernel;

/**

Testing vector addition performance time.
@author Jay Jay Billings
*/
public class VecAddTest {

/**
- @param args
  */
  public static void main(String[] args) {
  
  final int num = 200000000, maxits = 1000;
  final float[] vec = new float[num];
  
  Kernel kernel = new Kernel() {
  @OverRide
  public void run() {
  for (int j = 0; j < maxits; j++) {
  for (int i = 0; i < num; i++) {
  vec[i] += i;
  }
  }
  }
  };
  kernel.execute(8);
  
  System.out.println(vec[1]);
}

}

Execution with entrypoint fails with kernel compile error

When I try call kernel.execute("similarReduce", Range.create(similarSize)); on JTP mode I see no kernel calls at all. When I call the method in GPU\CPU mode - I receive "!!!!!!! clCreateKernel() failed invalid kernel name" error. I use workaround as recommended here. But I think using execute method with entrypoint is more prity.

OS Differences

I coded on my MacBook, the code was working well, but not pretty fast. So I switched to my Windows desktop PC with a GPU, but the code just wouldn't run. I'm getting

"Jan 31, 2017 4:32:05 PM com.amd.aparapi.internal.kernel.KernelRunner executeOpenCL
WARNUNG: ### CL exec seems to have failed. Trying to revert to Java ###"

every time I run the code. But minor changes will make the code work again.

Code:
int closest = -1;
Some loops and raytracing later...
if (closest > -1){ this.image[id] = 23; }
Will produce an error, but just
this.image[id] = 23;
without the conditional statement works great. Please help me im confused!

Regards Julius

Will you be supporting SPIR-V?

Since SPIR-V has been announced and blessed by the Khronos Group as the new universal intermediate representation format for both shader and kernel code, for use not only in Vulkan, but also in OpenCL going forward, I was wondering what your plans are for Aparapi to support compiling the Java kernel code to SPIR-V in addition to the current functionality?

( For those of you reading this who are not yet familiar with SPIR-V, see https://www.khronos.org/spir )

At any rate, thanks for this fun and useful library! I'm already having a lot of fun playing with Aparapi in my spare time project. Keep up the good work. :-)

Kernel execution on a chosen device without specifying the EXECUTION_MODE

Doc Choosing Specific Devices for Execution points out that one can choose a specific device for kernel execution.

However it seems that Kernel.setExecutionMode() is deprecated and aparapi is unable to automatically set the execution mode for a specific device.

Definition of a constant

Where is the definition of com_amd_aparapi_internal_jni_KernelRunnerJNI_JNI_FLAG_USE_ACC
located in the source code?

It has not been defined and I cant compile the source code.

Thank you in advance.

Submit io.github.aparapi to Sonatype (Maven Central)

https://issues.sonatype.org/browse/OSSRH-26711

mavenize?

Any chance you guys could publish releases on Maven Central? It'd help a whole lot. I can send a PR w/ any necessary changes if it's of interest.

user supplied Device incompatible with current EXECUTION_MODE

It seems that Kernel.setExecutionMode() is deprecated. But when no EXECUTION_MODE is specified (or set to AUTO), my code throws:

Exception in thread "main" java.lang.AssertionError: user supplied Device incompatible with current EXECUTION_MODE or getTargetDevice(); device = AMD; kernel = MatrixKernel, devices={Intel|AMD|Intel|AMD|Java Alternative Algorithm|Java Thread Pool}
at com.amd.aparapi.internal.kernel.KernelRunner.executeInternalInner(KernelRunner.java:1206)

` public static long runTest(Device device, final int r) {

    Range range = device.createRange(r * r);
    double[] randomMatrixA = getRandomSquareMatrix(r);
    double[] randomMatrixB = getRandomSquareMatrix(r);
    MatrixKernel matrixKernel = new MatrixKernel(randomMatrixA, randomMatrixB, r);

    matrixKernel.setExecutionMode(Kernel.EXECUTION_MODE.AUTO);

    final long time1 = System.currentTimeMillis();
    matrixKernel.execute(range);
    return (System.currentTimeMillis() - time1);
}`

Aparapi and Ray Casting (Ray Tracing)

Hello!

I've used Aparapi for a while now and I really like it. The project I've been working on is an open source realtime Ray Caster (Ray Tracer).

But now I'm interested in hearing your input on Aparapi- / OpenCL specific optimizations for a project like this. Would it be a good idea to use the local and / or global buffer annotations? Or do you have any other ideas?

In case you're interested in the project, you can find it here: https://github.com/macroing/OpenRC

Errors in Quick Reference Guide

I found some typos and small errors in the "Aparapi Quick Reference Guide" (QuickReference.pdf):

Section "Create an anonymous inner class extending com.amd.aparapi.Kernel"

Should "single dimension arrays of primitive from the call-site" be "single dimension arrays of primitive elements of final fields from the call-site"?

The statement "This code sets each element of data[] to its index." is wrong. There is no such code. Maybe the line in the run method should be replaced by the line in the following section.
Section "Executing your kernel over a given range (0..)"

"This code each element of data[]" sould be "This code sets each element of data[]"

"Data[getGlobalId()]" should be "data[getGlobalId()]"
Section "Setting default execution mode"

"com.amd.aparapi.ExecutionMode" should be "com.amd.aparapi.executionMode"
Section "Determining the execution mode"

"com.amd.aparapi.ExecutionMode" should be "com.amd.aparapi.executionMode"
Section "Kernel methods to determine execution identity"

"eacj" should be "each"

What does "pf" mean? Please explain it.
Section "Mapping of Aparapi Kernel math methods to Java and OpenCL equivalents", in the column "Java mapping" of the table

"Math.acos(float)" should be "Math.acos(double)"
"Math.asin(float)" should be "Math.asin(double)"
"Math.atan(float)" should be "Math.atan(double)"
"Math.atan2(float, float)" should be "Math.atan2(double, double)"
"Math.ceil(float)" should be "Math.ceil(double)"
"Math.cos(float)" should be "Math.cos(double)"
"Math.exp(float)" should be "Math.exp(double)"
"Math.floor(float)" should be "Math.floor(double)"
"Math.max(long,int)" should be "Math.max(long,long)"
"Math.min(long,int)" should be "Math.min(long,long)"
"Math.log(float)" should be "Math.log(double)"
"Math.pow(float, float)" should be "Math.pow(double, double)"
"Math.pow(double, float)" should be "Math.pow(double, double)"
"Math.IEEEremainder(float, float)" should be "Math.IEEEremainder(double, double)"
"Math.IEEEremainder(double, float)" should be "Math.IEEEremainder(double, double)"
"Math.rint( float)" should be "Math.rint(double)"
"1f/Math.sqrt( float)" should be "1.0/Math.sqrt(double)"
"1.0/Math. round( double)" should be "1.0/Math.sqrt(double)"
"Math.sin( float)" should be "Math.sin(double)"
"Math.sqrt( float)" should be "Math.sqrt(double)"
"Math.tan( float)" should be "Math.tan(double)"
"Math.toRadians( float)" should be "Math.toRadians(double)"
"Math.toDegrees( float)" should be "Math.toDegrees(double)"

Diacritics

Hi. First of all, thanks for this great project!

This is mostly an aesthetic matter. My mother tongue has diacritics, but (as one would expect) using them causes problems (name of Kernel subclasses, members used in run() and enclosing classes, if I'm not mistaken). I'd like if there was support for diacritics, since Java allows using them.

By now, I'm just a user so I can't [even try to] do that myself, so I'd be thankful if someone could do that without problems. Obviously, I'm assuming it's possible to add support for diacritics by changing á to _E1_ or something, for example. Anyway, I'm just using Aparapi in a personal project, so it's not really important.

Use of Aparapi libraries in Apache projects

Hi,

My name is edward, a member of Apache Software Foundation. I recently considering use of Aparapi in my (Java-based) open source projects (Apache Hama [1] and HORN [2]) but heard that Aparapi license is not suitable for inclusion in Apache products [3]. Can you allow us to include aparapi binary files, or give me some feedbacks?

Thanks.

Is there any issue about data types?

Hi,

I have following code.

When my code runs on aparapi, it generates wrong values as you can see its result 😞

Result      Num         Expected
2026982348  406816880   40681688012
2026982516  406816881   40681688180
2026982594  406816882   40681688258
2026982662  406816883   40681688326
2026982830  406816884   40681688494
2026982898  406816885   40681688562
2026982966  406816886   40681688630
2026983044  406816887   40681688708
2026983212  406816888   40681688876
2026983280  406816889   40681688944
2026983338  406816890   40681689002
2026983506  406816891   40681689170
2026983584  406816892   40681689248
2026983652  406816893   40681689316
2026983820  406816894   40681689484
2026983888  406816895   40681689552
2026983956  406816896   40681689620
2026984134  406816897   40681689798
2026984202  406816898   40681689866
2026984270  406816899   40681689934

Error in KernelProfile, no currentDevice (synchronization error? when trying to execute using CPU

I'm trying to do a simple test unfortunately, trying to get it to work with GPU or CPU results in:

SEVERE: Error in KernelProfile, no currentDevice (synchronization error?

The only time it works is when using JTP, using the default auto results in nothing getting executed.

static class TestKernel extends Kernel {
		final float inA[] = new float[] { 1.0f, 2.0f };
		final float inB[] = new float[] { 2.0f, 3.0f };
		final float result[] = new float[inA.length];

		@Override
		public void run() {
			int i = getGlobalId();
			result[i] = inA[i] + inB[i];
		}

		public float[] getResult() {
			return result;
		}
	}

	private static void aparapi() {
		try {
			NativeLoader.load();
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}

		TestKernel kernel = new TestKernel();

		Range range = Range.create(2);
		kernel.setExecutionMode(Kernel.EXECUTION_MODE.CPU);
		kernel.execute(range);

//		OutputUtils.print(kernel.getResult());
	}

Can I take over development of this project?

Hi, my project has a pressing need to rely on aparapi and as such I have been contributing on the project in my own repositories. Since I will be contributing significant amount of work I'd like to contribute back that effort. You are of course welcome to adopt the project back into yours but you will find a lot has changed and that may be difficult at this point, but please feel free. If not perhaps the team would like to consider moving over to my new repositories for future effort? I am willing to discuss alternatives as well.

Here is a recap of what I did so far and where the new repositories can be found.

Everything has been mavinized!

All new code is licensed under the apache license. Also since AMD is no longer the maintainer I changed the root package across all the projects.

I pulled out the core java library. This is where all the platform independent code lives and ultimately produces the jar that will be used as the dependencies. This is now its own repository and can be found here:

https://github.com/Syncleus/aparapi

I pulled out all the platform specific code, the JNI layer, into its own repository. This no longer uses ant, as the project has been mavenized. But it isnt a Java project either, so it doesnt use maven. It has been refactored to use autotools, which is a platform independent way to compile shared libraries. It currently only compiles for linux platforms though. The code for the JNI libraries can now be found here, it uses submodules so please clone recursively:

https://github.com/Syncleus/aparapi-jni

I also created an Archlinux AUR and an unofficial binary repository for the aparapi system-specific shared library. This will allow installation of the aparapi shared library using the archlinux package management system, and uninstallation as well. The AUR can be found here:

https://github.com/Syncleus/aparapi-archlinux

To add the unofficial repository to archlinux for use with pacman then add the following line to your /etc/pacman.conf file, before all the other repositories:

[aparapi]
SigLevel = Optional TrustAll
Server = http://syncleus.com/aparapi-archlinux-repo/

The examples are now also a separate repository, as it isnt really needed for the library itself. This has the life sample mavenized and working but still need to mavenize the rest of the examples. This repo can be found here:

https://github.com/Syncleus/aparapi-examples

Finally I purchased the aparapi.com domain name so I can start hosting some useful information about it and host some files there. I also have an account on maven central so I will soon be able to upload aparapi there as it is now in a state where it can be consumed as a dependency (since it is completely mavenized).

So let me know what you guys think about joining efforts and perhaps moving the development over to the new repositories?

Private superclass fields missing from struct This_s

When I create the following classes:

abstract class OperatorKernel extends Kernel {
  private float[] output;
  public void init(int pixels) {
    if (output == null) {
      output = new float[pixels];
    }
  }
}

class ProductKernel extends OperatorKernel {/*...*/}

and then call init() on a ProductKernel, I get this error:

Jul 05, 2016 7:11:43 PM com.amd.aparapi.internal.kernel.KernelRunner warnFallBackAndExecute
WARNING: Reverting to Java Thread Pool (JTP) for class com.google.users.polymorpheus.experimental.arttree.operators.X$XK
ernel: OpenCL compile failed
clBuildProgram failed
************************************************
<kernel>:36:13: error: no member named 'output' in 'struct This_s'
      this->output[pixelId]  = com_google_users_polymorpheus_experimental_arttree_operators_UnaryOperatorKernel__calcula
tePixelById(this, pixelId) * (float)this->signum;
      ~~~~  ^
<kernel>:36:163: error: no member named 'signum' in 'struct This_s'
      this->output[pixelId]  = com_google_users_polymorpheus_experimental_arttree_operators_UnaryOperatorKernel__calcula
tePixelById(this, pixelId) * (float)this->signum;

                                    ~~~~  ^

************************************************

Changing the visibility of the inherited fields to protected seems to fix this. I suspect the problem is that the bytecode-to-OpenCL compiler isn't looking at the superclass bytecode, even when the superclass isn't immediately Kernel.

Cant find KernelPreferences

I've tried to copy everything to my eclipse environment, and the code seems to be missing a KernelPreferences class? From package com.amd.aparapi

Issue when running with netbeans?

When running any sample project, I seem to be getting the following error:

Nov 24, 2015 7:54:08 PM com.amd.aparapi.internal.model.ClassModel$AttributePool <init>
WARNING: Found unexpected Attribute (name = org.netbeans.SourceLevelAnnotations)
Nov 24, 2015 7:54:08 PM com.amd.aparapi.internal.kernel.KernelRunner warnFallBackAndExecute
WARNING: Reverting to Java Thread Pool (JTP) for class volvis.RaycastRenderer$1: FP64 required but not supported

This leads me to think that there might be some issues with running using netbeans, as the attribute org.netbeans.SourceLevelAnnotations is not recognised. Due to a warning it then stops running on the GPU, tries JTP, and even that fails..

Does anyone have a clue what is going on here, and how to fix this?

I'm using java 1.8(.0_51) on Mac OS X 10.11.

Aparapi can't find OpenCL

I have been having some trouble getting aparapi to work on my desktop.
I'm running Windows 7 64 bit with an AMD graphics card and all appropriate drivers installed.
output from clinfo: http://pastebin.com/fA0Pke8Y

from latest release on code google (https://code.google.com/p/aparapi/downloads/list)
output from info sample: http://pastebin.com/wiYGEcGP
output from add sample: http://pastebin.com/GEXMR70r

from latest release on Github (https://github.com/aparapi/aparapi/releases)
output from info sample: http://pastebin.com/mbQWMybd
output from add sample: http://pastebin.com/KeWp6XnM

The aparapi.jar and opencl.dll files are both in locations in the windows path.
I can only imagine that I don't have something set up correctly, but would appreciate any help getting this to work.
Thanks,
Jeff

Error in sample package add

In the sample class com.amd.aparapi.sample.add.Main (i.e. in the file samples/add/src/com/amd/aparapi/sample/add/Main.java) there is the following line:

kernel.execute(Range.create(512));

But that should be:

kernel.execute(Range.create(size));

Please choose a more liberate license

Hi,
the Debian Java team considers packaging aparapi for official Debian. When we checked the license we stumbled upon the phrase

you will not (1) export, re-export or release to a national of a country in Country ...

which makes the software non-free from a Debian point of view since there is a restriction to a certain group of users who live in the said countries (besides the fact that we have no control to force this license).
I'm wondering whether you might consider a more liberate license - for instanse by removing the last paragraph completely (or choose some well known license like BSD or MIT).
Thanks for your cooperation, Andreas.

Using tanh and NPE in ClassModel

I was trying to use tanh by declaring

@OpenCLMapping(mapTo = "tanh")
protected float tanh(float _f) {
    return (float) Math.tanh(_f);
}

Unfortunately, this exposes a NullPointerException:

java.lang.NullPointerException
at com.amd.aparapi.internal.model.ClassModel.parse(ClassModel.java:2542)
at com.amd.aparapi.internal.model.ClassModel.parse(ClassModel.java:2526)
at com.amd.aparapi.internal.model.ClassModel.(ClassModel.java:142)

The problem is, that it tries to parse java.lang.Math, and Math.class.getClassLoader() returns null, so the ClassModel can't find the class file. I'm not quite sure how aparapi should deal with this - perhaps classes used in methods annotated with @OpenCLMapping shouldn't be parsed at all.

It would be nice, too, if the Kernel would define tanh and similar methods, anyway.

Thanks so much!

Method getGroupSize not found

When I use the method "getGroupSize()", defied the the table in section "Kernel methods to determine execution identity" in the Quick Reference Guide I get the following error message when compiling the class:

src/com/amd/aparapi/sample/taskinfo/Main.java:61: error: cannot find symbol
            groupSize[gid] = getGroupSize();
                             ^
  symbol: method getGroupSize()

Btw: Why are the methods from the table, in particular the method "getGlobalId()", not defined in the API documentation of the class "Kernel"? At least they should be described in the description of the class.

Single sequential execution reports error

When I execute the example "add" (on Linux with JDK 7) in the single sequential Java loop, set by the property "-Dcom.amd.aparapi.executionMode=SEQ", I get the following error message:

Exception in thread "main" java.lang.IllegalStateException: Can't run range with group size >1 sequentially. Barriers would deadlock!
at com.amd.aparapi.internal.kernel.KernelRunner.executeJava(KernelRunner.java:242)
at com.amd.aparapi.internal.kernel.KernelRunner.execute(KernelRunner.java:1196)
at com.amd.aparapi.Kernel.execute(Kernel.java:2028)
at com.amd.aparapi.Kernel.execute(Kernel.java:1962)
at com.amd.aparapi.Kernel.execute(Kernel.java:1933)
at com.amd.aparapi.sample.add.Main.main(Main.java:67)

Steps to reproduce on Linux:

"java -version" gives "OpenJDK Runtime Environment (IcedTea 2.6.7) (7u111-2.6.7-2~deb7u1)
OpenJDK 64-Bit Server VM (build 24.111-b01, mixed mode)"
mkdir Aparapi
cd Aparapi
unzip ~/dist_linux_x86_64.zip
cd samples/add/
java -Djava.library.path=../.. -classpath ../../aparapi.jar:add.jar -Dcom.amd.aparapi.executionMode=SEQ com.amd.aparapi.sample.add.Main

Port the wiki to markdowns inside doc dir

I started on this. Please let me know if there's another effort going on.