Coder Social home page Coder Social logo

libwb's Introduction

libWB

Travis Build Status AppVeyor status

libwb's People

Contributors

abduld avatar andre-orr avatar bgp2112 avatar bretmckee avatar chadheim avatar cwpearson avatar jbzdak avatar nirosys avatar oggy avatar profbbrown avatar sean-dougherty avatar ulidtko avatar zestrada avatar zonca avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libwb's Issues

Accessing wbTimers with local installation

I'm running a localized installation of libwb. While invocations of wbLog generate stdout output, I don't know how to access the timings collected by the wbTimers (via the macros wbTime_start and wbTime_stop). Guidance would be much appreciated!

-arch not included in nvcc argument list, causes kernels not to launch on older GPUs

I've run into a problem getting code that has been accepted as correct through webGPU to execute locally. It generates completely random and incorrect results for even simple vector addition. It looks like the kernel was not writing anything in the allocated memory for the output vector, and what I got was simply the pre-existing memory content. So I started with a minimal test case, and built from there.

TL;DR:
"-arch=sm_xx" is not included in the nvcc arguments, this prevented my code from launching kernels.

Apparently, GPUs with older architectures will not function when the arch argument is not provided with compilation.

Os: os x 10.8.0
Cuda version: 6.5
GPU: NVIDIA GeForce 8600 GT (compute capability 1.1)

Here is my test case:

#include <stdio.h>
#include <cuda.h>

__global__ void simpleCalc(float *input)
{
  int i = threadIdx.x + blockDim.x * blockIdx.x;
  input[i]++;
}

int wbCheck(cudaError_t err)
{
  do {
    if (err != cudaSuccess) {
      printf("Got CUDA error ... %s", cudaGetErrorString(err));
      return -1;
    }
    return 0;
  } while(0);
}

int main(){
  float testVector[5]={ 2.0, 2.0, 2.0, 2.0, 2.0 };
  float *deviceTestVector;
  float *hostTestVector = ( float * )malloc( 5 * sizeof(float));

  wbCheck(cudaMalloc((void**) &deviceTestVector, 5 * sizeof(float)));
  wbCheck(cudaMemcpy(deviceTestVector, testVector, 5 * sizeof(float), cudaMemcpyHostToDevice));

  simpleCalc<<<1,5>>>(deviceTestVector);
  cudaDeviceSynchronize();

  wbCheck(cudaMemcpy(hostTestVector, deviceTestVector, 5 * sizeof(float), cudaMemcpyDeviceToHost));
  wbCheck(cudaFree(deviceTestVector));
  for( int i=0; i<5; i++) {
    printf("%f", hostTestVector[i]);
  }
  free(hostTestVector);
  return 0;
}

When I compile this as follows:

nvcc -arch=sm_11 cudatest.cu -o cudatest

It's execution output is 3.0 (repeated five times) (expected output).

However, when I compile it without the arch argument:

nvcc cudatest.cu -o cudatest

It's execution output is 2.0 (repeated times five). The kernel did not execute.

Now when I change the code a bit, to work with wblib:

// MP 1
#include <wb.h>
#define wbCheck(stmt) do {                                                    \
        cudaError_t err = stmt;                                               \
        if (err != cudaSuccess) {                                             \
            wbLog(ERROR, "Failed to run stmt ", #stmt);                       \
            wbLog(ERROR, "Got CUDA error ...  ", cudaGetErrorString(err));    \
            return -1;                                                        \
        }                                                                     \
    } while(0)

__global__ void simpleCalc(float *input)
{
  int i = threadIdx.x + blockDim.x * blockIdx.x;
  input[i]++;
}

int main(){
  float testVector[5]={ 2.0, 2.0, 2.0, 2.0, 2.0 };
  float *deviceTestVector;
  float *hostTestVector = ( float * )malloc( 5 * sizeof(float));

  wbCheck(cudaMalloc((void**) &deviceTestVector, 5 * sizeof(float)));
  wbCheck(cudaMemcpy(deviceTestVector, testVector, 5 * sizeof(float), cudaMemcpyHostToDevice));

  simpleCalc<<<1,5>>>(deviceTestVector);
  cudaDeviceSynchronize();

  wbCheck(cudaMemcpy(hostTestVector, deviceTestVector, 5 * sizeof(float), cudaMemcpyDeviceToHost));
  wbCheck(cudaFree(deviceTestVector));
  for( int i=0; i<5; i++) {
    printf("%f", hostTestVector[i]);
  }
  free(hostTestVector);
  return 0;
}

I compiled this with the provided makefiles, and got 2.0 (repeated five times) as execution result.

I wouldn't mind fixing this myself, and submitting a pull request, only I'm not exactly experienced with CMake, and have no idea where to start.

If someone can point me in the right direction, that would be much appreciated. Concretely, where should I implement this check (or better said, where can I find the offending code), and is it a good idea to have cmake compile a short piece of cuda code to output the compute capability for this, or is there an easier way of doing it?

WebGPU isn't working

I'm trying to compile & run but it doesn't work.
Is it because the course is over ?

libwb JSON viewer (wbLogger)

Is there an available JSON parser (javascript) similar to the WebGPU site that can be used to view the output of the wbLogger? I am able to compile and run the WebGPU labs without difficulty, but reading JSON in the terminal from the logger is not pleasant.

Many thanks!

libwb json issue

From: Joe Bungo
Sent: Tuesday, October 18, 2016 1:41 PM
To: 'G. Jan Wilms'
Subject: RE: GPU Teaching Kit QwikLAB Access

Hi Jan,

Can you try upgrading to the latest version of CUDA? This has solved a lot of compatibility issues so far.

Joe Bungo
GPU Educators Program Manager
NVIDIA Corporation | Academic Programs
developer.nvidia.com/educators
Office: +1 (512) 401-4505
Mobile: +1 (512) 293-7324
[email protected]

From: G. Jan Wilms [mailto:[email protected]]
Sent: Tuesday, October 11, 2016 8:17 PM
To: Joe Bungo
Subject: Re: GPU Teaching Kit QwikLAB Access

I have found the support libraries (libwb) to be very finicky, particularly the 3rd party hpp code in the vendor folder. On machines with identical setup (VStudio 2013 Update 5, CudaToolkit 7.5) the wbTime_stop() function works fine on some machines and crashes on others. The following code is the culprit:
json11::Json json = json11::Json::object{
{"type", "timer"},
{"id", wbTimerNode_getId(node)},
{"session_id", wbTimerNode_getSessionId(node)},
{"data", wbTimerNode_toJSONObject(node)}};
std::cout << json.dump() << std::endl;

The assignment of “timer” to type in the json struct is invalid (in some machines) and causes the crash:

json issue

I have been able to work around it by commenting out the wbStop function call (or the assignment to type), but it is still puzzling.

Blessings,
-gjw

Runtime errors not recorded in attempts tab in web UI?

On the web ui accessed via the coursera class, I was getting occasional errors trying to run the data sets for the MP1 assignment. They seemed tied to server load since rerunning the set fixed the errors. But of the 4 or 5 errors saying either the data set output was not correct or that an exception occurred, none of them were recorded in my attempts tab at all. So I am thinking that perhaps run time errors or exceptions with attempts are not being recorded, because I only see compile errors or successful runs logged. Either that or the server is just acting weird due to current workload.

librt required but not mentioned in CMakeLists.txt

Building libwb fails on my system with the message:

/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/../../../../x86_64-pc-linux-gnu/bin/ld: CMakeFiles/MP0.dir/wbTimer.cpp.o: undefined reference to symbol 'clock_gettime@@GLIBC_2.2.5'
/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/../../../../x86_64-pc-linux-gnu/bin/ld: note: 'clock_gettime@@GLIBC_2.2.5' is defined in DSO /lib64/librt.so.1 so try adding it to the linker command line
/lib64/librt.so.1: could not read symbols: Invalid operation

After adding -lrt to the linker flags the compilation succeeds. I guess this error is due to my glibc version being 2.16.

I suggest adding something along the lines of

include(CheckFunctionExists)
set(CMAKE_EXTRA_INCLUDE_FILES time.h)
CHECK_FUNCTION_EXISTS(clock_gettime HAVE_CLOCK_GETTIME)
if(NOT HAVE_CLOCK_GETTIME)
find_library(LIBRT_LIBRARIES rt)
if(NOT LIBRT_LIBRARIES)
message(FATAL_ERROR "librt not found")
else(NOT LIBRT_LIBRARIES)
set( CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${LIBRT_LIBRARIES}" )
endif(NOT LIBRT_LIBRARIES)
endif(NOT HAVE_CLOCK_GETTIME)

to the CMakeLists.txt

wbImport doesn't support cudaHostAlloc

wbImport allocates data using malloc and doesn't provide any way to use cudaHostAlloc.
Either provide a switch to enable it or provide an api that allows to read the inputLength without allocating or loading the data. Once inputLength is known I can use cudaHostAlloc for the allocation. I will then need another api that only loads the data and doesn't calculate inputLength.

Print formatting support for wbLog

Please add print formatting support for wbLog. As an alternative show STDOUT and STDERR in the website. It would be nice for some light debugging

The system fails when reading CSV files created on Windows

The issue is with the wbFile_read() function in wbFile.cpp. When Windows-formatted CSV files are used the file has \r\n line endings, the files are being opened in the default text mode, which means they are replaced by \n in the buffer. As a result the size of the res=fread() is less than the count (file length), which kills the function. The following comment and amendment can be applied for a quick fix, though they remove the assertion, so probably it is better to perform binary reads taking into account \r\n EOLs.

/* if (res != count) {
wbLog(ERROR, "Failed to read data from ", wbFile_getFileName(file));
wbDelete(buffer);
return NULL;
}
buffer[bufferlen - 1] = '\0'; // make valid C string
*/

buffer[size * res] = '\0'; // make valid C string

Win64: Division by zero in _MSC_VER macro.

Division by zero in _MSC_VER macro line 23:

return ((uint64_t) counter.LowPart * NANOSEC / _hrtime_frequency) +
       (((uint64_t) counter.HighPart * NANOSEC / _hrtime_frequency) << 32);

Full Call Stack:
MP0.exe!_hrtime() Line 24 + 0xd bytes C++
MP0.exe!wb_init() Line 40 C++
MP0.exe!wbArg_new() Line 10 C++
MP0.exe!wbArg_read(int argc, char * * argv) Line 82 + 0xd bytes C++
MP0.exe!main(int argc, char * * argv) Line 10 + 0x1c bytes C++
MP0.exe!__tmainCRTStartup() Line 555 + 0x19 bytes C
MP0.exe!mainCRTStartup() Line 371 C

OpenCL header causes build fail on OS X Mavericks.

With de0d699, the inclusion of the OpenCL header, builds on Mavericks fail for me with clang complaining about SSE intrinsics.

I haven't had a chance to work out a fix, so I've resorted to commenting it out until I'm done with the current assignment. I will update this when I have a fix, if someone else doesn't get to it first.

I also have a minor patch to allow the cmake script to build for Mavericks as well. Unfortunately I do not have a 10.8, or earlier machine to verify correctness on pre-Mavericks machines.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.