autumnai / collenchyma-nn Goto Github PK

View Code? Open in Web Editor NEW

15.0 15.0 14.0 2.27 MB

Collenchyma plugin for backend-agnostic Neural Network operations

Home Page: http://autumnai.github.io/collenchyma-nn

License: Apache License 2.0

Rust 99.71% Shell 0.29%

collenchyma-nn's People

Contributors

Stargazers

Watchers

Forkers

benjaminbollen michaelhirn nagyistge rlugojr bklooste reem diamondlovesyou alexandermorozov sorpaas cmeon combust y-ich andreytkachenko ikhomyakov

collenchyma-nn's Issues

Add support for in-place activation functions

In-place activation functions where the result of the activation will be written to the position where it's input is read from would greatly reduce memory usage (for most neural network architectures by a factor of ~2).

However I am not sure what the best way would be to support this. An additional trait for every activation function seems a bit overkill, but maybe that's the most viable way (also better for partial implementations). I am open for suggestions here.

Remove our workspace size checking

We originally wanted to remove the somewhat unneccesary needs_cudnn_workspace with #37 , but due to some reason always computing the neccesary workspace size via cudnn for bwd_data.

The current solution looks like this:

collenchyma-nn/src/frameworks/cuda/mod.rs

Lines 309 to 313 in d9bb5da

    
           // let workspace_size_bwd_data = API::get_convolution_backward_data_workspace_size(*CUDNN.id_c(), useable_algo_bwd_data.as_cudnn().unwrap(), *filter_desc.id_c(), *conv_desc.id_c(), *src_desc.id_c(), *dest_desc.id_c()).unwrap(); 
        
           let workspace_size_bwd_data = match try!(useable_algo_bwd_data.needs_cudnn_workspace()) { 
        
               false => 1, 
        
               true => API::get_convolution_backward_data_workspace_size(*CUDNN.id_c(), useable_algo_bwd_data.as_cudnn().unwrap(), *filter_desc.id_c(), *conv_desc.id_c(), *src_desc.id_c(), *dest_desc.id_c()).unwrap(), 
        
           };

but optimally we would want to get rid of our own needs_cudnn_workspace heuristics.

Remove Float boundary for INN trait

The Float trait boundary ist too restrictive and doesn't really serve any purpose:

pub trait NN<F: Float>

Similar to autumnai/collenchyma-blas#7

Relicense under dual MIT/Apache-2.0

This issue was automatically generated. Feel free to close without ceremony if
you do not agree with re-licensing or if it is not possible for other reasons.
Respond to @cmr with any questions or concerns, or pop over to
#rust-offtopic on IRC to discuss.

You're receiving this because someone (perhaps the project maintainer)
published a crates.io package with the license as "MIT" xor "Apache-2.0" and
the repository field pointing here.

TL;DR the Rust ecosystem is largely Apache-2.0. Being available under that
license is good for interoperation. The MIT license as an add-on can be nice
for GPLv2 projects to use your code.

Why?

The MIT license requires reproducing countless copies of the same copyright
header with different names in the copyright field, for every MIT library in
use. The Apache license does not have this drawback. However, this is not the
primary motivation for me creating these issues. The Apache license also has
protections from patent trolls and an explicit contribution licensing clause.
However, the Apache license is incompatible with GPLv2. This is why Rust is
dual-licensed as MIT/Apache (the "primary" license being Apache, MIT only for
GPLv2 compat), and doing so would be wise for this project. This also makes
this crate suitable for inclusion and unrestricted sharing in the Rust
standard distribution and other projects using dual MIT/Apache, such as my
personal ulterior motive, the Robigalia project.

Some ask, "Does this really apply to binary redistributions? Does MIT really
require reproducing the whole thing?" I'm not a lawyer, and I can't give legal
advice, but some Google Android apps include open source attributions using
this interpretation. Others also agree with
it.
But, again, the copyright notice redistribution is not the primary motivation
for the dual-licensing. It's stronger protections to licensees and better
interoperation with the wider Rust ecosystem.

How?

To do this, get explicit approval from each contributor of copyrightable work
(as not all contributions qualify for copyright, due to not being a "creative
work", e.g. a typo fix) and then add the following to your README:

## License

Licensed under either of

 * Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
 * MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)

at your option.

### Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted
for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any
additional terms or conditions.

and in your license headers, if you have them, use the following boilerplate
(based on that used in Rust):

// Copyright 2016 collenchyma-nn Developers
//
// Licensed under the Apache License, Version 2.0, <LICENSE-APACHE or
// http://apache.org/licenses/LICENSE-2.0> or the MIT license <LICENSE-MIT or
// http://opensource.org/licenses/MIT>, at your option. This file may not be
// copied, modified, or distributed except according to those terms.

It's commonly asked whether license headers are required. I'm not comfortable
making an official recommendation either way, but the Apache license
recommends it in their appendix on how to use the license.

Be sure to add the relevant LICENSE-{MIT,APACHE} files. You can copy these
from the Rust repo for a plain-text
version.

And don't forget to update the license metadata in your Cargo.toml to:

license = "MIT OR Apache-2.0"

I'll be going through projects which agree to be relicensed and have approval
by the necessary contributors and doing this changes, so feel free to leave
the heavy lifting to me!

Contributor checkoff

To agree to relicensing, comment with :

I license past and future contributions under the dual MIT/Apache-2.0 license, allowing licensees to chose either at their option.

Or, if you're a contributor, you can check the box in this repo next to your
name. My scripts will pick this exact phrase up and check your checkbox, but
I'll come through and manually review this issue later as well.

Share workspace for convolution operations

Currently each of the convolution operations _forward, _backward_data, _backward_filter has their own allocated workspace, when one workspace that is shared between them should be enough.

Allow usage of 1- and 2-dimensional tensors in CUDA

Since we always use cuDNN's N-dimensional tensor, which only allows dimensions >= 3 you currently always have to reshape a SharedTensor before using it with any operation, even if it is irellevant to the operation (e.g. activation functions shouldn't care about the rank of the tensor).

If a {1,2}-dimensional tensor is used for an operation it's dimensions should automatically be filled with leading 1s if that doesn't change the operations behaviour.

Make use of Collenchyma prelude module

Small chore to clean up:

use co::backend::{Backend, BackendConfig};
use co::framework::IFramework;
use co::frameworks::Cuda;
use co::tensor::SharedTensor;

See autumnai/collenchyma#40

Add bench tests to track performance improvements

Add bench tests to track performance improvements.

Update Feature Matrix for cuDNN v4

collenchyma-nn runs with cuDNN v3 and v4. The Feature Matrix in the README does not reflect that, though.

Allow seperate workspaces for convolution/pooling

#18 reduces the memory usage for use cases where you need the workspace for both forward and backward (training).

I haven't done any testing but it could be possible the forward workspace is smaller than the backward one, leading to to higher memory usage than necessary in pure forward(/inference) use cases.

This would also help seperate the automatic convolution algorithm detection for those uses cases, leading to quicker startup time (this should also be possible right now by using an Algo different than Auto, but would clearer).

Add prelude module

See autumnai/collenchyma#40, autumnai/collenchyma#42

The same issue with namespace clashes also exists here. While we already have a plugin module which serves a similar but also additional purposes, a change to a more uniform prelude would be better.

Convolution tests fail.

---- convolution_spec_cuda::it_computes_correct_convolution_on_cuda_for_f64_plain stdout ----
    thread 'convolution_spec_cuda::it_computes_correct_convolution_on_cuda_for_f64_plain' panicked at 'called `Result::unwrap()` on an `Err` value: BadParam("At least one of the following conditions are met: One of the parameters `handle`, `src_desc`, `filter_desc`, `conv_desc`, `dest_desc` is NULL. The tensor `dest_desc` or `filter_desc` are not of the same dimension as `src_desc`. The tensor `src_desc`, `dest_desc` or `filter_desc` are not of the same data type. The numbers of feature maps of the tensor `src_desc` and `filter_desc` differ. The tensor `src_desc` has a dimension smaller than 3.")', ../src/libcore/result.rs:746
stack backtrace:
   1:     0x562fc1e6be00 - std::sys::backtrace::tracing::imp::write::h714760a4c8c0cdd8
   2:     0x562fc1e6f26b - std::panicking::default_hook::_$u7b$$u7b$closure$u7d$$u7d$::hff309ab1d83ffd90
   3:     0x562fc1e6ee5b - std::panicking::default_hook::h08ad3bb09872855b
   4:     0x562fc1e6281f - std::sys_common::unwind::begin_unwind_inner::h406d5f1a330b854b
   5:     0x562fc1e637e8 - std::sys_common::unwind::begin_unwind_fmt::h57ea3fbee1a40196
   6:     0x562fc1e6b051 - rust_begin_unwind
   7:     0x562fc1ea532f - core::panicking::panic_fmt::ha6b3c19493c123b3
   8:     0x562fc1e30d2c - core::result::unwrap_failed::h5ca7669a9ea0bd2d
                        at ../src/libcore/macros.rs:29
   9:     0x562fc1e384a7 - _<std..result..Result<T, E>>::unwrap::hc227083efa2e1836
                        at ../src/libcore/result.rs:687
  10:     0x562fc1e3c107 - collenchyma_nn::frameworks::cuda::_<impl plugin..Convolution<f64> for co..Backend<co..Cuda>>::new_convolution_config::h50a25026e3106645
                        at src/frameworks/cuda/mod.rs:274
  11:     0x562fc1defa04 - convolution_specs::convolution_spec_cuda::it_computes_correct_convolution_on_cuda_for_f64_plain::hd6d9e72d4485bc15
                        at tests/convolution_specs.rs:186
  12:     0x562fc1e0ccd6 - _<F as std..boxed..FnBox<A>>::call_box::h70afdbb50723dc50
  13:     0x562fc1e0f41b - std::sys_common::unwind::try::try_fn::h84d8f5327fa145ba
  14:     0x562fc1e6afdb - __rust_try
  15:     0x562fc1e6af6d - std::sys_common::unwind::inner_try::h4e97625a08807651
  16:     0x562fc1e0f79a - _<F as std..boxed..FnBox<A>>::call_box::h588f35c7f2016768
  17:     0x562fc1e6d904 - std::sys::thread::Thread::new::thread_start::h74af400293164137
  18:     0x7fa5884ce6a9 - start_thread
  19:     0x7fa587fece9c - clone
  20:                0x0 - <unknown>

parameter ordering is inconsistent between pooling and convolution

The ordering of the stride and padding is inconsistent between new_convolution_config andnew_pooling_config`, making their usage unituitive.

    fn new_pooling_config(
        &self,
        window: &[i32],
        padding: &[i32], // padding first
        stride: &[i32],
    )

    fn new_convolution_config(
        &self,
        src: &::co::tensor::SharedTensor<f32>,
        dest: &::co::tensor::SharedTensor<f32>,
        filter: &mut ::co::tensor::SharedTensor<f32>,
        stride: &[i32], // stride first
        zero_padding: &[i32],
    )

We should decide on one ordering and use that consistently.
Are there any other function signatures that could be affected by this?

mem::transmute: Compile error on 32bit linux

cargo build
Compiling collenchyma-nn v0.3.4 (file:///home/ben/src/leaf/collenchyma-nn)
src/frameworks/cuda/helper.rs:5:8: 5:59 error: transmute called with differently sized types: u64 (64 bits) to *const libc::c_void (32 bits) [E0512]
src/frameworks/cuda/helper.rs:5 Ok(::std::mem::transmute::<u64, *const ::libc::c_void>(

src/frameworks/cuda/helper.rs:16:8: 16:57 error: transmute called with differently sized types: u64 (6

Was wondering why it passed on travis but that uses rustc 1.6.0

32 bit linux. Looks like its because pointer size differences.

Tracking Collenchyma issues

autumnai/collenchyma#50 has a workaround here: