cambridge-iccs / ftorch Goto Github PK

View Code? Open in Web Editor NEW

52.0 52.0 14.0 4.83 MB

A library for directly calling PyTorch ML models from Fortran.

Home Page: https://cambridge-iccs.github.io/FTorch/

License: MIT License

CMake 3.00% C++ 8.45% C 5.40% Fortran 74.59% Python 5.69% Shell 2.87%

deep-learning fortran interoperability machine-learning python pytorch torch

ftorch's People

Contributors

Stargazers

Watchers

Forkers

jalvesz vopikamm helgehr gats-group m-badri aysha2016 elliottkasoar springnuance siddanib peernow

ftorch's Issues

Installation instructions may need to account for PyTorch weirdness

PyTorch installations can sometimes lead to a situation where CUDA is needed by libtorch but is not available on the host system (perhaps because it doesn't have a GPU). A solution is to install pytorch from a CPU only repository:

pip install torch --index-url https://download.pytorch.org/whl/cpu

It might be argued that this isn't our problem?

Long-form documentation

We should look into creating longer form documentation using ~~readthedocs or sphinx or rst~~ FORD.

This should be hosted elsewhere and cover most of what is already there, perhaps more detail on the examples, and full API documentation.

Do we want to host on rtd?

How to use low-version intel compiler to compile Ftorch ?

Recently, I want to use a AI model from python in General Circulation Model (GCM) which depends on the compiler version of 2017, but this ftorch module should be compiled by the intel compiler version of 2021. I wonder how to solve this problem and whether ftorch can be compiled in low-version intel compiler like 2017 or not?

Ambiguous description of pt2ts.py

The description of pt2ts.py in the utils README could be interpreted that the model will be saved in the same directory as the pt2ts.py file, but it is instead saved in the directory that pt2ts.py is called from.

Two potential solutions are:

Update the README to make the distinction clear
Update pt2ts.py (in quite a few places, including the benchmark repo) to get its location, and prepend this to the save path, and clarify this in the README

I would lean towards the latter, so the location is more consistent.

For example, pt2ts.py might be called from the model directory, a build directory, or the directory up containing (such as from a script like run_benchmarks.sh), each of which currently would save the model in a different location.

ResNet Example outputs could be improved

Currently the outputs from the python ResNet inference program are not the same as the outputs from the Fortran ResNet inference program in the example.

Really we should make sure they, at minimum, produce the same output, and ideally make it something meaningful (e.g. max arg and location, perhaps provide a real image).

Potential pytorch incompatibility

This is not an issue I've encountered, but having followed the FTorch build instructions, the version of libtorch/pytorch installed may mean that FTorch is incompatible with the model saved in the examples, as this pip installs torch in a (new) virtual environment.

This would only lead to errors if breaking changes were made to the TorchScript format between the versions, and in many cases the same pip-installed torch would be used anyway.

Handling multiple outputs from a model

At present the wrapper codes can only accept a single output tensor from a TorchScript model.

However, it is perfectly possible to return multiple output tensors from a PyTorch Model.
In this instance the python code returns them as a Tuple.

Is it possible to return multiple output tensors when using the C++ API (and by extension our wrapper scripts)?

The 'simple' answer is to add a concatenation layer onto the end of the pytorch model to return a single tensor and then unpack later. However, this is hacky, requires model alteration to use our code, and restricts us to returning only a single type(?).

The forward method in the API returns an IValue which is a union over various types: https://pytorch.org/cppdocs/api/structc10_1_1_i_value.html

There has been some discussion on this topic:

There is a Stack Overflow
This has previously been raised as an official bug, but in a questionable manner
There is a PyTorch discussion with a suggested, but unverified, solution.

It looks like this is probably possible, but may need a bit of rummaging around with C/C++ types and moving pointers.

Separate examples from library source

We should separate the examples from the key source files in the library.

Perhaps into an /examples directory?

Super simple example

At some point it would be nice to make an example in the examples repo that is as simple as possible (i.e. a 1-D input tensor etc.) so that the user can focus on the coupling procedure rather than any of the deeper challenges that may arise in later exercises (transposes/memory layouts, precision, etc.)

Candidates might be a 'Net' that simply multiplies by two etc.

I will try and do this at some point.

Add action and hook for fypp

We should add a hook to generate the Fortran source for FTorch from fypp.
This will mean that the full source is in the repo and can be used directly and in documentation.

For developers this would mean they edit the fypp file and then push that, with F90 being generated by the hook.

Reformat code with latest black

Black has updated and there are some changes to formatting rules that need incorporating to satisfy CI.

Provide citation information

It would be good to provide a citation.cff file in an agreed format.
This might need amending as we present and write about it.

Cannot pass multiple inputs to a model

At present the library only allows us to pass a single tensor to a model and receive a single output tensor.

This needs updating so that we can pass multiple input tensors.

In the Fortran:

Instead of passing an input tensor pass an array of tensors.
Get the length of the input array of tensors and pass that to the c++

In the c++:

loop over the number of tensors and append (.push_back) on to the inputs to the C function.

Triage torch_tensor_from_array

We need to decide if it is best to:

require users to specify the dtype when calling torch_tensor_to_array()
Stick with the abstract interface approach currently there.

If the latter then interface needs expanding beyond the current 'float' and 'double'.

I can see an argument for the former, however, as it

a) simplifies the code (one routine and no interface), and
b) forces the user to be explicit about the data type they are using which, as we have seen, causes issues if incorrect.

Discussion appreciated, then we need to reach a decision and triage:

Amend source as required
Update docs as required
Update examples as required
Update benchmarks as required

Add optimisation options/guidance

As discussed in #78, there are (at least) two forms of optimisation that would be relatively straight forward to facilitate in some capacity, but require more consideration/are unlikely to be the default options (which is why they are not included in the referenced PR):

Model freezing

See: torch.jit.freeze
Applies system-independent optimization, as opposed to the system-dependent optimize_for_inference (which is currently broken, unfortunately)
Can give up to 50% speedup
From our benchmarking (see FTorch with/without gradients and/or frozen models sections), freezing the model can make more modest, but not insignificant, improvements (in most cases)
- Tests were carried out by replacing scripted_model.save(filename) with frozen_model = torch.jit.freeze(scripted_model), and then frozen_model.save(filename) in pt2ts.py
While not a problem for the use of FTorch, it's also worth noting that running the same TorchScript via Forpy (on CPU or GPU) seemed to give a similar errors to what optimize_for_inference can gives e.g. AttributeError: 'RecursiveScriptModule' object has no attribute 'training'
Freezing the model appears to lead to numerical errors (~10^-6) for the ResNet benchmark, raising a RuntimeError when saving, but this doesn't seem to be the case for the cgdrag benchmark, and it is unclear why
The guidance part of the title is perhaps most relevant here, as this is less about the main FTorch library, and more about how we enable users to use tools like pt2ts.py as part of a workflow involving FTorch
- This is somewhat dependent on the typical familiarity of potential FTorch users with the processes involved in saving to TorchScript
- Note: trace_to_torchscript currently uses model freezing. It would be preferable to have a shared setting and/or behaviour, unless there is a clear reason to use freezing in only one of the functions
- Any guidelines on trace_to_torchscript compared with script_to_torchscript may also be useful, as currently there is no clear motivation not to use the "default" script_to_torchscript

InferenceMode

See: inference mode, autograd mechanics and the dev podcast
From our benchmarking (see FTorch with InferenceMode and NoGradMode sections), benefits were less clear, but in general it is expected to be at least as fast
- Tests were carried out by replacing torch::AutoGradMode enable_grad(requires_grad); with c10::InferenceMode guard(requires_grad); in all ctorch.cpp functions, but ideally both options would be presented to users
This mode was only added (as a beta) in PyTorch 1.9, so we would need to consider support for older versions
The mode is also much stricter than NoGradMode, so cannot be used in all cases

Investigate intel C compiler

There are some suggestions that libtorch doesn't work with intel C compilers.
I thought we'd tested this, but perhaps we need to review and remove as an option from the README if it doesn't work!

GPU example and documentation

It is not intuitive that when running for GPU the input should be put on CUDA, but the output should NOT (thanks @ElliottKasoar for pointing this out).

We should clearly document this somewhere, and provide an example of running on GPU.

Issues building on Windows

Copied from an email chain with a user:

I hope it's alright that I'm reaching out. I've recently set up a framework to incorporate physics-informed, neural net-based, user material subroutines in Abaqus. The framework is quite simple and doesn't take full advantage of the neural net setup. I would be very interested in coupling the PyTorch models directly to Fortran, and so, I'd be very interested in exploring FTorch.

I had a quick question - I've been trying to install the library, but the CMake configuration fails to identify the Fortran compiler. I've tried:

set(CMAKE_Fortran_COMPILER "/MinGW/bin/gfortran.exe") added to the CMakeLists.txt
cmake .. -DCMAKE_Fortran_COMPILER="/MinGW/bin/gfortran.exe" in cmd.

With both, I get the following error:

It fails with the following output:

    Change Dir: C:/Users/USER/FTorch/src/build/CMakeFiles/CMakeScratch/TryCompile-mb6xhj
    Run Build Command(s):devenv.com CMAKE_TRY_COMPILE.sln /build Debug /project cmTC_03248 && The system cannot find the file specified
    Generator: execution of make failed. Make command was: devenv.com CMAKE_TRY_COMPILE.sln /build Debug /project cmTC_03248 &&

Would you be able to help me work this out? What am I missing here?

Thank you for taking the time, I truly appreciate it.

Misleading comment

Comments in ftorch.f90 describing the device to specify when running on GPUs suggest torch_kGPU, as opposed to the correct torch_kCUDA.

Create example demonstrating multiple calls to inference

Based on #55 it is evidently not clear how to modularise the code for repeated calls in a sensible fashion.

We should adapt one of the benchmark examples to illustrate breaking code into init, main, and finalise functions.

Incorrect example in README

Related to #56 The example code in the README is now incorrect as it has not incorporated the updates to use the layout argument in torch_tensor_from_blob()

We should either:

Update it to use t_t_from_array()
Update to add a layout argument

My preference is for the first (see discussion on #56).

However, this information should also be taken out of the README and placed in the longer form API docs now, perhaps as part of #53

Project/Repository Naming

We have discussed that fortran-pytorch-lib might not be the nicest most catchy name for the project.

We should settle on a name before first release.

Currently we build a file called ftorch, could we call the project FTorch perhaps?

Disadvantages:

perhaps less clear what it does ("FTorch: a library of fortran PyTorch bindings")
No similarly nice equivalent for TensorFlow... "FTF" isn't great... "FTensorFlow" sounds weird.

Example for multiple inputs

We should provide an example of using multiple input tensors to a model.

Is there a pre-existing trained net we can deploy?
Davenet?

Do this after restructuring examples for #12

As described in #72 we should perhaps include a note warning about Memory pitfalls for calling a net twice with different inputs.

Also a test?

Consistency of package names

Some of the package names are presented differently across the code, e.g.,

PyTorch – Pytorch – pytorch – pyTorch
Torch Script – TorchScript

It would be better to make these consistent.

CUDA out of memory error for very long runs

This may be an issue with the implementation separate to FTorch, but from very large tests on GPUs (~100,000 iterations), I sometimes start to run into CUDA memory issues for the cgdrag benchmark example.

This example calls torch_tensor_delete after every iteration, but perhaps this is not cleaning up data on the GPU?

Full error (after running ./benchmarker_cgdrag_torch ../cgdrag_model saved_cgdrag_model_gpu.pt 100000 10 --use_cuda for ~32000 iterations):

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacty of 79.15 GiB of which 18.00 MiB is free. Including non-PyTorch memory, this process has 79.12 GiB memory in use. Of the allocated memory 78.63 GiB is allocated by PyTorch, and 5.47 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Unit testing

Unit testing is absolutely required for this code. Ideally this could be done using Fortran by invoking the C API.

Improve documentation aesthetics

It is possible to use styling with FORD to customise appearance.

Whilst the current online docs are functrional, a low-priority might make them more visually appealing and add better formatting of tables and notes etc.

CMake Improvements

We should add some CMake flags to improve the build experience this should include

Examples: A flag for each to enable building
Documentation: We should remove the dependencies for documentation building if only building the library
Testing: We should remove the dependencies for testing if only building the library

Prune README

Once #53 is merged it would be good to prune the readme to make it friendlier to those who are just discovering the project.

Much of the more detailed information (e.g. Windows build instructions) can just point users to the online docs.

API documentation for C API

Although just a wrapper for the Fortran interface some doxygen style comments on the C API header would be useful.

Wrap tensor generation routines from Fortran

It would be nice if we could wrap the fortran tensor generation to transform torch_tensor_from_blob into something like torch_tensor_from_array

Takes an array as the main, and only required, argument
Infers dtype
Infers size
Infers shape
Shape and stride are optional arguments
Indexing 'F' or 'C' passed through as string -> c_char
- Should raise error if this is passed in addition to stride and shape.

should we add static analysis to our repo?

Should we use something like sonarqube to do static analysis on the fortran source code?

Test on GPUs

Run the testing suite on a GPU system. Requires saving the torch script a bit differently? @jatkinson1000

Installation doesn't provide modules

After make install the installed library does not automatically pass the Fortran module directory to CMake projects that use the library in the approved manned. What should work is:

find_package(FTorch)
target_link_libraries(foo PRIVATE FTorch:ftorch)

This doesn't work: the compile fails, unable to use ftorch, and the Fortran module directory does not appear on the compiler command line.

Incompatible C++ compiler standard

When running make following the installation instructions in README.md, an error is raised:

libtorch/include/ATen/ATen.h:4:2: error: #error C++17 or later compatible compiler is required to use ATen.

This error was reproduced using both the CPU-only nightly (accessed on 19/09/23) and CPU-only (stable) 2.0.1 libtorch binaries.

This appears to be due to pytorch now requiring support for C++17:

A compiler that fully supports C++17, such as clang or gcc (especially for aarch64, gcc 9.4.0 or newer is required)

Versions used:

cmake: 3.16.3
gcc: 9.4.0

Loaded TorchScript missing no_grad context

When using jit.script, it appears that .eval and no_grad context is not saved.

This is likely to be partially responsible for worse performance than expected during inference.

My current solution has been to add:

model->eval();
torch::NoGradGuard no_grad;

in torch_jit_module_forward, as adding these in torch_jit_load did not appear to change the behaviour.

As it is also possible that these functions may be used in training, adding these settings conditionally would also be preferable.

Integration tests

We should look at creating, as a first step, integration tests.
These should be adapted from the benchmarks repo.

Possibly as a set of Fortran programs run via a shell script?

Simplify stride interface with `F` or `C` layout

This is not a high priority at the moment, but as discussed in #24, it would be good to improve the user friendliness of the stride functionality, through a F or C layout style and shape/size inference.

This should also be documented in (and potentially help clarify) the discussion about when to transpose arrays, in both the ResNet example and (soon) the long-form documentation.

Possibly use overloading in the API for dealing with multiple inputs

We could use overloading in Fortran 90 to provide a simpler interface.
Here's an example of overloading based on arity: https://gist.github.com/dorchard/3cc13fe75d6d109cb75ec11d41ddc104
(Note something similar can work for overloading on input type too)

Remove conda install

This is detritus for now, we can add it back in at a later date if we decide to support conda.

General Documentation

This should include general user documentation rendered to readthedocs via Sphinx. There is also an option to include the API docs within this site (doxygen).

Create a sensible Readme.md following the guidance in our handbook (development guide)
Create a walk through example in the documentation.

Compiling with conda installation of torch gives MKL errors and wrong numbers

When I compile with a conda installation of pytorch, I don't get errors during compiling, but when I test the library I get MKL errors and the numbers produced seem to be incorrect. I have tested against compiling the library with libtorch downloaded from source and that appears to work fine.
Here are the modules I'm using

Currently Loaded Modules:
  1) math  (S)   3) cmake/3.24.2   5) intel-oneapi-compilers-cees-beta/2021.4.0-2xfl6   7) intel-oneapi-mkl-cees-beta/2021.4.0-k4r3o
  2) devel (S)   4) gcc/10.3.0     6) intel-cees-beta/2021.4.0

and I've exported the conda environment here: mima-torch-env.txt. I'm using pytorch v2.

The path to TorchConfig.cmake is in /home/groups/aditis2/lauraman/miniconda3/envs/mima-torch/lib/python3.9/site-packages/torch/share/cmake/Torch so to build the library I use:

$ cmake .. -DTorch_DIR=/home/groups/aditis2/lauraman/miniconda3/envs/mima-torch/lib/python3.9/site-packages/torch/share/cmake/Torch  -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$GROUP_HOME/lauraman/test-fortran-pytorch-lib 

-- The C compiler identification is Intel 2021.4.0.20210910
-- The CXX compiler identification is Intel 2021.4.0.20210910
-- The Fortran compiler identification is Intel 2021.4.0.20210910
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/groups/s-ees/share/cees/spack_cees/spack/opt/spack/linux-centos7-x86_64_v3/gcc-4.8.5/intel-oneapi-compilers-2021.4.0-2xfl6e7kdhxegq5msukpqoxdmftsu2w6/compiler/2021.4.0/linux/bin/intel64/icc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/groups/s-ees/share/cees/spack_cees/spack/opt/spack/linux-centos7-x86_64_v3/gcc-4.8.5/intel-oneapi-compilers-2021.4.0-2xfl6e7kdhxegq5msukpqoxdmftsu2w6/compiler/2021.4.0/linux/bin/intel64/icpc - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Detecting Fortran compiler ABI info
-- Detecting Fortran compiler ABI info - done
-- Check for working Fortran compiler: /home/groups/s-ees/share/cees/spack_cees/spack/opt/spack/linux-centos7-x86_64_v3/gcc-4.8.5/intel-oneapi-compilers-2021.4.0-2xfl6e7kdhxegq5msukpqoxdmftsu2w6/compiler/2021.4.0/linux/bin/intel64/ifort - skipped
-- Detecting Fortran/C Interface
-- Detecting Fortran/C Interface - Found GLOBAL and MODULE mangling
-- Verifying Fortran/CXX Compiler Compatibility
-- Verifying Fortran/CXX Compiler Compatibility - Success
-- MKL_ARCH: None, set to ` intel64` by default
-- MKL_LINK: None, set to ` dynamic` by default
-- MKL_INTERFACE_FULL: None, set to ` intel_ilp64` by default
-- MKL_THREADING: None, set to ` intel_thread` by default
-- MKL_MPI: None, set to ` intelmpi` by default
CMake Warning at /home/groups/aditis2/lauraman/miniconda3/envs/mima-torch/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
  static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
  /home/groups/aditis2/lauraman/miniconda3/envs/mima-torch/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:127 (append_torchlib_if_found)
  CMakeLists.txt:35 (find_package)


-- Found Torch: /home/groups/aditis2/lauraman/miniconda3/envs/mima-torch/lib/python3.9/site-packages/torch/lib/libtorch.so  
-- Configuring done
-- Generating done
-- Build files have been written to: /scratch/users/lauraman/MiMA_pytorch/new-fortran-pytorch-lib/fortran-pytorch-lib/fortran-pytorch-lib/build_conda

$ make
Scanning dependencies of target ftorch
[ 33%] Building Fortran object CMakeFiles/ftorch.dir/ftorch.f90.o
[ 66%] Building CXX object CMakeFiles/ftorch.dir/ctorch.cpp.o
[100%] Linking CXX shared library libftorch.so
[100%] Built target ftorch

$ make install 
Consolidate compiler generated dependencies of target ftorch
[100%] Built target ftorch
Install the project...
-- Install configuration: "Release"
-- Installing: /home/groups/aditis2/lauraman/test-fortran-pytorch-lib/lib64/libftorch.so
-- Set runtime path of "/home/groups/aditis2/lauraman/test-fortran-pytorch-lib/lib64/libftorch.so" to "$ORIGIN/../lib64:/home/groups/s-ees/share/cees/spack_cees/spack/opt/spack/linux-centos7-zen2/intel-2021.4.0/intel-oneapi-mkl-2021.4.0-k4r3on5jujinjf5tjqs6u3jguuecptj4/mkl/2021.4.0/lib:/home/groups/aditis2/lauraman/miniconda3/envs/mima-torch/lib/python3.9/site-packages/torch/lib:/home/groups/s-ees/share/cees/spack_cees/spack/opt/spack/linux-centos7-zen2/intel-2021.4.0/intel-oneapi-mkl-2021.4.0-k4r3on5jujinjf5tjqs6u3jguuecptj4/compiler/latest/linux/compiler/lib/intel64_lin"
-- Installing: /home/groups/aditis2/lauraman/test-fortran-pytorch-lib/include/ctorch.h
-- Installing: /home/groups/aditis2/lauraman/test-fortran-pytorch-lib/lib64/cmake/FTorchConfig.cmake
-- Installing: /home/groups/aditis2/lauraman/test-fortran-pytorch-lib/lib64/cmake/FTorchConfig-release.cmake
-- Installing: /home/groups/aditis2/lauraman/test-fortran-pytorch-lib/include/ftorch/ftorch.mod

All looks like it compiles okay. I then go into the ResNet example and follow the steps to build this (Also, I noticed a subtle typo in the ResNet build instructions there: -DFTorchDIR should be -DFTorch_DIR)

$ cmake .. -DFTorch_DIR=$GROUP_HOME/lauraman/test-fortran-pytorch-lib/lib64/cmake/ -DCMAKE_BUILD_TYPE=Release
-- Building with Fortran PyTorch coupling
-- Configuring done
-- Generating done
-- Build files have been written to: /scratch/users/lauraman/MiMA_pytorch/new-fortran-pytorch-lib/fortran-pytorch-lib/examples/1_ResNet18/build

$ make
Scanning dependencies of target resnet_infer_fortran
[ 50%] Building Fortran object CMakeFiles/resnet_infer_fortran.dir/resnet_infer_fortran.f90.o
[100%] Linking Fortran executable resnet_infer_fortran
[100%] Built target resnet_infer_fortran

No errors there but when I run:

$ ./resnet_infer_fortran ../saved_resnet18_model_cpu.pt 

Intel MKL ERROR: Parameter 13 was incorrect on entry to SGEMM .
 -6.3447808E-03

Expect behaviour (?):
I tested this but instead of using the path to torch in the conda environment, I downloaded libtorch from source and compiled the library using that. Following the same steps I got the following result- could you check if this is correct please?

$ ./resnet_infer_fortran ../saved_resnet18_model_cpu.pt 
  0.1825228

RESNET_18 default arguments updated

In the ResNet example we use pretrained=True, but this is deprecated and should now be weights='IMAGENET1K_V1'.

User ability to decide GPU device number

Integration of FTorch with a distributed CPU based solver can lead to a scenario where there are N (--ntasks-per-node) MPI and M (torch::cuda::device_count()) GPUs per node (**M** <= **N**). The current implementation of FTorch appears to leverage only GPU:0 for all N MPI ranks. Providing user ability to decide which GPU to leverage can ensure that all available GPUs are used.

An initial discussion regarding this potential feature: #84.

Furthermore, there might still be multiple MPI ranks per GPU even after uniformly distributing the MPI ranks among available GPUs. The GPU probably calls these ML model copies serially. CUDA MPS could be utilized to concurrently run the ML model copies. An alternative might be to perform (gather to a single task, deploy the ML model from that task, and finally scatter to respective tasks) inside the fortran code.

Add documentation on advanced use for manipulating data layout

Removed from being part of #20 we should add some information for advanced users about manipulating data beyond simple transposition using the 'stride' options, as was done by @jatkinson1000 and @SimonClifford in the MiMA-ML project.

## Advanced use

Those experienced with C will perhaps have noticed that there are further freedoms
available beyond those presented above.

Always stride, or is there a transpose tradeoff?

Investigate creating tensors directly on GPU

Currently, data is (probably) explicitly moved to the target device, rather than created directly on the desired device, as is preferable.

Creating the tensor directly on the device would be closer to the previous form of the code (see changes for GPUs), although that exact implementation did not seem to work.

Simple linting and QC

We should add some basic linting and QC on the repository to help guide PRs to meet some basic standards in advance.

For now we should do this for python.
Apply:

black
pydocstyle --numpy

Consider:

mypy - may not want this as not great on standalone scripts and want to keep exercises simple
pylint - we may not be compliant... 😅

Possible Memory Leak

Discussed in #79

Possible memory leak, need to investigate further.

Typos in the README of the ResNet18 example

When following the README of the ResNet18 example I found some typos in the instructions.

Remove unused code

The following code is commented out and should be removed. Similarly, for the torch_from_blob_f header, which is unused.

FTorch/src/ctorch.cpp

Lines 112 to 138 in 2653bf8

    
           /* 
        
           // Exposes the given data as a Tensor without taking ownership of the original 
        
           // data 
        
           torch_tensor_t torch_from_blob(void* data, int ndim, const int64_t* shape, 
        
                                          torch_data_t dtype, torch_device_t device) 
        
           { 
        
             torch::Tensor* tensor = nullptr; 
        
             try { 
        
               // This doesn't throw if shape and dimensions are incompatible 
        
               c10::IntArrayRef vshape(shape, ndim); 
        
               tensor = new torch::Tensor; 
        
               *tensor = torch::from_blob( 
        
                   data, vshape, 
        
                   torch::dtype(get_dtype(dtype))).to(get_device(device)); 
        
             } catch (const torch::Error& e) { 
        
               std::cerr << "[ERROR]: " << e.msg() << std::endl; 
        
               delete tensor; 
        
               exit(EXIT_FAILURE); 
        
             } catch (const std::exception& e) { 
        
               std::cerr << "[ERROR]: " << e.what() << std::endl; 
        
               delete tensor; 
        
               exit(EXIT_FAILURE); 
        
             } 
        
             return tensor; 
        
           } 
        
           */

	/*
	// Exposes the given data as a Tensor without taking ownership of the original
	// data
	torch_tensor_t torch_from_blob(void* data, int ndim, const int64_t* shape,
	torch_data_t dtype, torch_device_t device)
	{
	torch::Tensor* tensor = nullptr;
	try {
	// This doesn't throw if shape and dimensions are incompatible
	c10::IntArrayRef vshape(shape, ndim);
	tensor = new torch::Tensor;
	*tensor = torch::from_blob(
	data, vshape,
	torch::dtype(get_dtype(dtype))).to(get_device(device));
	} catch (const torch::Error& e) {
	std::cerr << "[ERROR]: " << e.msg() << std::endl;
	delete tensor;
	exit(EXIT_FAILURE);
	} catch (const std::exception& e) {
	std::cerr << "[ERROR]: " << e.what() << std::endl;
	delete tensor;
	exit(EXIT_FAILURE);
	}
	return tensor;
	}

	*/