Coder Social home page Coder Social logo

prabhuomkar / pytorch-cpp Goto Github PK

View Code? Open in Web Editor NEW
1.8K 51.0 247.0 465 KB

C++ Implementation of PyTorch Tutorials for Everyone

License: MIT License

CMake 14.63% C++ 67.76% Python 1.06% Jupyter Notebook 15.49% Dockerfile 0.74% Shell 0.32%
cplusplus artificial-intelligence machine-learning pytorch torch tensors neural-network autograd libtorch recurrent-neural-network

pytorch-cpp's Introduction

C++ Implementation of PyTorch Tutorials for Everyone

OS (Compiler)\LibTorch 2.1.1
macOS (clang 11, 12, 13) Status
Linux (gcc 9, 10, 11) Status
Windows (msvc 2019, 2022) Status

Table of Contents

This repository provides tutorial code in C++ for deep learning researchers to learn PyTorch (i.e. Section 1 to 3)
Python Tutorial: https://github.com/yunjey/pytorch-tutorial

1. Basics

2. Intermediate

3. Advanced

4. Interactive Tutorials

5. Other Popular Tutorials

Getting Started

Requirements

  1. C++-17 compatible compiler
  2. CMake (minimum version 3.14)
  3. LibTorch version >= 1.12.0 and <= 2.1.1
  4. Conda

For Interactive Tutorials

Note: Interactive Tutorials are currently running on LibTorch Nightly Version.
So there are some tutorials which can break when working with nightly version.

conda create --name pytorch-cpp
conda activate pytorch-cpp
conda install xeus-cling notebook -c conda-forge

Clone, build and run tutorials

In Google Colab

Open In Colab

On Local Machine

git clone https://github.com/prabhuomkar/pytorch-cpp.git
cd pytorch-cpp

Generate build system

cmake -B build #<options>

Note for Windows users:
Libtorch only supports 64bit Windows and an x64 generator needs to be specified. For Visual Studio this can be done by appending -A x64 to the above command.

Some useful options:

Option Default Description
-D CUDA_V=(11.8|12.1|none) none Download LibTorch for a CUDA version (none = download CPU version).
-D LIBTORCH_DOWNLOAD_BUILD_TYPE=(Release|Debug) Release Determines which libtorch build type version to download (only relevant on Windows).
-D DOWNLOAD_DATASETS=(OFF|ON) ON Download required datasets during build (only if they do not already exist in pytorch-cpp/data).
-D CREATE_SCRIPTMODULES=(OFF|ON) OFF Create all required scriptmodule files for prelearned models / weights during build. Requires installed python3 with pytorch and torchvision.
-D CMAKE_PREFIX_PATH=path/to/libtorch/share/cmake/Torch <empty> Skip the downloading of LibTorch and use your own local version (see Requirements) instead.
-D CMAKE_BUILD_TYPE=(Release|Debug|...) <empty> Determines the CMake build-type for single-configuration generators (see CMake docs).
Example Linux
Aim
  • Use existing Python, PyTorch (see Requirements) and torchvision installation.
  • Download all datasets and create all necessary scriptmodule files.
Command
cmake -B build \
-D CMAKE_BUILD_TYPE=Release \
-D CMAKE_PREFIX_PATH=/path/to/libtorch/share/cmake/Torch \
-D CREATE_SCRIPTMODULES=ON 
Example Windows
Aim
  • Automatically download LibTorch for CUDA 11.8 (Release version) and all necessary datasets.
  • Do not create scriptmodule files.
Command
cmake -B build \
-A x64 \
-D CUDA_V=11.8

Build

Note for Windows (Visual Studio) users:
The CMake script downloads the Release version of LibTorch, so --config Release has to be appended to the build command.

How dataset download and scriptmodule creation work:

  • If DOWNLOAD_DATASETS is ON, the datasets required by the tutorials you choose to build will be downloaded to pytorch-cpp/data (if they do not already exist there).
  • If CREATE_SCRIPTMODULES is ON, the scriptmodule files for the prelearned models / weights required by the tutorials you choose to build will be created in the model folder of the respective tutorial's source folder (if they do not already exist).

All tutorials

To build all tutorials use

cmake --build build

All tutorials in a category

You can choose to only build tutorials in one of the categories basics, intermediate, advanced or popular. For example, if you are only interested in the basics tutorials:

cmake --build build --target basics
# In general: cmake --build build --target {category}

Single tutorial

You can also choose to only build a single tutorial. For example to build the language model tutorial only:

cmake --build build --target language-model
# In general: cmake --build build --target {tutorial-name}

Note:
The target argument is the tutorial's foldername with all underscores replaced by hyphens.

Tip for users of CMake version >= 3.15:
You can specify several targets separated by spaces, for example:

cmake --build build --target language-model image-captioning

Run Tutorials

  1. (IMPORTANT!) First change into the tutorial's directory within build/tutorials. For example, assuming you are in the pytorch-cpp directory and want to change to the pytorch basics tutorial folder:
    cd build/tutorials/basics/pytorch_basics
    # In general: cd build/tutorials/{basics|intermediate|advanced|popular/blitz}/{tutorial_name}
  2. Run the executable. Note that the executable's name is the tutorial's foldername with all underscores replaced with hyphens (e.g. for tutorial folder: pytorch_basics -> executable name: pytorch-basics (or pytorch-basics.exe on Windows)). For example, to run the pytorch basics tutorial:

    Linux/Mac
    ./pytorch-basics
    # In general: ./{tutorial-name}
    Windows
    .\pytorch-basics.exe
    # In general: .\{tutorial-name}.exe

Using Docker

Find the latest and previous version images on Docker Hub.

You can build and run the tutorials (on CPU) in a Docker container using the provided Dockerfile and docker-compose.yml files:

  1. From the root directory of the cloned repo build the image:
    docker-compose build --build-arg USER_ID=$(id -u) --build-arg GROUP_ID=$(id -g)

    Note:
    When you run the Docker container, the host repo directory is mounted as a volume in the Docker container in order to cache build and downloaded dependency files so that it is not necessary to rebuild or redownload everything when a container is restarted. In order to have correct file permissions it is necessary to provide your user and group ids as build arguments when building the image on Linux.

  2. Now start the container and build the tutorials using:
    docker-compose run --rm pytorch-cpp
    This fetches all necessary dependencies and builds all tutorials. After the build is done, by default the container starts bash in interactive mode in the build/tutorials folder.
    As with the local build, you can choose to only build tutorials of a category (basics, intermediate, advanced, popular):
    docker-compose run --rm pytorch-cpp {category}
    In this case the container is started in the chosen category's base build directory.
    Alternatively, you can also directly run a tutorial by instead invoking the run command with a tutorial name as additional argument, for example:
    docker-compose run --rm pytorch-cpp pytorch-basics
    # In general: docker-compose run --rm pytorch-cpp {tutorial-name} 
    This will - if necessary - build the pytorch-basics tutorial and then start the executable in a container.

License

This repository is licensed under MIT as given in LICENSE.

pytorch-cpp's People

Contributors

amirtronics avatar arrufat avatar mfl28 avatar mithil467 avatar prabhuomkar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch-cpp's Issues

[feature] Upgrade to libtorch version 1.8.0.

Is your feature request related to a problem? Please describe.

  • Upgrade the CMake scripts to download the latest libtorch version 1.8.0 and update the supported CUDA versions.
  • Update the CUDA version used in the colab notebook to version 11.1.
  • Update the Readme.
  • Update the PyTorch and Torchvision versions in Dockerfile.

Human detection tutorial please.

I'm always frustrated when google spits out only python and tensorflow human detection implementation 🤦‍♂️

It would be great if this libtroch tutorial can show some human detection implementation which have high accuracies.

Alternatives you've considered:
OpenPose (Super Slow, low FPS even on RTX 3090).
Would like to use OpenCL for AMD and ARM Mali GPUs.

Thanks.

[feature] Standardizing C++ Code

Is your feature request related to a problem? Please describe.
C++ code seems to vary a bit in Beginner and Intermediate tutorials as of now. Need to make use of it at maximum potential for best performance.

Describe the solution you'd like
Following things need to be standardized:

  • Libtorch data types initialized/assigned needs to be of appropriate type required
  • Using "\n" over std::endl

Additional context
Code should look consistent through out all set of tutorials.

Currently, lets stick to this CL to be in one PR.

[bug] Build Problem on Ubuntu

* https://discuss.pytorch.org/t/libtorch-for-raspberry-pi/63107
  This actually solved the build problem.

But now there are new Problems showing up:

cmake --build build
Scanning dependencies of target pytorch-cpp
[ 2%] Building CXX object CMakeFiles/pytorch-cpp.dir/main.cpp.o
[ 4%] Linking CXX executable pytorch-cpp
[ 4%] Built target pytorch-cpp
Scanning dependencies of target feedforward-neural-network
[ 7%] Building CXX object tutorials/basics/feedforward_neural_network/CMakeFiles/feedforward-neural-network.dir/src/main.cpp.o
/home/pi/Desktop/PytorchC++TestInternet/pytorch-cpp-master/tutorials/basics/feedforward_neural_network/src/main.cpp: In function ‘int main()’:
/home/pi/Desktop/PytorchC++TestInternet/pytorch-cpp-master/tutorials/basics/feedforward_neural_network/src/main.cpp:71:48: error: ‘cross_entropy’ is not a member of ‘torch::nn::functional’
auto loss = torch::nn::functional::cross_entropy(output, target);
^~~~~~~~~~~~~
/home/pi/Desktop/PytorchC++TestInternet/pytorch-cpp-master/tutorials/basics/feedforward_neural_network/src/main.cpp:74:39: error: expected primary-expression before ‘double’
running_loss += loss.item() * data.size(0);
^~~~~~
/home/pi/Desktop/PytorchC++TestInternet/pytorch-cpp-master/tutorials/basics/feedforward_neural_network/src/main.cpp:111:44: error: ‘cross_entropy’ is not a member of ‘torch::nn::functional’
auto loss = torch::nn::functional::cross_entropy(output, target);
^~~~~~~~~~~~~
/home/pi/Desktop/PytorchC++TestInternet/pytorch-cpp-master/tutorials/basics/feedforward_neural_network/src/main.cpp:113:35: error: expected primary-expression before ‘double’
running_loss += loss.item() * data.size(0);
^~~~~~
make[2]: *** [tutorials/basics/feedforward_neural_network/CMakeFiles/feedforward-neural-network.dir/build.make:63: tutorials/basics/feedforward_neural_network/CMakeFiles/feedforward-neural-network.dir/src/main.cpp.o] Fehler 1
make[1]: *** [CMakeFiles/Makefile2:354: tutorials/basics/feedforward_neural_network/CMakeFiles/feedforward-neural-network.dir/all] Fehler 2
make: *** [Makefile:84: all] Fehler 2

Is this a Problem of Libtorch 1.3 instead of the leatest Version?
Do you know that?

I have a similar issue while trying to build convolutional_neural_network from repo.
cmake --build . --config Release Scanning dependencies of target freespace_torch [ 33%] Building CXX object CMakeFiles/freespace_torch.dir/src/convnet.cpp.o In file included from /home/fugurcal/freespace_torch/src/convnet.cpp:2:0: /home/fugurcal/freespace_torch/include/convnet.h:13:20: error: ‘BatchNorm2d’ is not a member of ‘torch::nn’ torch::nn::BatchNorm2d(16), ^~~~~~~~~~~ /home/fugurcal/freespace_torch/include/convnet.h:13:20: note: suggested alternative: ‘BatchNorm’ torch::nn::BatchNorm2d(16), ^~~~~~~~~~~ BatchNorm /home/fugurcal/freespace_torch/include/convnet.h:14:20: error: ‘ReLU’ is not a member of ‘torch::nn’ torch::nn::ReLU(), ^~~~ /home/fugurcal/freespace_torch/include/convnet.h:15:20: error: ‘MaxPool2d’ is not a member of ‘torch::nn’ torch::nn::MaxPool2d(torch::nn::MaxPool2dOptions(2).stride(2)) ^~~~~~~~~ /home/fugurcal/freespace_torch/include/convnet.h:15:41: error: ‘MaxPool2dOptions’ is not a member of ‘torch::nn’ torch::nn::MaxPool2d(torch::nn::MaxPool2dOptions(2).stride(2)) ^~~~~~~~~~~~~~~~ /home/fugurcal/freespace_torch/include/convnet.h:15:41: note: suggested alternative: ‘Conv2dOptions’ torch::nn::MaxPool2d(torch::nn::MaxPool2dOptions(2).stride(2)) ^~~~~~~~~~~~~~~~ Conv2dOptions /home/fugurcal/freespace_torch/include/convnet.h:16:5: error: could not convert ‘{torch::nn::Conv2d((* &(& torch::nn::ConvOptions<2>(1, 16, torch::ExpandingArray<2, long int>(5)).torch::nn::ConvOptions<2>::stride(torch::ExpandingArray<2, long int>(1)))->torch::nn::ConvOptions<2>::padding(torch::ExpandingArray<2, long int>(2)))), <expression error>, <expression error>, <expression error>}’ from ‘<brace-enclosed initializer list>’ to ‘torch::nn::Sequential’ }; ^ /home/fugurcal/freespace_torch/include/convnet.h:20:20: error: ‘BatchNorm2d’ is not a member of ‘torch::nn’ torch::nn::BatchNorm2d(32), ^~~~~~~~~~~ /home/fugurcal/freespace_torch/include/convnet.h:20:20: note: suggested alternative: ‘BatchNorm’ torch::nn::BatchNorm2d(32), ^~~~~~~~~~~ BatchNorm /home/fugurcal/freespace_torch/include/convnet.h:21:20: error: ‘ReLU’ is not a member of ‘torch::nn’ torch::nn::ReLU(), ^~~~ /home/fugurcal/freespace_torch/include/convnet.h:22:20: error: ‘MaxPool2d’ is not a member of ‘torch::nn’ torch::nn::MaxPool2d(torch::nn::MaxPool2dOptions(2).stride(2)) ^~~~~~~~~ /home/fugurcal/freespace_torch/include/convnet.h:22:41: error: ‘MaxPool2dOptions’ is not a member of ‘torch::nn’ torch::nn::MaxPool2d(torch::nn::MaxPool2dOptions(2).stride(2)) ^~~~~~~~~~~~~~~~ /home/fugurcal/freespace_torch/include/convnet.h:22:41: note: suggested alternative: ‘Conv2dOptions’ torch::nn::MaxPool2d(torch::nn::MaxPool2dOptions(2).stride(2)) ^~~~~~~~~~~~~~~~ Conv2dOptions /home/fugurcal/freespace_torch/include/convnet.h:23:5: error: could not convert ‘{torch::nn::Conv2d((* &(& torch::nn::ConvOptions<2>(16, 32, torch::ExpandingArray<2, long int>(5)).torch::nn::ConvOptions<2>::stride(torch::ExpandingArray<2, long int>(1)))->torch::nn::ConvOptions<2>::padding(torch::ExpandingArray<2, long int>(2)))), <expression error>, <expression error>, <expression error>}’ from ‘<brace-enclosed initializer list>’ to ‘torch::nn::Sequential’ }; ^ CMakeFiles/freespace_torch.dir/build.make:62: recipe for target 'CMakeFiles/freespace_torch.dir/src/convnet.cpp.o' failed make[2]: *** [CMakeFiles/freespace_torch.dir/src/convnet.cpp.o] Error 1 CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/freespace_torch.dir/all' failed make[1]: *** [CMakeFiles/freespace_torch.dir/all] Error 2 Makefile:83: recipe for target 'all' failed make: *** [all] Error 2
I am using torch 1.5, any idea how solve this problem?

Originally posted by @FarukUgurcali in #37 (comment)

Expression: vector subscript out of range

In the tutorial of deep residual net. In the cifar.cpp, for the function--std::pair<torch::Tensor, torch::Tensor> read_data(const std::string& root, bool train), it works well in linux, but when in win10, i got the following error.

error

[feature] Add support for juce::image to torch::tensor and back

Is your feature request related to a problem? Please describe.
Hello,
I is use Libtorch for process image, OpenCV very large so try avoid. I try use JCUE C++ framework.

Describe the solution you'd like
Not able to convert torch::Tensor to juce::image. I want to read a Video file, split into images on the fly and apply Deep Learning algorithm on image in Libtorch, then turn it back to image JUCE can understand. Anyone is have example make like this?

Describe alternatives you've considered
Is here someone make OpenCV to juce::image.
https://forum.juce.com/t/opencvs-cv-mat-to-juces-image/33518

Additional context
Try integrate code with JUCE c++ GUI.

See also:
https://discuss.pytorch.org/t/avoid-use-opencv-for-image-convert-libtorch-torch-tensor-to-juce-image-and-like-back/94792

[feature] Refactoring code of Intermediate Tutorials

Is your feature request related to a problem? Please describe.
Lot of common code is available as of now, which is creating too much redundancy of same training/iterations/testing/data-loading/saving sections.

Describe the solution you'd like
Moving common pieces of code to utils only specific to intermediate tutorials.

Additional context
Following modules can be extracted to utils:

  • Trainer/Tester
  • Data Loaders

Currently, lets stick to this CL to be in one PR.

[feature] Upgrade C++ version to 17

Is your feature request related to a problem? Please describe.
There is (afaik) no pre-c++17 way to do file operations such as iterating over files in a folder in a way that is portable across operating systems without additional external dependencies such as boost.Filesystem. This makes it hard to implement e.g. custom datasets such as an ImageFolderDataset.

Describe the solution you'd like
The <filesystem> header - officially part of C++ as of version 17 - makes it possible to easily handle filesystem related operations in an OS-independent way. Compiling for c++17 requires a change in the CMake configurations. C++17 is supported by most compiler versions of the last years - that includes all compilers currently used in our CI-workflows.

Compile failure

Hello there,

I'm having this problem:
[ 21%] Building CXX object tutorials/intermediate/convolutional_neural_network/CMakeFiles/convolutional-neural-network.dir/src/main.cpp.o
/root/ml/pytorch-cpp/tutorials/intermediate/convolutional_neural_network/src/main.cpp: In function 'int main()':
/root/ml/pytorch-cpp/tutorials/intermediate/convolutional_neural_network/src/main.cpp:106:12: error: 'InferenceMode' is not a member of 'torch'
torch::InferenceMode no_grad;
^~~~~~~~~~~~~

What could be the cause?

Thank you!

[feature] Switch to Github Actions CI.

Is your feature request related to a problem? Please describe.

  • I'd suggest we switch from Travis CI to Github Actions CI as Travis seems to no longer allow unlimited free builds for open-source projects (see here).

Describe the solution you'd like

  • Remove the Travis CI setup and add equivalent Github Actions CI setup.

[bug] Build Problem on Raspberry Pi

Problem

  • I downloaded the files on a raspberrypi
  • unpacked the files
  • run: cmake -B build
  • run: cmake --build build
  • Console Text:

Scanning dependencies of target pytorch-cpp
[ 2%] Building CXX object CMakeFiles/pytorch-cpp.dir/main.cpp.o
[ 4%] Linking CXX executable pytorch-cpp
/home/pi/Desktop/LibTorch/libtorch/lib/libtorch.so: file not recognized: file format not recognized
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/pytorch-cpp.dir/build.make:87: pytorch-cpp] Fehler 1
make[1]: *** [CMakeFiles/Makefile2:327: CMakeFiles/pytorch-cpp.dir/all] Fehler 2
make: *** [Makefile:84: all] Fehler 2

  • (This is with manually downloaded libtorch, but it behaves the same when I let cmake download libtoch)
    -Problem is that the libtorch.so is an unrecogniced File Format

Help Me Pls
I think this is a Problem of my specific System, but I have no Idea what the issue is

Desktop:

  • OS: Raspbian
  • Libtorch Version [1.4]

The time consuming bug of '.to(at::kCPU)' [bug]

This is g great project!
Recently, I find using '.to(at::kCPU)' from CUDA takes a very long time, when I use my model to inference on GPU. Like this:

  vector<torch::jit::IValue> inputs = {input_data.input_ids_pt_,
                                       input_data.attention_mask_pt_,
                                       input_data.token_type_ids_pt_};                              
  auto pred_res = (static_cast<torch::jit::script::Module*>(bert_model_))->forward(inputs);      // About 12 milliseconds
  auto logits = pred_res.toTensor().to(at::kCPU);        // About 80 milliseconds

So what should I do to reduce the time consuming?
Looking forward to your reply, thanks!

A Question

When I use the CPU to train the model, as the number of iterations increases, Loss.Backward () this function is getting slower and slower ?However, the memory on vs2017 has been stable。
`for (int i = 0; i < num_epochs; i++) {

	int batch_index = 0;
	
	for (auto &batch : *dataloader) {
	
		auto data = batch.data.to(device);
		auto target = batch.target.to(device);

		auto output= ae->forward(target);
		
		Tensor LossToTal;
		//Replace torch::nn:functional::mse_loss() is Same   
		s->caclSSIM(data, output, LossToTal);	
		
		auto Loss =1.0f- LossToTal.mean();

		sts = clock();
		optimizer.zero_grad();
		Loss.backward();
		optimizer.step();
		endt = clock();
		v1 = endt - sts;
		if ((batch_index + 1) % 2 == 0) {
			std::cout << "Epoch [" << i << "/" << num_epochs << "], Step [" << batch_index + 1 << "/"
				<< num_samples / batch_size << "], MeanLoss " << Loss.item<double>() << 
				"CostTime:"<<v1<<std::endl;			
		}
		
		batch_index++;
	}
	
	
	
}`

Update CI GIthub Actions runner settings

  • Some of the current CI runner/OS-settings are no longer available or are deprecated (e.g. windows-2016, macos-10.15) and must be updated.
  • Some of the currently used Github actions are deprecated (checkout and setup-cmake) and must be updated to their latest versions.

[bug] Issue loading MNIST dataset

Describe the bug
There are two parts to this bug.
First off I am failing to build because CMAKE is telling me that the MD5sums of the 4 files from the classic source that is referenced are different to those in the CMakeLists file.

Bypassing that issue, as I reach the part of the example where the dataset is loaded, a std::bad_alloc exception is thrown.

To Reproduce
Steps to reproduce the behavior:

  • Load the project directory in VisualStudio 2019
  • Open the CMakeLists configuration
  • Select the torch basis example project from the build targets
  • (build and) run the project

Expected behavior
The project would successfully pass through the input pipeline section of the example

Screenshots
image

Desktop (please complete the following information):

  • OS: Windows 10
  • Libtorch Version "1.9.0+cu111"

Additional context
I get the same issue using the auto-downloaded version of libtorch

Integration of torch::tensor with Siv3D Image / texture

Is your feature request related to a problem? Please describe.
I am the author of Siv3DTorch (https://github.com/QuantScientist/Siv3DTorch) which integrates OpenSiv3d with Libtorch C++.
At the moment Siv3D does not support CMake and therefore all integration efforts are on VC 19.
One main burning issue that I have is reading and writing Images/video frames from and to Siv3D without using OpenCV.

Describe the solution you'd like
I would love to have an Image conversation method between the two frameworks, either using stb_image or libpng (used by Siv3D).

Additional context
The whole scenario is described here:
Siv3D/OpenSiv3D#534

The source code is here:
https://github.com/QuantScientist/Siv3DTorch/blob/master/src/loadmodel003.cpp

Many thanks for your help,

[feature] Dockerfile to support CUDA-version pytorch-cpp

Thanks for the nice code! Here is a Dockerfile to support CUDA-version pytorch-cpp. Hope it helps when you want to run the code with GPUs.

FROM nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04
LABEL maintainer="[email protected]"

# Fix the apt-get error from nvidia-docker
RUN rm /etc/apt/sources.list.d/cuda.list \
    && rm /etc/apt/sources.list.d/nvidia-ml.list \
    && apt-key del 7fa2af80 \
    && apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub \
    && apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/7fa2af80.pub

# Install basics
RUN apt-get update -y \
    && apt-get install -y apt-utils git curl ca-certificates tree htop wget libssl-dev unzip \
    && rm -rf /var/lib/apt/lists/*
# Install g++-8 gcc-8
RUN apt-get update && apt-get install -y gcc-8 g++-8 \
  && update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-8 60 --slave /usr/bin/g++ g++ /usr/bin/g++-8 \
  && update-alternatives --config gcc \
  && rm -rf /var/lib/apt/lists/*
# Install cmake
RUN apt-get purge -y cmake \
    && mkdir /root/temp \
    && cd /root/temp \
    && wget https://github.com/Kitware/CMake/releases/download/v3.23.4/cmake-3.23.4.tar.gz \
    && tar -xzvf cmake-3.23.4.tar.gz \
    && cd cmake-3.23.4 \
    && bash ./bootstrap \
    && make \
    && make install \
    && cmake --version \
    && rm -rf /root/temp \
    && rm -rf /var/lib/apt/lists/*
# Install libtorch
RUN cd /root/ \
    && wget https://download.pytorch.org/libtorch/cu102/libtorch-cxx11-abi-shared-with-deps-1.12.1%2Bcu102.zip -O libtorch.zip \
    && unzip libtorch.zip
# Install pytorch-cpp
RUN cd /root \
    && wget https://github.com/prabhuomkar/pytorch-cpp/archive/refs/tags/v1.12.tar.gz \
    && tar -xzvf v1.12.tar.gz
RUN cd /root/pytorch-cpp-1.12 \
    && cmake -B build \
    -D CMAKE_BUILD_TYPE=Release \
    -D CMAKE_PREFIX_PATH=/root/libtorch/share/cmake/Torch \
    -D CREATE_SCRIPTMODULES=ON \
    && cmake --build build
WORKDIR /root

[feature] How to build a single tutorial?

@mfl28 can we add some instructions on how to build a single tutorial and if necessary its datasets only?
I believe many users want to try "X" tutorial from all and if there is an easy to follow guide that would be great :)

PS: Added this as an issue for visibility and referencing.

How to use ModuleList?

I found that in this project, there is no code using ModuleList. I recently encountered some problems in using ModuleList.If you can give some examples, thank you very much.

[bug] Improve README for running the codebase

Describe the bug
Users facing issues in running the code #26

Expected behavior
Users should be able to follow README and face no issues in running the code.
So, adding more detailed tutorial in README with all variations for running the code.

Additional context
Linking to installation of OS specific CMake, Libtorch will also be helpful.

Download progress bar for CIFAR-10 [feature]

Hi @prabhuomkar @mfl28
While generating the build system, the process appeared to be stuck at the step "Fetching CIFAR10 dataset...".

Current status of downloading

I actually restarted the process midway, thinking that the process had frozen. The download process took approximately about 30 minutes on my computer, since the CIFAR-10 binary is a large file (of size about 162 MB).

A progress bar to indicate the download progress would really be helpful.

There is an existing cmake flag SHOW_PROGRESS. to achieve this.

In PR - #41
I was able to achieve these results:

ProgressBar1
ProgressBar2

[bug] MNIST dataset download from classic source rarely works.

Describe the bug
The classical MNIST source (http://yann.lecun.com/exdb/mnist/) seems to be very unreliable lately. We should switch to an alternative mirror as used in pytorch MNIST dataset (https://ossci-datasets.s3.amazonaws.com/mnist).

This has also been reported in #83 .

To Reproduce
Steps to reproduce the behavior:

  1. Use the CMake script to download the MNIST dataset from the classic source.
  2. Most of the time, the download is not completed successfully.

Expected behavior
MNIST dataset is downloaded correctly.

Have dataloader for ImageNet

Is your feature request related to a problem? Please describe.
I want have load ImageNet, not MNIST.

Describe the solution you'd like
To have a dataloader for ImageNet that works with training one of the models.

Thnaks.

Error in loading MNIST dataset, convolutional_neural_network

Hi,
I am unable to read the images in the MNIST dataset.
below I have listed the error that I am receiving. Can you please help me with it.
Thanks

: ~/Projects/cnn/build$ ./cnn

Convolutional Neural Network

CUDA available. Training on GPU.
terminate called after throwing an instance of 'c10::Error'
what(): Error opening images file at ../../../../data/mnist/train-images-idx3-ubyte
Exception raised from read_images at ../torch/csrc/api/src/data/datasets/mnist.cpp:67 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator >) + 0x6b (0x7f9f8f93e0db in /home/paras/Libraries/libtorch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0xce (0x7f9f8f939d2e in /home/paras/Libraries/libtorch/lib/libc10.so)
frame #2: + 0x43847a2 (0x7f9f0b77e7a2 in /home/paras/Libraries/libtorch/lib/libtorch_cpu.so)
frame #3: torch::data::datasets::MNIST::MNIST(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, torch::data::datasets::MNIST::Mode) + 0x46 (0x7f9f0b77f846 in /home/paras/Libraries/libtorch/lib/libtorch_cpu.so)
frame #4: main + 0x121 (0x55670d49790c in ./cnn)
frame #5: __libc_start_main + 0xf3 (0x7f9ec30140b3 in /lib/x86_64-linux-gnu/libc.so.6)
frame #6: _start + 0x2e (0x55670d49752e in ./cnn)

Aborted (core dumped)

A question

I use torch.jit.trace() and save my own CNN_model. The return of my network is a tuple contains 3 tensors.
When I use C++ as front, I use auto output = module.forward(inputs); to get my result. But how can I get the 3 tensors individually? and do some C++ operations?

Thank you!

[feature]requesting for pretrained weights

Is your feature request related to a problem? Please describe.

I would like to train a CNN-classifier with my custom data using the widely-used models like ResNet series,I found it is useful to initialize the model weights with ImageNet pretrained weights, and it is easy to implement with the torch::load API when the image channels of my dataset is 3, the same as ImageNet,under which situation no change should be made to the Conv1 layer.
It is the other situation when I try to train with gray images,as the Conv1 weights is supposed to be of in_channels=3, In the python fronten, I guess this maybe solved but imdieatly repalce the model.conv1 like this:

model.conv1 = nn.Conv2d(in_channels=1, out_channels=64, kernel_size=7, stride=2, padding=3, bias=False)

but as for the C++ fronten, repalcing seems not work:

model->conv1 = torch::nn::Conv2d(torch::nn::Conv2dOptions(in_channels, 64, 7).stride(2).padding(3).bias(false).dilation(1));

Describe the solution you'd like

Correcting the API use to rightly loading the pretrained weights.

Describe alternatives you've considered

Maybe a pretrained model on gray image dataset would bypass the problem.

Additional context

Exception occurs during the model forward process.

[bug] Unable to serialize the VGG19 script module from NeuralStyleTransfer to a file

To Reproduce
Steps to reproduce the behavior:

Run:

python model/create_vgg19_layers_scriptmodule.py 
Traceback (most recent call last):
  File "model/create_vgg19_layers_scriptmodule.py", line 19, in <module>
    main()
  File "model/create_vgg19_layers_scriptmodule.py", line 14, in main
    vgg_19_layers.save(filename)
  File "/usr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 575, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'Sequential' object has no attribute 'save'

Expected behavior
The script module should be generated.

Desktop (please complete the following information):

  • OS: Arch Linux
  • Libtorch Version 1.4.1 (from Arch Linux repository)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.