Coder Social home page Coder Social logo

nvidia / deeplearningexamples Goto Github PK

View Code? Open in Web Editor NEW
12.6K 12.6K 3.1K 106.18 MB

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

Shell 2.79% Python 43.31% Dockerfile 0.17% C++ 2.29% Cuda 1.08% Jupyter Notebook 50.03% CMake 0.07% Makefile 0.19% Starlark 0.04% C 0.03% Gherkin 0.01%
computer-vision deep-learning drug-discovery forecasting large-language-models mxnet nlp paddlepaddle pytorch recommender-systems speech-recognition speech-synthesis tensorflow tensorflow2 translation

deeplearningexamples's Introduction

NVIDIA Deep Learning Examples for Tensor Cores

Introduction

This repository provides State-of-the-Art Deep Learning examples that are easy to train and deploy, achieving the best reproducible accuracy and performance with NVIDIA CUDA-X software stack running on NVIDIA Volta, Turing and Ampere GPUs.

NVIDIA GPU Cloud (NGC) Container Registry

These examples, along with our NVIDIA deep learning software stack, are provided in a monthly updated Docker container on the NGC container registry (https://ngc.nvidia.com). These containers include:

  • The latest NVIDIA examples from this repository
  • The latest NVIDIA contributions shared upstream to the respective framework
  • The latest NVIDIA Deep Learning software libraries, such as cuDNN, NCCL, cuBLAS, etc. which have all been through a rigorous monthly quality assurance process to ensure that they provide the best possible performance
  • Monthly release notes for each of the NVIDIA optimized containers

Computer Vision

Models Framework AMP Multi-GPU Multi-Node TensorRT ONNX Triton DLC NB
EfficientNet-B0 PyTorch Yes Yes - Supported - Supported Yes -
EfficientNet-B4 PyTorch Yes Yes - Supported - Supported Yes -
EfficientNet-WideSE-B0 PyTorch Yes Yes - Supported - Supported Yes -
EfficientNet-WideSE-B4 PyTorch Yes Yes - Supported - Supported Yes -
EfficientNet v1-B0 TensorFlow2 Yes Yes Yes Example - Supported Yes -
EfficientNet v1-B4 TensorFlow2 Yes Yes Yes Example - Supported Yes -
EfficientNet v2-S TensorFlow2 Yes Yes Yes Example - Supported Yes -
GPUNet PyTorch Yes Yes - Example Yes Example Yes -
Mask R-CNN PyTorch Yes Yes - Example - Supported - Yes
Mask R-CNN TensorFlow2 Yes Yes - Example - Supported Yes -
nnUNet PyTorch Yes Yes - Supported - Supported Yes -
ResNet-50 MXNet Yes Yes - Supported - Supported - -
ResNet-50 PaddlePaddle Yes Yes - Example - Supported - -
ResNet-50 PyTorch Yes Yes - Example - Example Yes -
ResNet-50 TensorFlow Yes Yes - Supported - Supported Yes -
ResNeXt-101 PyTorch Yes Yes - Example - Example Yes -
ResNeXt-101 TensorFlow Yes Yes - Supported - Supported Yes -
SE-ResNeXt-101 PyTorch Yes Yes - Example - Example Yes -
SE-ResNeXt-101 TensorFlow Yes Yes - Supported - Supported Yes -
SSD PyTorch Yes Yes - Supported - Supported - Yes
SSD TensorFlow Yes Yes - Supported - Supported Yes Yes
U-Net Med TensorFlow2 Yes Yes - Example - Supported Yes -

Natural Language Processing

Models Framework AMP Multi-GPU Multi-Node TensorRT ONNX Triton DLC NB
BERT PyTorch Yes Yes Yes Example - Example Yes -
GNMT PyTorch Yes Yes - Supported - Supported - -
ELECTRA TensorFlow2 Yes Yes Yes Supported - Supported Yes -
BERT TensorFlow Yes Yes Yes Example - Example Yes Yes
BERT TensorFlow2 Yes Yes Yes Supported - Supported Yes -
GNMT TensorFlow Yes Yes - Supported - Supported - -
Faster Transformer Tensorflow - - - Example - Supported - -

Recommender Systems

Models Framework AMP Multi-GPU Multi-Node ONNX Triton DLC NB
DLRM PyTorch Yes Yes - Yes Example Yes Yes
DLRM TensorFlow2 Yes Yes Yes - Supported Yes -
NCF PyTorch Yes Yes - - Supported - -
Wide&Deep TensorFlow Yes Yes - - Supported Yes -
Wide&Deep TensorFlow2 Yes Yes - - Supported Yes -
NCF TensorFlow Yes Yes - - Supported Yes -
VAE-CF TensorFlow Yes Yes - - Supported - -
SIM TensorFlow2 Yes Yes - - Supported Yes -

Speech to Text

Models Framework AMP Multi-GPU Multi-Node TensorRT ONNX Triton DLC NB
Jasper PyTorch Yes Yes - Example Yes Example Yes Yes
QuartzNet PyTorch Yes Yes - Supported - Supported Yes -

Text to Speech

Models Framework AMP Multi-GPU Multi-Node TensorRT ONNX Triton DLC NB
FastPitch PyTorch Yes Yes - Example - Example Yes Yes
FastSpeech PyTorch Yes Yes - Example - Supported - -
Tacotron 2 and WaveGlow PyTorch Yes Yes - Example Yes Example Yes -
HiFi-GAN PyTorch Yes Yes - Supported - Supported Yes -

Graph Neural Networks

Models Framework AMP Multi-GPU Multi-Node ONNX Triton DLC NB
SE(3)-Transformer PyTorch Yes Yes - - Supported - -
MoFlow PyTorch Yes Yes - - Supported - -

Time-Series Forecasting

Models Framework AMP Multi-GPU Multi-Node TensorRT ONNX Triton DLC NB
Temporal Fusion Transformer PyTorch Yes Yes - Example Yes Example Yes -

NVIDIA support

In each of the network READMEs, we indicate the level of support that will be provided. The range is from ongoing updates and improvements to a point-in-time release for thought leadership.

Glossary

Multinode Training Supported on a pyxis/enroot Slurm cluster.

Deep Learning Compiler (DLC) TensorFlow XLA and PyTorch JIT and/or TorchScript

Accelerated Linear Algebra (XLA) XLA is a domain-specific compiler for linear algebra that can accelerate TensorFlow models with potentially no source code changes. The results are improvements in speed and memory usage.

PyTorch JIT and/or TorchScript TorchScript is a way to create serializable and optimizable models from PyTorch code. TorchScript, an intermediate representation of a PyTorch model (subclass of nn.Module) that can then be run in a high-performance environment such as C++.

Automatic Mixed Precision (AMP) Automatic Mixed Precision (AMP) enables mixed precision training on Volta, Turing, and NVIDIA Ampere GPU architectures automatically.

TensorFloat-32 (TF32) TensorFloat-32 (TF32) is the new math mode in NVIDIA A100 GPUs for handling the matrix math also called tensor operations. TF32 running on Tensor Cores in A100 GPUs can provide up to 10x speedups compared to single-precision floating-point math (FP32) on Volta GPUs. TF32 is supported in the NVIDIA Ampere GPU architecture and is enabled by default.

Jupyter Notebooks (NB) The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.

Feedback / Contributions

We're posting these examples on GitHub to better support the community, facilitate feedback, as well as collect and implement contributions using GitHub Issues and pull requests. We welcome all contributions!

Known issues

In each of the network READMEs, we indicate any known issues and encourage the community to provide feedback.

deeplearningexamples's People

Contributors

alancucki avatar alvarognvidia avatar asulecki avatar byshiue avatar grzegorz-k-karch avatar hxl3s avatar izzyputterman avatar jan-golda avatar jbaczek avatar jconwaynv avatar khcs avatar maggiezha avatar meatybobby avatar michal2409 avatar milesial avatar mmarcinkiewicz avatar nv-kkudrynski avatar nvpstr avatar peri044 avatar pribalta avatar rajeevsrao avatar shakandrew avatar sharathts avatar shashank3959 avatar shriyapalsamudram avatar swethmandava avatar szmigacz avatar tgrel avatar vinhngx avatar yzhang123 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deeplearningexamples's Issues

AMP availability in upstream tensorflow

When will AMP be available in upstream tensorflow? Or be open sourced?

It is quite difficult to use the images without source.

Specifically, the images use python 3.5. Happy to build my own images and recompile tf but cannot without source for AMP. Thanks!

Without docker?

Can I test these example using tensorflow2.0 alpha without ngc docker ?

bert ft16 warning: global step (tf.train.get_global_step) has not been increased

Hi,

I am running bert with use_fp16 and kept getting the following warning:

WARNING:tensorflow:It seems that global step (tf.train.get_global_step) has not been increased. Current value (could be stable): 61251 vs previous value: 61251. You could increase the global step by passing tf.train.get_global_step() to Optimizer.apply_gradients or Optimizer.minimize.

The training process also becomes much slower than before.

Can anyone please help?

Thanks.

need to run data_download outside of docker due to firewalls

It seems I cannot get docker to correctly access the dns system and resolve IP addresses.
Thus I have had to run the data downloads manually.
however I cannot find the download_files.py script needed for the following line in bookcorpus/run_preprocessing.sh

python3 /workspace/bookcorpus/download_files.py --list /workspace/bookcorpus/url_list.jsonl --out ${WORKING_DIR}/download --trash-bad-count

which makes the bookcorpus a bit difficult to get

Resnet18 is invalid.

file:
https://github.com/NVIDIA/DeepLearningExamples/blob/master/TensorFlow/Classification/imagenet/nvcnn_hvd.py

error log:

Traceback (most recent call last):
  File "nvcnn_hvd.py", line 1451, in <module>
    main()
  File "nvcnn_hvd.py", line 1320, in main
    batch_size)
  File "nvcnn_hvd.py", line 522, in training_step
    loss, logits = self.loss_func(images, labels, var_scope)
  File "nvcnn_hvd.py", line 1282, in loss_func
    output = model_func(net, images)
  File "nvcnn_hvd.py", line 1249, in <lambda>
    model_func = lambda net, images: inference_resnet_v1(net, images, nlayer)
  File "nvcnn_hvd.py", line 894, in inference_resnet_v1
    if   nlayer ==  18: return inference_resnet_v1_basic_impl(net, input_layer, [2,2, 2,2])
  File "nvcnn_hvd.py", line 881, in inference_resnet_v1_basic_impl
    basic_resnet_bottleneck_callback = partial(resnet_bottleneck_v1, basic=True)
NameError: name 'partial' is not defined

The Wiki data source might be corrupted

When run data_download.sh, get the following issue when download the Wiki data source. Any advice about this..

Downloading Wikidump
Extracting Wikidump
   0 B 0:00:00 [   0 B/s] [<=>                                                                                                                                                                                                              ]

bunzip2: Compressed file ends unexpectedly;
        perhaps it is corrupted?  *Possible* reason follows.
bunzip2: Inappropriate ioctl for device
        Input file = (stdin), output file = (stdout)

It is possible that the compressed file(s) have become corrupted.
You can use the -tvv option to test integrity of such files.

You can use the `bzip2recover' program to attempt to recover
data from undamaged sections of corrupted files.

How to costumize my own Transformer model?

Since the DeepLearningExamples uses fairseq to build the Transformer model,

from fairseq import data, distributed_utils, options, progress_bar, tasks, utils, bleu, tokenizer
model = task.build_model(args)

if I want to develop my own model based on Transformer with many possible changes to made on the neural structure and training strategy, how can I modify your code? Which files should I modify?

How to adjust the hyper parameters when using V100 32G

When I used 8*V100 32G to train tansformer, I found that the results of V100 32G were not significantly different with V100 16G if I didn't adjust hyper parameters, or even decreased. I increased the batch size, but the effect was not obvious. How should I adjust it to fully reflect the advantages of 32G V100?

MPI bind memory error

Hi I tried to pre-train the BERT-base model following the instructions. However, it failed due to bind memory. Any suggestions?
Thanks,
Xiaodong

Error Msg:
_WARNING: Open MPI tried to bind a process but failed. This is a
warning only; your job will continue, though performance may
be degraded.

Local host: 2cfa81745dec
Application name: /usr/bin/python3
Error message: failed to bind memory
Location: rtc_hwloc.c:445_

`nan` error in Pytorch ImageNet-ResNet50

OS: Ubuntu 18.04 LTS
GPU: NVIDIA TESLA V100 32G * 8
Docker: pytorch-18.09-py3

I ran these script files RN50_FP16_8GPU.sh, RN50_FP16_4GPU.sh, RN50_FP32_8GPU.sh, RN50_FP32_4GPU.sh, and all got nan loss after several epochs (<=6). After I replaced ResNet50 to ResNet18, there was also got nan loss after ~20 epochs.

I have tried to decrease lr and batch_size, but it not works.

PS: In addition, I could successfully run the mxnet version of ImageNet ResNet50v1.5.

Check failed: error == cudaSuccess unspecified launch failure

When I tried to resume training maskrcnn using Detectron. The training progress goes well at the begining, but alone with the training continues, the training time for each iter grows progressly, after handreds or thousands of iterations, the training broke down with the cuda error below:

E0628 19:13:22.730840 3624 net_dag.cc:195] Exception from operator chain starting at '' (type 'Concat'): caffe2::EnforceNotMet: [enforce fail at context_gpu.h:156] . Encountered CUDA error: unspecified launch failure Error from operator:
input: "gpu_3/roi_feat_3rd" input: "gpu_3/fc1_3rd_w" input: "gpu_3/fc1_3rd_b" output: "gpu_3/fc1_3rd" name: "" type: "FC" arg { name: "use_cudnn" i: 1 } arg { name: "order" s: "NCHW" } arg { name: "cudnn_exhaustive_search" i: 0 } device_option { device_type: 1 cuda_gpu_id: 3 }
E0628 19:13:22.730844 3631 net_dag.cc:195] Secondary exception from operator chain starting at '' (type 'WeightedSum'): caffe2::EnforceNotMet: [enforce fail at context_gpu.h:156] . Encountered CUDA error: unspecified launch failure Error from operator:
input: "gpu_3/fc1_2nd_w_grad" input: "gpu_3/one" input: "gpu_3/fc1_2nd_w" input: "gpu_3/wd" output: "gpu_3/fc1_2nd_w_grad" name: "" type: "WeightedSum" device_option { device_type: 1 cuda_gpu_id: 3 }
E0628 19:13:22.730901 3635 net_dag.cc:195] Secondary exception from operator chain starting at '' (type 'Concat'): caffe2::EnforceNotMet: [enforce fail at context_gpu.h:156] . Encountered CUDA error: unspecified launch failure Error from operator:
input: "gpu_3/_[mask]fcn1" output: "gpu_3/[mask]_fcn1" name: "" type: "Relu" arg { name: "cudnn_exhaustive_search" i: 0 } arg { name: "order" s: "NCHW" } device_option { device_type: 1 cuda_gpu_id: 3 } engine: "CUDNN"
F0628 19:13:22.730955 3631 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failure
F0628 19:13:22.730959 3624 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.730976 3635 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731009 3632 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731016 3636 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731016 3628 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731050 3626 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731067 3630 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731077 3622 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731089 3633 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731108 3634 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731112 3627 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731127 3623 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731139 3637 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731155 3629 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731155 3625 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failure
*** Check failure stack trace: ***
F0628 19:13:22.730959 3624 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.730976 3635 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731009 3632 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731016 3636 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731016 3628 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731050 3626 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731067 3630 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731077 3622 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731089 3633 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731108 3634 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731112 3627 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731127 3623 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731139 3637 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731155 3629 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731155 3625 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failure
*** Check failure stack trace: ***
*** Check failure stack trace: ***
F0628 19:13:22.730959 3624 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.730976 3635 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731009 3632 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731016 3636 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731016 3628 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731050 3626 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731067 3630 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731077 3622 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731089 3633 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731108 3634 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731112 3627 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731127 3623 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731139 3637 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731155 3629 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731155 3625 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failure
*** Check failure stack trace: ***
F0628 19:13:22.730959 3624 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.730976 3635 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731009 3632 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731016 3636 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731016 3628 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731050 3626 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731067 3630 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731077 3622 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731089 3633 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731108 3634 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731112 3627 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731127 3623 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731139 3637 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731155 3629 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731155 3625 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failure
F0628 19:13:22.730959 3624 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.730976 3635 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731009 3632 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731016 3636 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731016 3628 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731050 3626 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731067 3630 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731077 3622 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731089 3633 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731108 3634 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731112 3627 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731127 3623 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731139 3637 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731155 3629 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failureF0628 19:13:22.731155 3625 context_gpu.h:107] Check failed: error == cudaSuccess unspecified launch failure
*** Check failure stack trace: ***

I googled this error, it seems like something about GPU memory leak, but during training time, GPU memory usage are stable and normal until the progress broke down. I tried to reboot my server but it didn't work, can you help me out with this?
I checked the context_gpu.h at line 107, the code is:

~ThreadLocalCUDAObjects() noexcept {
 99     for (int i = 0; i < CAFFE2_COMPILE_TIME_MAX_GPUS; ++i) {
100       for (auto& handle : cublas_handles_[i]) {
101         if (handle) {
102           CUBLAS_CHECK(cublasDestroy(handle));
103         }
104       }
105       for (auto& stream : cuda_streams_[i]) {
106         if (stream) {
107           CUDA_CHECK(cudaStreamDestroy(stream));
108         }
109       }
110       for (auto& handle : cudnn_handles_[i]) {
111         if (handle) {
112           CUDNN_CHECK(cudnnDestroy(handle));
113         }
114       }
115     }
116   }

where to find the file sacrebleu_reference.de

I try to run the pytorch transformer model in this repository. But the runtime failed at the following line:
with open(os.path.join(args.data, 'sacrebleu_reference.de'), 'r') as reference:

Here it tries to open the file sacrebleu_reference.de. But where to get this file? The run_preprocessing.sh script didn't generate this file and I also could not find it online. So how do we get this file? Thanks.

Why is PyTorch/Distributed deleted?

Hi, I was searching for tutorials/examples of distributed learning with PyTorch. I found contents of PyTorch/Distributed very useful, but it became unavailable since commit hash 7a8544e. Why is it deleted?

how is horovod installed in nvcr.io/nvidia/tensorflow:19.05-py3?

I'm rebuilding the image against python 3.6

  • rebuild tensorflow against python 3.6 with tensorflow fork found at /opt/tensorflow
  • install horovod against python 3.6

I've done the first, but the second, I'm unclear how to do given that the nvidia image Dockerfile is not publicly available.

RUN ldconfig /usr/local/cuda-10.1/targets/x86_64-linux/lib/stubs && \
    HOROVOD_NCCL_HOME=/usr/local/nccl-2.4.6 \
    HOROVOD_GPU_ALLREDUCE=NCCL \
    HOROVOD_WITH_TENSORFLOW=1 \
    HOROVOD_WITHOUT_PYTORCH=1 \
    HOROVOD_WITHOUT_MXNET=1 \
    pip3 install --no-cache-dir horovod==0.16.1 && \
    ldconfig

Unclear how I can specify HOROVOD_NCCL_HOME as there doesn't seem to be an /usr/local/nccl-2.4.6

NanLossDuringTrainingError when training BERT large model

I have been using the BERT with FP16 + XLA implementation for several weeks. It works great for BERT BASE model training. Recently I started to use it to train LARGE model with FP16+XLA. The training went well until around 344k step. It hit NanLossDuringTrainingError with error "Model diverged with loss = NaN.". The error stack with tf 1.13.1 is below. Can you provide some insights on what's wrong? Thanks.

Model diverged with loss = NaN.
Error recorded from training_loop: NaN loss during training.
training_loop marked as finished
WARNING: Reraising captured error
Traceback (most recent call last):
File "run_pretraining.py", line 610, in
File "/usr/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "run_pretraining.py", line 582, in main
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2457, in train
rendezvous.raise_errors()
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/tpu/python/tpu/error_handling.py", line 128, in raise_errors
six.reraise(typ, value, traceback)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2452, in train
saving_listeners=saving_listeners)
File "/usr/lib/python2.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 358, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/lib/python2.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1124, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/lib/python2.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1158, in _train_model_default
saving_listeners)
File "/usr/lib/python2.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1407, in _train_with_estimator_spec
_, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 676, in run
run_metadata=run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 1171, in run
run_metadata=run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 1270, in run
raise six.reraise(*original_exc_info)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 1255, in run
return self._sess.run(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 1335, in run
run_metadata=run_metadata))
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/basic_session_run_hooks.py", line 753, in after_run
raise NanLossDuringTrainingError

Where does these magic optimization flags come from? any Docs?

Environment Variables

        # ============================================
        # Optimisation Flags - Do not remove
        # ============================================

        os.environ['CUDA_CACHE_DISABLE'] = '0'

        os.environ['HOROVOD_GPU_ALLREDUCE'] = 'NCCL'

        # os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

        os.environ['TF_GPU_THREAD_MODE'] = 'gpu_private'
        os.environ['TF_GPU_THREAD_COUNT'] = '1' if not hvd_utils.is_using_hvd() else str(hvd.size())

        os.environ['TF_USE_CUDNN_BATCHNORM_SPATIAL_PERSISTENT'] = '1'

        os.environ['TF_ADJUST_HUE_FUSED'] = '1'
        os.environ['TF_ADJUST_SATURATION_FUSED'] = '1'
        os.environ['TF_ENABLE_WINOGRAD_NONFUSED'] = '1'

        os.environ['TF_SYNC_ON_FINISH'] = '0'
        os.environ['TF_AUTOTUNE_THRESHOLD'] = '2'
        os.environ['TF_DISABLE_NVTX_RANGES'] = '1'

        # =================================================

Session Configuration

config.gpu_options.force_gpu_compatible = True # Force pinned memory

        config.gpu_options.force_gpu_compatible = True  # Force pinned memory

        if mode == 'train':
            config.intra_op_parallelism_threads = 1  # Avoid pool of Eigen threads

            if hvd_utils.is_using_hvd():
                config.inter_op_parallelism_threads = max(2, (multiprocessing.cpu_count() // hvd.size()) - 2)
            else:
                config.inter_op_parallelism_threads = 4

BERT pre-training with large batch size

Hi,
For pre-training, it is suggested to use batch size of 14 on V100 32G GPU. Is it possible to fit a larger batch size(max sequence length 512). e.g. 32 on V100 32 GB with fp16 and xla?

Source and reference streams have different lengths!

I was training the Transformer model when a error occurred. The training process for the 1st epoch went very well but the validation raised an error, "EOFError: Source and reference streams have different lengths!" . By the way, I run "sacrebleu -t wmt14/full -l de-en --echo src > $DATASET_DIR/sacrebleu_reference.de" to generate the reference. Anyone know how to fix it?


| epoch 001 | valid on 'valid' subset | valid_loss 4.55658 | valid_nll_loss 2.8718 | valid_ppl 7.32 | num_updates 7867
| /workspace/data-bin/wmt14_en_de_joined_dict test 3003 examples
| Sentences are being padded to multiples of: 1
generated batches in 0.0007243156433105469 s
Traceback (most recent call last):
File "/workspace/examples/transformer/train.py", line 525, in
distributed_main(args)
File "/workspace/examples/transformer/distributed_train.py", line 57, in main
single_process_main(args)
File "/workspace/examples/transformer/train.py", line 128, in main
current_bleu, current_sc_bleu = score(args, trainer, task, epoch_itr, args.gen_subset)
File "/workspace/examples/transformer/train.py", line 392, in score
sacrebleu_score = sacrebleu.corpus_bleu(predictions, refs, lowercase=args.ignore_case)
File "/opt/conda/lib/python3.6/site-packages/sacrebleu.py", line 1031, in corpus_bleu
raise EOFError("Source and reference streams have different lengths!")
EOFError: Source and reference streams have different lengths!

No APEX in the Jupyter docker container for PyTorch/Segmentation/MaskRCNN/

When running the demo notebook Mask_R-CNN_demo from the Jupyter docker built with

DeepLearningExamples/PyTorch/Segmentation/MaskRCNN/pytorch/docker/docker-jupyter/Dockerfile

I encountered the below error:

/notebooks/maskrcnn-benchmark/maskrcnn_benchmark/layers/nms.py in
3 from maskrcnn_benchmark import _C
4
----> 5 from apex import amp
6
7 # Only valid with fp32 inputs - give AMP the hint

ModuleNotFoundError: No module named 'apex'

Interpreting SQUAD benchmark results and possible difference in reproduction

In the README, benchmark results for SQUAD on v100s are given, https://github.com/NVIDIA/DeepLearningExamples/blob/master/TensorFlow/LanguageModeling/BERT/README.md#nvidia-dgx-1-8x-v100-16g

Does sentences/sec correspond to the print out of examples/sec given when running run_squad.py?

When I run run_squad.py on GKE (with fp16, xla, horovod, batch size 2), I get 21 examples/sec for 1x V100 and 14 examples/sec for 8x V100 (I'm assuming this is per GPU?). These are slightly different than the results given in the README.

EDIT: this was run on the base model and NOT large

Error in downloading data

When run data_download.sh and it is downloading Wikimedia data, it says there is no such directory. Seems some source file is missing... Is there any other way to get and process the data?

Downloading Wikidump
--2019-06-26 12:10:32-- ftp://ftpmirror.your.org/pub/wikimedia/dumps/enwiki/20190301/enwiki-20190301-pages-articles-multistream.xml.bz2
=> ‘wikidump.xml.bz2’
Resolving ftpmirror.your.org (ftpmirror.your.org)... 204.9.55.82, 2001:4978:1:420::cc09:3752
Connecting to ftpmirror.your.org (ftpmirror.your.org)|204.9.55.82|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done. ==> PWD ... done.
==> TYPE I ... done. ==> CWD (1) /pub/wikimedia/dumps/enwiki/20190301 ...
No such directory ‘pub/wikimedia/dumps/enwiki/20190301’.

[Resolved] BERT: AMP not being called

I ran Bert pretraining with tensorflow 1.12 and horovod 0.13.8, not with the NGC container.
But I got worse performance with mixed precision using amp than that with fp32.

I use cuda 9.0 and cuDNN 7.1.

The command lines to run pretraining with and without amp are as follows:
./run_pretraining.sh 32 8 5e-5 fp32 1, and
./run_pretraining.sh 32 8 5e-5 amp 1

Any suggestion?

pFP16Initializer in train_cifar10.py should be replaced with PseudoFP16Initializer

Running the Caffe2/Classification/cifar10/train_cifar10.py script results in the following error:

Traceback (most recent call last):
  File "train_cifar10.py", line 13, in <module>
    from caffe2.python.modeling.initializers import Initializer, pFP16Initializer
ImportError: cannot import name pFP16Initializer

This can be fixed by replacing all occurrences of pFP16Initializer with PseudoFP16Initializer.

ModuleNotFoundError: No module named 'fairseq.data.batch_C'

Hello,

When running "run_preprocessing.sh", no module "batch_C" in fairseq/data within DeepLearningExamples/PyTorch/Translation/Transformer directory.

Anybody with a suggestion ?
R.

Traceback (most recent call last):
File "preprocess.py", line 15, in
from fairseq.data import indexed_dataset, dictionary
File "/bfs/hpc_cluster/work/DeepLearningExamples/PyTorch/Translation/Transformer/fairseq/data/init.py", line 26, in
from .language_pair_dataset import LanguagePairDataset
File "/bfs/hpc_cluster/work/DeepLearningExamples/PyTorch/Translation/Transformer/fairseq/data/language_pair_dataset.py", line 26, in
from . import data_utils, FairseqDataset
File "/bfs/hpc_cluster/work/DeepLearningExamples/PyTorch/Translation/Transformer/fairseq/data/data_utils.py", line 31, in
import fairseq.data.batch_C
ModuleNotFoundError: No module named 'fairseq.data.batch_C'

Error in Validation of Transformer

In train.py, line 391: predictions = [hypo[1] + ('\n' if hypo[-1]!='\n' else '') for hypo in predictions], I think it should be predictions = [hypo[1] + ('\n' if hypo[1][-1]!='\n' else '') for hypo in predictions] because hypo[-1] is a whole sentence which will never be "\n" at all.

SQUAD benchmark segfaults on tf upstream AMP

  1. I built an image based on https://gitlab.com/nvidia/cuda/blob/ubuntu18.04/10.0/base/Dockerfile [1] with tf-nightly-gpu==1.14.1.dev20190523 (an attempt to use upstream AMP)
  2. I was able to run the Horovod benchmark https://github.com/horovod/horovod/blob/master/docs/benchmarks.md on the image on a 8x V100 16 Gi.
  3. however the SQUAD benchmark (fp16, xla, horovod) on the same image and machine segfaults [2]

[1]

FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04

ENV LD_LIBRARY_PATH $LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
ENV CUDA_HOME /usr/local/cuda

RUN mkdir -p /usr/src/app
WORKDIR /usr/src
COPY requirements.txt /usr/src/

RUN apt-get update && apt-get install -yqq --no-install-recommends \
        software-properties-common \
        build-essential \
        curl \
        python3-dev \
        python3-distutils \
        libfreetype6-dev \
        libpng-dev \
        libzmq3-dev \
        pkg-config \
        rsync \
        unzip \
        nvidia-modprobe \
        git \
        wget

RUN ln -s /usr/bin/python3.6 /usr/bin/python

RUN mkdir -p /usr/src/app
WORKDIR /usr/src

RUN curl -O https://bootstrap.pypa.io/get-pip.py && \
    python get-pip.py && \
    rm get-pip.py

COPY requirements.txt /usr/src/
RUN pip install --no-cache-dir -r requirements.txt

ENV LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/usr/local/nccl-2.4.2

# mpi
RUN mkdir /tmp/openmpi && \
    cd /tmp/openmpi && \
    wget https://www.open-mpi.org/software/ompi/v4.0/downloads/openmpi-4.0.0.tar.gz && \
    tar zxf openmpi-4.0.0.tar.gz && \
    cd openmpi-4.0.0 && \
    ./configure --enable-orterun-prefix-by-default && \
    make -j $(nproc) all && \
    make install && \
    ldconfig && \
    rm -rf /tmp/openmpi

RUN ldconfig /usr/local/cuda-10.0/targets/x86_64-linux/lib/stubs && \
    HOROVOD_NCCL_HOME=/usr/local/nccl-2.4.2 \
    HOROVOD_GPU_ALLREDUCE=NCCL \
    HOROVOD_WITH_TENSORFLOW=1 \
    HOROVOD_WITHOUT_PYTORCH=1 \
    HOROVOD_WITHOUT_MXNET=1 \
    pip install --no-cache-dir horovod && \
    ldconfig

# Install OpenSSH for MPI to communicate between containers
RUN apt-get update && apt-get install -y --no-install-recommends \
    openssh-client \
    openssh-server && \
    mkdir -p /var/run/sshd

# Allow OpenSSH to talk to containers without asking for confirmation
RUN cat /etc/ssh/ssh_config | grep -v StrictHostKeyChecking > /etc/ssh/ssh_config.new && \
    echo "    StrictHostKeyChecking no" >> /etc/ssh/ssh_config.new && \
    mv /etc/ssh/ssh_config.new /etc/ssh/ssh_config

# benchmarks
RUN cd /usr/src && git clone https://github.com/tensorflow/benchmarks

[2]


Current thread 0x00007f5bf0da9740 (most recent call first):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1871 in _create_c_op
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 2034 in __init__
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3623 in create_op
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507 in new_func
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 788 in _apply_op_helper
  File "<string>", line 80 in horovod_allreduce
  File "/usr/local/lib/python3.6/dist-packages/horovod/tensorflow/mpi_ops.py", line 79 in _allreduce
  File "/usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py", line 78 in allreduce
  File "/usr/src/bert/run_squad.py", line 1314 in main
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 251 in _run_main
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 300 in run
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py", line 40 in run
  File "/usr/src/bert/run_squad.py", line 1417 in <module>
Fatal Python error: Segmentation fault

Tensorflow NCF wrong movielens data location

Data after the preparation step is stored at "/tmp/cache/ml-20m" instead of "/data/cache/ml-20m"
So if one follows the instruction:

datadir=/data/cache/ml-20m
mpirun -np $numgpu
--allow-run-as-root
python ncf.py --data $datadir

the data will be missing.

Tacotron 2 and WaveGlow for PyTorch typo

In the quick start guide of the "Tacotron 2 and WaveGlow for PyTorch" there is a typo. It currently suggests execution of:

bash scripts/prepare-dataset.sh

Where the file is actually:

prepare_dataset.sh

The data_download.sh script doesn't process wikipedia

Here are the files I see after going through data download instructions in https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/LanguageModeling/BERT

The wikipedia directory seems to be missing files. Here's what I see after running data_download.sh which takes about 20 hours

4.0K    ./data/wikipedia_corpus/intermediate_files
16G     ./data/wikipedia_corpus/download
4.0K    ./data/wikipedia_corpus/final_text_files_sharded
67G     ./data/wikipedia_corpus/raw_data
13G     ./data/wikipedia_corpus/extracted_articles/AA
13G     ./data/wikipedia_corpus/extracted_articles
4.0K    ./data/wikipedia_corpus/final_text_file_single
4.0K    ./data/wikipedia_corpus/final_tfrecords_sharded
95G     ./data/wikipedia_corpus
5.2G    ./data/bookcorpus/intermediate_files
5.3G    ./data/bookcorpus/download
5.2G    ./data/bookcorpus/final_text_files_sharded
5.2G    ./data/bookcorpus/final_text_file_single
52G     ./data/bookcorpus/final_tfrecords_sharded
72G     ./data/bookcorpus
45M     ./data/squad/v2.0
34M     ./data/squad/v1.1
78M     ./data/squad
417M    ./data/pretrained_models_google/cased_L-12_H-768_A-12
394M    ./data/pretrained_models_google/chinese_L-12_H-768_A-12
643M    ./data/pretrained_models_google/multilingual_L-12_H-768_A-12
422M    ./data/pretrained_models_google/uncased_L-12_H-768_A-12
1.3G    ./data/pretrained_models_google/uncased_L-24_H-1024_A-16
684M    ./data/pretrained_models_google/multi_cased_L-12_H-768_A-12
1.3G    ./data/pretrained_models_google/cased_L-24_H-1024_A-16
9.7G    ./data/pretrained_models_google
177G    ./data
16K     ./__pycache__
12K     ./scripts/docker
44K     ./scripts
177G    .

Note that it only extracts AA of the wikipedia and the final sharded directories are empty.
Book corpus worked fine

Why The Test Result of Transformer NMT Task with 4 GPUs Is Worse Than What Is Reported in Readme

In the readme file, 4 GPUs can achieve a BLEU of 28.35 and even 28.67 when training more epochs.

GPU count Mixed precision BLEU fp32 BLEU Mixed precision training time fp32 training time
8 28.69 28.43 446 min 1896 min
4 28.35 28.31 834 min 3733 min
GPU count Precision BLEU score Epochs to train Training time
4 fp16 28.67 74 1925 min
4 fp32 28.40 47 5478 min

However, I have run the code with 4 GPUs and I did not modify the code at all but The Best Result I got is 27.63 on my "checkpoint_best.pt" which is epoch 19 in my case. I have run totally 80 epochs and the best BLEU over all those epochs is 28.13 which is not considered as the "checkpoint_best.pt" in the validation process.

I used the following command line to train the model:

nohup python -m torch.distributed.launch --nproc_per_node 4 /workspace/translation/train.py /workspace/data-bin/wmt14_en_de_joined_dict
--arch transformer_wmt_en_de_big_t2t
--share-all-embeddings
--optimizer adam
--adam-betas '(0.9, 0.997)'
--adam-eps "1e-9"
--clip-norm 0.0
--lr-scheduler inverse_sqrt
--warmup-init-lr 0.0
--update-freq 2
--warmup-updates 8000
--lr 0.0006
--min-lr 0.0
--dropout 0.1
--weight-decay 0.0
--criterion label_smoothed_cross_entropy
--label-smoothing 0.1
--max-tokens 5120
--seed 1
--max-epoch 80
--ignore-case
--fp16
--save-dir /workspace/checkpoints
--distributed-init-method env:// > train.nohup.out &

I also tried different warmup-updates and lr, and the results are similar. The result I got is like:

Test Checkpoint1
| Translated 3003 sentences (84994 tokens) in 25.2s (119.35 sentences/s, 3377.84 tokens/s)
| Generate test with beam=4: BLEU4 = 18.11, 50.2/23.5/12.7/7.2 (BP=1.000, ratio=1.041, syslen=67147, reflen=64512)
Test Checkpoint2
| Translated 3003 sentences (87704 tokens) in 27.5s (109.17 sentences/s, 3188.43 tokens/s)
| Generate test with beam=4: BLEU4 = 21.26, 52.5/26.7/15.5/9.4 (BP=1.000, ratio=1.061, syslen=68450, reflen=64512)
Test Checkpoint3
| Translated 3003 sentences (86611 tokens) in 25.8s (116.61 sentences/s, 3363.17 tokens/s)
| Generate test with beam=4: BLEU4 = 23.91, 55.5/29.5/17.8/11.2 (BP=1.000, ratio=1.040, syslen=67079, reflen=64512)
Test Checkpoint4
| Translated 3003 sentences (86518 tokens) in 25.8s (116.61 sentences/s, 3359.54 tokens/s)
| Generate test with beam=4: BLEU4 = 25.26, 56.7/30.9/19.0/12.3 (BP=1.000, ratio=1.035, syslen=66758, reflen=64512)
Test Checkpoint5
| Translated 3003 sentences (86768 tokens) in 25.7s (116.96 sentences/s, 3379.47 tokens/s)
| Generate test with beam=4: BLEU4 = 25.63, 56.8/31.2/19.4/12.5 (BP=1.000, ratio=1.034, syslen=66698, reflen=64512)
Test Checkpoint6
| Translated 3003 sentences (87220 tokens) in 25.8s (116.21 sentences/s, 3375.30 tokens/s)
| Generate test with beam=4: BLEU4 = 25.98, 56.9/31.5/19.8/12.9 (BP=1.000, ratio=1.042, syslen=67205, reflen=64512)
Test Checkpoint7
| Translated 3003 sentences (87715 tokens) in 25.9s (115.80 sentences/s, 3382.54 tokens/s)
| Generate test with beam=4: BLEU4 = 26.24, 57.2/31.8/20.0/13.0 (BP=1.000, ratio=1.045, syslen=67413, reflen=64512)
Test Checkpoint8
| Translated 3003 sentences (87808 tokens) in 26.8s (111.88 sentences/s, 3271.39 tokens/s)
| Generate test with beam=4: BLEU4 = 26.82, 57.6/32.3/20.5/13.6 (BP=1.000, ratio=1.045, syslen=67444, reflen=64512)
Test Checkpoint9
| Translated 3003 sentences (87394 tokens) in 25.6s (117.26 sentences/s, 3412.38 tokens/s)
| Generate test with beam=4: BLEU4 = 26.63, 57.8/32.2/20.3/13.3 (BP=1.000, ratio=1.039, syslen=67033, reflen=64512)
Test Checkpoint10
| Translated 3003 sentences (86825 tokens) in 25.8s (116.31 sentences/s, 3362.82 tokens/s)
| Generate test with beam=4: BLEU4 = 27.10, 58.1/32.7/20.7/13.7 (BP=1.000, ratio=1.031, syslen=66541, reflen=64512)
Test Checkpoint11
| Translated 3003 sentences (86850 tokens) in 25.9s (116.11 sentences/s, 3358.03 tokens/s)
| Generate test with beam=4: BLEU4 = 27.29, 58.1/32.8/20.9/13.9 (BP=1.000, ratio=1.032, syslen=66563, reflen=64512)
Test Checkpoint12
| Translated 3003 sentences (87137 tokens) in 26.2s (114.74 sentences/s, 3329.31 tokens/s)
| Generate test with beam=4: BLEU4 = 27.28, 58.2/32.9/20.9/13.8 (BP=1.000, ratio=1.035, syslen=66787, reflen=64512)
Test Checkpoint13
| Translated 3003 sentences (86810 tokens) in 25.6s (117.41 sentences/s, 3393.98 tokens/s)
| Generate test with beam=4: BLEU4 = 27.26, 58.3/32.9/20.9/13.8 (BP=1.000, ratio=1.031, syslen=66500, reflen=64512)
Test Checkpoint14
| Translated 3003 sentences (87359 tokens) in 25.8s (116.30 sentences/s, 3383.15 tokens/s)
| Generate test with beam=4: BLEU4 = 27.69, 58.3/33.2/21.3/14.3 (BP=1.000, ratio=1.036, syslen=66830, reflen=64512)
Test Checkpoint15
| Translated 3003 sentences (87415 tokens) in 26.3s (114.33 sentences/s, 3327.98 tokens/s)
| Generate test with beam=4: BLEU4 = 27.37, 58.1/32.9/21.0/14.0 (BP=1.000, ratio=1.038, syslen=66951, reflen=64512)
Test Checkpoint16
| Translated 3003 sentences (87332 tokens) in 26.7s (112.51 sentences/s, 3272.10 tokens/s)
| Generate test with beam=4: BLEU4 = 27.33, 58.1/32.9/21.0/13.9 (BP=1.000, ratio=1.039, syslen=66998, reflen=64512)
Test Checkpoint17
| Translated 3003 sentences (86721 tokens) in 25.9s (116.06 sentences/s, 3351.62 tokens/s)
| Generate test with beam=4: BLEU4 = 27.32, 58.4/33.0/20.9/13.8 (BP=1.000, ratio=1.029, syslen=66385, reflen=64512)
Test Checkpoint18
| Translated 3003 sentences (87388 tokens) in 26.2s (114.71 sentences/s, 3338.08 tokens/s)
| Generate test with beam=4: BLEU4 = 27.57, 58.3/33.1/21.2/14.2 (BP=1.000, ratio=1.038, syslen=66956, reflen=64512)
Test Checkpoint19
| Translated 3003 sentences (86919 tokens) in 25.8s (116.28 sentences/s, 3365.50 tokens/s)
| Generate test with beam=4: BLEU4 = 27.63, 58.6/33.3/21.2/14.1 (BP=1.000, ratio=1.033, syslen=66642, reflen=64512)
Test Checkpoint20
| Translated 3003 sentences (87485 tokens) in 26.1s (115.24 sentences/s, 3357.16 tokens/s)
| Generate test with beam=4: BLEU4 = 27.48, 58.1/33.0/21.1/14.1 (BP=1.000, ratio=1.037, syslen=66924, reflen=64512)
Test Checkpoint21
| Translated 3003 sentences (86993 tokens) in 26.3s (114.07 sentences/s, 3304.46 tokens/s)
| Generate test with beam=4: BLEU4 = 27.77, 58.5/33.3/21.4/14.3 (BP=1.000, ratio=1.032, syslen=66564, reflen=64512)
Test Checkpoint22
| Translated 3003 sentences (87084 tokens) in 25.4s (118.07 sentences/s, 3424.04 tokens/s)
| Generate test with beam=4: BLEU4 = 27.87, 58.6/33.3/21.5/14.4 (BP=1.000, ratio=1.032, syslen=66595, reflen=64512)
Test Checkpoint23
| Translated 3003 sentences (87013 tokens) in 26.4s (113.92 sentences/s, 3300.98 tokens/s)
| Generate test with beam=4: BLEU4 = 27.59, 58.4/33.2/21.2/14.1 (BP=1.000, ratio=1.033, syslen=66626, reflen=64512)
Test Checkpoint24
| Translated 3003 sentences (86741 tokens) in 26.0s (115.49 sentences/s, 3335.84 tokens/s)
| Generate test with beam=4: BLEU4 = 27.98, 58.7/33.5/21.6/14.4 (BP=1.000, ratio=1.029, syslen=66379, reflen=64512)
Test Checkpoint25
| Translated 3003 sentences (86884 tokens) in 25.4s (118.05 sentences/s, 3415.42 tokens/s)
| Generate test with beam=4: BLEU4 = 27.94, 58.8/33.5/21.5/14.4 (BP=1.000, ratio=1.029, syslen=66392, reflen=64512)
Test Checkpoint26
| Translated 3003 sentences (86840 tokens) in 26.4s (113.68 sentences/s, 3287.46 tokens/s)
| Generate test with beam=4: BLEU4 = 27.91, 58.7/33.5/21.5/14.4 (BP=1.000, ratio=1.028, syslen=66344, reflen=64512)
Test Checkpoint27
| Translated 3003 sentences (87050 tokens) in 26.2s (114.45 sentences/s, 3317.73 tokens/s)
| Generate test with beam=4: BLEU4 = 27.88, 58.7/33.4/21.5/14.3 (BP=1.000, ratio=1.030, syslen=66451, reflen=64512)
Test Checkpoint28
| Translated 3003 sentences (86981 tokens) in 25.8s (116.40 sentences/s, 3371.53 tokens/s)
| Generate test with beam=4: BLEU4 = 27.80, 58.7/33.3/21.4/14.3 (BP=1.000, ratio=1.031, syslen=66488, reflen=64512)
Test Checkpoint29
| Translated 3003 sentences (86219 tokens) in 25.6s (117.33 sentences/s, 3368.59 tokens/s)
| Generate test with beam=4: BLEU4 = 27.82, 58.8/33.4/21.4/14.3 (BP=1.000, ratio=1.022, syslen=65941, reflen=64512)
Test Checkpoint30
| Translated 3003 sentences (86879 tokens) in 26.9s (111.61 sentences/s, 3229.04 tokens/s)
| Generate test with beam=4: BLEU4 = 27.88, 58.6/33.4/21.5/14.4 (BP=1.000, ratio=1.031, syslen=66501, reflen=64512)
Test Checkpoint31
| Translated 3003 sentences (87082 tokens) in 26.6s (112.83 sentences/s, 3271.95 tokens/s)
| Generate test with beam=4: BLEU4 = 28.00, 58.8/33.6/21.6/14.4 (BP=1.000, ratio=1.032, syslen=66570, reflen=64512)
Test Checkpoint32
| Translated 3003 sentences (86677 tokens) in 26.6s (112.93 sentences/s, 3259.43 tokens/s)
| Generate test with beam=4: BLEU4 = 27.98, 58.8/33.5/21.6/14.4 (BP=1.000, ratio=1.028, syslen=66289, reflen=64512)
Test Checkpoint33
| Translated 3003 sentences (87034 tokens) in 26.2s (114.54 sentences/s, 3319.61 tokens/s)
| Generate test with beam=4: BLEU4 = 28.10, 58.8/33.6/21.7/14.5 (BP=1.000, ratio=1.032, syslen=66553, reflen=64512)
Test Checkpoint34
| Translated 3003 sentences (87064 tokens) in 26.3s (114.28 sentences/s, 3313.16 tokens/s)
| Generate test with beam=4: BLEU4 = 27.92, 58.4/33.3/21.6/14.4 (BP=1.000, ratio=1.031, syslen=66534, reflen=64512)
Test Checkpoint35
| Translated 3003 sentences (86818 tokens) in 26.6s (112.86 sentences/s, 3262.78 tokens/s)
| Generate test with beam=4: BLEU4 = 28.11, 58.9/33.7/21.7/14.5 (BP=1.000, ratio=1.028, syslen=66336, reflen=64512)
Test Checkpoint36
| Translated 3003 sentences (87037 tokens) in 25.9s (115.89 sentences/s, 3358.98 tokens/s)
| Generate test with beam=4: BLEU4 = 28.18, 58.8/33.6/21.8/14.6 (BP=1.000, ratio=1.031, syslen=66483, reflen=64512)
Test Checkpoint37
| Translated 3003 sentences (86740 tokens) in 25.7s (116.91 sentences/s, 3376.92 tokens/s)
| Generate test with beam=4: BLEU4 = 28.19, 58.9/33.7/21.8/14.6 (BP=1.000, ratio=1.026, syslen=66197, reflen=64512)
Test Checkpoint38
| Translated 3003 sentences (87084 tokens) in 26.1s (115.05 sentences/s, 3336.24 tokens/s)
| Generate test with beam=4: BLEU4 = 28.01, 58.7/33.5/21.6/14.5 (BP=1.000, ratio=1.032, syslen=66551, reflen=64512)
Test Checkpoint39
| Translated 3003 sentences (86972 tokens) in 27.7s (108.47 sentences/s, 3141.58 tokens/s)
| Generate test with beam=4: BLEU4 = 28.10, 58.7/33.5/21.7/14.6 (BP=1.000, ratio=1.030, syslen=66456, reflen=64512)
Test Checkpoint40
| Translated 3003 sentences (86717 tokens) in 25.7s (116.94 sentences/s, 3376.78 tokens/s)
| Generate test with beam=4: BLEU4 = 27.81, 58.7/33.4/21.4/14.2 (BP=1.000, ratio=1.028, syslen=66314, reflen=64512)
Test Checkpoint41
| Translated 3003 sentences (86542 tokens) in 26.0s (115.52 sentences/s, 3329.06 tokens/s)
| Generate test with beam=4: BLEU4 = 27.69, 58.9/33.3/21.3/14.1 (BP=1.000, ratio=1.025, syslen=66127, reflen=64512)
Test Checkpoint42
| Translated 3003 sentences (86841 tokens) in 27.1s (110.96 sentences/s, 3208.64 tokens/s)
| Generate test with beam=4: BLEU4 = 27.99, 58.7/33.5/21.6/14.5 (BP=1.000, ratio=1.028, syslen=66329, reflen=64512)
Test Checkpoint43
| Translated 3003 sentences (86986 tokens) in 26.8s (111.92 sentences/s, 3241.95 tokens/s)
| Generate test with beam=4: BLEU4 = 27.81, 58.6/33.3/21.4/14.3 (BP=1.000, ratio=1.031, syslen=66501, reflen=64512)
Test Checkpoint44
| Translated 3003 sentences (86691 tokens) in 25.6s (117.24 sentences/s, 3384.53 tokens/s)
| Generate test with beam=4: BLEU4 = 28.09, 58.8/33.6/21.7/14.6 (BP=1.000, ratio=1.026, syslen=66162, reflen=64512)
Test Checkpoint45
| Translated 3003 sentences (86845 tokens) in 26.5s (113.44 sentences/s, 3280.52 tokens/s)
| Generate test with beam=4: BLEU4 = 28.00, 58.8/33.5/21.6/14.4 (BP=1.000, ratio=1.029, syslen=66353, reflen=64512)
Test Checkpoint46
| Translated 3003 sentences (86280 tokens) in 25.7s (116.75 sentences/s, 3354.46 tokens/s)
| Generate test with beam=4: BLEU4 = 28.13, 59.0/33.6/21.7/14.6 (BP=1.000, ratio=1.021, syslen=65860, reflen=64512)
Test Checkpoint47
| Translated 3003 sentences (86857 tokens) in 26.4s (113.64 sentences/s, 3286.92 tokens/s)
| Generate test with beam=4: BLEU4 = 27.77, 58.6/33.3/21.4/14.3 (BP=1.000, ratio=1.029, syslen=66402, reflen=64512)
Test Checkpoint48
| Translated 3003 sentences (87087 tokens) in 26.0s (115.65 sentences/s, 3353.93 tokens/s)
| Generate test with beam=4: BLEU4 = 27.68, 58.4/33.2/21.3/14.2 (BP=1.000, ratio=1.032, syslen=66576, reflen=64512)
Test Checkpoint49
| Translated 3003 sentences (86627 tokens) in 25.5s (117.97 sentences/s, 3402.95 tokens/s)
| Generate test with beam=4: BLEU4 = 28.02, 59.0/33.6/21.6/14.4 (BP=1.000, ratio=1.026, syslen=66208, reflen=64512)
Test Checkpoint50
| Translated 3003 sentences (86529 tokens) in 25.9s (116.09 sentences/s, 3345.07 tokens/s)
| Generate test with beam=4: BLEU4 = 27.96, 58.8/33.5/21.5/14.4 (BP=1.000, ratio=1.024, syslen=66049, reflen=64512)
Test Checkpoint51
| Translated 3003 sentences (87095 tokens) in 26.2s (114.50 sentences/s, 3320.73 tokens/s)
| Generate test with beam=4: BLEU4 = 27.80, 58.6/33.4/21.4/14.3 (BP=1.000, ratio=1.030, syslen=66471, reflen=64512)
Test Checkpoint52
| Translated 3003 sentences (87160 tokens) in 27.2s (110.54 sentences/s, 3208.27 tokens/s)
| Generate test with beam=4: BLEU4 = 27.89, 58.6/33.4/21.5/14.4 (BP=1.000, ratio=1.032, syslen=66559, reflen=64512)
Test Checkpoint53
| Translated 3003 sentences (86909 tokens) in 26.1s (114.96 sentences/s, 3326.93 tokens/s)
| Generate test with beam=4: BLEU4 = 27.90, 58.8/33.5/21.5/14.3 (BP=1.000, ratio=1.029, syslen=66353, reflen=64512)
Test Checkpoint54
| Translated 3003 sentences (86785 tokens) in 26.1s (114.94 sentences/s, 3321.61 tokens/s)
| Generate test with beam=4: BLEU4 = 28.05, 58.8/33.6/21.6/14.5 (BP=1.000, ratio=1.028, syslen=66308, reflen=64512)
Test Checkpoint55
| Translated 3003 sentences (86914 tokens) in 25.9s (115.95 sentences/s, 3355.82 tokens/s)
| Generate test with beam=4: BLEU4 = 27.76, 58.5/33.3/21.4/14.2 (BP=1.000, ratio=1.029, syslen=66376, reflen=64512)
Test Checkpoint56
| Translated 3003 sentences (86775 tokens) in 26.5s (113.27 sentences/s, 3273.16 tokens/s)
| Generate test with beam=4: BLEU4 = 27.75, 58.5/33.2/21.4/14.3 (BP=1.000, ratio=1.028, syslen=66314, reflen=64512)
Test Checkpoint57
| Translated 3003 sentences (86522 tokens) in 26.3s (114.39 sentences/s, 3295.88 tokens/s)
| Generate test with beam=4: BLEU4 = 27.91, 58.9/33.4/21.5/14.3 (BP=1.000, ratio=1.024, syslen=66052, reflen=64512)
Test Checkpoint58
| Translated 3003 sentences (86269 tokens) in 26.1s (114.94 sentences/s, 3301.85 tokens/s)
| Generate test with beam=4: BLEU4 = 27.77, 58.7/33.3/21.4/14.2 (BP=1.000, ratio=1.021, syslen=65893, reflen=64512)
Test Checkpoint59
| Translated 3003 sentences (86738 tokens) in 25.9s (115.78 sentences/s, 3344.27 tokens/s)
| Generate test with beam=4: BLEU4 = 27.96, 58.5/33.4/21.6/14.5 (BP=1.000, ratio=1.029, syslen=66378, reflen=64512)
Test Checkpoint60
| Translated 3003 sentences (86566 tokens) in 25.7s (116.92 sentences/s, 3370.48 tokens/s)
| Generate test with beam=4: BLEU4 = 27.85, 58.7/33.4/21.5/14.3 (BP=1.000, ratio=1.025, syslen=66151, reflen=64512)
Test Checkpoint61
| Translated 3003 sentences (86785 tokens) in 25.3s (118.91 sentences/s, 3436.47 tokens/s)
| Generate test with beam=4: BLEU4 = 27.74, 58.7/33.3/21.3/14.2 (BP=1.000, ratio=1.028, syslen=66291, reflen=64512)
Test Checkpoint62
| Translated 3003 sentences (86261 tokens) in 25.7s (116.79 sentences/s, 3354.79 tokens/s)
| Generate test with beam=4: BLEU4 = 27.86, 58.8/33.4/21.5/14.3 (BP=1.000, ratio=1.021, syslen=65898, reflen=64512)
Test Checkpoint63
| Translated 3003 sentences (86569 tokens) in 25.1s (119.58 sentences/s, 3447.32 tokens/s)
| Generate test with beam=4: BLEU4 = 27.92, 58.8/33.5/21.5/14.4 (BP=1.000, ratio=1.025, syslen=66155, reflen=64512)
Test Checkpoint64
| Translated 3003 sentences (86583 tokens) in 25.8s (116.47 sentences/s, 3357.96 tokens/s)
| Generate test with beam=4: BLEU4 = 27.59, 58.5/33.2/21.2/14.1 (BP=1.000, ratio=1.025, syslen=66146, reflen=64512)
Test Checkpoint65
| Translated 3003 sentences (86707 tokens) in 26.2s (114.76 sentences/s, 3313.64 tokens/s)
| Generate test with beam=4: BLEU4 = 27.78, 58.5/33.3/21.4/14.2 (BP=1.000, ratio=1.028, syslen=66294, reflen=64512)
Test Checkpoint66
| Translated 3003 sentences (86478 tokens) in 26.0s (115.55 sentences/s, 3327.54 tokens/s)
| Generate test with beam=4: BLEU4 = 27.63, 58.5/33.2/21.3/14.1 (BP=1.000, ratio=1.025, syslen=66114, reflen=64512)
Test Checkpoint67
| Translated 3003 sentences (86564 tokens) in 25.8s (116.40 sentences/s, 3355.20 tokens/s)
| Generate test with beam=4: BLEU4 = 27.92, 58.6/33.4/21.5/14.4 (BP=1.000, ratio=1.026, syslen=66200, reflen=64512)
Test Checkpoint68
| Translated 3003 sentences (86548 tokens) in 26.2s (114.58 sentences/s, 3302.20 tokens/s)
| Generate test with beam=4: BLEU4 = 28.08, 58.8/33.6/21.7/14.5 (BP=1.000, ratio=1.024, syslen=66041, reflen=64512)
Test Checkpoint69
| Translated 3003 sentences (86580 tokens) in 25.9s (116.08 sentences/s, 3346.72 tokens/s)
| Generate test with beam=4: BLEU4 = 28.13, 58.8/33.7/21.7/14.6 (BP=1.000, ratio=1.026, syslen=66178, reflen=64512)
Test Checkpoint70
| Translated 3003 sentences (86448 tokens) in 26.1s (115.01 sentences/s, 3310.94 tokens/s)
| Generate test with beam=4: BLEU4 = 27.88, 58.8/33.5/21.5/14.3 (BP=1.000, ratio=1.023, syslen=65998, reflen=64512)
Test Checkpoint71
| Translated 3003 sentences (86832 tokens) in 26.0s (115.69 sentences/s, 3345.26 tokens/s)
| Generate test with beam=4: BLEU4 = 27.91, 58.6/33.4/21.5/14.4 (BP=1.000, ratio=1.029, syslen=66355, reflen=64512)
Test Checkpoint72
| Translated 3003 sentences (86550 tokens) in 25.6s (117.18 sentences/s, 3377.25 tokens/s)
| Generate test with beam=4: BLEU4 = 27.95, 58.8/33.5/21.5/14.4 (BP=1.000, ratio=1.024, syslen=66092, reflen=64512)
Test Checkpoint73
| Translated 3003 sentences (86415 tokens) in 25.4s (118.17 sentences/s, 3400.41 tokens/s)
| Generate test with beam=4: BLEU4 = 27.84, 58.8/33.4/21.4/14.3 (BP=1.000, ratio=1.023, syslen=65990, reflen=64512)
Test Checkpoint74
| Translated 3003 sentences (86251 tokens) in 26.2s (114.65 sentences/s, 3292.82 tokens/s)
| Generate test with beam=4: BLEU4 = 27.97, 58.8/33.5/21.6/14.4 (BP=1.000, ratio=1.021, syslen=65889, reflen=64512)
Test Checkpoint75
| Translated 3003 sentences (86418 tokens) in 26.1s (115.03 sentences/s, 3310.16 tokens/s)
| Generate test with beam=4: BLEU4 = 27.72, 58.6/33.2/21.3/14.2 (BP=1.000, ratio=1.023, syslen=65971, reflen=64512)
Test Checkpoint76
| Translated 3003 sentences (86474 tokens) in 25.9s (116.04 sentences/s, 3341.50 tokens/s)
| Generate test with beam=4: BLEU4 = 27.63, 58.6/33.2/21.2/14.1 (BP=1.000, ratio=1.023, syslen=66025, reflen=64512)
Test Checkpoint77
| Translated 3003 sentences (86100 tokens) in 25.6s (117.20 sentences/s, 3360.35 tokens/s)
| Generate test with beam=4: BLEU4 = 28.11, 59.1/33.7/21.7/14.5 (BP=1.000, ratio=1.018, syslen=65695, reflen=64512)
Test Checkpoint78
| Translated 3003 sentences (86497 tokens) in 26.2s (114.53 sentences/s, 3298.82 tokens/s)
| Generate test with beam=4: BLEU4 = 27.80, 58.7/33.4/21.4/14.3 (BP=1.000, ratio=1.024, syslen=66073, reflen=64512)
Test Checkpoint79
| Translated 3003 sentences (86905 tokens) in 26.3s (114.22 sentences/s, 3305.35 tokens/s)
| Generate test with beam=4: BLEU4 = 27.69, 58.5/33.2/21.3/14.2 (BP=1.000, ratio=1.028, syslen=66327, reflen=64512)
Test Checkpoint80
| Translated 3003 sentences (86654 tokens) in 26.3s (114.36 sentences/s, 3300.06 tokens/s)
| Generate test with beam=4: BLEU4 = 27.65, 58.5/33.2/21.3/14.1 (BP=1.000, ratio=1.026, syslen=66219, reflen=64512)

So, why I am not able to achieve the results as reported in the readme file? Could you tell me the command line that you use to run transformer on 4 GPUs?

Another question is that the "Attention is all you need" paper uses 0.1 as the initial learning rate whereas 0.0006 is used here. Why there is such a large difference on learning rate?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.