pytorch-ignite / examples Goto Github PK
View Code? Open in Web Editor NEWExamples, tutorials, and how-to guides
License: BSD 3-Clause "New" or "Revised" License
Examples, tutorials, and how-to guides
License: BSD 3-Clause "New" or "Revised" License
Following up on this question: pytorch/ignite#2301, it'd be great if we have a dedicated how-to guide for checkpointing and then loading up the checkpoint from disk and resuming training. The setup code can be taken from https://pytorch-ignite.ai/tutorials/beginner/01-getting-started/.
The idea is to show how to tackle a use-case like:
x, y1, y2 = batch
y_pred1, y_pred2, aux = model(x)
and where we would like to compute metrics between y_pred1
vs y1
and y_pred2
vs y2
.
output_transform
for each metricContext: discussed on discord, "questions" channel, multi-output thread.
Hugo has built-in ability to split for summary from the markdown files.
Add <!--more-->
(should be exact) where you want to show the summary as the content before <!--more-->
.
OR
Add summary
frontmatter if the summary doesn't look good from markdown content.
Link to <!--more-->
This could allow the summary to be shown consistently on the website.
cc @Priyansi
NCCL now supports gather
operator. The line in bottom should be updated.
Shift the notebook FastaiLRFinder_MNIST.ipynb as a how-to guide making appropriate changes.
Finalize moving tutorials into level-based folders:
We have cifar10-distributed.py
file that IMO should be put to intermediate
folder near the notebook.
What to do:
The idea is to make a new beginner-level NLP tutorial combining this transformers example and this Text CNN notebook.
Update:
The idea has now been changed to using this ๐ค tutorial: https://huggingface.co/transformers/training.html for all the base code and then introducing Ignite for training. Ignite concepts are still derived from the TextCNN notebook.
I slightly adapted the cifar10 example in this fork, basically removing python-fire and adding the torch.distributed.launch function, so that it can be executed as a standalone script with clearml-task.
I executed the following script with nproc_per_node
in [1, 2, 3, 4] on a AWS g4dn.12xlarge instance (x4 T4 GPUs). I got the following results:
I am increasing the batch size by 16 each time I add a GPU, so that each GPU has the same number of samples. I didn't change the default number of processes (8) for all of them, because I didn't oberserve that the GPUs were under-used (below 95%)
I was expecting to observe a quasi-linear time improvement, but it isn't the case. Am I missing something?
PS: Here are the requirements I used to execute the script
torch==1.7.1+cu110
torchvision==0.8.2
pytorch-ignite==0.4.8
clearml==.1.1.6
tensorboardX==2.4.1
A tutorial which shows how custom metrics can be made using Metric
class would be useful.
The example can create an easy metric like the Levenshtein Distance between two strings.
Reference - https://pytorch.org/ignite/metrics.html#how-to-create-a-custom-metric
Add weight
in frontmatter of how to guides and tutorials pages. It could allow Hugo to sort the order of appearance.
See: https://gohugo.io/templates/lists#by-weight
TLDR; Lower number get higher precedence. So we shall start with 1. And I suggest to rename the file names to start with number like 01-file-name
or 1-file-name
so that it will show in order when viewing. And it could allow to scan the files.
It also allow us to see the files in order in the sidebar on the website.
For example:
01-installation
02-data-iterator
03-gradient-accumulation
04-fastai-lr-finder
05-time-profiling
...
Currently, the download links for python files are not working.
https://pytorch-ignite.ai/tutorials/getting-started/
Because there is no python files in this repo which is used as submodule in the website repo.
It would be great if python files are also in this repo to ease the download of python files.
Use Ray Tune with Ignite for hyperparameter optimization. Also, compare Tune with Optuna and Ax.
Ideas on how to do cross-validation: pytorch/ignite#1384
Following on the discussion with @vfdev-5 , this new how-to guide should be abstract and:
The purpose is to show explicitly how to convert pure pytorch code to ignite and explain what we gain by that
Topics to cover:
It would be nice to have a how-to guide combine all the built-in loggers that Ignite provides: ClearML, Tensorboard, MLflow, etc. See more here: https://pytorch.org/ignite/contrib/handlers.html#loggers.
The code on how to use these is already provided in https://github.com/pytorch/ignite/tree/master/examples/contrib/mnist.
When creating this new how-to guide, please:
Everyone is welcome to contribute! ๐ Please feel free to ask any further questions below.
Since PyTorch-Ignite v0.4.7, we have updated save_handler
in Checkpoint
to pass the path to checkpoint directory rather than mentioning DiskSaver(checkpoint_dir, create_dir=True)
. It would be nice if this could be updated for all instances in this repository too. One such instance can be in under Checkpointing in How to convert PyTorch Code to Ignite.
To make ease of converting notebook to markdown files, add a few frontmatter in the notebook as html comments.
Current frontmatter to add in a cell (this should be at the most top of the notebook)
<!-- ---
title: Example title
description: Example description
date: 2021-07-27
downloads: true
include_footer: true
sidebar: true
tags:
- deep learning
- machine learning
- pytorch
- python
--- -->
A new advanced standalone tutorial on idist
that doesn't rely on comparison as in the blog post Distributed Made Easy with Ignite but focuses more on idist
methods like all_reduce
and all_gather
and broadcast
.
The idea is to move the scripts from https://github.com/pytorch/ignite/tree/master/examples/reinforcement_learning here in the form of a tutorial. To create this new tutorial, please:
Feel free to ask any questions here ๐ Everyone is welcome to contribute!
Following up on this question: pytorch/ignite#2441. As mentioned in the comments, It would be great to have an example for using LRScheduler in How-to-guide. The code example is already available in the comments
The idea is to restructure this tutorial in the following way:
We can provide 2 parts:
unfortunately, as i upgrade the ignite, i don't know how to save my model's checkpoints, becuase the save_interval is deprecated, my code is below, but it doesn' work.
checkpointer = ModelCheckpoint(output_dir, cfg.MODEL.NAME, n_saved=10, require_empty=False) trainer.add_event_handler(Events.EPOCH_COMPLETED(every=1), checkpointer, to_save={'model': model, 'optimizer': optimizer})
please give an example, thanks
Shift the files from pytorch-ignite/pytorch-ignite.github.io/content/docs/how-to-guides to pytorch-ignite/examples/how-to-guides and store them as .ipynb instead of .md
Shift the notebook HandlersTimeProfiler_MNIST.ipynb to a how-to guide in this repository. Add content from Time Profiling during training.
In the how-to-guides
sometimes dependencies are assumed, sometimes they are installed with
!pip install
I suggest to uniform this by providing a conda
and/or requirements.txt
file/s so that a full environment can be created.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.