Coder Social home page Coder Social logo

pytorch-ignite / code-generator Goto Github PK

View Code? Open in Web Editor NEW
38.0 7.0 23.0 5.51 MB

Web Application to generate your training scripts with PyTorch Ignite

Home Page: https://code-generator.pytorch-ignite.ai/

License: BSD 3-Clause "New" or "Revised" License

Python 35.33% Shell 2.04% HTML 1.26% JavaScript 21.38% Vue 39.45% CSS 0.19% Dockerfile 0.36%
pytorch pytorch-ignite deep-learning neural-networks webapp vuejs hacktoberfest code-generator

code-generator's People

Contributors

afzal442 avatar avinashsharma080 avatar dependabot[bot] avatar guptaaryan16 avatar puhuk avatar renovate[bot] avatar rwiteshbera avatar sayantan1410 avatar theory-in-progress avatar trsvchn avatar vfdev-5 avatar ydcjeff avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

code-generator's Issues

Remove useless code in image classification template

@trainer.on(TrainEvents.BACKWARD_COMPLETED(lambda _, ev: (ev % 100 == 0) or (ev == 1)))
def _():
# do something interesting
pass
# ----------------------------------------
# here we will use `every` to trigger
# every 100 iterations
# ----------------------------------------
@trainer.on(TrainEvents.OPTIM_STEP_COMPLETED(every=100))
def _():
# do something interesting
pass

Script to help Contributing new templates

Clear and concise description of the problem

The idea is to provide a new script to make it easy for us to contribute templates to the code generator app. This should be like a python script which can be provided information in the form of a main.py file and helps in creating necessary changes in the code generator app based on the script for the necessary tasks.

Suggested solution

I propose something like this

main.py (to be submitted as template)

### DataLoaders 
train_dataloader = ...
test_dataloader = ...

## model 
class MyModel(nn.module):
      def __init___(self, ):
           pass

## Training Step
def step(engine, batch):
        ....

## Evaluation function
def evaluate():
        ....
   

Now using the comments like data loaders, we can separate the code through a script and then input the code below at specific places in the template and then use it for the template creation as well. This approach seems to be less tedious than making all the changes individually and hoping everything to work

Alternative

Also we can add a template-contributing.md guide that can help people who want to contribute new templates to the existing app.

cc @vfdev-5 @theory-in-progress

Add support for TPU devices

Clear and concise description of the problem

It would be good to provide an option to select accelerator as TPU instead of GPU
We can also auto-select TPU accelerator if open with Colab + add torch_xla installation steps.

What to do:
0) Try a template with TPUs. Choose distributed training option with 8 processes and spawning option. "Open in colab" one template, for example, vision classification template, install manually torch_xla (see https://colab.research.google.com/drive/1E9zJrptnLJ_PKhmaP5Vhb6DTVRvyrKHx) and run the code with backend xla-tpu: python main.py --nproc_per_node 8 --backend nccl. If everything is correctly done, training should probably run

  1. Update UI
  • Add a drop-out menu for backend selection: "nccl" and "xla-tpu" in "Training Options"
  • when user selects "xla-tpu", training should be only distributed with 8 processes and "Run the training with torch.multiprocessing.spawn".
  1. Update content: README.md and other impacted files
  2. if exported to Colab, we need to make sure that accelerator is "TPU"

Suggested solution

Alternative

Additional context

To extend the mapping capabilities of metadata.json to have very specific features for each templates

Clear and concise description of the problem

Right now, the metadata.json works like this

{
	training”:{
		…Options
	}

}

And we just import all these options and assume that all templates are going to have them.

This causes two specific issues, as given below

  • We are not able to choose specific metadata and training options(maybe sub options) for each template
  • We can't configure special options for some templates like having specific options for text classification template for question-answering, text summarisation etc.

Suggested solution

We propose two approaches to solve this problem, namely

  • We can add a templates sub option for each training option which contains all templates for that specific training option and check it during rendering of the templates. This works as follows:
[Changes in metadata.json]
"deterministic": {
     "name": "deterministic",
     "type": "checkbox",
     "description": "Should the training be deterministic?",
 +   "templates": ['vision-classification', 'text-classification', ..]
   }

[Changes in TabTraining.vue(and other Vue components)]
<div v-if="deterministic.templates.includes(store.config.template)">

Then we can selectively choose which option will be available for each template. This approach makes it easier to track each template and option. The only problem of this approach seems to be hard for us to specify an option like a specific argparser or backend.

  • We can configure data_options.json file in the templates. This file will contain all options related to each template that can may contain sub options check as well. This approach makes more sense while contributing new templates as you need to add all options you want to configure directly.
    This will require the following changes
[New file templateOptions.json]
{
   "templates": 
   {
       "template-vision-classification": 
       {
           "training": [
               "argparser", 
               "deterministic", 
               "torchrun",
               "spawn",
               "nproc_per_node",
               "nnodes",
               "master_addr",
               "master_port",
               "backend"
           ]
}

[Changes in TabTraining.vue(and other Vue components)]
<FormCheckbox
     :label="deterministic.description"
     :saveKey="deterministic.name"
     v-show="trainingOptions.includes('deterministic')"
   />

const trainingOptions = ref(templates[store.config.template]["training"])

This approach seems a bit complex but can provide more control over the options and templates. Also it can help introduce template specific features like having specific evaluation functions and other metadata options.

Additional context

This issue was discuss in our weekly meeting this week.

cc @vfdev-5 @theory-in-progress

Model, View and Controller

Thanks to @ydcjeff we have a working base app. And I think its a right point to split the app code into "classic" components for the GUI - app.

First, of this will allow us to start writing test, for example to verify that we generate appropriate python files and run generated code.

Model will be responsible for generating code from templates, py file preparation and archiving. It is going to be streamlit agnostic, so we can test it using pytest.

Controller will stick together the model and Strimlit View.

Ideas for Code Generator

Clear and concise description of the problem

A list of few new features that can make Code Generator super awesome

  1. Side by side comparison of pytorch vs ignite code ! So code-generator works for pure pytorch too, a researcher will love that.
  2. Ability to choose models, say from detection people can choose retinanet, SSD, etc.
  3. HuggingFace intergration! How nice to transparently fine tune HF models (Maybe choose a task and then a model), an awesome starter for many NLP applications.
  4. A direct torch.hub connection. Maybe all the models listed in torch.hub can be easily trained?

Suggested solution

We can probably go one feature at a time and iterate over couple of releases.

Maybe we should discuss this further over GitHub / discord

Make app looks good on mobile

Clear and concise description of the problem

We are currently able to visit the site even on mobile, but it doesn't look good.

Suggested solution

Make app looks good with left sidebar and generated code in the middle.

Alternative

N/A

Additional context

Correcting for running Template-vision-segmentation in colab

It’s about trivial error.

We can run the template we want through the open in colab button in the pytorch-ignite code generator. And, we can press the button to link the following code to run in colab for running Template-vision-segmentation directly in colab.

!wget https://raw.githubusercontent.com/pytorch-ignite/nbs/main/nbs/0a809e9f-82c6-42cc-a7de-378f7f87cc7b/ignite-template-vision-segmentation.zip  
!unzip ignite-template-vision-segmentation.zip  
!pip install -r requirements.txt  
!python main.py config.yaml  

However, if this is executed as colab, the following problem arise because it is executed without the data required for training.

Traceback (most recent call last):
  File "/content/data.py", line 80, in setup_data
    dataset_train = VOCSegmentationPIL(
  File "/content/data.py", line 56, in __init__
    super().__init__(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torchvision/datasets/voc.py", line 101, in __init__
    raise RuntimeError("Dataset not found or corrupted. You can use download=True to download it")
RuntimeError: Dataset not found or corrupted. You can use download=True to download it

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/content/main.py", line 128, in <module>
    main()
  File "/content/main.py", line 124, in main
    p.run(run, config=config)
  File "/usr/local/lib/python3.10/dist-packages/ignite/distributed/launcher.py", line 316, in run
    func(local_rank, *args, **kwargs)
  File "/content/main.py", line 36, in run
    dataloader_train, dataloader_eval = setup_data(config)
  File "/content/data.py", line 87, in setup_data
    raise RuntimeError(
RuntimeError: Dataset not found. You can use `download_datasets` from data.py function to download it.

So, I suggest correcting download=False to download=True in code-generator/src/templates/template-vision-segmentation/data.py for running it immediately.

Some more improvements

App explanation

Let's either create a tutorial guide show how to use the app or a simply button with a message explaining how to use the app, where to start etc.

Distributed:

  • Done

Let's simplify this code is no distributed option is selected :

    with idist.Parallel(
        backend=config.backend,
        nproc_per_node=config.nproc_per_node,
        nnodes=config.nnodes,
        node_rank=config.node_rank,
        master_addr=config.master_addr,
        master_port=config.master_port,
    ) as parallel:
        parallel.run(run, config=config)

to

# (no dist)
    with idist.Parallel(
        backend=config.backend,
    ) as parallel:
        parallel.run(run, config=config)

and

# single node
    with idist.Parallel(
        backend=config.backend,
        nproc_per_node=config.nproc_per_node,
    ) as parallel:
        parallel.run(run, config=config)

Readme

  • Done

We should be very careful with distributed button and this suggestion

python -m torch.distributed.launch \
  --nproc_per_node=2 \
  --use_env main.py \
  --backend="nccl"

as dist button will add the code to spawn processes inside the main process and dist launch will spawn more processes.

Let's do the following:

  • add another checkbox with the option: use dist launch or spawn process
    • if user picks "dist launch" -> README.md says to use: python -m torch.distributed.launch --nproc_per_node=2 ... and in the code we define config.nproc_per_node=None. Same for multi-node: config.master_addr=None etc and python -m torch.distributed.launch --nproc_per_node=2 --master_addr=master --master_port=1234 --nnodes=2 --node_rank=0 ...
    • if user picks "spawn" -> README.md says : python main.py ... and in the code we define config.nproc_per_node=2.

We can also imagine folks doing other things like here: https://github.com/sdesrozis/why-ignite

DataLoader

  • Done

If user picks "spawn" option, we have to update the code like

    train_dataloader = idist.auto_dataloader(
        train_dataset,
        batch_size=config.train_batch_size,
        num_workers=config.num_workers,
        shuffle=True,
        persistent_workers=True
    )
    eval_dataloader = idist.auto_dataloader(
        eval_dataset,
        batch_size=config.eval_batch_size,
        num_workers=config.num_workers,
        shuffle=False,
        persistent_workers=True
    )

"Save the best model by eval score" and "Early stop ..."

  • Done

It would be better to avoid such messages:

Please make sure to pass argument to metric_name parameter of get_handlers in main.py. Otherwise it can result KeyError.

Let's control what we are doing and configure everything such that we do not need to warn the user like that.

(Later) AMP mode as option ?

  • Done

It would be nice to add AMP option for image classification / at least.

(Later) Optimizer type

  • Done

User would like to choose optimizer type: Adam, RMSprop etc

fix lr scheduler warning in segmentation template

Describe the bug

Following warning is shown in the CI: step-504

/opt/hostedtoolcache/Python/3.6.13/x64/lib/python3.6/site-packages/torch/optim/lr_scheduler.py:134:

UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`.

In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.

Failure to do this will result in PyTorch skipping the first value of the learning rate schedule.

See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate

Reproduction

https://github.com/pytorch-ignite/code-generator/tree/main/src/templates/template-vision-segmentation

Steps to reproduce

python main.py \
  --data_path <path_to_dataset> \
  --train_batch_size 4 \
  --eval_batch_size 4 \
  --num_workers 2 \
  --max_epochs 2 \
  --train_epoch_length 4 \
  --eval_epoch_length 4

Expected result

No warning to show.

Environment info

Output of python -m torch.utils.collect_env:

OS: Linux
torch: 1.9.0
torchvision: 0.10.0
ignite: 0.4.5

If you like to tackle this issue, please comment that you want to work on and see the contributing guide.

Few improvements

Currently tested on master: 78b0def

  • requirements-dev.txt is missing the entry -r requirements.txt to install streamlit with appropriate version
-r requirements.txt
# dev
pytorch-ignite
torch
torchvision
jinja2
requests

# test
pytest
hypothesis
  • Change text: "Those in the parenthesis are used in the generated code." -> "Names in the parenthesis are variable names in the generated code." or something similar.

  • Let's explicitly create the trainer in CIFAR10 example to show how to write training_step

  • Let's add AMP option

  • Let's add Error metric (to show how we can do metrics arithmetics) :

    accuracy_metric = Accuracy(device=device)
    metrics = {
        'eval_accuracy': accuracy_metric,
        'eval_loss': Loss(loss_fn=loss_fn, device=device),
        'eval_error': (1.0 - accuracy_metric) * 100
    }
  • Let's change the output of initialize and also set up a LR scheduler:
- device, model, optimizer, loss_fn = initialize(config)
+ device = idist.device()
+ model, optimizer, loss_fn, lr_scheduler = initialize(config)
  • Distributed option if used as multiprocessing schema: python main.py -> multiple childs have/had a certain issue with dataloaders: first iteration of each epoch is very slow. To avoid that let's prefer to say to the user to lauch things with torch.distributed.launch

  • I think this code is useless to add to main.py if exp_logger is None

    # --------------------------------
    # setup common experiment loggers
    # --------------------------------
    exp_logger = setup_exp_logging(
        config=config,
        eval_engine=eval_engine,
        train_engine=train_engine,
        optimizer=optimizer,
        name=name
    )
  • I'm a bit confused about this option: eval_max_epochs and its value = 2. It is something I've never seen before. I think that we have to follow the standard practices and by default run once on the validation dataloader. Thoughts ?

  • If possible make sidebar resizable from a min possible to a max value.

[Idea] Option to add dataflow, model tests

Here is an idea to think about for later versions, like v0.3 etc

Context

Let's imagine I use the generator to quick-start with my specific problem: own dataflow, model etc.
I generate the code and start to bootstrap things between the training code and my custom things. Without running the training, it is almost impossible to ensure the correctness, however, I could think of some basic additional tests with verbose option to ensure that my own dataloaders and the model provide the expected info.

Feature

Let's say generated files are:

- main.py
- model.py
- dataflow.py
- utils.py

The idea is to provide additional folder:

tests
 - test_dataflow.py
 - test_model.py

where we can provide a skeleton code for

  • loop over few dataloader batches and either show images (like here)or assert dimensions.
  • assert the output shape/type of the model

Anyway, this is something to discuss and brain storm...

Add current time into logging message

Right now the logging message is the following (trainin-info.log):

...
�[32m[ignite]�[0m: train [1/90]: {'epoch': 1, 'train_loss': 2.521751880645752}
�[32m[ignite]�[0m: train [1/100]: {'epoch': 1, 'train_loss': 2.4120213985443115}
...

Let's add current datetime instead of "ignite" as following:

...
�[32m[20230819-12:34:56]�[0m: train [1/90]: {'epoch': 1, 'train_loss': 2.521751880645752}
�[32m[20230819-12:35:12]�[0m: train [1/100]: {'epoch': 1, 'train_loss': 2.4120213985443115}
...

Add comments for each parameter in config.yaml

Right now we can see the following in the config.yaml:

seed: 777
data_path: ./
train_batch_size: 32
eval_batch_size: 32
num_workers: 4
max_epochs: 20
use_amp: false
debug: false
filename_prefix: training
n_saved: 2
save_every_iters: 1000
patience: 3
output_dir: ./logs
log_every_iters: 10
lr: 0.0001
model: resnet18

and it is unclear what each parameter is responsible for.

The idea is to add comments for each parameter like below

seed: 777  # random seed
data_path: ./  # input data path 
train_batch_size: 32
eval_batch_size: 32
num_workers: 4
max_epochs: 20
use_amp: false
debug: false
filename_prefix: training  # training checkpoint filename prefix
n_saved: 2  # number of saved checkpoints
save_every_iters: 1000  # training checkpoint frequency
patience: 3  # early stopping patience parameter
output_dir: ./logs  # output folder
log_every_iters: 10
lr: 0.0001
model: resnet18

Unify output paths

There are two output paths currently : config.output_path for saving checkpoints and config.filepath for python logging and experiment tracking systems.

Goal is to combine them into one and have a easy traceable folder name for experiments.

Add 404 page

Clear and concise description of the problem

If going to any unexpected url e.g. https://code-generator.pytorch-ignite.ai/test, we see a blank page.
Let's show a 404 page ?

Suggested solution

Alternative

Additional context

Show `txt` for requirements.txt in `CodeBlock`

Describe the bug

Currently, there is no file extension beside Copy button for requirements.txt.

Screen Shot 2021-07-09 at 13 04 43

Expected result

Show txt beside Copy button.

Steps to reproduce

  1. Go to https://code-generator.netlify.app/create
  2. Choose a template
  3. See the requirements.txt in the right pane

Solution

Currently, we are using markup language from Prismjs for txt file which is not correct.

So change the language-markup to language-txt or language-text which don't exist in Prismjs languages. But this could allow us to add the file extension like here.

Another part is we are highlighting txt with markup. Since text files normally don't need syntax highlighting, we could just pass language grammar as empty object (i.e. languages['markup'] -> {}).


If you like to tackle this issue, please comment that you want to work on and see the contributing guide.

Boolean config item always give 'False' value.

Describe the bug

Boolean config items always give 'False' value even when set 'True' or 'true' in config.yaml.

Please see function setup_parser() on utils.py, on line 29. I think it's missing default value, like on line 31, or may be just skip checking the boolean type, and treat it like another?

def setup_parser():
with open("config.yaml", "r") as f:
config = yaml.safe_load(f.read())
parser = ArgumentParser()
parser.add_argument("--backend", default=None, type=str)
for k, v in config.items():
if isinstance(v, bool):
parser.add_argument(f"--{k}", action="store_true")
else:
parser.add_argument(f"--{k}", default=v, type=type(v))
return parser

Expected result

Boolean config items give value as set.

Steps to reproduce

Set any boolean config item in config.yaml to 'true'.

Update Node dependencies for CI warning

Describe the bug

Browserslist: caniuse-lite is outdated. Please run:
  npx browserslist@latest --update-db
  Why you should do it regularly: https://github.com/browserslist/browserslist#browsers-data-updating

This warning seems visible in pnpm run test command and needs to be resolved. For more information, see https://github.com/pytorch-ignite/code-generator/actions/runs/5938810397/job/16103995089#step:10:36

Expected result

No warnings in CI and the CI works as expected.

Link left pane events to updates in the code (if possible)

The idea is to link (if possible) left pane events to updated code on the right, to increase interactivity of the app.

For example: to inform user which files/part of the files were changed, (like in regular IDE, with stars or highlighting the updated code lines).

UI and templates improvements v0.2.0

Vision template

  • Configurable parts are not inplace and can lead to bad rendering:

Screen Shot 2021-05-25 at 21 50 38

Screen Shot 2021-05-25 at 21 51 50

Screen Shot 2021-05-25 at 21 51 58

  • Unclear configuration name "every save_every_iters" and no related widget :

Screen Shot 2021-05-25 at 21 52 29

  • Put dataloader_train.sampler.set_epoch(trainer.state.epoch - 1) into trainer = setup_trainer(config, model, optimizer, loss_fn, device, dataloader_train)

DCGAN template

- [ ] We can remove Net model in model.py

Segmentation template

  • Put deeplabv3 model into model.py and remove Net and GANs models defs.
  • Unclear code
    # run evaluation at every training epoch end
    # with shortcut `on` decorator API and
    # print metrics to the stderr
    # again with `add_event_handler` API
    # for evaluation stats
    @trainer.on(Events.EPOCH_COMPLETED(every=1))
    def _():
        # show timer
        
        if timer is not None:
            logger.info("Time per batch: %.4f seconds", timer.value())
            timer.reset()

What is the purpose of the timers and the reset here ?

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Edited/Blocked

These updates have been manually edited so Renovate will no longer make changes. To discard all commits and start over, click on a checkbox.

  • fix(deps): update all non-major dependencies (@iconify/iconify, @octokit/core, @types/ejs, @types/file-saver, @types/jest, @types/prismjs, @vitejs/plugin-vue, @vue/compiler-sfc, albumentations, continuumio/miniconda3, ejs, jest, playwright-chromium, prettier, prismjs, pytorch-ignite, semver, start-server-and-test, torch, torchvision, uuid, vue, vue-router)

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Ignored or Blocked

These are blocked by an existing closed PR and will not be recreated unless you click a checkbox below.

Detected dependencies

dockerfile
docker/Dockerfile
  • continuumio/miniconda3 24.1.2-0
github-actions
.github/workflows/ci.yml
  • actions/checkout v4
  • actions/setup-python v5
  • actions/setup-node v4
  • actions/cache v4
  • actions/cache v4
  • actions/checkout v4
  • actions/setup-python v5
  • actions/setup-node v4
  • actions/cache v4
npm
package.json
  • @iconify/iconify ^3.1.0
  • @octokit/core ^5.0.0
  • @types/ejs ^3.1.0
  • @types/file-saver ^2.0.5
  • @types/jest ^27.4.0
  • @types/prismjs ^1.26.0
  • @vitejs/plugin-vue ^2.1.0
  • @vue/compiler-sfc ^3.2.30
  • ejs ^3.1.6
  • execa ^8.0.1
  • file-saver ^2.0.5
  • jest ^27.5.0
  • jszip ^3.10.1
  • playwright-chromium ^1.33.0
  • prettier ^2.5.1
  • prismjs ^1.26.0
  • prompts ^2.4.2
  • semver ^7.3.5
  • start-server-and-test ^2.0.0
  • uuid ^9.0.0
  • vite ^2.7.13
  • vue ^3.2.30
  • vue-router ^4.0.12
pip_requirements
scripts/requirements.txt
  • torch >=1.10.2
  • torchvision >=0.11.3
  • pytorch-ignite >=0.4.8
src/templates/template-common/requirements.txt
  • torch >=1.10.2
  • torchvision >=0.11.3
  • pytorch-ignite >=0.4.8
src/templates/template-text-classification/requirements.txt
src/templates/template-vision-segmentation/requirements.txt
  • albumentations >=1.3.0

  • Check this box to trigger a request for Renovate to run again on this repository

Structure as python scripts instead of modules

Currently the templates are structured as python modules. So to edit and run, we need to install in editable mode.

Goal is to provide as simple python scripts instead of python modules and verify structuring like python scripts still works.

Error using Visdom as exp. tracking system for image-segmentation

Describe the bug

AttributeError: 'VisdomLogger' object has no attribute 'writer'

Expected result

No error.

Reproduction

Please see line 137, src/templates/template-vision-segmentation/vis.py.
logger.writer.add_image is for tensorboard, no?

logger.writer.add_image

Steps to reproduce

Choose template for vision-segmentation
Choose Visdom for exp. tracking system.

Generate pyfiles on demand

Now we are generating python files on each interaction with the sidebar. The idea is to keep code rendering on the screen, but write generated strings into files, when user presses the "Download" button.

Intermediate Representation

We are currently thinking about a visual representation of building blocks composing an application : PyTorch objects, handlers, metrics, engines, etc. The idea is to provide a graphical helper to organise events and dataflow. In my knowledge, it is an original approach that could be complementary to our code generator (from templates).

To do this, the visual representation (from a graphical tool, in the spirit of PyFlow https://github.com/wonderworks-software/PyFlow) should be described in a specific representation used by a code generator. This representation could be similar to the intermediate representation (IR) used in compilation (e.g llvm, gcc). It helps optimisation, and code generation.

I wonder about merging our effort to have a unique representation. I mean

  • Templates -> IR -> code
  • Graphic Tool -> IR -> code

What do you think about that ?

Some insights

Remove local_rank and `idist.barrier()` from data.py if no distributed configuration selected

Clear and concise description of the problem

Got a feedback that this code

    local_rank = idist.get_local_rank()

    ...

    if local_rank > 0:
        # Ensure that only rank 0 download the dataset
        idist.barrier()

    ...
    
    )
    if local_rank == 0:
        # Ensure that only rank 0 download the dataset
        idist.barrier()

looks a bit strange if no distributed configuration is selected

Let's put template conditions here as well

Fix LR scheduler issue on CI

Describe the bug

Currently CI is failing on segmenation example:

[ignite]: Configuration: 
{'accumulation_steps': 4,
 'backend': None,
 'data_path': '/home/runner/data',
 'debug': False,
 'eval_batch_size': 2,
 'eval_epoch_length': 4,
 'filename_prefix': 'training',
 'limit_sec': 60,
 'log_every_iters': 2,
 'lr': 0.007,
 'max_epochs': 2,
 'n_saved': 2,
 'num_classes': 21,
 'num_workers': 2,
 'output_dir': PosixPath('logs/20230320-090125-backend-None-lr-0.007'),
 'patience': 2,
 'save_every_iters': 2,
 'seed': 666,
 'train_batch_size': 2,
 'train_epoch_length': 4,
 'use_amp': False}
Traceback (most recent call last):
  File "/home/runner/work/code-generator/code-generator/dist-tests/vision-segmentation-all/main.py", line 181, in <module>
    main()
  File "/home/runner/work/code-generator/code-generator/dist-tests/vision-segmentation-all/main.py", line 177, in main
    p.run(run, config=config)
  File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/ignite/distributed/launcher.py", line 316, in run
    func(local_rank, *args, **kwargs)
  File "/home/runner/work/code-generator/code-generator/dist-tests/vision-segmentation-all/main.py", line 83, in run
    trainer.add_event_handler(Events.ITERATION_STARTED, lr_scheduler)
  File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/ignite/engine/engine.py", line 319, in add_event_handler
    _check_signature(handler, "handler", self, *(event_args + args), **kwargs)
  File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/ignite/engine/utils.py", line 10, in _check_signature
    signature = inspect.signature(fn)
  File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/inspect.py", line 3113, in signature
    return Signature.from_callable(obj, follow_wrapped=follow_wrapped)
  File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/inspect.py", line 2862, in from_callable
    return _signature_from_callable(obj, sigcls=cls,
  File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/inspect.py", line 2261, in _signature_from_callable
    raise TypeError('{!r} is not a callable object'.format(obj))
TypeError: <torch.optim.lr_scheduler.LambdaLR object at 0x7fdc0baab0a0> is not a callable object
Error: Process completed with exit code 1.

v0.2.0 Roadmap

We now have 2 versions of the app:

The new version is made with Vue 3 while the old one is made with Streamlit.

Features lacked in new app, present in old app

  • Introductory page - PR: #124
  • Multiple templates - PR: #118 #119 #129 #131
  • TerminateOnNaN, TimeLimit, EarlyStopping, and Timer handlers - PR: #116
  • Unit Testing Options - PR: #133
  • Code copy button - PR: #106
  • README markdown is not rendered

Features lacked in old app, present in new app

  • Resizable panes
  • Deterministic Training
  • Brand colors and styles
  • Line numbers, file extensions appearance
  • Customization (ofc with maintenance)
  • Blob storage download link (shorter than base64 link)

New features to be added in new version

  • Download successful message - PR: #109
  • Make app looks good on mobile - Issue: #104, PR: #105 #108
  • Add respective file icons beside file names - PR: #110

Small fixes:

UI

  • scrolling bug (#151)
  • split between windows (#148)
  • disable trainer-logger tags if no template (#153 )

Set default configuration (#154 )

  • preconfigured handlers
  • checkpoints
  • out folder
  • tb by default

Misc

  • ClearML (#149)
  • Export to Colab (#162 )
  • Link v0.2.0 URL in old code-generator app (#159 )
  • Bug with closing help window (#156 )
  • New line trimming: like Jinja does maybe? (#152 )
  • pytest in requirements (#150)
  • READMEs update (#155)
  • larger epochs, batch size (#155)

cc @vfdev-5 @trsvchn

Option to export model training config

Currently we store defaults configuration for the project in utils.py.

And as it was discussed, it would be great to add option for the user to be able to dump this configuration (e.g. for reproducibility purpose).

we can make it optional with another flag like:

  • "Store config in a separate file?"

and then, for example, we can ask for the format:

"What kind of config format would you like?"

  • py
  • json
  • yaml

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.