The kohya-colab from hollowstrawberry

If I train at 756 and most of my images are high def. but one is 640px. how will that be handled?

will the image be stretched or scaled and lose details?

No Module Found

Getting a couple of errors when trying to run the collab
ERROR: Could not find a version that satisfies the requirement tensorboard==2.12.2 (from versions: 1.6.0rc0, 1.6.0, 1.7.0, 1.8.0, 1.9.0, 1.10.0, 1.11.0, 1.12.0, 1.12.1, 1.12.2, 1.13.0, 1.13.1, 1.14.0, 1.15.0, 2.0.0, 2.0.1, 2.0.2, 2.1.0, 2.1.1, 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.4.0, 2.4.1, 2.5.0, 2.6.0, 2.7.0, 2.8.0, 2.9.0, 2.9.1, 2.10.0, 2.10.1, 2.11.0, 2.11.1, 2.11.2, 2.12.0)
ERROR: No matching distribution found for tensorboard==2.12.2

And

ModuleNotFoundError: No module named 'accelerate'

Looks like it's getting stuck in the setup process.

Using shared drive in colab

Please, add an option to use shared drive to save loras in colab

I have found that the trainer does not seem to recognise tags after line breaks.

example: if I have the txt file like this:

tag1,
tag2,
tag3, tag4, tag5,

after making the lora file, I checked it and found it learned from tag1 only.

Will there be a windows version ?

I don't want to run the training on google colab, instead I use jupyter to use my own GPU power, so I set COLAB=False

I think having this option is a very good idea but everything is made for linux, the code is using linux commands (apt/sed) and the code is overly complicated for my usage. I just want something that works on jupyter and I end up spending hours understanding and modifying everything and it still doesn't work

Could you please make an option to choose between linux and windows please ?

How to properly train a multiple folders training project?

I tried to set up and train a multiple folders training project. I set up the dataset_config exactly in the format your template suggested, but the trainer returned "Error: Your custom dataset is invalid or contains an error! Please check the original template.".

bucketing, bucket resolution and gradient accumulation

Please, add bucketing, bucket resolution and gradient accumulation in colab!
Its very missing features

SD 1.5 not selectable anymore

Hi, can you add back SD15 for us who train real characters pls? Would be much apprechiated!

Is it possible to train 512x 768 images?

I saw other Lora Trainer has that options, but they required xformers which cannot be ran in AMD GPU, is it possible to train no squad size image in this trainer?

Error: Your MyDrive/lora_training/datasets/xxx folder is empty.

I was able to make two loras then today after I create the folder "Loras/project_name/dataset" it says it is empty even though it has the images and txts. It then creates an empty data set. I add the images and txts to that one and when running it I get the message "choose a valid project name. Please help

Fiftyone "failed to bind to port"

Attempting to launch step 3 installs prerequisites without error, but throws "fiftyone.core.service.DatabaseService failed to bind to port" before launching image GUI. No issues prior, going to go out on a limb and assume problem is similar to ticket #44 with an overnight FO functionality change.

Please add a jupyter for extracting lora from ckpt

Hey. Please include jupyter files for the utilities such as extracting and converting models. In particular for my use case, extracting Lora.
Thanks for your work.

Information request

Hi,

Following the fascinating reading of your process for the creation of LoRA at this link: https://civitai.com/models/22530, I would like to deepen the following points:

You indicate in your Colab LoRa Trainer: "I recommend that your images multiplied by their repeats is between 200 and 400". What happens if my set of images is composed of 450 images? I leave repeat = 1 for 30 epoch?
Still in your Colab LoRa Trainer, you specify that: "LoCons are said to be great with artstyles". Do you have any links to additional resources on style specific training for LoCons or values from your own experience that work well?
In the 03/22/23 update of the Civiatai article, you say: "The colabs are now faster, have better options for photorealism". Which options and which values do you think are useful for obtaining photorealistic LoRa?

Thank in advance and keep it up, your writing style makes information accessible and understandable to everyone, and as you say, I believe all information should be free for everyone!

Will Prodigy be added as optimizer ?

From what I red on numerous posts and guide, Prodigy seems to be a direct upgrade from Dadaptation for adaptative optimizer. So I wonder if it can be added for the google colab.

Failes at step 4 everytime

step 4 tagging your images has been failling everytime and i don't know what to do
here is the error: ModuleNotFoundError: No module named 'tokenizers.pre_tokenizers'

SDXL

will have kaggle version?

kaggle almost is same as colab ,and it can have 33h gpu use every week

Alternative image sources

Gelbooru seems to be extremely lacking in content. Is it possible to implement scraping of Danbooru?

got stuck on this message

Never mind

Invalid file in folder error

If you capitalize your filename it gives an invalid file in folder error.
I had to mess around quite a bit on my end trying to figure out what was wrong, and had to wind up renaming the images and text files to all lowercase for it to work.

Add gradient_checkpointing option

Title, add an option to enable gradient_checkpointing

i want dataset_maker guide

How properly train LOLA on multiple folders?

Hello i am trying to train lora on different folders. In extra/ Advanced said that When enabling this, the number of repeats set in the main cell will be ignored, and the main folder set by the project name will also be ignored.. But when running main script it either said that folder has no photos or as in image below

Folder including all folders and test image with tags

dataset_config.toml in config folder

[[datasets]]

[[datasets.subsets]]
image_dir = "/content/drive/MyDrive/lora_training/datasets/jakuzure_nonon/clasic"
num_repeats = 10

[[datasets.subsets]]
image_dir = "/content/drive/MyDrive/lora_training/datasets/jakuzure_nonon/etc"
num_repeats = 5

[[datasets.subsets]]
image_dir = "/content/drive/MyDrive/lora_training/datasets/jakuzure_nonon/nudist_beach"
num_repeats = 10

[[datasets.subsets]]
image_dir = "/content/drive/MyDrive/lora_training/datasets/jakuzure_nonon/orchestra"
num_repeats = 10

[[datasets.subsets]]
image_dir = "/content/drive/MyDrive/lora_training/datasets/jakuzure_nonon/sports"
num_repeats = 10

[general]
resolution = 512
shuffle_caption = true
keep_tokens = 1
flip_aug = false
caption_extension = ".txt"
enable_bucket = true
bucket_reso_steps = 64
bucket_no_upscale = false
min_bucket_reso = 256
max_bucket_reso = 1024

training_config.toml

[additional_network_arguments]
unet_lr = 0.0005
text_encoder_lr = 0.0001
network_dim = 32
network_alpha = 16
network_module = "networks.lora"

[optimizer_arguments]
learning_rate = 0.0005
lr_scheduler = "cosine_with_restarts"
lr_scheduler_num_cycles = 3
lr_warmup_steps = 274
optimizer_type = "AdamW8bit"

[training_arguments]
max_train_epochs = 20
save_every_n_epochs = 1
save_last_n_epochs = 7
train_batch_size = 2
clip_skip = 2
min_snr_gamma = 5.0
weighted_captions = false
seed = 42
max_token_length = 225
xformers = true
lowram = true
max_data_loader_n_workers = 8
persistent_data_loader_workers = true
save_precision = "fp16"
mixed_precision = "fp16"
output_dir = "/content/drive/MyDrive/lora_training/output/jakuzure_nonon"
logging_dir = "/content/drive/MyDrive/lora_training/log"
output_name = "jakuzure_nonon"
log_prefix = "jakuzure_nonon"
save_state = false

[model_arguments]
pretrained_model_name_or_path = "/content/animefull-final-pruned-fp16.safetensors"
v2 = false

[saving_arguments]
save_model_as = "safetensors"

[dreambooth_arguments]
prior_loss_weight = 1.0

[dataset_arguments]
cache_latents = true

How do I train with multiple dataset folders?

I am trying to train multiple outfits to different keywords but the end result no matter what tag I use it creates a mishmash of the various outfits.

Problem with SD 2.1 Lora

They straight up don't work and i don't know why.

I trained a lora for MangledMergeV3 which is a 2.1 model, after the training i tested it only to find that the lora had no effect on the image generation. (I did check the box that says the model is 2.1 btw)

This doesn't seem to be an issue with training parameters because i tried different parameters that i knew at least worked on SD 1.5

Worked 1 times then stopped

So i tried the Lora colab to make one, it worked but results werent good (as i was expecting) when doing a new one following steps from a friend, i end up with this :

No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
Loading settings from /content/drive/MyDrive/lora_training/config/CinderFall/training_config.toml...
/content/drive/MyDrive/lora_training/config/CinderFall/training_config
prepare tokenizer
Downloading (…)olve/main/vocab.json: 100% 961k/961k [00:00<00:00, 5.46MB/s]
Downloading (…)olve/main/merges.txt: 100% 525k/525k [00:00<00:00, 6.65MB/s]
Downloading (…)cial_tokens_map.json: 100% 389/389 [00:00<00:00, 80.5kB/s]
Downloading (…)okenizer_config.json: 100% 905/905 [00:00<00:00, 228kB/s]
update token length: 225
Load dataset config from /content/drive/MyDrive/lora_training/config/CinderFall/dataset_config.toml
prepare images.
found directory /content/drive/MyDrive/lora_training/datasets/CinderFall contains 95 image files
380 train images with repeating.
0 reg images.
no regularization images / 正則化画像が見つかりませんでした
[Dataset 0]
batch_size: 2
resolution: (512, 512)
enable_bucket: True
min_bucket_reso: 256
max_bucket_reso: 1024
bucket_reso_steps: 64
bucket_no_upscale: False

[Subset 0 of Dataset 0]
image_dir: "/content/drive/MyDrive/lora_training/datasets/CinderFall"
image_count: 95
num_repeats: 4
shuffle_caption: True
keep_tokens: 1
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
token_warmup_min: 1,
token_warmup_step: 0,
is_reg: False
class_tokens: None
caption_extension: .txt

[Dataset 0]
loading image sizes.
100% 95/95 [00:00<00:00, 596.50it/s]
make buckets
number of images (including repeats) / 各bucketの画像枚数（繰り返し回数を含む）
bucket 0: resolution (320, 768), count: 8
bucket 1: resolution (384, 640), count: 120
bucket 2: resolution (448, 576), count: 164
bucket 3: resolution (512, 512), count: 36
bucket 4: resolution (576, 448), count: 24
bucket 5: resolution (640, 384), count: 28
mean ar error (without repeats): 0.05451873417765707
prepare accelerator
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /content/kohya-trainer/train_network.py:760 in │
│ │
│ 757 │ args = parser.parse_args() │
│ 758 │ args = train_util.read_config_from_file(args, parser) │
│ 759 │ │
│ ❱ 760 │ train(args) │
│ 761 │
│ │
│ /content/kohya-trainer/train_network.py:140 in train │
│ │
│ 137 │ │
│ 138 │ # acceleratorを準備する │
│ 139 │ print("prepare accelerator") │
│ ❱ 140 │ accelerator, unwrap_model = train_util.prepare_accelerator(args) │
│ 141 │ is_main_process = accelerator.is_main_process │
│ 142 │ │
│ 143 │ # mixed precisionに対応した型を用意しておき適宜castする │
│ │
│ /content/kohya-trainer/library/train_util.py:2693 in prepare_accelerator │
│ │
│ 2690 │ │ log_prefix = "" if args.log_prefix is None else args.log_pref │
│ 2691 │ │ logging_dir = args.logging_dir + "/" + log_prefix + time.strf │
│ 2692 │ │
│ ❱ 2693 │ accelerator = Accelerator( │
│ 2694 │ │ gradient_accumulation_steps=args.gradient_accumulation_steps, │
│ 2695 │ │ mixed_precision=args.mixed_precision, │
│ 2696 │ │ log_with=log_with, │
│ │
│ /usr/local/lib/python3.9/dist-packages/accelerate/accelerator.py:355 in │
│ init │
│ │
│ 352 │ │ if self.state.mixed_precision == "fp16" and self.distributed_ │
│ 353 │ │ │ self.native_amp = True │
│ 354 │ │ │ if not torch.cuda.is_available() and not parse_flag_from_ │
│ ❱ 355 │ │ │ │ raise ValueError(err.format(mode="fp16", requirement= │
│ 356 │ │ │ kwargs = self.scaler_handler.to_kwargs() if self.scaler_h │
│ 357 │ │ │ if self.distributed_type == DistributedType.FSDP: │
│ 358 │ │ │ │ from torch.distributed.fsdp.sharded_grad_scaler impor │
╰──────────────────────────────────────────────────────────────────────────────╯
ValueError: fp16 mixed precision requires a GPU
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /usr/local/bin/accelerate:8 in │
│ │
│ 5 from accelerate.commands.accelerate_cli import main │
│ 6 if name == 'main': │
│ 7 │ sys.argv[0] = re.sub(r'(-script.pyw|.exe)?$', '', sys.argv[0]) │
│ ❱ 8 │ sys.exit(main()) │
│ 9 │
│ │
│ /usr/local/lib/python3.9/dist-packages/accelerate/commands/accelerate_cli.py │
│ :45 in main │
│ │
│ 42 │ │ exit(1) │
│ 43 │ │
│ 44 │ # Run │
│ ❱ 45 │ args.func(args) │
│ 46 │
│ 47 │
│ 48 if name == "main": │
│ │
│ /usr/local/lib/python3.9/dist-packages/accelerate/commands/launch.py:1104 in │
│ launch_command │
│ │
│ 1101 │ elif defaults is not None and defaults.compute_environment == Com │
│ 1102 │ │ sagemaker_launcher(defaults, args) │
│ 1103 │ else: │
│ ❱ 1104 │ │ simple_launcher(args) │
│ 1105 │
│ 1106 │
│ 1107 def main(): │
│ │
│ /usr/local/lib/python3.9/dist-packages/accelerate/commands/launch.py:567 in │
│ simple_launcher │
│ │
│ 564 │ process = subprocess.Popen(cmd, env=current_env) │
│ 565 │ process.wait() │
│ 566 │ if process.returncode != 0: │
│ ❱ 567 │ │ raise subprocess.CalledProcessError(returncode=process.return │
│ 568 │
│ 569 │
│ 570 def multi_gpu_launcher(args): │
╰──────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['/usr/bin/python3', 'train_network.py',
'--dataset_config=/content/drive/MyDrive/lora_training/config/CinderFall/dataset
_config.toml',
'--config_file=/content/drive/MyDrive/lora_training/config/CinderFall/training_c
onfig.toml']' returned non-zero exit status 1.

I tried doing the same as i did when it worked the first time without success, im lost and dont know what to do now

Tagging downloads every time,

Tagging downloads the tagger model and etc. every time it is run, even on a local runtime. Is this an issue with kohya-trainer?

env: PYTHONPATH=/env/python

🚶‍♂️ Launching program...

env: PYTHONPATH=/content/kohya-trainer
downloading wd14 tagger model from hf_hub. id: SmilingWolf/wd-v1-4-swinv2-tagger-v2
Downloading (…)"keras_metadata.pb";: 100% 448k/448k [00:00<00:00, 5.67MB/s]
Downloading (…)"saved_model.pb";: 100% 37.6M/37.6M [00:01<00:00, 33.9MB/s]
Downloading (…)in/selected_tags.csv: 100% 254k/254k [00:00<00:00, 484kB/s]
Downloading (…)ata-00000-of-00001";: 100% 385M/385M [00:08<00:00, 44.7MB/s]
Downloading (…)"variables.index";: 100% 24.2k/24.2k [00:00<00:00, 21.2MB/s]

are webp images supported?

That would be very useful.

Error loading state_dict of Civitai model

Theke keep happening and i gotnoidea why

Traceback (most recent call last) lora trainer error

sing accelerator 0.15.0 or above.
loading model for process 0/1
load StableDiffusion checkpoint
loading u-net:
loading vae:
loading text encoder:
Replace CrossAttention.forward to use xformers
[Dataset 0]
caching latents.
100% 306/306 [00:43<00:00, 7.02it/s]
import network module: lycoris.kohya
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /content/kohya-trainer/train_network.py:760 in │
│ │
│ 757 │ args = parser.parse_args() │
│ 758 │ args = train_util.read_config_from_file(args, parser) │
│ 759 │ │
│ ❱ 760 │ train(args) │
│ 761 │
│ │
│ /content/kohya-trainer/train_network.py:195 in train │
│ │
│ 192 │ │ │ net_kwargs[key] = value │
│ 193 │ │
│ 194 │ # if a new network is added in future, add if ~ then blocks for ea │
│ ❱ 195 │ network = network_module.create_network(1.0, args.network_dim, arg │
│ 196 │ if network is None: │
│ 197 │ │ return │
│ 198 │
│ │
│ /usr/local/lib/python3.9/dist-packages/lycoris/kohya.py:26 in create_network │
│ │
│ 23 │ dropout = float(kwargs.get('dropout', 0.)) │
│ 24 │ algo = kwargs.get('algo', 'lora') │
│ 25 │ disable_cp = kwargs.get('disable_conv_cp', False) │
│ ❱ 26 │ network_module = { │
│ 27 │ │ 'lora': LoConModule, │
│ 28 │ │ 'loha': LohaModule, │
│ 29 │ }[algo] │
╰──────────────────────────────────────────────────────────────────────────────╯
KeyError: 'locon'
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /usr/local/bin/accelerate:8 in │
│ │
│ 5 from accelerate.commands.accelerate_cli import main │
│ 6 if name == 'main': │
│ 7 │ sys.argv[0] = re.sub(r'(-script.pyw|.exe)?$', '', sys.argv[0]) │
│ ❱ 8 │ sys.exit(main()) │
│ 9 │
│ │
│ /usr/local/lib/python3.9/dist-packages/accelerate/commands/accelerate_cli.py │
│ :45 in main │
│ │
│ 42 │ │ exit(1) │
│ 43 │ │
│ 44 │ # Run │
│ ❱ 45 │ args.func(args) │
│ 46 │
│ 47 │
│ 48 if name == "main": │
│ │
│ /usr/local/lib/python3.9/dist-packages/accelerate/commands/launch.py:1104 in │
│ launch_command │
│ │
│ 1101 │ elif defaults is not None and defaults.compute_environment == Com │
│ 1102 │ │ sagemaker_launcher(defaults, args) │
│ 1103 │ else: │
│ ❱ 1104 │ │ simple_launcher(args) │
│ 1105 │
│ 1106 │
│ 1107 def main(): │
│ │
│ /usr/local/lib/python3.9/dist-packages/accelerate/commands/launch.py:567 in │
│ simple_launcher │
│ │
│ 564 │ process = subprocess.Popen(cmd, env=current_env) │
│ 565 │ process.wait() │
│ 566 │ if process.returncode != 0: │
│ ❱ 567 │ │ raise subprocess.CalledProcessError(returncode=process.return │
│ 568 │
│ 569 │
│ 570 def multi_gpu_launcher(args): │
╰──────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['/usr/bin/python3', 'train_network.py',
'--dataset_config=/content/drive/MyDrive/lora_training/config/aa/dataset_con
fig.toml',
'--config_file=/content/drive/MyDrive/lora_training/config/aa/training_confi
g.toml']' returned non-zero exit status 1.

what is this error? It was working fine until yesterday

How to add VAE to training

Hello and good time of day.

How to add custom VAE to training when training with custom model?

Thanks for you work, best lora training colab i've ever seen.

403 error when curating dataset

As the title suggests, everything else works fine I was able to make the lora from the photos but when curating the images it just gives me a 403

support 2.1 models

can you add or guide me through how to add 2.1 models?

I change the line: model_url = "https://civitai.com/api/download/models/29103?type=Model&format=SafeTensor"
but seem likes there is a mismatch between the architectures

Wandb integration

It would be SO good to have https://wandb.ai/ integration if you're able to implement. Definetely worth ko-fi!

pip dependency conflict error, can still continue

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchaudio 2.0.2+cu118 requires torch==2.0.1, but you have torch 2.0.0+cu118 which is incompatible.
torchdata 0.6.1 requires torch==2.0.1, but you have torch 2.0.0+cu118 which is incompatible.
torchtext 0.15.2 requires torch==2.0.1, but you have torch 2.0.0+cu118 which is incompatible.
torchvision 0.15.2+cu118 requires torch==2.0.1, but you have torch 2.0.0+cu118 which is incompatible

hope its ok

Please put together a Paperspace version

Hi. So Paperspace is the new kid on the block and a pretty good Colab alternative. In fact, much better than Colab in my opinion.
Could you please consider it?
Thanks.

[Dataset Maker-3rd sp] ServiceListenTimeout: fiftyone.core.service.DatabaseService failed to bind to port

ServiceListenTimeout Traceback (most recent call last)

in <cell line: 29>()
27
28 import numpy as np
---> 29 import fiftyone as fo
30 import fiftyone.zoo as foz
31 from fiftyone import ViewField as F

10 frames

/usr/local/lib/python3.10/dist-packages/fiftyone/core/service.py in find_port()
166 pass
167
--> 168 raise ServiceListenTimeout(etau.get_class_name(self), port)
169
170 return find_port()

ServiceListenTimeout: fiftyone.core.service.DatabaseService failed to bind to port

Fiftyone image curation wipes dataset

Entering the input box to finalize the FO duplicate image curation, the FO session appears to prematurely disconnect, navigating to the 'Welcome to FiftyOne!' page, and all contents of the dataset are deleted. There are no issues before this, and the FO page will display and mark images correctly. Had no issues like this prior to what I assume was the latest main branch update. Attempted with both manually uploading files and using the Gelbooru scraper, and used .jpg, .jpeg, and .png files.

Add option to "Skip existing latents"

Very good colab, thank you for you great work, but one thing is it wasting time on building latents each time cell is run.

Or split to three cells - prepare and deps, cache latents, train parameters + train start.

Like in Linagruf colab.

SDXL support?

Please add SDXL support

Error: Your MyDrive/lora_training/datasets/xxx folder is empty.

I have put my .png images and .txt tags into MyDrive/lora_training/datasets/xxx folder. But I still get the Error: Your MyDrive/lora_training/datasets/xxx folder is empty.

manually add folder

do we need make the dir manually from drive?

i mean i have some several code that might help:

from google.colab import drive
drive.mount('/content/drive')
!cd /content/drive/MyDrive/
!mkdir /content/drive/MyDrive/lora_training/
!mkdir /content/drive/MyDrive/lora_training/config
!mkdir /content/drive/MyDrive/lora_training/datasets
!mkdir /content/drive/MyDrive/lora_training/output
!mkdir /content/drive/MyDrive/lora_training/log

new error

Have been using this colab successfully until today. Now there is an error:

ERROR: Could not find a version that satisfies the requirement torch==2.0.0+cu118 (from versions: 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 2.0.0, 2.0.1)
ERROR: No matching distribution found for torch==2.0.0+cu118

ModuleNotFoundError Traceback (most recent call last)
in <cell line: 512>()
510 display(Markdown("### ✅ Done! Go download your Lora(s) from Google Drive"))
511
--> 512 main()

1 frames
in install_dependencies()
206 get_ipython().system('sed -i 's/model_name + "."/model_name + "-{:02d}.".format(num_train_epochs)/g' train_network.py # name of the last epoch will match the rest')
207
--> 208 from accelerate.utils import write_basic_config
209 if not os.path.exists(accelerate_config_file):
210 write_basic_config(save_location=accelerate_config_file)

ModuleNotFoundError: No module named 'accelerate'

tried:
!pip install torch==2.0.0+cu118

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
ERROR: Could not find a version that satisfies the requirement torch==2.0.0+cu118 (from versions: 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 2.0.0, 2.0.1)
ERROR: No matching distribution found for torch==2.0.0+cu118

tried
!pip install install torch==2.0.0
works, but get the same error when running the main cell

'is not defined'

getting some errors with

'os is not defined' I have fixed this one by placing import os at the very start of the code

proceeds to give the error 'root_dir' is not defined

help :((

Hope to add the function about loss chart

Update a line or bar chart in real time during training, the horizontal coordinate is the epoch number and the vertical coordinate is the loss rate. If it cannot be updated in real time, it can be displayed when the training is completed.

add ability to use custom local model (enhancement)

hi, your script works well, however please also add the ability to select a local model to train on (that was uploaded to gdrive). thanks

opportunity - help me with stable diffusion

hey hellow,

I appreciate your stable diffusion tutorial: https://huggingface.co/hollowstrawberry/holotard
It helped me a lot.

I would like to pay you to help me implement mine as I'm a bit lost.
Is that possible?

My email is tentothedollar @ gmail.com

Thanks

Tags and Caption

Hi there :)

Im using you Lora Trainer notebook (really straightfoward, I love it!). But I was wondering how or where do I put the tags and caption txt. files. Doesn't it usually create itself with a cell? In my understanding, a txt. files is created and I need to go and do some adjustment/modifications so that it describe the image well.

Thanks in advance for your help :)

Please explain the num_repeat step

So the rule of thumb is 100 steps per image, followed by an additional 1500 steps. So for 10 images, 150 steps per image is a good sweet spot. And then the number of steps decrease the more images you add in order to prevent over training. So for instance with 20 images, 100 steps per image is enough. But you're saying I should choose a number between 200 - 400. What's the logic behind that?

How to change optimizer

Im trying to change optimizer to AdamW, but changing optimizer_type leads to:

AdamW produces better results

hollowstrawberry / kohya-colab Goto Github PK

kohya-colab's People

Contributors

Stargazers

Watchers

Forkers

kohya-colab's Issues

ERROR: Could not find a version that satisfies the requirement torch==2.0.0+cu118 (from versions: 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 2.0.0, 2.0.1) ERROR: No matching distribution found for torch==2.0.0+cu118

Recommend Projects

Recommend Topics

Recommend Org

ERROR: Could not find a version that satisfies the requirement torch==2.0.0+cu118 (from versions: 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 2.0.0, 2.0.1)
ERROR: No matching distribution found for torch==2.0.0+cu118