Coder Social home page Coder Social logo

hollowstrawberry / kohya-colab Goto Github PK

View Code? Open in Web Editor NEW
556.0 556.0 77.0 1.78 MB

Accessible Google Colab notebooks for Stable Diffusion Lora training, based on the work of kohya-ss and Linaqruf

License: GNU General Public License v3.0

Jupyter Notebook 48.72% Python 51.28%

kohya-colab's People

Contributors

cleanup-crew-from-discord avatar darkmyu avatar hollowstrawberry avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kohya-colab's Issues

No Module Found

Getting a couple of errors when trying to run the collab
ERROR: Could not find a version that satisfies the requirement tensorboard==2.12.2 (from versions: 1.6.0rc0, 1.6.0, 1.7.0, 1.8.0, 1.9.0, 1.10.0, 1.11.0, 1.12.0, 1.12.1, 1.12.2, 1.13.0, 1.13.1, 1.14.0, 1.15.0, 2.0.0, 2.0.1, 2.0.2, 2.1.0, 2.1.1, 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.4.0, 2.4.1, 2.5.0, 2.6.0, 2.7.0, 2.8.0, 2.9.0, 2.9.1, 2.10.0, 2.10.1, 2.11.0, 2.11.1, 2.11.2, 2.12.0)
ERROR: No matching distribution found for tensorboard==2.12.2

And

ModuleNotFoundError: No module named 'accelerate'

Looks like it's getting stuck in the setup process.

Will there be a windows version ?

I don't want to run the training on google colab, instead I use jupyter to use my own GPU power, so I set COLAB=False

I think having this option is a very good idea but everything is made for linux, the code is using linux commands (apt/sed) and the code is overly complicated for my usage. I just want something that works on jupyter and I end up spending hours understanding and modifying everything and it still doesn't work

Could you please make an option to choose between linux and windows please ?

How to properly train a multiple folders training project?

I tried to set up and train a multiple folders training project. I set up the dataset_config exactly in the format your template suggested, but the trainer returned "Error: Your custom dataset is invalid or contains an error! Please check the original template.".

Is it possible to train 512x 768 images?

I saw other Lora Trainer has that options, but they required xformers which cannot be ran in AMD GPU, is it possible to train no squad size image in this trainer?

Error: Your MyDrive/lora_training/datasets/xxx folder is empty.

I was able to make two loras then today after I create the folder "Loras/project_name/dataset" it says it is empty even though it has the images and txts. It then creates an empty data set. I add the images and txts to that one and when running it I get the message "choose a valid project name. Please help

Fiftyone "failed to bind to port"

Attempting to launch step 3 installs prerequisites without error, but throws "fiftyone.core.service.DatabaseService failed to bind to port" before launching image GUI. No issues prior, going to go out on a limb and assume problem is similar to ticket #44 with an overnight FO functionality change.

Information request

Hi,

Following the fascinating reading of your process for the creation of LoRA at this link: https://civitai.com/models/22530, I would like to deepen the following points:

  1. You indicate in your Colab LoRa Trainer: "I recommend that your images multiplied by their repeats is between 200 and 400". What happens if my set of images is composed of 450 images? I leave repeat = 1 for 30 epoch?
  2. Still in your Colab LoRa Trainer, you specify that: "LoCons are said to be great with artstyles". Do you have any links to additional resources on style specific training for LoCons or values ​​from your own experience that work well?
  3. In the 03/22/23 update of the Civiatai article, you say: "The colabs are now faster, have better options for photorealism". Which options and which values ​​do you think are useful for obtaining photorealistic LoRa?

Thank in advance and keep it up, your writing style makes information accessible and understandable to everyone, and as you say, I believe all information should be free for everyone!

Will Prodigy be added as optimizer ?

From what I red on numerous posts and guide, Prodigy seems to be a direct upgrade from Dadaptation for adaptative optimizer. So I wonder if it can be added for the google colab.

Failes at step 4 everytime

step 4 tagging your images has been failling everytime and i don't know what to do
here is the error: ModuleNotFoundError: No module named 'tokenizers.pre_tokenizers'

Alternative image sources

Gelbooru seems to be extremely lacking in content. Is it possible to implement scraping of Danbooru?

Invalid file in folder error

If you capitalize your filename it gives an invalid file in folder error.
I had to mess around quite a bit on my end trying to figure out what was wrong, and had to wind up renaming the images and text files to all lowercase for it to work.

How properly train LOLA on multiple folders?

Hello i am trying to train lora on different folders. In extra/ Advanced said that When enabling this, the number of repeats set in the main cell will be ignored, and the main folder set by the project name will also be ignored.. But when running main script it either said that folder has no photos or as in image below
image
Folder including all folders and test image with tags
image
dataset_config.toml in config folder

[[datasets]]

[[datasets.subsets]]
image_dir = "/content/drive/MyDrive/lora_training/datasets/jakuzure_nonon/clasic"
num_repeats = 10

[[datasets.subsets]]
image_dir = "/content/drive/MyDrive/lora_training/datasets/jakuzure_nonon/etc"
num_repeats = 5

[[datasets.subsets]]
image_dir = "/content/drive/MyDrive/lora_training/datasets/jakuzure_nonon/nudist_beach"
num_repeats = 10

[[datasets.subsets]]
image_dir = "/content/drive/MyDrive/lora_training/datasets/jakuzure_nonon/orchestra"
num_repeats = 10

[[datasets.subsets]]
image_dir = "/content/drive/MyDrive/lora_training/datasets/jakuzure_nonon/sports"
num_repeats = 10

[general]
resolution = 512
shuffle_caption = true
keep_tokens = 1
flip_aug = false
caption_extension = ".txt"
enable_bucket = true
bucket_reso_steps = 64
bucket_no_upscale = false
min_bucket_reso = 256
max_bucket_reso = 1024

training_config.toml

[additional_network_arguments]
unet_lr = 0.0005
text_encoder_lr = 0.0001
network_dim = 32
network_alpha = 16
network_module = "networks.lora"

[optimizer_arguments]
learning_rate = 0.0005
lr_scheduler = "cosine_with_restarts"
lr_scheduler_num_cycles = 3
lr_warmup_steps = 274
optimizer_type = "AdamW8bit"

[training_arguments]
max_train_epochs = 20
save_every_n_epochs = 1
save_last_n_epochs = 7
train_batch_size = 2
clip_skip = 2
min_snr_gamma = 5.0
weighted_captions = false
seed = 42
max_token_length = 225
xformers = true
lowram = true
max_data_loader_n_workers = 8
persistent_data_loader_workers = true
save_precision = "fp16"
mixed_precision = "fp16"
output_dir = "/content/drive/MyDrive/lora_training/output/jakuzure_nonon"
logging_dir = "/content/drive/MyDrive/lora_training/log"
output_name = "jakuzure_nonon"
log_prefix = "jakuzure_nonon"
save_state = false

[model_arguments]
pretrained_model_name_or_path = "/content/animefull-final-pruned-fp16.safetensors"
v2 = false

[saving_arguments]
save_model_as = "safetensors"

[dreambooth_arguments]
prior_loss_weight = 1.0

[dataset_arguments]
cache_latents = true

Problem with SD 2.1 Lora

They straight up don't work and i don't know why.

I trained a lora for MangledMergeV3 which is a 2.1 model, after the training i tested it only to find that the lora had no effect on the image generation. (I did check the box that says the model is 2.1 btw)

This doesn't seem to be an issue with training parameters because i tried different parameters that i knew at least worked on SD 1.5

Worked 1 times then stopped

So i tried the Lora colab to make one, it worked but results werent good (as i was expecting) when doing a new one following steps from a friend, i end up with this :

No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
Loading settings from /content/drive/MyDrive/lora_training/config/CinderFall/training_config.toml...
/content/drive/MyDrive/lora_training/config/CinderFall/training_config
prepare tokenizer
Downloading (…)olve/main/vocab.json: 100% 961k/961k [00:00<00:00, 5.46MB/s]
Downloading (…)olve/main/merges.txt: 100% 525k/525k [00:00<00:00, 6.65MB/s]
Downloading (…)cial_tokens_map.json: 100% 389/389 [00:00<00:00, 80.5kB/s]
Downloading (…)okenizer_config.json: 100% 905/905 [00:00<00:00, 228kB/s]
update token length: 225
Load dataset config from /content/drive/MyDrive/lora_training/config/CinderFall/dataset_config.toml
prepare images.
found directory /content/drive/MyDrive/lora_training/datasets/CinderFall contains 95 image files
380 train images with repeating.
0 reg images.
no regularization images / 正則化画像が見つかりませんでした
[Dataset 0]
batch_size: 2
resolution: (512, 512)
enable_bucket: True
min_bucket_reso: 256
max_bucket_reso: 1024
bucket_reso_steps: 64
bucket_no_upscale: False

[Subset 0 of Dataset 0]
image_dir: "/content/drive/MyDrive/lora_training/datasets/CinderFall"
image_count: 95
num_repeats: 4
shuffle_caption: True
keep_tokens: 1
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
token_warmup_min: 1,
token_warmup_step: 0,
is_reg: False
class_tokens: None
caption_extension: .txt

[Dataset 0]
loading image sizes.
100% 95/95 [00:00<00:00, 596.50it/s]
make buckets
number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む)
bucket 0: resolution (320, 768), count: 8
bucket 1: resolution (384, 640), count: 120
bucket 2: resolution (448, 576), count: 164
bucket 3: resolution (512, 512), count: 36
bucket 4: resolution (576, 448), count: 24
bucket 5: resolution (640, 384), count: 28
mean ar error (without repeats): 0.05451873417765707
prepare accelerator
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /content/kohya-trainer/train_network.py:760 in │
│ │
│ 757 │ args = parser.parse_args() │
│ 758 │ args = train_util.read_config_from_file(args, parser) │
│ 759 │ │
│ ❱ 760 │ train(args) │
│ 761 │
│ │
│ /content/kohya-trainer/train_network.py:140 in train │
│ │
│ 137 │ │
│ 138 │ # acceleratorを準備する │
│ 139 │ print("prepare accelerator") │
│ ❱ 140 │ accelerator, unwrap_model = train_util.prepare_accelerator(args) │
│ 141 │ is_main_process = accelerator.is_main_process │
│ 142 │ │
│ 143 │ # mixed precisionに対応した型を用意しておき適宜castする │
│ │
│ /content/kohya-trainer/library/train_util.py:2693 in prepare_accelerator │
│ │
│ 2690 │ │ log_prefix = "" if args.log_prefix is None else args.log_pref │
│ 2691 │ │ logging_dir = args.logging_dir + "/" + log_prefix + time.strf │
│ 2692 │ │
│ ❱ 2693 │ accelerator = Accelerator( │
│ 2694 │ │ gradient_accumulation_steps=args.gradient_accumulation_steps, │
│ 2695 │ │ mixed_precision=args.mixed_precision, │
│ 2696 │ │ log_with=log_with, │
│ │
│ /usr/local/lib/python3.9/dist-packages/accelerate/accelerator.py:355 in │
init
│ │
│ 352 │ │ if self.state.mixed_precision == "fp16" and self.distributed_ │
│ 353 │ │ │ self.native_amp = True │
│ 354 │ │ │ if not torch.cuda.is_available() and not parse_flag_from_ │
│ ❱ 355 │ │ │ │ raise ValueError(err.format(mode="fp16", requirement= │
│ 356 │ │ │ kwargs = self.scaler_handler.to_kwargs() if self.scaler_h │
│ 357 │ │ │ if self.distributed_type == DistributedType.FSDP: │
│ 358 │ │ │ │ from torch.distributed.fsdp.sharded_grad_scaler impor │
╰──────────────────────────────────────────────────────────────────────────────╯
ValueError: fp16 mixed precision requires a GPU
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /usr/local/bin/accelerate:8 in │
│ │
│ 5 from accelerate.commands.accelerate_cli import main │
│ 6 if name == 'main': │
│ 7 │ sys.argv[0] = re.sub(r'(-script.pyw|.exe)?$', '', sys.argv[0]) │
│ ❱ 8 │ sys.exit(main()) │
│ 9 │
│ │
│ /usr/local/lib/python3.9/dist-packages/accelerate/commands/accelerate_cli.py │
│ :45 in main │
│ │
│ 42 │ │ exit(1) │
│ 43 │ │
│ 44 │ # Run │
│ ❱ 45 │ args.func(args) │
│ 46 │
│ 47 │
│ 48 if name == "main": │
│ │
│ /usr/local/lib/python3.9/dist-packages/accelerate/commands/launch.py:1104 in │
│ launch_command │
│ │
│ 1101 │ elif defaults is not None and defaults.compute_environment == Com │
│ 1102 │ │ sagemaker_launcher(defaults, args) │
│ 1103 │ else: │
│ ❱ 1104 │ │ simple_launcher(args) │
│ 1105 │
│ 1106 │
│ 1107 def main(): │
│ │
│ /usr/local/lib/python3.9/dist-packages/accelerate/commands/launch.py:567 in │
│ simple_launcher │
│ │
│ 564 │ process = subprocess.Popen(cmd, env=current_env) │
│ 565 │ process.wait() │
│ 566 │ if process.returncode != 0: │
│ ❱ 567 │ │ raise subprocess.CalledProcessError(returncode=process.return │
│ 568 │
│ 569 │
│ 570 def multi_gpu_launcher(args): │
╰──────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['/usr/bin/python3', 'train_network.py',
'--dataset_config=/content/drive/MyDrive/lora_training/config/CinderFall/dataset
_config.toml',
'--config_file=/content/drive/MyDrive/lora_training/config/CinderFall/training_c
onfig.toml']' returned non-zero exit status 1.

I tried doing the same as i did when it worked the first time without success, im lost and dont know what to do now

Tagging downloads every time,

Tagging downloads the tagger model and etc. every time it is run, even on a local runtime. Is this an issue with kohya-trainer?

env: PYTHONPATH=/env/python

🚶‍♂️ Launching program...

env: PYTHONPATH=/content/kohya-trainer
downloading wd14 tagger model from hf_hub. id: SmilingWolf/wd-v1-4-swinv2-tagger-v2
Downloading (…)"keras_metadata.pb";: 100% 448k/448k [00:00<00:00, 5.67MB/s]
Downloading (…)"saved_model.pb";: 100% 37.6M/37.6M [00:01<00:00, 33.9MB/s]
Downloading (…)in/selected_tags.csv: 100% 254k/254k [00:00<00:00, 484kB/s]
Downloading (…)ata-00000-of-00001";: 100% 385M/385M [00:08<00:00, 44.7MB/s]
Downloading (…)"variables.index";: 100% 24.2k/24.2k [00:00<00:00, 21.2MB/s]

Traceback (most recent call last) lora trainer error

sing accelerator 0.15.0 or above.
loading model for process 0/1
load StableDiffusion checkpoint
loading u-net:
loading vae:
loading text encoder:
Replace CrossAttention.forward to use xformers
[Dataset 0]
caching latents.
100% 306/306 [00:43<00:00, 7.02it/s]
import network module: lycoris.kohya
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /content/kohya-trainer/train_network.py:760 in │
│ │
│ 757 │ args = parser.parse_args() │
│ 758 │ args = train_util.read_config_from_file(args, parser) │
│ 759 │ │
│ ❱ 760 │ train(args) │
│ 761 │
│ │
│ /content/kohya-trainer/train_network.py:195 in train │
│ │
│ 192 │ │ │ net_kwargs[key] = value │
│ 193 │ │
│ 194 │ # if a new network is added in future, add if ~ then blocks for ea │
│ ❱ 195 │ network = network_module.create_network(1.0, args.network_dim, arg │
│ 196 │ if network is None: │
│ 197 │ │ return │
│ 198 │
│ │
│ /usr/local/lib/python3.9/dist-packages/lycoris/kohya.py:26 in create_network │
│ │
│ 23 │ dropout = float(kwargs.get('dropout', 0.)) │
│ 24 │ algo = kwargs.get('algo', 'lora') │
│ 25 │ disable_cp = kwargs.get('disable_conv_cp', False) │
│ ❱ 26 │ network_module = { │
│ 27 │ │ 'lora': LoConModule, │
│ 28 │ │ 'loha': LohaModule, │
│ 29 │ }[algo] │
╰──────────────────────────────────────────────────────────────────────────────╯
KeyError: 'locon'
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /usr/local/bin/accelerate:8 in │
│ │
│ 5 from accelerate.commands.accelerate_cli import main │
│ 6 if name == 'main': │
│ 7 │ sys.argv[0] = re.sub(r'(-script.pyw|.exe)?$', '', sys.argv[0]) │
│ ❱ 8 │ sys.exit(main()) │
│ 9 │
│ │
│ /usr/local/lib/python3.9/dist-packages/accelerate/commands/accelerate_cli.py │
│ :45 in main │
│ │
│ 42 │ │ exit(1) │
│ 43 │ │
│ 44 │ # Run │
│ ❱ 45 │ args.func(args) │
│ 46 │
│ 47 │
│ 48 if name == "main": │
│ │
│ /usr/local/lib/python3.9/dist-packages/accelerate/commands/launch.py:1104 in │
│ launch_command │
│ │
│ 1101 │ elif defaults is not None and defaults.compute_environment == Com │
│ 1102 │ │ sagemaker_launcher(defaults, args) │
│ 1103 │ else: │
│ ❱ 1104 │ │ simple_launcher(args) │
│ 1105 │
│ 1106 │
│ 1107 def main(): │
│ │
│ /usr/local/lib/python3.9/dist-packages/accelerate/commands/launch.py:567 in │
│ simple_launcher │
│ │
│ 564 │ process = subprocess.Popen(cmd, env=current_env) │
│ 565 │ process.wait() │
│ 566 │ if process.returncode != 0: │
│ ❱ 567 │ │ raise subprocess.CalledProcessError(returncode=process.return │
│ 568 │
│ 569 │
│ 570 def multi_gpu_launcher(args): │
╰──────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['/usr/bin/python3', 'train_network.py',
'--dataset_config=/content/drive/MyDrive/lora_training/config/aa/dataset_con
fig.toml',
'--config_file=/content/drive/MyDrive/lora_training/config/aa/training_confi
g.toml']' returned non-zero exit status 1.

what is this error? It was working fine until yesterday

How to add VAE to training

Hello and good time of day.

How to add custom VAE to training when training with custom model?

Thanks for you work, best lora training colab i've ever seen.

403 error when curating dataset

As the title suggests, everything else works fine I was able to make the lora from the photos but when curating the images it just gives me a 403

image

pip dependency conflict error, can still continue

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchaudio 2.0.2+cu118 requires torch==2.0.1, but you have torch 2.0.0+cu118 which is incompatible.
torchdata 0.6.1 requires torch==2.0.1, but you have torch 2.0.0+cu118 which is incompatible.
torchtext 0.15.2 requires torch==2.0.1, but you have torch 2.0.0+cu118 which is incompatible.
torchvision 0.15.2+cu118 requires torch==2.0.1, but you have torch 2.0.0+cu118 which is incompatible

hope its ok

Please put together a Paperspace version

Hi. So Paperspace is the new kid on the block and a pretty good Colab alternative. In fact, much better than Colab in my opinion.
Could you please consider it?
Thanks.

[Dataset Maker-3rd sp] ServiceListenTimeout: fiftyone.core.service.DatabaseService failed to bind to port

WX20230720-122957@2x


ServiceListenTimeout Traceback (most recent call last)

in <cell line: 29>()
27
28 import numpy as np
---> 29 import fiftyone as fo
30 import fiftyone.zoo as foz
31 from fiftyone import ViewField as F

10 frames

/usr/local/lib/python3.10/dist-packages/fiftyone/core/service.py in find_port()
166 pass
167
--> 168 raise ServiceListenTimeout(etau.get_class_name(self), port)
169
170 return find_port()

ServiceListenTimeout: fiftyone.core.service.DatabaseService failed to bind to port

Fiftyone image curation wipes dataset

Entering the input box to finalize the FO duplicate image curation, the FO session appears to prematurely disconnect, navigating to the 'Welcome to FiftyOne!' page, and all contents of the dataset are deleted. There are no issues before this, and the FO page will display and mark images correctly. Had no issues like this prior to what I assume was the latest main branch update. Attempted with both manually uploading files and using the Gelbooru scraper, and used .jpg, .jpeg, and .png files.

Add option to "Skip existing latents"

Very good colab, thank you for you great work, but one thing is it wasting time on building latents each time cell is run.

Or split to three cells - prepare and deps, cache latents, train parameters + train start.

Like in Linagruf colab.

manually add folder

do we need make the dir manually from drive?

i mean i have some several code that might help:

from google.colab import drive
drive.mount('/content/drive')
!cd /content/drive/MyDrive/
!mkdir /content/drive/MyDrive/lora_training/
!mkdir /content/drive/MyDrive/lora_training/config
!mkdir /content/drive/MyDrive/lora_training/datasets
!mkdir /content/drive/MyDrive/lora_training/output
!mkdir /content/drive/MyDrive/lora_training/log

new error

Have been using this colab successfully until today. Now there is an error:

ERROR: Could not find a version that satisfies the requirement torch==2.0.0+cu118 (from versions: 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 2.0.0, 2.0.1)
ERROR: No matching distribution found for torch==2.0.0+cu118

ModuleNotFoundError Traceback (most recent call last)
in <cell line: 512>()
510 display(Markdown("### ✅ Done! Go download your Lora(s) from Google Drive"))
511
--> 512 main()

1 frames
in install_dependencies()
206 get_ipython().system('sed -i 's/model_name + "."/model_name + "-{:02d}.".format(num_train_epochs)/g' train_network.py # name of the last epoch will match the rest')
207
--> 208 from accelerate.utils import write_basic_config
209 if not os.path.exists(accelerate_config_file):
210 write_basic_config(save_location=accelerate_config_file)

ModuleNotFoundError: No module named 'accelerate'

tried:
!pip install torch==2.0.0+cu118

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
ERROR: Could not find a version that satisfies the requirement torch==2.0.0+cu118 (from versions: 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 2.0.0, 2.0.1)
ERROR: No matching distribution found for torch==2.0.0+cu118

tried
!pip install install torch==2.0.0
works, but get the same error when running the main cell

'is not defined'

getting some errors with

'os is not defined' I have fixed this one by placing import os at the very start of the code

proceeds to give the error 'root_dir' is not defined

help :((

Hope to add the function about loss chart

Update a line or bar chart in real time during training, the horizontal coordinate is the epoch number and the vertical coordinate is the loss rate. If it cannot be updated in real time, it can be displayed when the training is completed.

Tags and Caption

Hi there :)

Im using you Lora Trainer notebook (really straightfoward, I love it!). But I was wondering how or where do I put the tags and caption txt. files. Doesn't it usually create itself with a cell? In my understanding, a txt. files is created and I need to go and do some adjustment/modifications so that it describe the image well.

Thanks in advance for your help :)

Please explain the num_repeat step

So the rule of thumb is 100 steps per image, followed by an additional 1500 steps. So for 10 images, 150 steps per image is a good sweet spot. And then the number of steps decrease the more images you add in order to prevent over training. So for instance with 20 images, 100 steps per image is enough. But you're saying I should choose a number between 200 - 400. What's the logic behind that?

How to change optimizer

Im trying to change optimizer to AdamW, but changing optimizer_type leads to:

image
image

AdamW produces better results

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.