Coder Social home page Coder Social logo

sd-scripts's Introduction

This repository contains training, generation and utility scripts for Stable Diffusion.

Change History is moved to the bottom of the page. 更新履歴はページ末尾に移しました。

日本語版READMEはこちら

For easier use (GUI and PowerShell scripts etc...), please visit the repository maintained by bmaltais. Thanks to @bmaltais!

This repository contains the scripts for:

  • DreamBooth training, including U-Net and Text Encoder
  • Fine-tuning (native training), including U-Net and Text Encoder
  • LoRA training
  • Textual Inversion training
  • Image generation
  • Model conversion (supports 1.x and 2.x, Stable Diffision ckpt/safetensors and Diffusers)

About requirements.txt

The file does not contain requirements for PyTorch. Because the version of PyTorch depends on the environment, it is not included in the file. Please install PyTorch first according to the environment. See installation instructions below.

The scripts are tested with Pytorch 2.1.2. 2.0.1 and 1.12.1 is not tested but should work.

Links to usage documentation

Most of the documents are written in Japanese.

English translation by darkstorm2150 is here. Thanks to darkstorm2150!

Windows Required Dependencies

Python 3.10.6 and Git:

Give unrestricted script access to powershell so venv can work:

  • Open an administrator powershell window
  • Type Set-ExecutionPolicy Unrestricted and answer A
  • Close admin powershell window

Windows Installation

Open a regular Powershell terminal and type the following inside:

git clone https://github.com/kohya-ss/sd-scripts.git
cd sd-scripts

python -m venv venv
.\venv\Scripts\activate

pip install torch==2.1.2 torchvision==0.16.2 --index-url https://download.pytorch.org/whl/cu118
pip install --upgrade -r requirements.txt
pip install xformers==0.0.23.post1 --index-url https://download.pytorch.org/whl/cu118

accelerate config

If python -m venv shows only python, change python to py.

Note: Now bitsandbytes==0.43.0, prodigyopt==1.0 and lion-pytorch==0.0.6 are included in the requirements.txt. If you'd like to use the another version, please install it manually.

This installation is for CUDA 11.8. If you use a different version of CUDA, please install the appropriate version of PyTorch and xformers. For example, if you use CUDA 12, please install pip install torch==2.1.2 torchvision==0.16.2 --index-url https://download.pytorch.org/whl/cu121 and pip install xformers==0.0.23.post1 --index-url https://download.pytorch.org/whl/cu121.

Answers to accelerate config:

- This machine
- No distributed training
- NO
- NO
- NO
- all
- fp16

If you'd like to use bf16, please answer bf16 to the last question.

Note: Some user reports ValueError: fp16 mixed precision requires a GPU is occurred in training. In this case, answer 0 for the 6th question: What GPU(s) (by id) should be used for training on this machine as a comma-separated list? [all]:

(Single GPU with id 0 will be used.)

Upgrade

When a new release comes out you can upgrade your repo with the following command:

cd sd-scripts
git pull
.\venv\Scripts\activate
pip install --use-pep517 --upgrade -r requirements.txt

Once the commands have completed successfully you should be ready to use the new version.

Upgrade PyTorch

If you want to upgrade PyTorch, you can upgrade it with pip install command in Windows Installation section. xformers is also required to be upgraded when PyTorch is upgraded.

Credits

The implementation for LoRA is based on cloneofsimo's repo. Thank you for great work!

The LoRA expansion to Conv2d 3x3 was initially released by cloneofsimo and its effectiveness was demonstrated at LoCon by KohakuBlueleaf. Thank you so much KohakuBlueleaf!

License

The majority of scripts is licensed under ASL 2.0 (including codes from Diffusers, cloneofsimo's and LoCon), however portions of the project are available under separate license terms:

Memory Efficient Attention Pytorch: MIT

bitsandbytes: MIT

BLIP: BSD-3-Clause

Change History

Apr 7, 2024 / 2024-04-07: v0.8.7

  • The default value of huber_schedule in Scheduled Huber Loss is changed from exponential to snr, which is expected to give better results.

  • Scheduled Huber Loss の huber_schedule のデフォルト値を exponential から、より良い結果が期待できる snr に変更しました。

Apr 7, 2024 / 2024-04-07: v0.8.6

Highlights

  • The dependent libraries are updated. Please see Upgrade and update the libraries.
    • Especially imagesize is newly added, so if you cannot update the libraries immediately, please install with pip install imagesize==1.4.1 separately.
    • bitsandbytes==0.43.0, prodigyopt==1.0, lion-pytorch==0.0.6 are included in the requirements.txt.
      • bitsandbytes no longer requires complex procedures as it now officially supports Windows.
    • Also, the PyTorch version is updated to 2.1.2 (PyTorch does not need to be updated immediately). In the upgrade procedure, PyTorch is not updated, so please manually install or update torch, torchvision, xformers if necessary (see Upgrade PyTorch).
  • When logging to wandb is enabled, the entire command line is exposed. Therefore, it is recommended to write wandb API key and HuggingFace token in the configuration file (.toml). Thanks to bghira for raising the issue.
    • A warning is displayed at the start of training if such information is included in the command line.
    • Also, if there is an absolute path, the path may be exposed, so it is recommended to specify a relative path or write it in the configuration file. In such cases, an INFO log is displayed.
    • See #1123 and PR #1240 for details.
  • Colab seems to stop with log output. Try specifying --console_log_simple option in the training script to disable rich logging.
  • Other improvements include the addition of masked loss, scheduled Huber Loss, DeepSpeed support, dataset settings improvements, and image tagging improvements. See below for details.

Training scripts

  • train_network.py and sdxl_train_network.py are modified to record some dataset settings in the metadata of the trained model (caption_prefix, caption_suffix, keep_tokens_separator, secondary_separator, enable_wildcard).
  • Fixed a bug that U-Net and Text Encoders are included in the state in train_network.py and sdxl_train_network.py. The saving and loading of the state are faster, the file size is smaller, and the memory usage when loading is reduced.
  • DeepSpeed is supported. PR #1101 and #1139 Thanks to BootsofLagrangian! See PR #1101 for details.
  • The masked loss is supported in each training script. PR #1207 See Masked loss for details.
  • Scheduled Huber Loss has been introduced to each training scripts. PR #1228 Thanks to kabachuha for the PR and cheald, drhead, and others for the discussion! See the PR and Scheduled Huber Loss for details.
  • The options --noise_offset_random_strength and --ip_noise_gamma_random_strength are added to each training script. These options can be used to vary the noise offset and ip noise gamma in the range of 0 to the specified value. PR #1177 Thanks to KohakuBlueleaf!
  • The options --save_state_on_train_end are added to each training script. PR #1168 Thanks to gesen2egee!
  • The options --sample_every_n_epochs and --sample_every_n_steps in each training script now display a warning and ignore them when a number less than or equal to 0 is specified. Thanks to S-Del for raising the issue.

Dataset settings

  • The English version of the dataset settings documentation is added. PR #1175 Thanks to darkstorm2150!
  • The .toml file for the dataset config is now read in UTF-8 encoding. PR #1167 Thanks to Horizon1704!
  • Fixed a bug that the last subset settings are applied to all images when multiple subsets of regularization images are specified in the dataset settings. The settings for each subset are correctly applied to each image. PR #1205 Thanks to feffy380!
  • Some features are added to the dataset subset settings.
    • secondary_separator is added to specify the tag separator that is not the target of shuffling or dropping.
      • Specify secondary_separator=";;;". When you specify secondary_separator, the part is not shuffled or dropped.
    • enable_wildcard is added. When set to true, the wildcard notation {aaa|bbb|ccc} can be used. The multi-line caption is also enabled.
    • keep_tokens_separator is updated to be used twice in the caption. When you specify keep_tokens_separator="|||", the part divided by the second ||| is not shuffled or dropped and remains at the end.
    • The existing features caption_prefix and caption_suffix can be used together. caption_prefix and caption_suffix are processed first, and then enable_wildcard, keep_tokens_separator, shuffling and dropping, and secondary_separator are processed in order.
    • See Dataset config for details.
  • The dataset with DreamBooth method supports caching image information (size, caption). PR #1178 and #1206 Thanks to KohakuBlueleaf! See DreamBooth method specific options for details.

Image tagging

  • The support for v3 repositories is added to tag_image_by_wd14_tagger.py (--onnx option only). PR #1192 Thanks to sdbds!
    • Onnx may need to be updated. Onnx is not installed by default, so please install or update it with pip install onnx==1.15.0 onnxruntime-gpu==1.17.1 etc. Please also check the comments in requirements.txt.
  • The model is now saved in the subdirectory as --repo_id in tag_image_by_wd14_tagger.py . This caches multiple repo_id models. Please delete unnecessary files under --model_dir.
  • Some options are added to tag_image_by_wd14_tagger.py.
    • Some are added in PR #1216 Thanks to Disty0!
    • Output rating tags --use_rating_tags and --use_rating_tags_as_last_tag
    • Output character tags first --character_tags_first
    • Expand character tags and series --character_tag_expand
    • Specify tags to output first --always_first_tags
    • Replace tags --tag_replacement
    • See Tagging documentation for details.
  • Fixed an error when specifying --beam_search and a value of 2 or more for --num_beams in make_captions.py.

About Masked loss

The masked loss is supported in each training script. To enable the masked loss, specify the --masked_loss option.

The feature is not fully tested, so there may be bugs. If you find any issues, please open an Issue.

ControlNet dataset is used to specify the mask. The mask images should be the RGB images. The pixel value 255 in R channel is treated as the mask (the loss is calculated only for the pixels with the mask), and 0 is treated as the non-mask. The pixel values 0-255 are converted to 0-1 (i.e., the pixel value 128 is treated as the half weight of the loss). See details for the dataset specification in the LLLite documentation.

About Scheduled Huber Loss

Scheduled Huber Loss has been introduced to each training scripts. This is a method to improve robustness against outliers or anomalies (data corruption) in the training data.

With the traditional MSE (L2) loss function, the impact of outliers could be significant, potentially leading to a degradation in the quality of generated images. On the other hand, while the Huber loss function can suppress the influence of outliers, it tends to compromise the reproduction of fine details in images.

To address this, the proposed method employs a clever application of the Huber loss function. By scheduling the use of Huber loss in the early stages of training (when noise is high) and MSE in the later stages, it strikes a balance between outlier robustness and fine detail reproduction.

Experimental results have confirmed that this method achieves higher accuracy on data containing outliers compared to pure Huber loss or MSE. The increase in computational cost is minimal.

The newly added arguments loss_type, huber_schedule, and huber_c allow for the selection of the loss function type (Huber, smooth L1, MSE), scheduling method (exponential, constant, SNR), and Huber's parameter. This enables optimization based on the characteristics of the dataset.

See PR #1228 for details.

  • loss_type: Specify the loss function type. Choose huber for Huber loss, smooth_l1 for smooth L1 loss, and l2 for MSE loss. The default is l2, which is the same as before.
  • huber_schedule: Specify the scheduling method. Choose exponential, constant, or snr. The default is snr.
  • huber_c: Specify the Huber's parameter. The default is 0.1.

Please read Releases for recent updates.

主要な変更点

  • 依存ライブラリが更新されました。アップグレード を参照しライブラリを更新してください。
    • 特に imagesize が新しく追加されていますので、すぐにライブラリの更新ができない場合は pip install imagesize==1.4.1 で個別にインストールしてください。
    • bitsandbytes==0.43.0prodigyopt==1.0lion-pytorch==0.0.6 が requirements.txt に含まれるようになりました。
      • bitsandbytes が公式に Windows をサポートしたため複雑な手順が不要になりました。
    • また PyTorch のバージョンを 2.1.2 に更新しました。PyTorch はすぐに更新する必要はありません。更新時は、アップグレードの手順では PyTorch が更新されませんので、torch、torchvision、xformers を手動でインストールしてください。
  • wandb へのログ出力が有効の場合、コマンドライン全体が公開されます。そのため、コマンドラインに wandb の API キーや HuggingFace のトークンなどが含まれる場合、設定ファイル(.toml)への記載をお勧めします。問題提起していただいた bghira 氏に感謝します。
    • このような場合には学習開始時に警告が表示されます。
    • また絶対パスの指定がある場合、そのパスが公開される可能性がありますので、相対パスを指定するか設定ファイルに記載することをお勧めします。このような場合は INFO ログが表示されます。
    • 詳細は #1123 および PR #1240 をご覧ください。
  • Colab での動作時、ログ出力で停止してしまうようです。学習スクリプトに --console_log_simple オプションを指定し、rich のロギングを無効してお試しください。
  • その他、マスクロス追加、Scheduled Huber Loss 追加、DeepSpeed 対応、データセット設定の改善、画像タグ付けの改善などがあります。詳細は以下をご覧ください。

学習スクリプト

  • train_network.py および sdxl_train_network.py で、学習したモデルのメタデータに一部のデータセット設定が記録されるよう修正しました(caption_prefixcaption_suffixkeep_tokens_separatorsecondary_separatorenable_wildcard)。
  • train_network.py および sdxl_train_network.py で、state に U-Net および Text Encoder が含まれる不具合を修正しました。state の保存、読み込みが高速化され、ファイルサイズも小さくなり、また読み込み時のメモリ使用量も削減されます。
  • DeepSpeed がサポートされました。PR #1101#1139 BootsofLagrangian 氏に感謝します。詳細は PR #1101 をご覧ください。
  • 各学習スクリプトでマスクロスをサポートしました。PR #1207 詳細は マスクロスについて をご覧ください。
  • 各学習スクリプトに Scheduled Huber Loss を追加しました。PR #1228 ご提案いただいた kabachuha 氏、および議論を深めてくださった cheald 氏、drhead 氏を始めとする諸氏に感謝します。詳細は当該 PR および Scheduled Huber Loss について をご覧ください。
  • 各学習スクリプトに、noise offset、ip noise gammaを、それぞれ 0~指定した値の範囲で変動させるオプション --noise_offset_random_strength および --ip_noise_gamma_random_strength が追加されました。 PR #1177 KohakuBlueleaf 氏に感謝します。
  • 各学習スクリプトに、学習終了時に state を保存する --save_state_on_train_end オプションが追加されました。 PR #1168 gesen2egee 氏に感謝します。
  • 各学習スクリプトで --sample_every_n_epochs および --sample_every_n_steps オプションに 0 以下の数値を指定した時、警告を表示するとともにそれらを無視するよう変更しました。問題提起していただいた S-Del 氏に感謝します。

データセット設定

  • データセット設定の .toml ファイルが UTF-8 encoding で読み込まれるようになりました。PR #1167 Horizon1704 氏に感謝します。
  • データセット設定で、正則化画像のサブセットを複数指定した時、最後のサブセットの各種設定がすべてのサブセットの画像に適用される不具合が修正されました。それぞれのサブセットの設定が、それぞれの画像に正しく適用されます。PR #1205 feffy380 氏に感謝します。
  • データセットのサブセット設定にいくつかの機能を追加しました。
    • シャッフルの対象とならないタグ分割識別子の指定 secondary_separator を追加しました。secondary_separator=";;;" のように指定します。secondary_separator で区切ることで、その部分はシャッフル、drop 時にまとめて扱われます。
    • enable_wildcard を追加しました。true にするとワイルドカード記法 {aaa|bbb|ccc} が使えます。また複数行キャプションも有効になります。
    • keep_tokens_separator をキャプション内に 2 つ使えるようにしました。たとえば keep_tokens_separator="|||" と指定したとき、1girl, hatsune miku, vocaloid ||| stage, mic ||| best quality, rating: general とキャプションを指定すると、二番目の ||| で分割された部分はシャッフル、drop されず末尾に残ります。
    • 既存の機能 caption_prefixcaption_suffix とあわせて使えます。caption_prefixcaption_suffix は一番最初に処理され、その後、ワイルドカード、keep_tokens_separator、シャッフルおよび drop、secondary_separator の順に処理されます。
    • 詳細は データセット設定 をご覧ください。
  • DreamBooth 方式の DataSet で画像情報(サイズ、キャプション)をキャッシュする機能が追加されました。PR #1178#1206 KohakuBlueleaf 氏に感謝します。詳細は データセット設定 をご覧ください。
  • データセット設定の英語版ドキュメント が追加されました。PR #1175 darkstorm2150 氏に感謝します。

画像のタグ付け

  • tag_image_by_wd14_tagger.py で v3 のリポジトリがサポートされました(--onnx 指定時のみ有効)。 PR #1192 sdbds 氏に感謝します。
    • Onnx のバージョンアップが必要になるかもしれません。デフォルトでは Onnx はインストールされていませんので、pip install onnx==1.15.0 onnxruntime-gpu==1.17.1 等でインストール、アップデートしてください。requirements.txt のコメントもあわせてご確認ください。
  • tag_image_by_wd14_tagger.py で、モデルを--repo_id のサブディレクトリに保存するようにしました。これにより複数のモデルファイルがキャッシュされます。--model_dir 直下の不要なファイルは削除願います。
  • tag_image_by_wd14_tagger.py にいくつかのオプションを追加しました。
    • 一部は PR #1216 で追加されました。Disty0 氏に感謝します。
    • レーティングタグを出力する --use_rating_tags および --use_rating_tags_as_last_tag
    • キャラクタタグを最初に出力する --character_tags_first
    • キャラクタタグとシリーズを展開する --character_tag_expand
    • 常に最初に出力するタグを指定する --always_first_tags
    • タグを置換する --tag_replacement
    • 詳細は タグ付けに関するドキュメント をご覧ください。
  • make_captions.py--beam_search を指定し --num_beams に2以上の値を指定した時のエラーを修正しました。

マスクロスについて

各学習スクリプトでマスクロスをサポートしました。マスクロスを有効にするには --masked_loss オプションを指定してください。

機能は完全にテストされていないため、不具合があるかもしれません。その場合は Issue を立てていただけると助かります。

マスクの指定には ControlNet データセットを使用します。マスク画像は RGB 画像である必要があります。R チャンネルのピクセル値 255 がロス計算対象、0 がロス計算対象外になります。0-255 の値は、0-1 の範囲に変換されます(つまりピクセル値 128 の部分はロスの重みが半分になります)。データセットの詳細は LLLite ドキュメント をご覧ください。

Scheduled Huber Loss について

各学習スクリプトに、学習データ中の異常値や外れ値(data corruption)への耐性を高めるための手法、Scheduled Huber Lossが導入されました。

従来のMSE(L2)損失関数では、異常値の影響を大きく受けてしまい、生成画像の品質低下を招く恐れがありました。一方、Huber損失関数は異常値の影響を抑えられますが、画像の細部再現性が損なわれがちでした。

この手法ではHuber損失関数の適用を工夫し、学習の初期段階(ノイズが大きい場合)ではHuber損失を、後期段階ではMSEを用いるようスケジューリングすることで、異常値耐性と細部再現性のバランスを取ります。

実験の結果では、この手法が純粋なHuber損失やMSEと比べ、異常値を含むデータでより高い精度を達成することが確認されています。また計算コストの増加はわずかです。

具体的には、新たに追加された引数loss_type、huber_schedule、huber_cで、損失関数の種類(Huber, smooth L1, MSE)とスケジューリング方法(exponential, constant, SNR)を選択できます。これによりデータセットに応じた最適化が可能になります。

詳細は PR #1228 をご覧ください。

  • loss_type : 損失関数の種類を指定します。huber で Huber損失、smooth_l1 で smooth L1 損失、l2 で MSE 損失を選択します。デフォルトは l2 で、従来と同様です。
  • huber_schedule : スケジューリング方法を指定します。exponential で指数関数的、constant で一定、snr で信号対雑音比に基づくスケジューリングを選択します。デフォルトは snr です。
  • huber_c : Huber損失のパラメータを指定します。デフォルトは 0.1 です。

PR 内でいくつかの比較が共有されています。この機能を試す場合、最初は --loss_type smooth_l1 --huber_schedule snr --huber_c 0.1 などで試してみるとよいかもしれません。

最近の更新情報は Release をご覧ください。

Additional Information

Naming of LoRA

The LoRA supported by train_network.py has been named to avoid confusion. The documentation has been updated. The following are the names of LoRA types in this repository.

  1. LoRA-LierLa : (LoRA for Li n e a r La yers)

    LoRA for Linear layers and Conv2d layers with 1x1 kernel

  2. LoRA-C3Lier : (LoRA for C olutional layers with 3 x3 Kernel and Li n e a r layers)

    In addition to 1., LoRA for Conv2d layers with 3x3 kernel

LoRA-LierLa is the default LoRA type for train_network.py (without conv_dim network arg).

Sample image generation during training

A prompt file might look like this, for example

# prompt 1
masterpiece, best quality, (1girl), in white shirts, upper body, looking at viewer, simple background --n low quality, worst quality, bad anatomy,bad composition, poor, low effort --w 768 --h 768 --d 1 --l 7.5 --s 28

# prompt 2
masterpiece, best quality, 1boy, in business suit, standing at street, looking back --n (low quality, worst quality), bad anatomy,bad composition, poor, low effort --w 576 --h 832 --d 2 --l 5.5 --s 40

Lines beginning with # are comments. You can specify options for the generated image with options like --n after the prompt. The following can be used.

  • --n Negative prompt up to the next option.
  • --w Specifies the width of the generated image.
  • --h Specifies the height of the generated image.
  • --d Specifies the seed of the generated image.
  • --l Specifies the CFG scale of the generated image.
  • --s Specifies the number of steps in the generation.

The prompt weighting such as ( ) and [ ] are working.

sd-scripts's People

Contributors

ai-casanova avatar akx avatar bmaltais avatar bootsoflagrangian avatar breakcore2 avatar cjangcjengh avatar ddpn08 avatar dependabot[bot] avatar disty0 avatar fannovel16 avatar feffy380 avatar fireicewolf avatar hkingauditore avatar isotr0py avatar kohakublueleaf avatar kohya-ss avatar laksjdjf avatar linaqruf avatar mgz-dev avatar nameless1117 avatar p1atdev avatar rockerboo avatar sdbds avatar shirayu avatar space-nuko avatar tingtingin avatar tomj2ee avatar tsukimiya avatar u-haru avatar xzuyn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sd-scripts's Issues

returned non-zero exit status 2 + no such file or directory ...PC\\AppData\\Local\\Programs\\Python\\Python310\\Scripts\\accelerate.exe\\__main__.py

I tried running the lora train popup script and got through the question when after trying to run it gave me this error
I think this is about as far as I have ever gotten but I have gotten the "no such file or directory" with the "accelerate.exe\main.py" before so whatever is causing that seems to be the issue. I did pip freeze and got a long list which in the rentry lora training page said was bad so I followed the instructions but I still get the same results even after reinstalling everything in the venv and moving over the bits and bytes files. Honestly at this point I'm stumped. I have tried the kohya-ss gui lora script and still no luck there because I think I had the same issue. Any help would be greatly appreciated. Thanks so much.

(venv) C:\SD-SCRIPTS\sd-scripts>accelerate launch --num_cpu_threads_per_process 11 lora_train_popup.py
usage: lora_train_popup.py [-h] [--v2] [--v_parameterization]
[--pretrained_model_name_or_path PRETRAINED_MODEL_NAME_OR_PATH]
[--train_data_dir TRAIN_DATA_DIR] [--shuffle_caption]
[--caption_extension CAPTION_EXTENSION] [--caption_extention CAPTION_EXTENTION]
[--keep_tokens KEEP_TOKENS] [--color_aug] [--flip_aug]
[--face_crop_aug_range FACE_CROP_AUG_RANGE] [--random_crop] [--debug_dataset]
[--resolution RESOLUTION] [--cache_latents] [--enable_bucket]
[--min_bucket_reso MIN_BUCKET_RESO] [--max_bucket_reso MAX_BUCKET_RESO]
[--reg_data_dir REG_DATA_DIR] [--in_json IN_JSON] [--dataset_repeats DATASET_REPEATS]
[--output_dir OUTPUT_DIR] [--output_name OUTPUT_NAME]
[--save_precision {None,float,fp16,bf16}] [--save_every_n_epochs SAVE_EVERY_N_EPOCHS]
[--save_last_n_epochs SAVE_LAST_N_EPOCHS] [--save_state] [--resume RESUME]
[--train_batch_size TRAIN_BATCH_SIZE] [--max_token_length {None,150,225}] [--use_8bit_adam]
[--mem_eff_attn] [--xformers] [--vae VAE] [--learning_rate LEARNING_RATE]
[--max_train_steps MAX_TRAIN_STEPS] [--max_train_epochs MAX_TRAIN_EPOCHS]
[--max_data_loader_n_workers MAX_DATA_LOADER_N_WORKERS] [--seed SEED]
[--gradient_checkpointing] [--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS]
[--mixed_precision {no,fp16,bf16}] [--full_fp16] [--clip_skip CLIP_SKIP]
[--logging_dir LOGGING_DIR] [--log_prefix LOG_PREFIX] [--lr_scheduler LR_SCHEDULER]
[--lr_warmup_steps LR_WARMUP_STEPS] [--prior_loss_weight PRIOR_LOSS_WEIGHT] [--no_metadata]
[--save_model_as {None,ckpt,pt,safetensors}] [--unet_lr UNET_LR]
[--text_encoder_lr TEXT_ENCODER_LR] [--network_weights NETWORK_WEIGHTS]
[--network_module NETWORK_MODULE] [--network_dim NETWORK_DIM]
[--network_args [NETWORK_ARGS ...]] [--network_train_unet_only]
[--network_train_text_encoder_only]
lora_train_popup.py: error: argument --max_token_length: invalid choice: 75 (choose from None, 150, 225)
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\My PC\AppData\Local\Programs\Python\Python310\lib\runpy.py:196 in │
│ _run_module_as_main │
│ │
│ 193 │ main_globals = sys.modules["main"].dict
│ 194 │ if alter_argv: │
│ 195 │ │ sys.argv[0] = mod_spec.origin │
│ ❱ 196 │ return _run_code(code, main_globals, None, │
│ 197 │ │ │ │ │ "main", mod_spec) │
│ 198 │
│ 199 def run_module(mod_name, init_globals=None, │
│ │
│ C:\Users\My PC\AppData\Local\Programs\Python\Python310\lib\runpy.py:86 in _run_code │
│ │
│ 83 │ │ │ │ │ loader = loader, │
│ 84 │ │ │ │ │ package = pkg_name, │
│ 85 │ │ │ │ │ spec = mod_spec) │
│ ❱ 86 │ exec(code, run_globals) │
│ 87 │ return run_globals │
│ 88 │
│ 89 def run_module_code(code, init_globals=None, │
│ │
│ C:\Users\My PC\AppData\Local\Programs\Python\Python310\Scripts\accelerate.exe_main
.py:7 │
│ in │
│ │
│ [Errno 2] No such file or directory: "C:\Users\My │
│ PC\AppData\Local\Programs\Python\Python310\Scripts\accelerate.exe\main.py" │
│ │
│ C:\Users\My │
│ PC\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\accelerate_cli. │
│ py:45 in main │
│ │
│ 42 │ │ exit(1) │
│ 43 │ │
│ 44 │ # Run │
│ ❱ 45 │ args.func(args) │
│ 46 │
│ 47 │
│ 48 if name == "main": │
│ │
│ C:\Users\My │
│ PC\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py:1104 │
│ in launch_command │
│ │
│ 1101 │ elif defaults is not None and defaults.compute_environment == ComputeEnvironment.AMA │
│ 1102 │ │ sagemaker_launcher(defaults, args) │
│ 1103 │ else: │
│ ❱ 1104 │ │ simple_launcher(args) │
│ 1105 │
│ 1106 │
│ 1107 def main(): │
│ │
│ C:\Users\My │
│ PC\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py:567 │
│ in simple_launcher │
│ │
│ 564 │ process = subprocess.Popen(cmd, env=current_env) │
│ 565 │ process.wait() │
│ 566 │ if process.returncode != 0: │
│ ❱ 567 │ │ raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) │
│ 568 │
│ 569 │
│ 570 def multi_gpu_launcher(args): │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '["C:\Users\My PC\AppData\Local\Programs\Python\Python310\python.exe",
'lora_train_popup.py']' returned non-zero exit status 2.

LORA caption training: extremely long pauses between epochs

For some reason there is a large delay when epochs change, making training much slower, what could cause this?
My settings:
accelerate launch --num_cpu_threads_per_process 10 train_network.py --pretrained_model_name_or_path=B:\AIimages\stable-diffusion-webui\models\Stable-diffusion\model.ckpt --train_data_dir=B:\AIimages\training\data --output_dir=B:\train\out\ --in_json=B:\AIimages\training\data\meta_lat.json --resolution=512,512 --prior_loss_weight=1.0 --train_batch_size=4 --learning_rate=1e-3 --max_train_steps=15000 --use_8bit_adam --xformers --gradient_checkpointing --mixed_precision=fp16 --save_every_n_epochs=10 --network_module=networks.lora --shuffle_caption --unet_lr=3e-4 --text_encoder_lr=3e-5 --lr_scheduler=constant --save_model_as=safetensors --seed=115

Feature Request - Flash Attention

Hi!

Would be great to see some support for flash-attention (flash-attn in pip) as you already support xformers.
https://github.com/HazyResearch/flash-attention

My understanding is from using it on some other projects it can lower the vRAM requirements a little lower than xformers.
For example the sd_dreambooth extension supports flash attention (d8ahazard/sd_dreambooth_extension#283)
Would be useful for people who are unable to get xformers to work at all for training. (for example I can use it to generate images with out issues, but when training with xformers CUDA errors out)

Thanks!

Feature Request - Interrupt and Image Generation previews.

Much like the stable diffusion A1111 repo and the other dreambooth scripts out there, I would like to recommend adding the Interrupt function and support for image generation at each epoch or steps.

Features

Generate image preview of all concepts every X steps or Epochs (allows the user to decide when the model is fully trained and prevent over training)
Allow to interrupt / save at current step
Allow save state to support saving at X number of steps (this could be set the same amount as the generate preview)

Toggle between steps / epochs for save states and image generation

I have had my pc crash during training on occasion or automatic windows updates while being away from the pc, so being able to control how often it in steps would be useful for myself cause most of the time the end of a epoch is occasionally too far away in steps to reach before something bad happens.

Can somebody modifies this to run on colab?

I don't have an decent video card for training.if some one can modify this to run on colab will be much appreciated.
or if I can run this totally on CPU,without video card support?

Saving training states at different intervals than trained models

Hallo,

i want my training to be resumable, but I don't want to create a state directory every time a model file is created ("save_every_n_epochs" parameter).

Is there currently a way to separate those two jobs?
Like a different parameter for state dir creation "save_state_every_n_epochs"?

Normally for training i would want only the last state to be saved. So setting "save_state_every_n_epochs" to something high like 9999 should save only the last state (basically the same behavior as it is now with models and "save_every_n_epochs" parameter).

If there is currently no way to do it, would you consider implementing it?

P.S.
Also thanks a lot for creating a very fast and uncomplicated way to fine-tune models :)
LoRA training is amazingly fast with it.

[Question?] different result from gen_img_diffusers.py and AUTOMATIC1111 web ui

I don't think this is an issue for this repo but I am curious why it show different results on same parameters? Any ideas, inputs?

AUTOMATIC1111 web ui

00363-3771183235-christmas Award winning beautiful portrait commission of a zwx supermodel with a beautiful hyperdetailed attractive outfit and f

gen_img_diffusers.py

im_20221225010148_000_3771183235

Here is the prompt:

python gen_img_diffusers.py --outdir ./images_output --xformers --fp16 --max_embeddings_multiples 1 --vae stabilityai/sd-vae-ft-mse --prompt "christmas Award winning beautiful portrait commission of a zwx supermodel with a beautiful hyperdetailed attractive outfit and face wearing a golden red and green winter cozy outfit with red background and white snow falling around. character design by charlie bowater, ross tran, and makoto shinkai, detailed, inked, western comic book art --n ((((ugly)))), (((duplicate))), ((morbid)), ((mutilated)), [out of frame], extra fingers, mutated hands, ((poorly drawn hands)), ((poorly drawn face)), (((mutation))), (((deformed))), ((ugly)), blurry, ((bad anatomy)), (((bad proportions))), ((extra limbs)), cloned face, (((disfigured))), out of frame, ugly, extra limbs, (bad anatomy), gross proportions, (malformed limbs), ((missing arms)), ((missing legs)), (((extra arms))), (((extra legs))), mutated hands, (fused fingers), (too many fingers), (((long neck)))" --ckpt= ./analog-20-supermodel-2800-zwx.ckpt --sampler k_euler_a --steps 80 --scale 8 --images_per_prompt 1 --seed 3771183235

I am using a custom model. You may try the same prompt on your models and see.

PS: This is @aivandroid from Twitter. Nice to see you here 🥳

Accelerator acting like my GPU is not present?

accelerator.py not detecting my GPU

File "E:\sd-scripts\train_network.py", line 1453, in
train(args)
File "E:\sd-scripts\train_network.py", line 1017, in train
accelerator = Accelerator(gradient_accumulation_steps=args.gradient_accumulation_steps, mixed_precision=args.mixed_precision,
File "E:\sd-scripts\venv\lib\site-packages\accelerate\accelerator.py", line 355, in init
raise ValueError(err.format(mode="fp16", requirement="a GPU"))
ValueError: fp16 mixed precision requires a GPU

(venv) PS E:\sd-scripts> python
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.

import torch
torch.cuda.is_available()
True
torch.cuda.device_count()
1
torch.cuda.current_device()
0
torch.cuda.device(0)
<torch.cuda.device object at 0x000002DA22E42F50>
torch.cuda.get_device_name(0)
'NVIDIA GeForce RTX 4090'

Add `accelerate` torch.compile() support for faster training on Pytorch 2.0

When selecting this in accelerate config:

Do you wish to optimize your script with torch dynamo?[yes/NO]:yes
---------------------------------------------------------------------------------------------------------Which dynamo backend would you like to use?
Please select a choice using the arrow or number keys, and selecting with enter
    eager
    aot_eager
 ➔  inductor
    nvfuser
    aot_nvfuser
    aot_cudagraphs
    ofi
    fx2trt
    onnxrt
    ipex

The LORA training script errors out with:

steps:   0%|                                                                    | 0/1600 [00:00<?, ?it/s]epoch 1/2
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/output_graph.py:674 in             │
│ call_user_compiler                                                                               │
│                                                                                                  │
│   671 │   │   │   elif config.DO_NOT_USE_legacy_non_fake_example_inputs:                         │
│   672 │   │   │   │   compiled_fn = compiler_fn(gm, self.example_inputs())                       │
│   673 │   │   │   else:                                                                          │
│ ❱ 674 │   │   │   │   compiled_fn = compiler_fn(gm, self.fake_example_inputs())                  │
│   675 │   │   │   _step_logger()(logging.INFO, f"done compiler function {name}")                 │
│   676 │   │   │   assert callable(compiled_fn), "compiler_fn did not return callable"            │
│   677 │   │   except Exception as e:                                                             │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/debug_utils.py:1032 in             │
│ debug_wrapper                                                                                    │
│                                                                                                  │
│   1029 │   │   │   │   │   )                                                                     │
│   1030 │   │   │   │   │   raise                                                                 │
│   1031 │   │   else:                                                                             │
│ ❱ 1032 │   │   │   compiled_gm = compiler_fn(gm, example_inputs, **kwargs)                       │
│   1033 │   │                                                                                     │
│   1034 │   │   return compiled_gm                                                                │
│   1035                                                                                           │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_inductor/compile_fx.py:398 in compile_fx  │
│                                                                                                  │
│   395 │   │   # TODO: can add logging before/after the call to create_aot_dispatcher_function    │
│   396 │   │   # in torch._functorch/aot_autograd.py::aot_module_simplified::aot_function_simpl   │
│   397 │   │   # once torchdynamo is merged into pytorch                                          │
│ ❱ 398 │   │   return aot_autograd(                                                               │
│   399 │   │   │   fw_compiler=fw_compiler,                                                       │
│   400 │   │   │   bw_compiler=bw_compiler,                                                       │
│   401 │   │   │   decompositions=select_decomp_table(),                                          │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/optimizations/training.py:78 in    │
│ compiler_fn                                                                                      │
│                                                                                                  │
│    75 │   │   try:                                                                               │
│    76 │   │   │   # NB: NOT cloned!                                                              │
│    77 │   │   │   with enable_aot_logging():                                                     │
│ ❱  78 │   │   │   │   cg = aot_module_simplified(gm, example_inputs, **kwargs)                   │
│    79 │   │   │   │   counters["aot_autograd"]["ok"] += 1                                        │
│    80 │   │   │   │   return eval_frame.disable(cg)                                              │
│    81 │   │   except Exception:                                                                  │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py:2355 in         │
│ aot_module_simplified                                                                            │
│                                                                                                  │
│   2352 │   full_args.extend(params_flat)                                                         │
│   2353 │   full_args.extend(args)                                                                │
│   2354 │                                                                                         │
│ ❱ 2355 │   compiled_fn = create_aot_dispatcher_function(                                         │
│   2356 │   │   functional_call,                                                                  │
│   2357 │   │   full_args,                                                                        │
│   2358 │   │   aot_config,                                                                       │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/utils.py:94 in time_wrapper        │
│                                                                                                  │
│     91 │   │   if key not in compilation_metrics:                                                │
│     92 │   │   │   compilation_metrics[key] = []                                                 │
│     93 │   │   t0 = time.time()                                                                  │
│ ❱   94 │   │   r = func(*args, **kwargs)                                                         │
│     95 │   │   latency = time.time() - t0                                                        │
│     96 │   │   # print(f"Dynamo timer: key={key}, latency={latency:.2f} sec")                    │
│     97 │   │   compilation_metrics[key].append(latency)                                          │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py:2052 in         │
│ create_aot_dispatcher_function                                                                   │
│                                                                                                  │
│   2049 │   │   compiler_fn = partial(aot_wrapper_dedupe, compiler_fn=compiler_fn)                │
│   2050 │   │   # You can put more passes here                                                    │
│   2051 │   │                                                                                     │
│ ❱ 2052 │   │   compiled_fn = compiler_fn(flat_fn, fake_flat_tensor_args, aot_config)             │
│   2053 │   │                                                                                     │
│   2054 │   │   if not hasattr(compiled_fn, '_boxed_call'):                                       │
│   2055 │   │   │   compiled_fn = make_boxed_func(compiled_fn)                                    │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py:1273 in         │
│ aot_wrapper_dedupe                                                                               │
│                                                                                                  │
│   1270 │   # or not                                                                              │
│   1271 │   try:                                                                                  │
│   1272 │   │   with enable_python_dispatcher():                                                  │
│ ❱ 1273 │   │   │   fw_metadata, _out, _num_aliasing_metadata_outs = run_functionalized_fw_and_c  │
│   1274 │   │   │   │   flat_fn                                                                   │
│   1275 │   │   │   )(*flat_args)                                                                 │
│   1276 │   except RuntimeError as e:                                                             │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py:289 in inner    │
│                                                                                                  │
│    286 │   │                                                                                     │
│    287 │   │   torch._enable_functionalization(reapply_views=True)                               │
│    288 │   │   try:                                                                              │
│ ❱  289 │   │   │   outs = f(*f_args)                                                             │
│    290 │   │   finally:                                                                          │
│    291 │   │   │   torch._disable_functionalization()                                            │
│    292                                                                                           │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py:2327 in         │
│ functional_call                                                                                  │
│                                                                                                  │
│   2324 │   │   │   │   │   │   "ignore", "Anomaly Detection has been enabled."                   │
│   2325 │   │   │   │   │   )                                                                     │
│   2326 │   │   │   │   │   with torch.autograd.detect_anomaly(check_nan=False):                  │
│ ❱ 2327 │   │   │   │   │   │   out = Interpreter(mod).run(*args[params_len:], **kwargs)          │
│   2328 │   │   │   else:                                                                         │
│   2329 │   │   │   │   out = mod(*args[params_len:], **kwargs)                                   │
│   2330                                                                                           │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/fx/interpreter.py:136 in run               │
│                                                                                                  │
│   133 │   │   │   │   continue                                                                   │
│   134 │   │   │                                                                                  │
│   135 │   │   │   try:                                                                           │
│ ❱ 136 │   │   │   │   self.env[node] = self.run_node(node)                                       │
│   137 │   │   │   except Exception as e:                                                         │
│   138 │   │   │   │   msg = f"While executing {node.format_node()}"                              │
│   139 │   │   │   │   msg = '{}\n\n{}'.format(e.args[0], msg) if e.args else str(msg)            │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/fx/interpreter.py:177 in run_node          │
│                                                                                                  │
│   174 │   │   │   args, kwargs = self.fetch_args_kwargs_from_env(n)                              │
│   175 │   │   │   assert isinstance(args, tuple)                                                 │
│   176 │   │   │   assert isinstance(kwargs, dict)                                                │
│ ❱ 177 │   │   │   return getattr(self, n.op)(n.target, args, kwargs)                             │
│   178 │                                                                                          │
│   179 │   # Main Node running APIs                                                               │
│   180 │   @compatibility(is_backward_compatible=True)                                            │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/fx/interpreter.py:294 in call_module       │
│                                                                                                  │
│   291 │   │   assert isinstance(target, str)                                                     │
│   292 │   │   submod = self.fetch_attr(target)                                                   │
│   293 │   │                                                                                      │
│ ❱ 294 │   │   return submod(*args, **kwargs)                                                     │
│   295 │                                                                                          │
│   296 │   @compatibility(is_backward_compatible=True)                                            │
│   297 │   def output(self, target : 'Target', args : Tuple[Argument, ...], kwargs : Dict[str,    │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1482 in _call_impl    │
│                                                                                                  │
│   1479 │   │   if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks   │
│   1480 │   │   │   │   or _global_backward_pre_hooks or _global_backward_hooks                   │
│   1481 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                   │
│ ❱ 1482 │   │   │   return forward_call(*args, **kwargs)                                          │
│   1483 │   │   # Do not call functions when jit is used                                          │
│   1484 │   │   full_backward_hooks, non_full_backward_hooks = [], []                             │
│   1485 │   │   backward_pre_hooks = []                                                           │
│                                                                                                  │
│ /home/alpha/clone/sd-scripts/networks/lora.py:44 in forward                                      │
│                                                                                                  │
│    41 │   del self.org_module                                                                    │
│    42                                                                                            │
│    43   def forward(self, x):                                                                    │
│ ❱  44 │   return self.org_forward(x) + self.lora_up(self.lora_down(x)) * self.multiplier         │
│    45                                                                                            │
│    46                                                                                            │
│    47 def create_network(multiplier, network_dim, vae, text_encoder, unet, **kwargs):            │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1482 in _call_impl    │
│                                                                                                  │
│   1479 │   │   if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks   │
│   1480 │   │   │   │   or _global_backward_pre_hooks or _global_backward_hooks                   │
│   1481 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                   │
│ ❱ 1482 │   │   │   return forward_call(*args, **kwargs)                                          │
│   1483 │   │   # Do not call functions when jit is used                                          │
│   1484 │   │   full_backward_hooks, non_full_backward_hooks = [], []                             │
│   1485 │   │   backward_pre_hooks = []                                                           │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/nn/modules/linear.py:114 in forward        │
│                                                                                                  │
│   111 │   │   │   init.uniform_(self.bias, -bound, bound)                                        │
│   112 │                                                                                          │
│   113 │   def forward(self, input: Tensor) -> Tensor:                                            │
│ ❱ 114 │   │   return F.linear(input, self.weight, self.bias)                                     │
│   115 │                                                                                          │
│   116 │   def extra_repr(self) -> str:                                                           │
│   117 │   │   return 'in_features={}, out_features={}, bias={}'.format(                          │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_inductor/overrides.py:37 in               │
│ __torch_function__                                                                               │
│                                                                                                  │
│    34 │   │   │   and replacements[func] in replacements_using_triton_random                     │
│    35 │   │   ):                                                                                 │
│    36 │   │   │   return replacements[func](*args, **kwargs)                                     │
│ ❱  37 │   │   return func(*args, **kwargs)                                                       │
│    38                                                                                            │
│    39                                                                                            │
│    40 patch_functions = AutogradMonkeypatch                                                      │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py:825 in          │
│ __torch_dispatch__                                                                               │
│                                                                                                  │
│    822 │   │   │   ), f"{args} {kwargs}"                                                         │
│    823 │   │   │   return converter(self, args[0])                                               │
│    824 │   │                                                                                     │
│ ❱  825 │   │   args, kwargs = self.validate_and_convert_non_fake_tensors(                        │
│    826 │   │   │   func, converter, args, kwargs                                                 │
│    827 │   │   )                                                                                 │
│    828                                                                                           │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py:973 in          │
│ validate_and_convert_non_fake_tensors                                                            │
│                                                                                                  │
│    970 │   │   │   │   return converter(self, x)                                                 │
│    971 │   │   │   return x                                                                      │
│    972 │   │                                                                                     │
│ ❱  973 │   │   return tree_map_only(                                                             │
│    974 │   │   │   torch.Tensor,                                                                 │
│    975 │   │   │   validate,                                                                     │
│    976 │   │   │   (args, kwargs),                                                               │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/utils/_pytree.py:259 in tree_map_only      │
│                                                                                                  │
│   256 │   ...                                                                                    │
│   257                                                                                            │
│   258 def tree_map_only(ty: TypeAny, fn: FnAny[Any], pytree: PyTree) -> PyTree:                  │
│ ❱ 259 │   return tree_map(map_only(ty)(fn), pytree)                                              │
│   260                                                                                            │
│   261 def tree_all(pred: Callable[[Any], bool], pytree: PyTree) -> bool:                         │
│   262 │   flat_args, _ = tree_flatten(pytree)                                                    │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/utils/_pytree.py:195 in tree_map           │
│                                                                                                  │
│   192                                                                                            │
│   193 def tree_map(fn: Any, pytree: PyTree) -> PyTree:                                           │
│   194 │   flat_args, spec = tree_flatten(pytree)                                                 │
│ ❱ 195 │   return tree_unflatten([fn(i) for i in flat_args], spec)                                │
│   196                                                                                            │
│   197 Type2 = Tuple[Type[T], Type[S]]                                                            │
│   198 TypeAny = Union[Type[Any], Tuple[Type[Any], ...]]                                          │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/utils/_pytree.py:195 in <listcomp>         │
│                                                                                                  │
│   192                                                                                            │
│   193 def tree_map(fn: Any, pytree: PyTree) -> PyTree:                                           │
│   194 │   flat_args, spec = tree_flatten(pytree)                                                 │
│ ❱ 195 │   return tree_unflatten([fn(i) for i in flat_args], spec)                                │
│   196                                                                                            │
│   197 Type2 = Tuple[Type[T], Type[S]]                                                            │
│   198 TypeAny = Union[Type[Any], Tuple[Type[Any], ...]]                                          │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/utils/_pytree.py:244 in inner              │
│                                                                                                  │
│   241 │   │   @functools.wraps(f)                                                                │
│   242 │   │   def inner(x: T) -> Any:                                                            │
│   243 │   │   │   if isinstance(x, ty):                                                          │
│ ❱ 244 │   │   │   │   return f(x)                                                                │
│   245 │   │   │   else:                                                                          │
│   246 │   │   │   │   return x                                                                   │
│   247 │   │   return inner                                                                       │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py:965 in validate │
│                                                                                                  │
│    962 │   │   │   │   │   │   f"Can't call metadata mutating ops on non-Fake Tensor inputs. Fo  │
│    963 │   │   │   │   │   )                                                                     │
│    964 │   │   │   │   if not self.allow_non_fake_inputs:                                        │
│ ❱  965 │   │   │   │   │   raise Exception(                                                      │
│    966 │   │   │   │   │   │   f"Please convert all Tensors to FakeTensors first or instantiate  │
│    967 │   │   │   │   │   │   f"with 'allow_non_fake_inputs'. Found in {func}(*{args}, **{kwar  │
│    968 │   │   │   │   │   )                                                                     │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
Exception: Please convert all Tensors to FakeTensors first or instantiate FakeTensorMode with
'allow_non_fake_inputs'. Found in aten._to_copy.default(*(Parameter containing:
tensor([[ 0.0292,  0.0266,  0.0296,  ...,  0.0353, -0.0317, -0.0230],
        [ 0.0112, -0.0135,  0.0291,  ..., -0.0087,  0.0124,  0.0297],
        [-0.0299,  0.0291, -0.0143,  ..., -0.0097,  0.0106, -0.0191],
        [-0.0344, -0.0083,  0.0227,  ...,  0.0093,  0.0345, -0.0343]],
       device='cuda:0', requires_grad=True),), **{'dtype': torch.float16})

While executing %self_text_model_encoder_layers_0_self_attn_q_proj : [#users=1] =
call_module[target=self_text_model_encoder_layers_0_self_attn_q_proj](args =
(%self_text_model_encoder_layers_0_layer_norm1,), kwargs = {})
Original traceback:
  File "/home/alpha/.local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line
209, in forward
    query_states = self.q_proj(hidden_states) * self.scale
 |   File "/home/alpha/.local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py",
line 317, in forward
    hidden_states, attn_weights = self.self_attn(
 |   File "/home/alpha/.local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py",
line 574, in forward
    layer_outputs = encoder_layer(
 |   File "/home/alpha/.local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py",
line 643, in forward
    encoder_outputs = self.encoder(
 |   File "/home/alpha/.local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py",
line 722, in forward
    return self.text_model(


The above exception was the direct cause of the following exception:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/alpha/clone/sd-scripts/train_network.py:419 in <module>                                    │
│                                                                                                  │
│   416 │   │   │   │   │     help="only training Text Encoder part / Text Encoder関連部分のみ学   │
│   417                                                                                            │
│   418   args = parser.parse_args()                                                               │
│ ❱ 419   train(args)                                                                              │
│   420                                                                                            │
│                                                                                                  │
│ /home/alpha/clone/sd-scripts/train_network.py:283 in train                                       │
│                                                                                                  │
│   280 │   │   with torch.set_grad_enabled(train_text_encoder):                                   │
│   281 │   │     # Get the text embedding for conditioning                                        │
│   282 │   │     input_ids = batch["input_ids"].to(accelerator.device)                            │
│ ❱ 283 │   │     encoder_hidden_states = train_util.get_hidden_states(args, input_ids, tokenize   │
│   284 │   │                                                                                      │
│   285 │   │   # Sample noise that we'll add to the latents                                       │
│   286 │   │   noise = torch.randn_like(latents, device=latents.device)                           │
│                                                                                                  │
│ /home/alpha/clone/sd-scripts/library/train_util.py:1257 in get_hidden_states                     │
│                                                                                                  │
│   1254   if args.clip_skip is None:                                                              │
│   1255 │   encoder_hidden_states = text_encoder(input_ids)[0]                                    │
│   1256   else:                                                                                   │
│ ❱ 1257 │   enc_out = text_encoder(input_ids, output_hidden_states=True, return_dict=True)        │
│   1258 │   encoder_hidden_states = enc_out['hidden_states'][-args.clip_skip]                     │
│   1259 │   if weight_dtype is not None:                                                          │
│   1260 │     # this is required for additional network training                                  │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1482 in _call_impl    │
│                                                                                                  │
│   1479 │   │   if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks   │
│   1480 │   │   │   │   or _global_backward_pre_hooks or _global_backward_hooks                   │
│   1481 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                   │
│ ❱ 1482 │   │   │   return forward_call(*args, **kwargs)                                          │
│   1483 │   │   # Do not call functions when jit is used                                          │
│   1484 │   │   full_backward_hooks, non_full_backward_hooks = [], []                             │
│   1485 │   │   backward_pre_hooks = []                                                           │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/accelerate/utils/operations.py:490 in __call__   │
│                                                                                                  │
│   487 │   │   update_wrapper(self, model_forward)                                                │
│   488 │                                                                                          │
│   489 │   def __call__(self, *args, **kwargs):                                                   │
│ ❱ 490 │   │   return convert_to_fp32(self.model_forward(*args, **kwargs))                        │
│   491 │                                                                                          │
│   492 │   def __getstate__(self):                                                                │
│   493 │   │   raise pickle.PicklingError(                                                        │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/amp/autocast_mode.py:14 in                 │
│ decorate_autocast                                                                                │
│                                                                                                  │
│    11 │   @functools.wraps(func)                                                                 │
│    12 │   def decorate_autocast(*args, **kwargs):                                                │
│    13 │   │   with autocast_instance:                                                            │
│ ❱  14 │   │   │   return func(*args, **kwargs)                                                   │
│    15 │   decorate_autocast.__script_unsupported = '@autocast() decorator is not supported in    │
│    16 │   return decorate_autocast                                                               │
│    17                                                                                            │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py:83 in forward        │
│                                                                                                  │
│    80 │   │   return getattr(self._orig_mod, name)                                               │
│    81 │                                                                                          │
│    82 │   def forward(self, *args, **kwargs):                                                    │
│ ❱  83 │   │   return self.dynamo_ctx(self._orig_mod.forward)(*args, **kwargs)                    │
│    84                                                                                            │
│    85                                                                                            │
│    86 def remove_from_cache(f):                                                                  │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py:212 in _fn           │
│                                                                                                  │
│   209 │   │   │   dynamic_ctx = enable_dynamic(self.dynamic)                                     │
│   210 │   │   │   dynamic_ctx.__enter__()                                                        │
│   211 │   │   │   try:                                                                           │
│ ❱ 212 │   │   │   │   return fn(*args, **kwargs)                                                 │
│   213 │   │   │   finally:                                                                       │
│   214 │   │   │   │   set_eval_frame(prior)                                                      │
│   215 │   │   │   │   dynamic_ctx.__exit__(None, None, None)                                     │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py:333 in catch_errors  │
│                                                                                                  │
│   330 │   │   │   │   │   return hijacked_callback(frame, cache_size, hooks)                     │
│   331 │   │                                                                                      │
│   332 │   │   with compile_lock:                                                                 │
│ ❱ 333 │   │   │   return callback(frame, cache_size, hooks)                                      │
│   334 │                                                                                          │
│   335 │   catch_errors._torchdynamo_orig_callable = callback  # type: ignore[attr-defined]       │
│   336 │   return catch_errors                                                                    │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py:480 in            │
│ _convert_frame                                                                                   │
│                                                                                                  │
│   477 │   def _convert_frame(frame: types.FrameType, cache_size: int, hooks: Hooks):             │
│   478 │   │   counters["frames"]["total"] += 1                                                   │
│   479 │   │   try:                                                                               │
│ ❱ 480 │   │   │   result = inner_convert(frame, cache_size, hooks)                               │
│   481 │   │   │   counters["frames"]["ok"] += 1                                                  │
│   482 │   │   │   return result                                                                  │
│   483 │   │   except (NotImplementedError, Unsupported):                                         │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py:103 in _fn        │
│                                                                                                  │
│   100 │   │   prior_fwd_from_src = torch.fx.graph_module._forward_from_src                       │
│   101 │   │   torch.fx.graph_module._forward_from_src = fx_forward_from_src_skip_result          │
│   102 │   │   try:                                                                               │
│ ❱ 103 │   │   │   return fn(*args, **kwargs)                                                     │
│   104 │   │   finally:                                                                           │
│   105 │   │   │   torch._C._set_grad_enabled(prior_grad_mode)                                    │
│   106 │   │   │   torch.random.set_rng_state(rng_state)                                          │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/utils.py:94 in time_wrapper        │
│                                                                                                  │
│     91 │   │   if key not in compilation_metrics:                                                │
│     92 │   │   │   compilation_metrics[key] = []                                                 │
│     93 │   │   t0 = time.time()                                                                  │
│ ❱   94 │   │   r = func(*args, **kwargs)                                                         │
│     95 │   │   latency = time.time() - t0                                                        │
│     96 │   │   # print(f"Dynamo timer: key={key}, latency={latency:.2f} sec")                    │
│     97 │   │   compilation_metrics[key].append(latency)                                          │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py:339 in            │
│ _convert_frame_assert                                                                            │
│                                                                                                  │
│   336 │   │   global initial_grad_state                                                          │
│   337 │   │   initial_grad_state = torch.is_grad_enabled()                                       │
│   338 │   │                                                                                      │
│ ❱ 339 │   │   return _compile(                                                                   │
│   340 │   │   │   frame.f_code,                                                                  │
│   341 │   │   │   frame.f_globals,                                                               │
│   342 │   │   │   frame.f_locals,                                                                │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py:400 in _compile   │
│                                                                                                  │
│   397 │   try:                                                                                   │
│   398 │   │   for attempt in itertools.count():                                                  │
│   399 │   │   │   try:                                                                           │
│ ❱ 400 │   │   │   │   out_code = transform_code_object(code, transform)                          │
│   401 │   │   │   │   orig_code_map[out_code] = code                                             │
│   402 │   │   │   │   break                                                                      │
│   403 │   │   │   except exc.RestartAnalysis:                                                    │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py:341 in  │
│ transform_code_object                                                                            │
│                                                                                                  │
│   338 │   instructions = cleaned_instructions(code, safe)                                        │
│   339 │   propagate_line_nums(instructions)                                                      │
│   340 │                                                                                          │
│ ❱ 341 │   transformations(instructions, code_options)                                            │
│   342 │                                                                                          │
│   343 │   fix_vars(instructions, code_options)                                                   │
│   344                                                                                            │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py:387 in transform  │
│                                                                                                  │
│   384 │   │   │   export,                                                                        │
│   385 │   │   │   mutated_closure_cell_contents,                                                 │
│   386 │   │   )                                                                                  │
│ ❱ 387 │   │   tracer.run()                                                                       │
│   388 │   │   output = tracer.output                                                             │
│   389 │   │   assert output is not None                                                          │
│   390 │   │   assert output.output_instructions                                                  │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py:1692 in run    │
│                                                                                                  │
│   1689 │                                                                                         │
│   1690 │   def run(self):                                                                        │
│   1691 │   │   _step_logger()(logging.INFO, f"torchdynamo start tracing {self.f_code.co_name}")  │
│ ❱ 1692 │   │   super().run()                                                                     │
│   1693 │                                                                                         │
│   1694 │   def match_nested_cell(self, name, cell):                                              │
│   1695 │   │   """Match a cell in this method to one in a function we are inlining"""            │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py:538 in run     │
│                                                                                                  │
│    535 │   │   │   while (                                                                       │
│    536 │   │   │   │   self.instruction_pointer is not None                                      │
│    537 │   │   │   │   and not self.output.should_exit                                           │
│ ❱  538 │   │   │   │   and self.step()                                                           │
│    539 │   │   │   ):                                                                            │
│    540 │   │   │   │   pass                                                                      │
│    541 │   │   except BackendCompilerFailed:                                                     │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py:501 in step    │
│                                                                                                  │
│    498 │   │   try:                                                                              │
│    499 │   │   │   if not hasattr(self, inst.opname):                                            │
│    500 │   │   │   │   unimplemented(f"missing: {inst.opname}")                                  │
│ ❱  501 │   │   │   getattr(self, inst.opname)(inst)                                              │
│    502 │   │   │                                                                                 │
│    503 │   │   │   return inst.opname != "RETURN_VALUE"                                          │
│    504 │   │   except BackendCompilerFailed:                                                     │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py:1758 in        │
│ RETURN_VALUE                                                                                     │
│                                                                                                  │
│   1755 │   │   │   f"torchdynamo done tracing {self.f_code.co_name} (RETURN_VALUE)",             │
│   1756 │   │   )                                                                                 │
│   1757 │   │   log.debug("RETURN_VALUE triggered compile")                                       │
│ ❱ 1758 │   │   self.output.compile_subgraph(self)                                                │
│   1759 │   │   self.output.add_output_instructions([create_instruction("RETURN_VALUE")])         │
│   1760                                                                                           │
│   1761                                                                                           │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/output_graph.py:551 in             │
│ compile_subgraph                                                                                 │
│                                                                                                  │
│   548 │   │   │   output = []                                                                    │
│   549 │   │   │   if count_calls(self.graph) != 0 or len(pass2.graph_outputs) != 0:              │
│   550 │   │   │   │   output.extend(                                                             │
│ ❱ 551 │   │   │   │   │   self.compile_and_call_fx_graph(tx, pass2.graph_output_vars(), root)    │
│   552 │   │   │   │   )                                                                          │
│   553 │   │   │   │                                                                              │
│   554 │   │   │   │   if len(pass2.graph_outputs) != 0:                                          │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/output_graph.py:598 in             │
│ compile_and_call_fx_graph                                                                        │
│                                                                                                  │
│   595 │   │                                                                                      │
│   596 │   │   assert_no_fake_params_or_buffers(gm)                                               │
│   597 │   │   with tracing(self.tracing_context):                                                │
│ ❱ 598 │   │   │   compiled_fn = self.call_user_compiler(gm)                                      │
│   599 │   │   compiled_fn = disable(compiled_fn)                                                 │
│   600 │   │                                                                                      │
│   601 │   │   counters["stats"]["unique_graphs"] += 1                                            │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/torch/_dynamo/output_graph.py:679 in             │
│ call_user_compiler                                                                               │
│                                                                                                  │
│   676 │   │   │   assert callable(compiled_fn), "compiler_fn did not return callable"            │
│   677 │   │   except Exception as e:                                                             │
│   678 │   │   │   compiled_fn = gm.forward                                                       │
│ ❱ 679 │   │   │   raise BackendCompilerFailed(self.compiler_fn, e) from e                        │
│   680 │   │   return compiled_fn                                                                 │
│   681 │                                                                                          │
│   682 │   def fake_example_inputs(self) -> List[torch.Tensor]:                                   │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
BackendCompilerFailed: compile_fx raised Exception: Please convert all Tensors to FakeTensors first or
instantiate FakeTensorMode with 'allow_non_fake_inputs'. Found in aten._to_copy.default(*(Parameter
containing:
tensor([[ 0.0292,  0.0266,  0.0296,  ...,  0.0353, -0.0317, -0.0230],
        [ 0.0112, -0.0135,  0.0291,  ..., -0.0087,  0.0124,  0.0297],
        [-0.0299,  0.0291, -0.0143,  ..., -0.0097,  0.0106, -0.0191],
        [-0.0344, -0.0083,  0.0227,  ...,  0.0093,  0.0345, -0.0343]],
       device='cuda:0', requires_grad=True),), **{'dtype': torch.float16})

While executing %self_text_model_encoder_layers_0_self_attn_q_proj : [#users=1] =
call_module[target=self_text_model_encoder_layers_0_self_attn_q_proj](args =
(%self_text_model_encoder_layers_0_layer_norm1,), kwargs = {})
Original traceback:
  File "/home/alpha/.local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line
209, in forward
    query_states = self.q_proj(hidden_states) * self.scale
 |   File "/home/alpha/.local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py",
line 317, in forward
    hidden_states, attn_weights = self.self_attn(
 |   File "/home/alpha/.local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py",
line 574, in forward
    layer_outputs = encoder_layer(
 |   File "/home/alpha/.local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py",
line 643, in forward
    encoder_outputs = self.encoder(
 |   File "/home/alpha/.local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py",
line 722, in forward
    return self.text_model(


Set torch._dynamo.config.verbose=True for more information


You can suppress this exception and fall back to eager by setting:
    torch._dynamo.config.suppress_errors = True

steps:   0%|                                                                    | 0/1600 [00:03<?, ?it/s]
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/alpha/.local/bin/accelerate:8 in <module>                                                  │
│                                                                                                  │
│   5 from accelerate.commands.accelerate_cli import main                                          │
│   6 if __name__ == '__main__':                                                                   │
│   7 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])                         │
│ ❱ 8 │   sys.exit(main())                                                                         │
│   9                                                                                              │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py:45 in main │
│                                                                                                  │
│   42 │   │   exit(1)                                                                             │
│   43 │                                                                                           │
│   44 │   # Run                                                                                   │
│ ❱ 45 │   args.func(args)                                                                         │
│   46                                                                                             │
│   47                                                                                             │
│   48 if __name__ == "__main__":                                                                  │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/accelerate/commands/launch.py:1104 in            │
│ launch_command                                                                                   │
│                                                                                                  │
│   1101 │   elif defaults is not None and defaults.compute_environment == ComputeEnvironment.AMA  │
│   1102 │   │   sagemaker_launcher(defaults, args)                                                │
│   1103 │   else:                                                                                 │
│ ❱ 1104 │   │   simple_launcher(args)                                                             │
│   1105                                                                                           │
│   1106                                                                                           │
│   1107 def main():                                                                               │
│                                                                                                  │
│ /home/alpha/.local/lib/python3.10/site-packages/accelerate/commands/launch.py:567 in             │
│ simple_launcher                                                                                  │
│                                                                                                  │
│    564 │   process = subprocess.Popen(cmd, env=current_env)                                      │
│    565 │   process.wait()                                                                        │
│    566 │   if process.returncode != 0:                                                           │
│ ❱  567 │   │   raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)       │
│    568                                                                                           │
│    569                                                                                           │
│    570 def multi_gpu_launcher(args):                                                             │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['/usr/bin/python', 'train_network.py',
'--pretrained_model_name_or_path=/home/alpha/Storage/AIModels/Stable-diffusion/panatomy05full_0.7-AIModel
s_Anything-V3.0-pruned-fp16_0.3-Weighted_sum-merged.ckpt',
'--train_data_dir=/home/alpha/Storage/TrainingData/test/training_data',
'--output_dir=/home/alpha/Storage/TrainingOutput/test/', '--prior_loss_weight=1.0',
'--resolution=512,512', '--train_batch_size=1', '--learning_rate=1e-5', '--max_train_steps=1600',
'--use_8bit_adam', '--xformers', '--mixed_precision=fp16', '--cache_latents', '--save_precision=fp16',
'--save_model_as=safetensors', '--clip_skip=2', '--network_module=networks.lora']' returned non-zero exit
status 1.

(While the same arguments work with TorchDynamo disabled.

Maybe torch.compile() needs to be added conditionally and manually, instead of automatically with accelerate?

Feature: Print() where captions are being sourced from for clarity

Hey again. Today I realized that captions can be sourced from 3 different locations, but we can't tell where from the training log.

  • metadata file
  • directory name
  • captions file

There is currently no distinction in the logs which source the captions are coming from. It would be great to have a simple print line that states this clearly, to make sure you are training on the correct captions.

Ex: Earlier I was training LoRA Dreambooth and was accidentally using directory name captions instead of captions file, but had no idea until training was done.

'--max_token_length None' doesn't seem to work

PowerShell 7.3.1
PS D:\git\sd-scripts> .\venv\Scripts\activate
(venv) PS D:\git\sd-scripts> accelerate launch .\train_network.py --max_token_length "None"
usage: train_network.py [-h] [--v2] [--v_parameterization]
                        [--pretrained_model_name_or_path PRETRAINED_MODEL_NAME_OR_PATH]
                        [--train_data_dir TRAIN_DATA_DIR] [--shuffle_caption] [--caption_extension CAPTION_EXTENSION]
                        [--caption_extention CAPTION_EXTENTION] [--keep_tokens KEEP_TOKENS] [--color_aug] [--flip_aug]
                        [--face_crop_aug_range FACE_CROP_AUG_RANGE] [--random_crop] [--debug_dataset]
                        [--resolution RESOLUTION] [--cache_latents] [--enable_bucket]
                        [--min_bucket_reso MIN_BUCKET_RESO] [--max_bucket_reso MAX_BUCKET_RESO]
                        [--reg_data_dir REG_DATA_DIR] [--in_json IN_JSON] [--dataset_repeats DATASET_REPEATS]
                        [--output_dir OUTPUT_DIR] [--output_name OUTPUT_NAME]
                        [--save_precision {None,float,fp16,bf16}] [--save_every_n_epochs SAVE_EVERY_N_EPOCHS]
                        [--save_last_n_epochs SAVE_LAST_N_EPOCHS] [--save_state] [--resume RESUME]
                        [--train_batch_size TRAIN_BATCH_SIZE] [--max_token_length {None,150,225}] [--use_8bit_adam]
                        [--mem_eff_attn] [--xformers] [--vae VAE] [--learning_rate LEARNING_RATE]
                        [--max_train_steps MAX_TRAIN_STEPS] [--seed SEED] [--gradient_checkpointing]
                        [--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS] [--mixed_precision {no,fp16,bf16}]
                        [--full_fp16] [--clip_skip CLIP_SKIP] [--logging_dir LOGGING_DIR] [--log_prefix LOG_PREFIX]
                        [--lr_scheduler LR_SCHEDULER] [--lr_warmup_steps LR_WARMUP_STEPS]
                        [--prior_loss_weight PRIOR_LOSS_WEIGHT] [--save_model_as {None,ckpt,pt,safetensors}]
                        [--unet_lr UNET_LR] [--text_encoder_lr TEXT_ENCODER_LR] [--network_weights NETWORK_WEIGHTS]
                        [--network_module NETWORK_MODULE] [--network_dim NETWORK_DIM]
                        [--network_args [NETWORK_ARGS ...]] [--network_train_unet_only]
                        [--network_train_text_encoder_only]
train_network.py: error: argument --max_token_length: invalid int value: 'None'
Traceback (most recent call last):
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "D:\git\sd-scripts\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module>
  File "D:\git\sd-scripts\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
    args.func(args)
  File "D:\git\sd-scripts\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
    simple_launcher(args)
  File "D:\git\sd-scripts\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['D:\\git\\sd-scripts\\venv\\Scripts\\python.exe', '.\\train_network.py', '--max_token_length', 'None']' returned non-zero exit status 2.
(venv) PS D:\git\sd-scripts>

extract_lora does not work because module keys don't match any SD1.x models

Hey again.

Edit: I see it works for SD2.x models so I guess the SD1.x keys are not the same and need to be added. Is it SpatialTransformer that's missing?

I was attempting to try out the extract_lora_from_models.py but realized that UNET_TARGET_REPLACE_MODULE = ["Transformer2DModel", "Attention"] never matches any layers in any models I throw at it, so the result is always create LoRA for U-Net: 0 modules., and an empty output file.

Are these the correct keys for SD1.x models?

CUDA Error: no kernel image is available for execution on the device

Hi there,
I am brand new with Neural Network and when I tried to input a test video into a Computer, this is what I got:
nvidia@nvidia-desktop:~$ "/home/nvidia/test.sh"
ci: Using default 'data/ci.txt'
coord: Using default 'data/coord.txt'
Polaris Object Detection
layer filters size input output
0 conv 32 3 x 3 / 1 480 x 352 x 3 -> 480 x 352 x 32 0.292 BFLOPs
1 max 2 x 2 / 2 480 x 352 x 32 -> 240 x 176 x 32
2 conv 64 3 x 3 / 1 240 x 176 x 32 -> 240 x 176 x 64 1.557 BFLOPs
3 max 2 x 2 / 2 240 x 176 x 64 -> 120 x 88 x 64
4 conv 128 3 x 3 / 1 120 x 88 x 64 -> 120 x 88 x 128 1.557 BFLOPs
5 conv 64 1 x 1 / 1 120 x 88 x 128 -> 120 x 88 x 64 0.173 BFLOPs
6 conv 128 3 x 3 / 1 120 x 88 x 64 -> 120 x 88 x 128 1.557 BFLOPs
7 max 2 x 2 / 2 120 x 88 x 128 -> 60 x 44 x 128
8 conv 256 3 x 3 / 1 60 x 44 x 128 -> 60 x 44 x 256 1.557 BFLOPs
9 conv 128 1 x 1 / 1 60 x 44 x 256 -> 60 x 44 x 128 0.173 BFLOPs
10 conv 256 3 x 3 / 1 60 x 44 x 128 -> 60 x 44 x 256 1.557 BFLOPs
11 max 2 x 2 / 2 60 x 44 x 256 -> 30 x 22 x 256
12 conv 512 3 x 3 / 1 30 x 22 x 256 -> 30 x 22 x 512 1.557 BFLOPs
13 conv 256 1 x 1 / 1 30 x 22 x 512 -> 30 x 22 x 256 0.173 BFLOPs
14 conv 512 3 x 3 / 1 30 x 22 x 256 -> 30 x 22 x 512 1.557 BFLOPs
15 conv 256 1 x 1 / 1 30 x 22 x 512 -> 30 x 22 x 256 0.173 BFLOPs
16 conv 512 3 x 3 / 1 30 x 22 x 256 -> 30 x 22 x 512 1.557 BFLOPs
17 max 2 x 2 / 2 30 x 22 x 512 -> 15 x 11 x 512
18 conv 1024 3 x 3 / 1 15 x 11 x 512 -> 15 x 11 x1024 1.557 BFLOPs
19 conv 512 1 x 1 / 1 15 x 11 x1024 -> 15 x 11 x 512 0.173 BFLOPs
20 conv 1024 3 x 3 / 1 15 x 11 x 512 -> 15 x 11 x1024 1.557 BFLOPs
21 conv 512 1 x 1 / 1 15 x 11 x1024 -> 15 x 11 x 512 0.173 BFLOPs
22 conv 1024 3 x 3 / 1 15 x 11 x 512 -> 15 x 11 x1024 1.557 BFLOPs
23 conv 1024 3 x 3 / 1 15 x 11 x1024 -> 15 x 11 x1024 3.114 BFLOPs
24 conv 1024 3 x 3 / 1 15 x 11 x1024 -> 15 x 11 x1024 3.114 BFLOPs
25 route 16
26 conv 64 1 x 1 / 1 30 x 22 x 512 -> 30 x 22 x 64 0.043 BFLOPs
27 reorg / 2 30 x 22 x 64 -> 15 x 11 x 256
28 route 27 24
29 conv 1024 3 x 3 / 1 15 x 11 x1280 -> 15 x 11 x1024 3.893 BFLOPs
30 conv 35 1 x 1 / 1 15 x 11 x1024 -> 15 x 11 x 35 0.012 BFLOPs
31 detection
mask_scale: Using default '1.000000'
CUDA Error: no kernel image is available for execution on the device
polarisnnet: ./src/cuda.c:36: check_error: Assertion `0' failed.
/home/nvidia/test.sh: line 4: 4502 Aborted (core dumped) ./polarisnnet detector line data/test/t.data data/test/t.cfg data/test/t.weights data/test/t.mp4
Do you guys know what happened?

"The paging file is too small for this operation to complete."

This is the error i get while trying to train with the script. Is my 1070ti with 8GB Vram the issue or did i mess up applying the script or a dependency issue?

steps:   0%|                                                                                    | 0/25 [00:00<?, ?it/s]epoch 1/1
Traceback (most recent call last):
  File "C:\Users\...\sd-scripts\venv\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 62, in <module>
    from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: DLL load failed while importing _pywrap_tensorflow_internal: The paging file is too small for this operation to complete.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\...\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\Users\...\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "C:\Users\...\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\...\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "C:\Users\...\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "C:\Users\...\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "C:\Users\...\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\...\sd-scripts\lora_train_popup.py", line 8, in <module>
    import train_network
  File "C:\Users\...\sd-scripts\train_network.py", line 9, in <module>
    from accelerate.utils import set_seed
  File "C:\Users\...\sd-scripts\venv\lib\site-packages\accelerate\__init__.py", line 7, in <module>
    from .accelerator import Accelerator
  File "C:\Users\...\sd-scripts\venv\lib\site-packages\accelerate\accelerator.py", line 27, in <module>
    from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
  File "C:\Users\...\sd-scripts\venv\lib\site-packages\accelerate\checkpointing.py", line 24, in <module>
    from .utils import (
  File "C:\Users\...\sd-scripts\venv\lib\site-packages\accelerate\utils\__init__.py", line 103, in <module>
    from .megatron_lm import (
  File "C:\Users\...\sd-scripts\venv\lib\site-packages\accelerate\utils\megatron_lm.py", line 32, in <module>
    from transformers.modeling_outputs import (
  File "C:\Users\...\sd-scripts\venv\lib\site-packages\transformers\__init__.py", line 30, in <module>
    from . import dependency_versions_check
  File "C:\Users\...\sd-scripts\venv\lib\site-packages\transformers\dependency_versions_check.py", line 17, in <module>
    from .utils.versions import require_version, require_version_core
  File "C:\Users\...\sd-scripts\venv\lib\site-packages\transformers\utils\__init__.py", line 34, in <module>
    from .generic import (
  File "C:\Users\...\sd-scripts\venv\lib\site-packages\transformers\utils\generic.py", line 33, in <module>
    import tensorflow as tf
  File "C:\Users\...\sd-scripts\venv\lib\site-packages\tensorflow\__init__.py", line 37, in <module>
    from tensorflow.python.tools import module_util as _module_util
  File "C:\Users\...\sd-scripts\venv\lib\site-packages\tensorflow\python\__init__.py", line 36, in <module>
    from tensorflow.python import pywrap_tensorflow as _pywrap_tensorflow
  File "C:\Users\...\sd-scripts\venv\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 77, in <module>
    raise ImportError(
ImportError: Traceback (most recent call last):
  File "C:\Users\...\sd-scripts\venv\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 62, in <module>
    from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: DLL load failed while importing _pywrap_tensorflow_internal: The paging file is too small for this operation to complete.


Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/errors for some common causes and solutions.
If you need help, create an issue at https://github.com/tensorflow/tensorflow/issues and include the entire stack trace above this error message.

DB Warning "accelerate does not support gradient_accumulation_steps when training multiple models"

When running the db training script with gradient accumulation it warns accelerate does not support gradient_accumulation_steps when training multiple models (U-Net and Text Encoder)

But this same warning does not appear when using the finetune script.
Is this an actual issue for both scripts or just the db one?

Also I'm not even training the text encoder, so I'm wondering if I should be concerned at all?

text encoderの学習が途中で止まらない

sd-scripts/train_db.py

Lines 1011 to 1013 in d9bb4aa

if stop_text_encoder_training:
print(f"stop text encoder training at step {global_step}")
text_encoder.train(False)

train(False)は推論に使わないdropoutを無効化するなどの機能で、パラメータの更新を止めるものではないようです。
https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.eval

勾配計算を無効化するには、
text_encoder.requires_grad_(False)
でできますが、勾配が0になるだけでAdamの残ったモーメント分は更新されてしまうらしいです。

止まっていないことはtext_encoder/pytorch_model.binのhash値で確認しました。

Save training metadata to model outputs

Hi thanks so much for your work with the LoRA training, its been a blast

I was thinking it would be a good help if the training parameters used for a LoRA model were saved to the resulting .pt file. I might want to remember how I configured a model and right now I have to remember to write down the parameters myself every time. Adding the data automatically would also help if I receive a model from somewhere else and want to know how they trained it

Some examples of things I would find useful in this metadata

  • SD model name/hash that was trained on
  • Directory structure/number of images/list of concepts/repeats
  • Epoch count
  • Batches per epoch
  • Regularization image count
  • Total number of optimization steps
  • LR scheduler/warmup rate
  • Training batch size
  • Learning rates

Given that the output files are PyTorch models (.pt) they seem to just be .zip files, so maybe just putting the training parameters as a .json file inside would suffice. And the .safetensors format has a JSON header also

From the additional_networks extension this data could later be inspected from a new tab or similar

It is important that the data is embedded into the .pt file itself so it is retained if the model is distributed later

ModuleNotFoundError: No module named 'albumentations'

Traceback (most recent call last):
File "C:\Users\Siddhesh\Desktop\kohya_ss\train_network.py", line 21, in
import albumentations as albu
ModuleNotFoundError: No module named 'albumentations'
Traceback (most recent call last):
File "C:\Users\Siddhesh\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Siddhesh\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\Users\Siddhesh\AppData\Local\Programs\Python\Python310\Scripts\accelerate.exe_main
.py", line 7, in
File "C:\Users\Siddhesh\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
args.func(args)
File "C:\Users\Siddhesh\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
simple_launcher(args)
File "C:\Users\Siddhesh\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\Users\Siddhesh\AppData\Local\Programs\Python\Python310\python.exe', 'train_network.py', '--cache_latents', '--enable_bucket', '--use_8bit_adam', '--xformers', '--pretrained_model_name_or_path=C:/Users/Siddhesh/Desktop/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned.ckpt', '--train_data_dir=C:/Users/Siddhesh/Desktop/test\img', '--resolution=512,512', '--output_dir=C:/Users/Siddhesh/Desktop/test\model', '--train_batch_size=1', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--max_train_steps=800', '--use_8bit_adam', '--xformers', '--mixed_precision=fp16', '--save_every_n_epochs=1', '--seed=1234', '--save_precision=fp16', '--logging_dir=C:/Users/Siddhesh/Desktop/test\log', '--network_module=networks.lora', '--text_encoder_lr=1e-06', '--unet_lr=0.0001', '--network_dim=4']' returned non-zero exit status 1.

no kernel image is available for execution on the device

Error no kernel image is available for execution on the device at line 89 in file D:\ai\tool\bitsandbytes\csrc\ops.cu
Traceback (most recent call last):
File "C:\Users\Siddhesh\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Siddhesh\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\Users\Siddhesh\Desktop\kohya_ss\venv\Scripts\accelerate.exe_main.py", line 7, in
File "C:\Users\Siddhesh\Desktop\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
args.func(args)
File "C:\Users\Siddhesh\Desktop\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
simple_launcher(args)
File "C:\Users\Siddhesh\Desktop\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\Users\Siddhesh\Desktop\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--cache_latents', '--enable_bucket', '--use_8bit_adam', '--xformers', '--pretrained_model_name_or_path=C:/Users/Siddhesh/Desktop/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned.ckpt', '--train_data_dir=C:/Users/Siddhesh/Desktop/test\img', '--resolution=512,512', '--output_dir=C:/Users/Siddhesh/Desktop/test\model', '--train_batch_size=1', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--max_train_steps=400', '--use_8bit_adam', '--xformers', '--mixed_precision=fp16', '--save_every_n_epochs=1', '--seed=1234', '--save_precision=fp16', '--logging_dir=C:/Users/Siddhesh/Desktop/test\log', '--network_module=networks.lora', '--text_encoder_lr=1e-06', '--unet_lr=0.0001', '--network_dim=4']' returned non-zero exit status 1.

(venv) PS C:\Users\Siddhesh\Desktop\kohya_ss> python
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.

import torch
import sys
print('A', sys.version)
A 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
print('B', torch.version)
B 1.12.1+cu116
print('C', torch.cuda.is_available())
C True
print('D', torch.backends.cudnn.enabled)
D True
device = torch.device('cuda')
print('E', torch.cuda.get_device_properties(device))
E _CudaDeviceProperties(name='NVIDIA GeForce GTX 1060 6GB', major=6, minor=1, total_memory=6143MB, multi_processor_count=10)
print('F', torch.tensor([1.0, 2.0]).cuda())
F tensor([1., 2.], device='cuda:0')

EDIT: Error no kernel image is available for execution on the device at line 89 in file D:\ai\tool\bitsandbytes\csrc\ops.cu

^^ The bold part must be the error because I have no D:\ drive on my system!

Example of training LoRA

Hi, thank you for nice work.

I have been trying to train LoRA, but have not yet succeeded.
To be precise, the loss goes down and last.safetensors is generated, but using them hardly changes the image produced like this.

I have tried for days with different parameters, but have not been able to identify the problem.

I uploaded the training data and scripts at https://github.com/shirayu/example_lora_training .
Any advice would be appreciated.

The command for training: https://github.com/shirayu/example_lora_training/blob/cdf08770e41d0cf82ee5c7e20dc1dfaed8ea824b/train_lora.zsh

Winerror 2 The system cannot find the file specified

Anytime I try and start a training session I get hit with the blow error. I have no idea what file it's not finding. I tried to follow the installation instructions to the letter but I'm very new to all this so it's pretty possible I messed something up. Thanks for the help.

max_train_steps = 0
stop_text_encoder_training = 0
lr_warmup_steps = 0
accelerate launch --num_cpu_threads_per_process=16 "train_db.py" --cache_latents --enable_bucket --use_8bit_adam --xformers --pretrained_model_name_or_path=E:/Ai/stable-diffusion-webui/models/Stable-diffusion/NAImodel.ckpt --train_data_dir="C:/Users/user/kohya_ss/PQ2/image" --resolution=512,512 --output_dir=C:/Users/user/kohya_ss/PQ2/model --train_batch_size=1 --learning_rate=1e-06 --lr_scheduler=constant --lr_warmup_steps=0 --max_train_steps=0 --use_8bit_adam --xformers --mixed_precision=fp16 --save_every_n_epochs=1 --seed=1234 --save_precision=fp16 --logging_dir=C:/Users/user/kohya_ss/PQ2/log --caption_extention=
Traceback (most recent call last):
File "C:\Users\user\kohya_ss\venv\lib\site-packages\gradio\routes.py", line 321, in run_predict
output = await app.blocks.process_api(
File "C:\Users\user\kohya_ss\venv\lib\site-packages\gradio\blocks.py", line 1015, in process_api
result = await self.call_function(fn_index, inputs, iterator, request)
File "C:\Users\user\kohya_ss\venv\lib\site-packages\gradio\blocks.py", line 856, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Users\user\kohya_ss\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\Users\user\kohya_ss\venv\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "C:\Users\user\kohya_ss\venv\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "C:\Users\user\kohya_ss\dreambooth_gui.py", line 413, in train_model
subprocess.run(run_cmd)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2544.0_x64__qbz5n2kfra8p0\lib\subprocess.py", line 503, in run
with Popen(*popenargs, **kwargs) as process:
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2544.0_x64__qbz5n2kfra8p0\lib\subprocess.py", line 971, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2544.0_x64__qbz5n2kfra8p0\lib\subprocess.py", line 1440, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,

I can't start training

Traceback (most recent call last):
File "G:\kohya\sd-scripts\train_db.py", line 1229, in
train(args)
File "G:\kohya\sd-scripts\train_db.py", line 1043, in train
encoder_hidden_states = text_encoder.text_model.final_layer_norm(encoder_hidden_states)
File "G:\kohya\sd-scripts\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "G:\kohya\sd-scripts\venv\lib\site-packages\torch\nn\modules\normalization.py", line 189, in forward
return F.layer_norm(
File "G:\kohya\sd-scripts\venv\lib\site-packages\torch\nn\functional.py", line 2503, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: expected scalar type Float but found Half
steps: 0%| | 0/3000 [00:34<?, ?it/s]
Traceback (most recent call last):
File "C:\Users\username\miniconda3\envs\kohya\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\username\miniconda3\envs\kohya\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "G:\kohya\sd-scripts\venv\Scripts\accelerate.exe_main
.py", line 7, in
File "G:\kohya\sd-scripts\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
args.func(args)
File "G:\kohya\sd-scripts\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
simple_launcher(args)
File "G:\kohya\sd-scripts\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['G:\kohya\sd-scripts\venv\Scripts\python.exe', 'train_db.py', '--pretrained_model_name_or_path=G:\stable-diffusion-webui\models\Stable-diffusion\Anything-V3.0-pruned.ckpt', '--train_data_dir=G:\kohya\dataset\train', '--output_dir=G:\kohya\results', '--prior_loss_weight=1.0', '--resolution=512', '--train_batch_size=1', '--learning_rate=1e-6', '--max_train_steps=3000', '--use_8bit_adam', '--xformers', '--mixed_precision=fp16', '--cache_latents', '--caption_extention=.txt', '--clip_skip=2', '--full_fp16', '--gradient_checkpointing']' returned non-zero exit status 1.

i'm using windows10 and 3080 10gb.
may using conda cause this problem?

thanks in advance.

"OSError: [WinError 1455]" when attempting to start the training.

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\NAME\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\Users\NAME\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "C:\Users\NAME\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\NAME\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "C:\Users\NAME\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "C:\Users\NAME\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "C:\Users\NAME\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\NAME\sd-scripts\train_network.py", line 8, in <module>
    import torch
  File "C:\Users\NAME\sd-scripts\venv\lib\site-packages\torch\__init__.py", line 129, in <module>
    raise err
OSError: [WinError 1455] The paging file is too small for this operation to complete. Error loading "C:\Users\NAME\sd-scripts\venv\lib\site-packages\torch\lib\cusolver64_11.dll" or one of its dependencies.

Running this on an RTX 2060 with 6GB VRAM. I don't know if specs have an influence.

Converted v1 checkpoints in conversion script cause error in generation

Some shape of weights is wrong.

RuntimeError: Error(s) in loading state_dict for LatentDiffusion:
        size mismatch for model.diffusion_model.input_blocks.1.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
        size mismatch for model.diffusion_model.input_blocks.1.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
        size mismatch for model.diffusion_model.input_blocks.2.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
...

No such file or directory: venv\\Scripts\\accelerate.exe\\__main__.py

`╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│                                                                                                  │
│ G:\AI\sd-scripts\lora_train_popup.py:432 in <module>                                             │
│                                                                                                  │
│   429                                                                                            │
│   430                                                                                            │
│   431 if __name__ == "__main__":                                                                 │
│ ❱ 432 │   main()                                                                                 │
│   433                                                                                            │
│ G:\AI\sd-scripts\lora_train_popup.py:189 in main                                                 │
│                                                                                                  │
│   186 │   arg_class = ArgStore()                                                                 │
│   187 │   ret = mb.askyesno(message="Do you want to load a json config file?")                   │
│   188 │   if ret:                                                                                │
│ ❱ 189 │   │   load_json(ask_file("json to load from", {"json"}), arg_class)                      │
│   190 │   │   arg_class = ask_elements_trunc(arg_class)                                          │
│   191 │   else:                                                                                  │
│   192 │   │   arg_class = ask_elements(arg_class)                                                │
│                                                                                                  │
│ G:\AI\sd-scripts\lora_train_popup.py:403 in load_json                                            │
│                                                                                                  │
│   400 │   with open(path) as f:                                                                  │
│   401 │   │   json_obj = json.loads(f.read())                                                    │
│   402 │   print("json loaded, setting variables...")                                             │
│ ❱ 403 │   obj.net_dim = json_obj["net_dim"]                                                      │
│   404 │   obj.scheduler = json_obj["scheduler"]                                                  │
│   405 │   obj.warmup_lr_ratio = json_obj["warmup_lr_ratio"]                                      │
│   406 │   obj.learning_rate = json_obj["learning_rate"]                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 'net_dim'
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│                                                                                                  │
│ C:\Users\satya\AppData\Local\Programs\Python\Python310\lib\runpy.py:196 in _run_module_as_main   │
│                                                                                                  │
│   193 │   main_globals = sys.modules["__main__"].__dict__                                        │
│   194 │   if alter_argv:                                                                         │
│   195 │   │   sys.argv[0] = mod_spec.origin                                                      │
│ ❱ 196 │   return _run_code(code, main_globals, None,                                             │
│   197 │   │   │   │   │    "__main__", mod_spec)                                                 │
│   198                                                                                            │
│   199 def run_module(mod_name, init_globals=None,                                                │
│ C:\Users\satya\AppData\Local\Programs\Python\Python310\lib\runpy.py:86 in _run_code              │
│                                                                                                  │
│    83 │   │   │   │   │      __loader__ = loader,                                                │
│    84 │   │   │   │   │      __package__ = pkg_name,                                             │
│    85 │   │   │   │   │      __spec__ = mod_spec)                                                │
│ ❱  86 │   exec(code, run_globals)                                                                │
│    87 │   return run_globals                                                                     │
│    88                                                                                            │
│    89 def _run_module_code(code, init_globals=None,                                              │
│                                                                                                  │
│ G:\AI\sd-scripts\venv\Scripts\accelerate.exe\__main__.py:7 in <module>                           │
│                                                                                                  │
│ [Errno 2] No such file or directory:                                                             │
│ 'G:\\AI\\sd-scripts\\venv\\Scripts\\accelerate.exe\\__main__.py'                                 │
│                                                                                                  │
│ G:\AI\sd-scripts\venv\lib\site-packages\accelerate\commands\accelerate_cli.py:45 in main         │
│                                                                                                  │
│   42 │   │   exit(1)                                                                             │
│   43 │                                                                                           │
│   44 │   # Run                                                                                   │
│ ❱ 45 │   args.func(args)                                                                         │
│   46                                                                                             │
│   47                                                                                             │
│   48 if __name__ == "__main__":                                                                  │
│                                                                                                  │
│ G:\AI\sd-scripts\venv\lib\site-packages\accelerate\commands\launch.py:1104 in launch_command     │
│                                                                                                  │
│   1101 │   elif defaults is not None and defaults.compute_environment == ComputeEnvironment.AMA  │
│   1102 │   │   sagemaker_launcher(defaults, args)                                                │
│   1103 │   else:                                                                                 │
│ ❱ 1104 │   │   simple_launcher(args)                                                             │
│   1105                                                                                           │
│   1106                                                                                           │
│   1107 def main():                                                                               │
│                                                                                                  │
│ G:\AI\sd-scripts\venv\lib\site-packages\accelerate\commands\launch.py:567 in simple_launcher     │
│                                                                                                  │
│    564 │   process = subprocess.Popen(cmd, env=current_env)                                      │
│    565 │   process.wait()                                                                        │
│    566 │   if process.returncode != 0:                                                           │
│ ❱  567 │   │   raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)       │
│    568                                                                                           │
│    569                                                                                           │
│    570 def multi_gpu_launcher(args):                                                             │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['G:\\AI\\sd-scripts\\venv\\Scripts\\python.exe', 'lora_train_popup.py']' returned non-zero`

Enable Training on 6GB Cards... with DeepSpeed?

I am trying to squeeze training onto my 6GB laptop RTX 2060, and cant quite manage it with the "low memory" config:

accelerate launch --num_cpu_threads_per_process 8 train_db.py \
--pretrained_model_name_or_path="/home/alpha/Storage/AIModels/Stable-diffusion/panatomy05full_0.7-AIModels_Anything-V3.0-pruned-fp16_0.3-Weighted_sum-merged.ckpt" \
--train_data_dir="/home/alpha/Storage/TrainingData/test/training_data" \
--output_dir="/home/alpha/Storage/TrainingOutput/test/" \
--prior_loss_weight=1.0 \
--resolution=512 \
--train_batch_size=1 \
--learning_rate=1e-6 \
--max_train_steps=1600 \
--use_8bit_adam \
--xformers \
--mixed_precision="fp16" \
--cache_latents \
--gradient_checkpointing \
--save_precision="fp16" \
--full_fp16 \
--save_model_as="safetensors" \

So, I figured I would investigate Deepspeed cpu offloading with the accelerate config... but I keep running into errors on both the git version and the 0.7.7 release from pypi. Here is an error from the pypi release:

Traceback (most recent call last):
  File "/home/alpha/clone/sd-scripts/train_db.py", line 332, in <module>
    train(args)
  File "/home/alpha/clone/sd-scripts/train_db.py", line 154, in train
    unet, text_encoder, optimizer, train_dataloader, lr_scheduler = accelerator.prepare(
  File "/home/alpha/.local/lib/python3.10/site-packages/accelerate/accelerator.py", line 619, in prepare
    result = self._prepare_deepspeed(*args)
  File "/home/alpha/.local/lib/python3.10/site-packages/accelerate/accelerator.py", line 805, in _prepare_deepspeed
    engine, optimizer, _, lr_scheduler = deepspeed.initialize(**kwargs)
  File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/__init__.py", line 125, in initialize
    engine = DeepSpeedEngine(args=args,
  File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 330, in __init__
    self._configure_optimizer(optimizer, model_parameters)
  File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1210, in _configure_optimizer
    self.optimizer = self._configure_zero_optimizer(basic_optimizer)
  File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1455, in _configure_zero_optimizer
    optimizer = DeepSpeedZeroOptimizer(
  File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 532, in __init__
    self._param_slice_mappings = self._create_param_mapping()
  File "/home/alpha/.local/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 544, in _create_param_mapping
    lp_name = self.param_names[lp]
KeyError: <exception str() failed>
[2023-01-12 13:13:52,241] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 5398
[2023-01-12 13:13:52,244] [ERROR] [launch.py:324:sigkill_handler] ['/usr/bin/python', '-u', 'train_db.py', '--pretrained_model_name_or_path=/home/alpha/Storage/AIModels/Stable-diffusion/panatomy05full_0.7-AIModels_Anything-V3.0-pruned-fp16_0.3-Weighted_sum-merged.ckpt', '--train_data_dir=/home/alpha/Storage/TrainingData/test/training_data', '--output_dir=/home/alpha/Storage/TrainingOutput/test/', '--prior_loss_weight=1.0', '--resolution=512', '--train_batch_size=1', '--learning_rate=1e-6', '--max_train_steps=1600', '--use_8bit_adam', '--xformers', '--mixed_precision=fp16', '--cache_latents', '--gradient_checkpointing', '--save_precision=fp16', '--full_fp16', '--save_model_as=safetensors'] exits with return code = 1
Traceback (most recent call last):
  File "/home/alpha/.local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/alpha/.local/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
    args.func(args)
  File "/home/alpha/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 827, in launch_command
    deepspeed_launcher(args)
  File "/home/alpha/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 540, in deepspeed_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['deepspeed', '--no_local_rank', '--num_gpus', '1', 'train_db.py', '--pretrained_model_name_or_path=/home/alpha/Storage/AIModels/Stable-diffusion/panatomy05full_0.7-AIModels_Anything-V3.0-pruned-fp16_0.3-Weighted_sum-merged.ckpt', '--train_data_dir=/home/alpha/Storage/TrainingData/test/training_data', '--output_dir=/home/alpha/Storage/TrainingOutput/test/', '--prior_loss_weight=1.0', '--resolution=512', '--train_batch_size=1', '--learning_rate=1e-6', '--max_train_steps=1600', '--use_8bit_adam', '--xformers', '--mixed_precision=fp16', '--cache_latents', '--gradient_checkpointing', '--save_precision=fp16', '--full_fp16', '--save_model_as=safetensors']' returned non-zero exit status 1.

Is there anything in particular that needs to be changed for this repo to support deepspeed? Or maybe there is some other tweak to squeeze LORA onto 6GB?

TypeError: 'type' object is not subscriptable

An error called "TypeError: 'type' object is not subscriptable" occurs when using 'train_network.py' to finetune model.

The whole message is:

import network module: networks.lora
Traceback (most recent call last):
File "/root/autodl-tmp/sd-scripts/train_network.py", line 1455, in
train(args)
File "/root/autodl-tmp/sd-scripts/train_network.py", line 1092, in train
network = network_module.create_network(1.0, args.network_dim, vae, text_encoder, unet, **net_kwargs)
File "/root/autodl-tmp/sd-scripts/networks/lora.py", line 50, in create_network
network = LoRANetwork(text_encoder, unet, multiplier=multiplier, lora_dim=network_dim)
File "/root/autodl-tmp/sd-scripts/networks/lora.py", line 66, in init
def create_modules(prefix, root_module: torch.nn.Module, target_replace_modules) -> list[LoRAModule]:
TypeError: 'type' object is not subscriptable
Traceback (most recent call last):
File "/root/miniconda3/bin/accelerate", line 8, in
sys.exit(main())
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
args.func(args)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/commands/launch.py", line 1104, in launch_command
simple_launcher(args)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/commands/launch.py", line 567, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/root/miniconda3/bin/python', '/root/autodl-tmp/sd-scripts/train_network.py', '--pretrained_model_name_or_path=/root/autodl-tmp/finalmodel.ckpt', '--in_json=/root/autodl-tmp/liangxing-lat.json', '--shuffle_caption', '--keep_tokens=1', '--train_data_dir=/root/autodl-tmp/liangxing', '--dataset_repeats=10', '--output_dir=/root/autodl-tmp/liangxing-lora-test', '--save_precision=float', '--save_model_as=ckpt', '--save_every_n_epochs=1', '--save_state', '--color_aug', '--flip_aug', '--resolution=640,640', '--train_batch_size=4', '--max_token_length=225', '--learning_rate=1e-4', '--prior_loss_weight=1.0', '--seed=2998', '--unet_lr=1e-4', '--text_encoder_lr=1e-6', '--max_train_steps=8955', '--gradient_checkpointing', '--gradient_accumulation_steps=2', '--mixed_precision=no', '--clip_skip=2', '--logging_dir=logs', '--lr_scheduler=polynomial', '--lr_warmup_steps=450', '--network_module=networks.lora']' returned non-zero exit status 1.

It seems to be related to the 'networks.lora' module. How to fix this ? Btw, the '--network_module' cannot be set as 'None', otherwise an another error will appear:

import network module: None
Traceback (most recent call last):
File "/root/autodl-tmp/sd-scripts/train_network.py", line 1455, in
train(args)
File "/root/autodl-tmp/sd-scripts/train_network.py", line 1084, in train
network_module = importlib.import_module(args.network_module)
File "/root/miniconda3/lib/python3.8/importlib/init.py", line 118, in import_module
if name.startswith('.'):
AttributeError: 'NoneType' object has no attribute 'startswith'
Traceback (most recent call last):
File "/root/miniconda3/bin/accelerate", line 8, in
sys.exit(main())
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
args.func(args)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/commands/launch.py", line 1104, in launch_command
simple_launcher(args)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/commands/launch.py", line 567, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/root/miniconda3/bin/python', '/root/autodl-tmp/sd-scripts/train_network.py', '--pretrained_model_name_or_path=/root/autodl-tmp/finalmodel.ckpt', '--in_json=/root/autodl-tmp/liangxing-lat.json', '--shuffle_caption', '--keep_tokens=1', '--train_data_dir=/root/autodl-tmp/liangxing', '--dataset_repeats=10', '--output_dir=/root/autodl-tmp/liangxing-lora-test', '--save_precision=float', '--save_model_as=ckpt', '--save_every_n_epochs=1', '--save_state', '--color_aug', '--flip_aug', '--resolution=640,640', '--train_batch_size=4', '--max_token_length=225', '--learning_rate=1e-4', '--prior_loss_weight=1.0', '--seed=2998', '--unet_lr=1e-4', '--text_encoder_lr=1e-6', '--max_train_steps=8955', '--gradient_checkpointing', '--gradient_accumulation_steps=2', '--mixed_precision=no', '--clip_skip=2', '--logging_dir=logs', '--lr_scheduler=polynomial', '--lr_warmup_steps=450']' returned non-zero exit status 1.

lr_schedulers currently do not take in num_cycles or power parameters

Current version of diffusers.optimization.get_scheduler used does not expose power parameter (for polynomial) or num_cycles (for cosine_with_restarts).

This means, for example:

  • the current implementation of cosine, and cosine_with_restarts produce the same scheduler as the num_cycles defaults to 1.
  • the current implementation of polynomial only produces a linear scheduler as the power defaults to 1.

This is fixed in a future implementation of diffusers: huggingface/diffusers@d87cc15#diff-8702f762e46a3b5363085930b0b045de554909d32560864031ca7b12ddd349d5

Posting this as an issue for awareness and also as something to look into in the future if the repo is tested and updated for a later version of diffusers which includes this patch from the above diffusers commit.

Folder name system to Config File or JSON

The folder name system sucks.
when we are literally passing so many configs each time why using the folder system. Also in the folder system we can't pass some characters which are necessary. Like in the concept.

Add "repeat" feature for fine tuning

Each concept can have a different repeat in DreamBooth method. But metadata .json (fine tuning method) does not have the feature.

One idea is that the folder name for fine tuning can have repeats like <repeat>_<concept>. concept is ignored. If the repeat is not provided, it become 1. And merge_captions or merge_dd_tags script will append the repeat value to json.

Question. Webp support

Is there support for webp format for training images? Webp provides a smaller file size and better quality than jpg. And there are no jpeg artifacts that can degrade learning outcomes.

No data found. Please verify arguments

Hi, i don't really get why i'm getting the message

No data found. Please verify arguments / 画像がありません。引数指定を確認してください

Did the

.\venv\Scripts\activate

Inside sd-scripts folder

Then

accelerate launch --num_cpu_threads_per_process 8 train_network.py --pretrained_model_name_or_path=B:\AIimages\stable-diffusion-webui\models\Stable-diffusion\model.ckpt --train_data_dir=B:\AIimages\training\images\input\ --output_dir=B:\AIimages\sd-scripts\output\ --prior_loss_weight=1.0 --resolution=512,512 --train_batch_size=4 --learning_rate=1e-4 --max_train_steps=200 --use_8bit_adam --xformers --mixed_precision=fp16 --save_every_n_epochs=1 --save_model_as=safetensors --clip_skip=1 --seed=42 --color_aug --network_module=networks.lora --unet_lr=5e-4 --text_encoder_lr=5e-5

There are 61 .pngs with .caption files inside input folder

No data found. Please verify arguments

accelerate launch --num_cpu_threads_per_process=16 "train_network.py" --pretrained_model_name_or_path="C:/Programs/stable-diffusion-webui/models/Stable-diffusion/sd-v1-4.ckpt" --train_data_dir="C:/Users/pdept/Desktop/AI pics/training/512x512" --resolution=512,512 --output_dir="C:/Users/pdept/Desktop/AI pics/training/Nowy folder" --use_8bit_adam --xformers --logging_dir="" --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=1e-3 --network_dim=8 --output_name="last" --learning_rate="1e-5" --lr_scheduler="cosine" --train_batch_size="1" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --seed="1234" --cache_latents --max_data_loader_n_workers="1" --gradient_checkpointing --xformers --use_8bit_adam
prepare tokenizer
Use DreamBooth method.
prepare train images.
0 train images with repeating.
loading image sizes.
0it [00:00, ?it/s]
prepare dataset
No data found. Please verify arguments / 画像がありません。引数指定を確認してください

CUDA_SETUP: WARNING!

It's not dreambooth!!! Hallo, i have a problem. I tried to train LoRA with this script https://github.com/derrian-distro/LoRA_Easy_Training_Scripts but got an error: WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)! but I have already downloaded CUDA, I had uninstalled CUDA 12 and downloaded version 11.6 and cuDNN v8.7.0 but it still didn't help. I also have anaconda installed, but maybe I need to enter its address somewhere

CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\venv\lib\site-packages\bitsandbytes\cuda_setup\paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {WindowsPath('/usr/local/cuda/lib64')}
warn(
WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)!
CUDA SETUP: Loading binary C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.so...
Traceback (most recent call last):
File "C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\lora_train_popup.py", line 432, in
main()
File "C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\lora_train_popup.py", line 197, in main
train_network.train(args)
File "C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\train_network.py", line 114, in train
import bitsandbytes as bnb
File "C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\venv\lib\site-packages\bitsandbytes_init_.py", line 6, in
from .autograd._functions import (
File "C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\venv\lib\site-packages\bitsandbytes\autograd_functions.py", line 5, in
import bitsandbytes.functional as F
File "C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\venv\lib\site-packages\bitsandbytes\functional.py", line 13, in
from .cextension import COMPILED_WITH_CUDA, lib
File "C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\venv\lib\site-packages\bitsandbytes\cextension.py", line 41, in
lib = CUDALibrary_Singleton.get_instance().lib
File "C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\venv\lib\site-packages\bitsandbytes\cextension.py", line 37, in get_instance
cls.instance.initialize()
File "C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\venv\lib\site-packages\bitsandbytes\cextension.py", line 31, in initialize
self.lib = ct.cdll.LoadLibrary(binary_path)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\ctypes_init
.py", line 452, in LoadLibrary
return self.dlltype(name)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\ctypes_init
.py", line 364, in init
if '/' in name or '\' in name:
TypeError: argument of type 'WindowsPath' is not iterable
Traceback (most recent call last):
File "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\venv\Scripts\accelerate.exe_main
.py", line 7, in
File "C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
args.func(args)
File "C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
simple_launcher(args)
File "C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\Artem\ai\SD-вещи\kohya-ss-sd-scripts\sd-scripts\venv\Scripts\python.exe', 'lora_train_popup.py']' returned non-zero exit status 1.

What does face aug do

I looked at the code and it seem that face aug trains on the faces

what i want to know is if it trains on the faces addtionally or if it only trains on the faces also how does the random crop setting relate to this

Cache latents optionaly

It would be great to make caching the image latents optional. It takes 12 hours to convert 100k images with left/right flip enabled on a 3090 :/ (2.6 it/s). If I'm doing quick test runs on 1 or 2 epocs this is not worth the time currently.

If that is too much work, something as simple as improving the speed would be fine as well, as my gpu is never at 100% while processing.

Set initial epoch number when resuming

Currently the epoch starts from 1 when resuming. It is better to be able to set arbitrary number by argument.
(Or automatically taken from the file name.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.