Could you at least tell in the guide what location are we supposed to put the downloaded pretrains in?

It only says "Or download manually". But if I download, for example, archive of Quick96, do I extract its contents into /home/user/DeepFaceLab_Linux/DeepFaceLab/models/Model_Quick96 or in some other mysterious folder?

SAEHD refuses to use my GPU

I have a GTX 1070. It recognizes it in windows, but on the linux build, it's as if DeepFaceLab only recognizes my CPU.
Im kinda a linux noob, so be gentle.

/!\ ffmpeg fail problem

i am trying to launch bash 2_extract_image_from_data_src.sh

/!\ ffmpeg fail, job commandline:['ffmpeg', '-i', '/home/piai/DeepFaceLab_Linux/workspace/data_src.mp4', '-pix_fmt', 'rgb24', '/home/piai/DeepFaceLab_Linux/workspace/data_src/%5d.png']
Done.

but this happen... help please

unicode ascii error

when I tried to training model, it occurred an error:(system Ubutun)
Error: 'ascii' codec can't encode character '\u200b' in position 5: ordinal not in range(128)
Traceback (most recent call last):
File "/root/DeepFaceLab_Linux/DeepFaceLab/mainscripts/Trainer.py", line 57, in trainerThread
debug=debug,
File "/root/DeepFaceLab_Linux/DeepFaceLab/models/ModelBase.py", line 162, in init
nn.initialize(self.device_config)
File "/root/DeepFaceLab_Linux/DeepFaceLab/core/leras/nn.py", line 73, in initialize
os.environ['CUDA_CACHE_MAXSIZE'] = '536870912' #512Mb (32mb default)
File "/root/anaconda3/envs/deepfacelab/lib/python3.6/os.py", line 673, in setitem
key = self.encodekey(key)
File "/root/anaconda3/envs/deepfacelab/lib/python3.6/os.py", line 745, in encode
return value.encode(encoding, 'surrogateescape')
UnicodeEncodeError: 'ascii' codec can't encode character '\u200b' in position 5: ordinal not in range(128)

XSeg_data_dst/src_mask_edit won't run because of xcb

i cannot get mask_edit work, fail on not being able to load xcb:

"QObject::moveToThread: Current thread (0x55e3156bd070) is not the object's thread (0x55e315529a80).
Cannot move to target thread (0x55e3156bd070)

qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "/home/debian/anaconda3/envs/deepfacelab/lib/python3.8/site-packages/cv2/qt/plugins" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem."

managed to get it run with QT_QPA_PLATFORM=vnc, however cannot press escape, so cannot turn it of gracefully.

xseg_train works for some reason.

running on ubuntu 20.04.1

Nvidia GPU is not detected / doesn't show option to choose GPU

Hello,

I have been facing this issue since days now, finally had to ask it here.

So I have been using DFL on windows since quite some time now, it works smooth and in terms of installation and maintenance its a boon. but when it comes to performance I noticed that windows operating system was eating up almost 40% of my GPU memory due to some internal processes and it overall reduced my speed and accuracy. Hence I decided to switch to Linux version which I had used about two years ago and it gave amazing performance back then too.

Now the problem,

In windows, dfl used to give option to choose from either CPU ot Nvidia GPU for faceset extraction, merging and training, but when I run the same bash files on Linux it skips the part where I am supposed to choose either of those and as I can see that only 8 cores are printed when the extraction starts it is obvious that CPU is being used and not GPU.

Some Info on my system,
Kernel => Linux 192.168.1.5 5.14.18-300.fc35.x86_64 #1 SMP Fri Nov 12 16:43:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
CPU => AMD Ryzen 9 5900HX with Radeon Graphics
GPU => Nvidia RTX-3060 Mobile GPU
OS- Fedora 35

So here is what happens when i run command bash -i 5_data_dst_extract_faces_S3FD.sh, there used to be an option to choose for CPU or GPU in windows

===============================================================================
(deepfacelab) [mactimus@192 scripts]$ bash -i 5_data_dst_extract_faces_S3FD.sh
[wf] Face type ( f/wf/head ?:help ) :
wf
[512] Image size ( 256-2048 ?:help ) :
512
[90] Jpeg quality ( 1-100 ?:help ) :
90
Extracting faces...

Running on CPU2
Running on CPU6
Running on CPU3
Running on CPU0
Running on CPU4
Running on CPU7
Running on CPU1
Running on CPU5

===============================================================================

I am able to see my device when i run,

conda install numba
numba -s

==================================================================================

Also when I run the following command in python env

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

-------------------------I see the following ------------------

(deepfacelab) [mactimus@192 scripts]$ python
Python 3.7.11 (default, Jul 27 2021, 14:32:16)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.

from tensorflow.python.client import device_lib
2021-11-24 12:57:07.902835: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
print(device_lib.list_local_devices())
2021-11-24 12:57:13.142488: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-11-24 12:57:13.143137: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-11-24 12:57:14.272368: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-11-24 12:57:14.272548: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3060 Laptop GPU computeCapability: 8.6
coreClock: 1.702GHz coreCount: 30 deviceMemorySize: 5.81GiB deviceMemoryBandwidth: 312.97GiB/s
2021-11-24 12:57:14.272581: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-11-24 12:57:14.274654: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-11-24 12:57:14.274722: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2021-11-24 12:57:14.289957: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-11-24 12:57:14.290264: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-11-24 12:57:14.290364: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory
2021-11-24 12:57:14.291851: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2021-11-24 12:57:14.292030: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory
2021-11-24 12:57:14.292052: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1757] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2021-11-24 12:57:14.330203: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-11-24 12:57:14.330238: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 0
2021-11-24 12:57:14.330244: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0: N
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 7823247586605606195
]

====================================================================================

Here is my output for command nvidia-smi

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1901 G /usr/libexec/Xorg 4MiB |
+-----------------------------------------------------------------------------+

Any help is most welcomed,
Thanks

NEW VERSION TEST

The repository is alive, don't worry.
Soon I plan to upload a new version of the scripts, as well as scripts for converting models to DFLive and its subsequent launch.
Who can help me with script tests on different video cards?

I created a discussion where you will write the results of the work with a full description of the system and python packages

Required:

RTX 3x series
RTX 2x series
GTX 1x series

#Systems:

Ubuntu 16,18,20
Debian 10,11,12
CentOS
Arch Linux
Redhat

Let's make DFLab and DFLive easier to use in unix systems :)

Pretrain model

I wanna know if I can ask for pretrain SAEHD model?

Please update the Readme

install the deepfacelab as the README.md
iit will have erro like :::::
[[[[[[
python3.6: can't open file './deepfacelab/main.py':
]]]]]]
if you cut the folder 'DeepFaceLab' to the folder 'scripts' can solve this promblem (just place the folders as the windows version)

then
another issue came out
run the .sh
[[[
Traceback (most recent call last):
。。。
import error LibXrender
]]]
after install
libsm6
libxrender1
libxext-dev

error

`(deepfacelab) can@kali:~/DeepFaceLab_Linux/scripts$ bash 1_clear_workspace.sh

CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run

$ conda init <SHELL_NAME>

Currently supported shells are:

bash
fish
tcsh
xonsh
zsh
powershell

See 'conda init --help' for more information and options.

IMPORTANT: You may need to close and restart your shell after running 'conda init'.
`

script update

hi @nagadit ,

First of all, thank you for your excellent work, it has provided a great help.

And do you have time to update the script? I need to convert the .dfm model for deepfakelive, but the script seems to be missing a lot of script version updates.

Thank you very much！

Running a shell script in KDE Neon 20.04 gives error (Anaconda)

The error given is.

CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run

$ conda init <SHELL_NAME>

Currently supported shells are:

bash
fish
tcsh
xonsh
zsh
powershell

See 'conda init --help' for more information and options.

IMPORTANT: You may need to close and restart your shell after running 'conda init'.

Running every single listed conda init commands give similar output.

(deepfacelab) inspirer@ArvidsPCKDE:/DeepFaceLab_Linux/scripts$ conda init powershell
no change /home/inspirer/anaconda3/condabin/conda
no change /home/inspirer/anaconda3/bin/conda
no change /home/inspirer/anaconda3/bin/conda-env
no change /home/inspirer/anaconda3/bin/activate
no change /home/inspirer/anaconda3/bin/deactivate
no change /home/inspirer/anaconda3/etc/profile.d/conda.sh
no change /home/inspirer/anaconda3/etc/fish/conf.d/conda.fish
no change /home/inspirer/anaconda3/shell/condabin/Conda.psm1
no change /home/inspirer/anaconda3/shell/condabin/conda-hook.ps1
no change /home/inspirer/anaconda3/lib/python3.8/site-packages/xontrib/conda.xsh
no change /home/inspirer/anaconda3/etc/profile.d/conda.csh
No action taken.
(deepfacelab) inspirer@ArvidsPCKDE:/DeepFaceLab_Linux/scripts$

Help is appreciated.

Regards
Arvid Boisen

preview during training and interactive merger missing Linux

I'm using the linux implementation, just downloaded the repo yesterday.
When I run training script I don't get a preview window of the training process (no I am not using the "no preview" version of
the script). Also, this might be an issue with the machine I am running this on, but when I press enter the script doesn't
quit training.
Similarly, When I run the 7_merge_Quick96.sh script, I don't get the use interactive merger option.
I checked online and there seems to be at least as of Jan 2022 someone who posted a video using Linux and where
the preview during training and the interactive merger was an option.

Have these features been removed in an update or put somewhere else?

Extremely outdated, not working currently, doesn't detect GPU

Been trying to run on a cloud linux remote desktop for the last 2 days (runpod.io).
No results.

needs deleting, sorry duplicate

ubuntu20. 4. The training model does not use GPU on the CPU CUDA = 11 cudnn = 8 tensorflow GPU = 2.4.0

How to use these scripts for head swap?

If anyone has worked with head swap, can you please point to the doc which does this. A small explanation would be fine as well. Thanks!

Guide For Setup On M1 apple silicon Macos Please?

Can Anyone Make aStep Guide OR Youtube TUTORIAL for Using DeepfakeLab on M1 Macos please.
And if Possible, anyway to use 8 Core Gpu and 16 Core Neural Engine For it somehow?
On this Machine It could Rock, Just somebody Help.

AMD gpu don't work

I'm running DFL on ubuntu 22.04 and when i extract faces it refuses to use my RX 570 8GB, i alredy downgraded tensorflow-gpu, idk what to do, pls help me

2_extract_image_from_data_src.sh: 4: /main.py: not found

hi
so where is the main.py

i run it in the script directory

can not start a new thread with python3.6

when i run bash 6_train_Quick96.sh

Is this a problem caused by the tensorflow version?

Initializing models: 0%| | 0/5 [00:00<?, ?it/s]Error: Cannot assign a device for operation encoder/down1/downs_0/conv1/weight/Initializer/cai: Could not satisfy explicit device specification '' because the node node encoder/down1/downs_0/conv1/weight/Initializer/cai (defined at home/wangtao/anaconda3/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) placed on device Device assignments active during op 'encoder/down1/downs_0/conv1/weight/Initializer/cai' creation:
with tf.device(None): </home/wangtao/anaconda3/lib/python3.6/site-packages/tensorflow_core/python/ops/variables.py:1816>
with tf.device(/GPU:0): </home2/wangtao/yhl-xhzy/DeepFaceLab_Linux/DeepFaceLab/models/Model_SAEHD/Model.py:259> was colocated with a group of nodes that required incompatible device '/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:2].
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_name_index_=-1 requested_device_name_='/device:GPU:0' assigned_device_name_='' resource_device_name_='/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
Assign: CPU
VariableV2: CPU
Const: CPU XLA_CPU XLA_GPU
Identity: CPU XLA_CPU XLA_GPU
Fill: CPU XLA_CPU XLA_GPU

RTX 3090 fails in training SAEHD or XSeg if CPU does not support AVX2 - "Illegal instruction, core dumped". Solution below - use Tensorflow 2.5.0 instead

Thanks so much to Nagadit for hosting this repository. Here is my problem and hopefully it helps someone else:

Expected behaviour

Training SAEHD or XSeg on DFL with RTX 3090, tensorflow 2.4.0

Actual behaviour

Python throws Error code of "illegal instruction, core dumped" on last line of DFL script (which says "train")
This is despite Tensorflow 2.4.0 correctly recognising the RTX 3090, and despite cuda 11.0 or 11.1 and compatible nvidia drivers (455.28) all working correctly.
You can check if Tensorflow is recognising your GPU by typing

python

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

Steps to reproduce

Install DFL on Linux as per Nagadit repository but use python 3.8 instead, and cudnn 8.0.5 and cudatoolkit 11.0 from Conda or 11.1 from Nvidia direct. Use latest requirements-cuda.txt since some newer versions of some modules are required (like h5py==2.10.0 or pyqt5==5.15.2)
Tensorflow 2.4.0
Use latest files from iperov Github.

Solution:
This will only apply to some people out there with older CPUs, but here is what I eventually found:

This is a Tensorflow 2.4.0 problem. Even if RTX 3090 works with TF 2.4.0, older CPUs do not. TF requires AVX or AVX2 support. TF 2.3 and 2.2 support AVX and AVX2. The tensorflow guys forgot to include AVX support in 2.4.0, despite it being compatible! Newer CPUs with AVX2 support will be ok with TF 2.4.0.
I therefore compiled my own tensorflow for my machine, and this produced TF 2.5.0 which had AVX support. I can now fully train DFL using RTX 3090!

To compile your own TF you can use: https://www.tensorflow.org/install/source
I compiled with cudnn 8.0.5 dev files (filename libcudnn8-dev_8.0.5.39-1+cuda11.1) and cudatoolkit 11.1 installed.
Compiling TF requires Bazel. To install bazelisk, can use:
$ wget https://github.com/bazelbuild/bazelisk/releases/download/v1.7.4/bazelisk-linux-amd64
$ chmod +x bazelisk-linux-amd64
$ sudo mv bazelisk-linux-amd64 /usr/local/bin/bazel
$ bazel version

Issuing the command:
$ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

This produced tensorflow 2.5.0 and this works great with RTX3090 and current DFL build. I can train SAEHD now.

Don't think this problem is common but hopefully of some use to someone out there

Linux installation driver issues RTX 3090

Sorry if this is the wrong place but I wouldn't know where else to ask:
forum and Discord was not helpful so far.

I have been trying to get DFL working on ubuntu 20 for days now but it doesn't recocnize the RTX 3090 and starts training on CPU I guess.

I have installed everything by the guide and therefore think there is a diver - cudatoolkit incompatibility present? Can that be?

Ubuntu installes latest drivers + 460 and cudatoolkit 11.3 as far as I understand. Do I understand correctly that the conda env cudatoolkit is the one that gets respected or does the systemwide 11.3 need to be removed?
Then I tried to rollback the driver to 455 or 450 since some people told me that is working correctly. I could never get it done.
After installation and blacklisting nouveau nvidia-smi reports "no devices were found".

I am terribly desperate that I can't get it working and considering to switch to the windows installation hoping it will work out of the box.

XSeg Editor Tooltips Seem Bugged

On Ubuntu 20.04. Using the XSeg Editor works, but none of the tool-tips that show the functionality of the control work. Only a blank box is displayed on mouseover where the text should be. No errors reported in the console. Maybe missing Font on Linux? Or a color issue?

AMD GPU not an option?

I recently started playing around with deepfacelab. At first I toyed with it on my home PC that has no GPU. Decided to move the process to my workstation at the office. It has an RX560

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Baffin [Radeon RX 550 640SP / RX 560/560X] (rev ff)

Kernel :
4.18.0-193.14.3.el8_2.x86_64

When running 5_data_dst_extract_faces_S3FD.sh I do not get an option to use the GPU. Instead it uses the AMD Ryzen 5 3400G. Granted it is faster than my personal i5-4590T at 2.91s/it vs 3.22s/it, but I was hoping for a little better.

What am I missing?

Ошибка (XSeg editor)

При попытке открыть XSeg editor такая ошибка:

Running XSeg editor.
QObject::moveToThread: Current thread (0x55625b3d5310) is not the object's thread (0x55625b85eec0).
Cannot move to target thread (0x55625b3d5310)

qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "/home/(username)/.conda/envs/deepfacelab/lib/python3.7/site-packages/cv2/qt/plugins" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

Available platform plugins are: xcb, eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcomposite-glx, webgl.

5_XSeg_data_src_mask_edit.sh: строка 5: 1087987 Аварийный останов         (стек памяти сброшен на диск) $DFL_PYTHON "$DFL_SRC/main.py" xseg editor --input-dir "$DFL_WORKSPACE/data_src/aligned"
(deepfacelab)

Пробовал разные версии pyqt, не помогло. Дистрибутив arch.

Script 2 stalls

Hello,

I followed the installation instructions closely in the README.md and all went smoothly. Script 1 works correctly with no errors. Running "bash 2_extract_image_from_data_src.sh" also produces no errors but the operation stalls and the CPU remains idle.
Might this be related to my NVIDIA drivers?

nvidia-smi prints:

Mon Feb 14 10:09:57 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
| 34%   35C    P8     3W / 100W |    378MiB /  4096MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1190      G   /usr/lib/xorg/Xorg                204MiB |
|    0   N/A  N/A      4391      G   ...AAAAAAAAA= --shared-files       13MiB |
|    0   N/A  N/A      4466      G   firefox                           157MiB |
+-----------------------------------------------------------------------------+

Other suggestions, please?

I can't run shell

Hello, i have installed Linux Version on Google cloud.
But when i run, all sh give this error:

Consider that directory there is it. I have try to change ev.sh file, using absolute path, but i don't resolve problem.

Directory was installed here:

/home/deepfaker/Deepfacelab_linux/...

Using xseg and quick96 together?

Is it possible to use xseg to mask the faces etc , and quick96 to train the model ?

I tried to run the script after I finished the installation, but nothing returned

(deepfacelab)  wei@wei-B75  ~/文档/LAB/DeepFaceLab_Linux/scripts   master ±  ./2_extract_image_from_data_src.sh

just like this.
After running for an hour...two hours, there was still nothing. Is it a problem with the graphics card driver?

Something is missing here

When I launch a train script it gives me this error:

Initializing models: 0%| | 0/5 [00:00<?, ?it/s]
Error: No OpKernel was registered to support Op 'DepthToSpace' used by node DepthToSpace (defined at home/marco/DeepFaceLab_Linux/DeepFaceLab/core/leras/ops/init.py:336) with these attrs: [data_format="NCHW", block_size=2, T=DT_FLOAT]
Registered devices: [CPU]
Registered kernels:
device='GPU'; T in [DT_QINT8]
device='GPU'; T in [DT_HALF]
device='GPU'; T in [DT_FLOAT]
device='CPU'; T in [DT_VARIANT]; data_format in ["NHWC"]
device='CPU'; T in [DT_RESOURCE]; data_format in ["NHWC"]

Missing export scripts

The export scripts for AMP and SAEHD (6) export AMP as dfm.bat and 6) export AMP as dfm.bat) are missing.

Deepfacelab not utilizing GPU, failing, even though all depencencies fulfilled, python sees GPU

I am having a problem that I cannot seem to pin down. When running the trainer script, I am asked to select a GPU, which I do, so Deepfacelab clearly sees it, but then the whole thing bails out saying it cannot assign a device. Below is the output.

./6_train_SAEHD.sh

Running trainer.

Choose one of saved models, or enter a name to create a new model.
[r] : rename
[d] : delete

[0] : DF-UD256 - latest
:
0
Loading DF-UD256_SAEHD model...

Choose one or several GPU idxs (separated by comma).

[CPU] : CPU
[0] : GeForce GTX 1660 Ti

[0] Which GPU indexes to choose? :
0

Initializing models: 0%| | 0/7 [00:00<?, ?it/s]
Error: Cannot assign a device for operation encoder/down1/downs_0/conv1/weight/Initializer/cai: Could not satisfy explicit device specification '' because the node node encoder/down1/downs_0/conv1/weight/Initializer/cai (defined at run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/initializers/init.py:13) placed on device Device assignments active during op 'encoder/down1/downs_0/conv1/weight/Initializer/cai' creation:
with tf.device(None): </home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py:1796>
with tf.device(/GPU:0): </run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/models/Model_SAEHD/Model.py:233> was colocated with a group of nodes that required incompatible device '/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0].
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_name_index_=-1 requested_device_name_='/device:GPU:0' assigned_device_name_='' resource_device_name_='/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
Identity: CPU XLA_CPU
VariableV2: CPU
Const: CPU XLA_CPU
Assign: CPU
Fill: CPU XLA_CPU

Colocation members, user-requested devices, and framework assigned devices, if any:
encoder/down1/downs_0/conv1/weight/Initializer/cai/shape_as_tensor (Const)
encoder/down1/downs_0/conv1/weight/Initializer/cai/Const (Const)
encoder/down1/downs_0/conv1/weight/Initializer/cai (Fill)
encoder/down1/downs_0/conv1/weight (VariableV2) /device:GPU:0
encoder/down1/downs_0/conv1/weight/Assign (Assign) /device:GPU:0
encoder/down1/downs_0/conv1/weight/read (Identity) /device:GPU:0
Assign_1 (Assign) /device:GPU:0
Assign_260 (Assign) /device:GPU:0

 [[node encoder/down1/downs_0/conv1/weight/Initializer/_cai_ (defined at run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/initializers/__init__.py:13) ]]Additional information about colocations:No node-device colocations were active during op 'encoder/down1/downs_0/conv1/weight/Initializer/_cai_' creation.

Device assignments active during op 'encoder/down1/downs_0/conv1/weight/Initializer/cai' creation:
with tf.device(None): </home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py:1796>
with tf.device(/GPU:0): </run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/models/Model_SAEHD/Model.py:233>

Original stack trace for 'encoder/down1/downs_0/conv1/weight/Initializer/cai':
File "home/user/.conda/envs/deepfacelab/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "home/user/.conda/envs/deepfacelab/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "home/user/.conda/envs/deepfacelab/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/mainscripts/Trainer.py", line 57, in trainerThread
debug=debug,
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/models/ModelBase.py", line 189, in init
self.on_initialize()
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/models/Model_SAEHD/Model.py", line 236, in on_initialize
encoder_out_ch = self.encoder.compute_output_channels ( (nn.floatx, bgr_shape))
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 147, in compute_output_channels
shape = self.compute_output_shape(shapes)
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 121, in compute_output_shape
self.build()
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 65, in build
self._build_sub(v[name],name)
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 35, in _build_sub
layer.build()
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 65, in build
self._build_sub(v[name],name)
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 20, in _build_sub
self.build_sub(sublayer, f"{name}{i}")
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 35, in _build_sub
layer.build()
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 65, in build
self._build_sub(v[name],name)
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 33, in _build_sub
layer.build_weights()
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/layers/Conv2D.py", line 76, in build_weights
self.weight = tf.get_variable("weight", (self.kernel_size,self.kernel_size,self.in_ch,self.out_ch), dtype=self.dtype, initializer=kernel_initializer, trainable=self.trainable )
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1572, in get_variable
aggregation=aggregation)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1315, in get_variable
aggregation=aggregation)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 569, in get_variable
aggregation=aggregation)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 521, in _true_getter
aggregation=aggregation)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 940, in _get_single_variable
aggregation=aggregation)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 260, in call
return cls._variable_v1_call(*args, **kwargs)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 221, in _variable_v1_call
shape=shape)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 199, in
previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 2613, in default_variable_creator
shape=shape)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 264, in call
return super(VariableMetaclass, cls).call(*args, **kwargs)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 1668, in init
shape=shape)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 1798, in _init_from_args
initial_value(), name="initial_value", dtype=dtype)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 909, in
partition_info=partition_info)
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/initializers/init.py", line 13, in call
return tf.zeros( shape, dtype=dtype, name="cai")
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
return target(*args, **kwargs)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py", line 2747, in wrapped
tensor = fun(*args, **kwargs)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py", line 2806, in zeros
output = fill(shape, constant(zero, dtype=dtype), name=name)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
return target(*args, **kwargs)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py", line 239, in fill
result = gen_array_ops.fill(dims, value, name=name)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3412, in fill
"Fill", dims=dims, value=value, name=name)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 744, in _apply_op_helper
attrs=attr_protos, op_def=op_def)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3485, in _create_op_internal
op_def=op_def)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1949, in init
self._traceback = tf_stack.extract_stack()

Traceback (most recent call last):
File "/home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1365, in do_call
return fn(*args)
File "/home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1348, in run_fn
self.extend_graph()
File "/home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1388, in extend_graph
tf_session.ExtendSession(self.session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation encoder/down1/downs_0/conv1/weight/Initializer/cai: Could not satisfy explicit device specification '' because the node {{colocation_node encoder/down1/downs_0/conv1/weight/Initializer/cai}} was colocated with a group of nodes that required incompatible device '/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0].
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_name_index=-1 requested_device_name='/device:GPU:0' assigned_device_name='' resource_device_name='/device:GPU:0' supported_device_types=[CPU] possible_devices_=[]
Identity: CPU XLA_CPU
VariableV2: CPU
Const: CPU XLA_CPU
Assign: CPU
Fill: CPU XLA_CPU

Colocation members, user-requested devices, and framework assigned devices, if any:
encoder/down1/downs_0/conv1/weight/Initializer/cai/shape_as_tensor (Const)
encoder/down1/downs_0/conv1/weight/Initializer/cai/Const (Const)
encoder/down1/downs_0/conv1/weight/Initializer/cai (Fill)
encoder/down1/downs_0/conv1/weight (VariableV2) /device:GPU:0
encoder/down1/downs_0/conv1/weight/Assign (Assign) /device:GPU:0
encoder/down1/downs_0/conv1/weight/read (Identity) /device:GPU:0
Assign_1 (Assign) /device:GPU:0
Assign_260 (Assign) /device:GPU:0

 [[{{node encoder/down1/downs_0/conv1/weight/Initializer/_cai_}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/mainscripts/Trainer.py", line 57, in trainerThread
debug=debug,
File "/run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/models/ModelBase.py", line 189, in init
self.on_initialize()
File "/run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/models/Model_SAEHD/Model.py", line 568, in on_initialize
do_init = not model.load_weights( self.get_strpath_storage_for_file(filename) )
File "/run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/layers/Saveable.py", line 96, in load_weights
nn.batch_set_value(tuples)
File "/run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/ops/init.py", line 29, in batch_set_value
nn.tf_sess.run(assign_ops, feed_dict=feed_dict)
File "/home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 958, in run
run_metadata_ptr)
File "/home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1181, in run
feed_dict_tensor, options, run_metadata)
File "/home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1359, in do_run
run_metadata)
File "/home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1384, in do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation encoder/down1/downs_0/conv1/weight/Initializer/cai: Could not satisfy explicit device specification '' because the node node encoder/down1/downs_0/conv1/weight/Initializer/cai (defined at run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/initializers/init.py:13) placed on device Device assignments active during op 'encoder/down1/downs_0/conv1/weight/Initializer/cai' creation:
with tf.device(None): </home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py:1796>
with tf.device(/GPU:0): </run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/models/Model_SAEHD/Model.py:233> was colocated with a group of nodes that required incompatible device '/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0].
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_name_index=-1 requested_device_name='/device:GPU:0' assigned_device_name='' resource_device_name_='/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
Identity: CPU XLA_CPU
VariableV2: CPU
Const: CPU XLA_CPU
Assign: CPU
Fill: CPU XLA_CPU

Colocation members, user-requested devices, and framework assigned devices, if any:
encoder/down1/downs_0/conv1/weight/Initializer/cai/shape_as_tensor (Const)
encoder/down1/downs_0/conv1/weight/Initializer/cai/Const (Const)
encoder/down1/downs_0/conv1/weight/Initializer/cai (Fill)
encoder/down1/downs_0/conv1/weight (VariableV2) /device:GPU:0
encoder/down1/downs_0/conv1/weight/Assign (Assign) /device:GPU:0
encoder/down1/downs_0/conv1/weight/read (Identity) /device:GPU:0
Assign_1 (Assign) /device:GPU:0
Assign_260 (Assign) /device:GPU:0

 [[node encoder/down1/downs_0/conv1/weight/Initializer/_cai_ (defined at run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/initializers/__init__.py:13) ]]Additional information about colocations:No node-device colocations were active during op 'encoder/down1/downs_0/conv1/weight/Initializer/_cai_' creation.

Device assignments active during op 'encoder/down1/downs_0/conv1/weight/Initializer/cai' creation:
with tf.device(None): </home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py:1796>
with tf.device(/GPU:0): </run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/models/Model_SAEHD/Model.py:233>

Original stack trace for 'encoder/down1/downs_0/conv1/weight/Initializer/cai':
File "home/user/.conda/envs/deepfacelab/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "home/user/.conda/envs/deepfacelab/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "home/user/.conda/envs/deepfacelab/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/mainscripts/Trainer.py", line 57, in trainerThread
debug=debug,
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/models/ModelBase.py", line 189, in init
self.on_initialize()
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/models/Model_SAEHD/Model.py", line 236, in on_initialize
encoder_out_ch = self.encoder.compute_output_channels ( (nn.floatx, bgr_shape))
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 147, in compute_output_channels
shape = self.compute_output_shape(shapes)
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 121, in compute_output_shape
self.build()
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 65, in build
self._build_sub(v[name],name)
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 35, in _build_sub
layer.build()
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 65, in build
self._build_sub(v[name],name)
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 20, in _build_sub
self.build_sub(sublayer, f"{name}{i}")
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 35, in _build_sub
layer.build()
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 65, in build
self._build_sub(v[name],name)
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 33, in _build_sub
layer.build_weights()
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/layers/Conv2D.py", line 76, in build_weights
self.weight = tf.get_variable("weight", (self.kernel_size,self.kernel_size,self.in_ch,self.out_ch), dtype=self.dtype, initializer=kernel_initializer, trainable=self.trainable )
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1572, in get_variable
aggregation=aggregation)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1315, in get_variable
aggregation=aggregation)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 569, in get_variable
aggregation=aggregation)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 521, in _true_getter
aggregation=aggregation)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 940, in _get_single_variable
aggregation=aggregation)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 260, in call
return cls._variable_v1_call(*args, **kwargs)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 221, in _variable_v1_call
shape=shape)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 199, in
previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 2613, in default_variable_creator
shape=shape)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 264, in call
return super(VariableMetaclass, cls).call(*args, **kwargs)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 1668, in init
shape=shape)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 1798, in _init_from_args
initial_value(), name="initial_value", dtype=dtype)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 909, in
partition_info=partition_info)
File "run/media/user/Flashie/DeepFaceLab_Linux/DeepFaceLab/core/leras/initializers/init.py", line 13, in call
return tf.zeros( shape, dtype=dtype, name="cai")
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
return target(*args, **kwargs)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py", line 2747, in wrapped
tensor = fun(*args, **kwargs)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py", line 2806, in zeros
output = fill(shape, constant(zero, dtype=dtype), name=name)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
return target(*args, **kwargs)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py", line 239, in fill
result = gen_array_ops.fill(dims, value, name=name)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3412, in fill
"Fill", dims=dims, value=value, name=name)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 744, in _apply_op_helper
attrs=attr_protos, op_def=op_def)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3485, in _create_op_internal
op_def=op_def)
File "home/user/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1949, in init
self._traceback = tf_stack.extract_stack()

any help is appreciated. I see on iperovs repo some 3080 users having similar problems, but those are on windows, and I have a 166TI.

Conda packages not available from sources

$ conda create -n deepfacelab -c main python=3.7 cudnn=7.6.5 cudatoolkit=10.1.243
Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

cudnn=7.6.5
cudatoolkit=10.1.243

Current channels:

To search for alternate channels that may provide the conda package you're
looking for, navigate to

https://anaconda.org

and use the search bar at the top of the page.

Ffmpeg not launched?

Hello, I created my deepfacelab environment with "conda create -n deepfacelab -c main python=3.7 cudnn=7.6.5 cudatoolkit=10.1.243" on Arch linux using the README on the nagadit git pages. It installed with no problems. I've have also tried other python versions.

I am in doubt however as to how ffmpeg-python should be installed. I try with pip install ffmpeg-python within the deepfaceloab environment. This seems to install the libs but no binaries.

I've also tried:

conda install -c conda-forge ffmpeg-python

This installs and ffmpeg binary in ~/anaconda3/envs/deepfacelab/bin

When I try to run the second script in the scripts directory with "bash 2_extract_image_from_data_src.sh" the script stalls. There is no prompt to enter how many frames per second the data_src.mp4 video has. I suspect ffmpeg is not being launched.

Can someone please help? Have been stuck for days...

Extracting face not working on GPU

When 5_data_dst_extract_faces_S3FD.sh running on my ubuntu 22.04 with RTX2080, no face is detected, but it works fine when using CPU

Инструкция по установке устарела.

Если всё делать по инструкции, deepfacelab не будет работать. Судя по всему, последняя версия deepfacelab не работает с CUDA версии ниже 11 и cudnn версии ниже 8. Так же requirements-cuda не подходят, с ними не появляется список видеокарт и обработка работает только на CPU. Мне помогла установка последних версий этих пакетов.
Правильное создание окружения deepfacelab- conda create -n deepfacelab -c main python=3.7 cudnn=8.2.1 cudatoolkit=11
Установка нужных пакетов(переключившись на окружение deepfacelab)- python -m pip install tqdm numpy h5py opencv-python ffmpeg-python scikit-image scipy colorama tensorflow-gpu pyqt5 tf2onnx
Надеюсь, что это вам поможет.

File not found to the google drive link attached

https://drive.google.com/drive/folders/17a9b9zmLdnAlItifcGSE9ixDIDAT3YxP attached at Step 4. Download CelebA Dataset is coming as file not found. Please upload the file and close this issue

only use cpu train SAEHD

Since it is in a Cloud Servers, parameters cannot be passed in. All parameters are default values
and anaconda = 2021.05 cuda=11.3.
appreciate!

Something wrong with data_format

Error: Only NHWC data_format supported on CPU. Got NCHW [[nod DepthToSpace (defined at /DeepFaceLab_Linux/DeepFaceLab/core/leras/ops/init.py:337)]]

XSeg_train unable to run

Environment

$ uname -a
Linux GPU-01 5.4.0-120-generic #136-Ubuntu SMP Fri Jun 10 13:40:48 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ nvidia-smi
Sat Jul  2 12:23:56 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.05    Driver Version: 510.73.05    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:17:00.0 Off |                  N/A |
|  0%   35C    P8    12W / 250W |   1091MiB / 11264MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  Off  | 00000000:65:00.0 Off |                  N/A |
|  0%   30C    P8    11W / 250W |      8MiB / 11264MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA GeForce ...  Off  | 00000000:66:00.0 Off |                  N/A |
|  0%   31C    P8    10W / 250W |   6690MiB / 11264MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

Steps to reproduce

I am following the guide from Druuzil Tech and Games here.

Copy data_src.mp4 into workspace
Copy the 1.7GB RTT model into workspace/model
Copy the 9GB RTM WF Faceset into workspace/data_dst/aligned
./2_extract_image_from_data_src.sh
./4_data_src_extract_faces_S3FD.sh
- Faced a GPU error and downgrade tensorflow-gpu to 2.3.1 as per #20
./5_XSeg_data_src_mask_apply.sh
./5_XSeg_train.sh

Error Output

Loading samples: 100%|#########################################################################################| 25461/25461 [00:57<00:00, 439.62it/s]
Loaded 63012 packed faces from /data/home/kryan/DeepFaceLab_Linux/workspace/data_dst/aligned
Filtering: 100%|##############################################################################################| 88473/88473 [00:58<00:00, 1514.01it/s]
Using 278 segmented samples.
================== Model Summary ==================
==                                               ==
==        Model name: XSeg                       ==
==                                               ==
== Current iteration: 1                          ==
==                                               ==
==---------------- Model Options ----------------==
==                                               ==
==         face_type: wf                         ==
==          pretrain: False                      ==
==        batch_size: 8                          ==
==                                               ==
==----------------- Running On ------------------==
==                                               ==
==      Device index: 0                          ==
==              Name: NVIDIA GeForce GTX 1080 Ti ==
==              VRAM: 9.03GB                     ==
==                                               ==
===================================================
Starting. Press "Enter" to stop training and save model.
: cannot connect to X server .8308]
Error: DNN Backward Data function launch failure : input shape([8,32,258,258]) filter shape([3,3,32,1])
         [[node gradients/Conv2D_30_grad/Conv2DBackpropInput (defined at /DeepFaceLab_Linux/DeepFaceLab/core/leras/ops/__init__.py:55) ]]

Errors may have originated from an input operation.
Input Source operations connected to node gradients/Conv2D_30_grad/Conv2DBackpropInput:
 XSeg/out_conv/weight/read (defined at /DeepFaceLab_Linux/DeepFaceLab/core/leras/layers/Conv2D.py:61)

Original stack trace for 'gradients/Conv2D_30_grad/Conv2DBackpropInput':
  File "/.conda/envs/deepfacelab/lib/python3.7/threading.py", line 890, in _bootstrap
    self._bootstrap_inner()
  File "/.conda/envs/deepfacelab/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/.conda/envs/deepfacelab/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/DeepFaceLab_Linux/DeepFaceLab/mainscripts/Trainer.py", line 58, in trainerThread
    debug=debug)
  File "/DeepFaceLab_Linux/DeepFaceLab/models/Model_XSeg/Model.py", line 17, in __init__
    super().__init__(*args, force_model_class_name='XSeg', **kwargs)
  File "/DeepFaceLab_Linux/DeepFaceLab/models/ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "/DeepFaceLab_Linux/DeepFaceLab/models/Model_XSeg/Model.py", line 118, in on_initialize
    gpu_loss_gvs += [ nn.gradients ( gpu_loss, self.model.get_weights() ) ]
  File "/DeepFaceLab_Linux/DeepFaceLab/core/leras/ops/__init__.py", line 55, in tf_gradients
    grads = gradients.gradients(loss, vars, colocate_gradients_with_ops=True )
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/gradients_impl.py", line 172, in gradients
    unconnected_gradients)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/gradients_util.py", line 669, in _GradientsHelper
    lambda: grad_fn(op, *out_grads))
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/gradients_util.py", line 336, in _MaybeCompile
    return grad_fn()  # Exit early
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/gradients_util.py", line 669, in <lambda>
    lambda: grad_fn(op, *out_grads))
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/nn_grad.py", line 596, in _Conv2DGrad
    data_format=data_format),
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 1300, in conv2d_backprop_input
    name=name)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 744, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3485, in _create_op_internal
    op_def=op_def)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1949, in __init__
    self._traceback = tf_stack.extract_stack()

...which was originally created as op 'Conv2D_30', defined at:
  File "/.conda/envs/deepfacelab/lib/python3.7/threading.py", line 890, in _bootstrap
    self._bootstrap_inner()
[elided 4 identical lines from previous traceback]
  File "/DeepFaceLab_Linux/DeepFaceLab/models/ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "/DeepFaceLab_Linux/DeepFaceLab/models/Model_XSeg/Model.py", line 103, in on_initialize
    gpu_pred_logits_t, gpu_pred_t = self.model.flow(gpu_input_t, pretrain=self.pretrain)
  File "/DeepFaceLab_Linux/DeepFaceLab/facelib/XSegNet.py", line 85, in flow
    return self.model(x, pretrain=pretrain)
  File "/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 117, in __call__
    return self.forward(*args, **kwargs)
  File "/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/XSeg.py", line 167, in forward
    logits = self.out_conv(x)
  File "/DeepFaceLab_Linux/DeepFaceLab/core/leras/layers/LayerBase.py", line 14, in __call__
    return self.forward(*args, **kwargs)
  File "/DeepFaceLab_Linux/DeepFaceLab/core/leras/layers/Conv2D.py", line 101, in forward
    x = tf.nn.conv2d(x, weight, strides, 'VALID', dilations=dilations, data_format=nn.data_format)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
    return target(*args, **kwargs)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/nn_ops.py", line 2273, in conv2d
    name=name)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 979, in conv2d
    data_format=data_format, dilations=dilations, name=name)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 744, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3485, in _create_op_internal
    op_def=op_def)

Traceback (most recent call last):
  File "/data/home/kryan/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/data/home/kryan/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/data/home/kryan/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InternalError: DNN Backward Data function launch failure : input shape([8,32,258,258]) filter shape([3,3,32,1])
         [[{{node gradients/Conv2D_30_grad/Conv2DBackpropInput}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/home/kryan/DeepFaceLab_Linux/DeepFaceLab/mainscripts/Trainer.py", line 129, in trainerThread
    iter, iter_time = model.train_one_iter()
  File "/data/home/kryan/DeepFaceLab_Linux/DeepFaceLab/models/ModelBase.py", line 474, in train_one_iter
    losses = self.onTrainOneIter()
  File "/data/home/kryan/DeepFaceLab_Linux/DeepFaceLab/models/Model_XSeg/Model.py", line 194, in onTrainOneIter
    loss = self.train (image_np, target_np)
  File "/data/home/kryan/DeepFaceLab_Linux/DeepFaceLab/models/Model_XSeg/Model.py", line 136, in train
    l, _ = nn.tf_sess.run ( [loss, loss_gv_op], feed_dict={self.model.input_t :input_np, self.model.target_t :target_np })
  File "/data/home/kryan/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 958, in run
    run_metadata_ptr)
  File "/data/home/kryan/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1181, in _run
    feed_dict_tensor, options, run_metadata)
  File "/data/home/kryan/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/data/home/kryan/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: DNN Backward Data function launch failure : input shape([8,32,258,258]) filter shape([3,3,32,1])
         [[node gradients/Conv2D_30_grad/Conv2DBackpropInput (defined at /DeepFaceLab_Linux/DeepFaceLab/core/leras/ops/__init__.py:55) ]]

Errors may have originated from an input operation.
Input Source operations connected to node gradients/Conv2D_30_grad/Conv2DBackpropInput:
 XSeg/out_conv/weight/read (defined at /DeepFaceLab_Linux/DeepFaceLab/core/leras/layers/Conv2D.py:61)

Original stack trace for 'gradients/Conv2D_30_grad/Conv2DBackpropInput':
  File "/.conda/envs/deepfacelab/lib/python3.7/threading.py", line 890, in _bootstrap
    self._bootstrap_inner()
  File "/.conda/envs/deepfacelab/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/.conda/envs/deepfacelab/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/DeepFaceLab_Linux/DeepFaceLab/mainscripts/Trainer.py", line 58, in trainerThread
    debug=debug)
  File "/DeepFaceLab_Linux/DeepFaceLab/models/Model_XSeg/Model.py", line 17, in __init__
    super().__init__(*args, force_model_class_name='XSeg', **kwargs)
  File "/DeepFaceLab_Linux/DeepFaceLab/models/ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "/DeepFaceLab_Linux/DeepFaceLab/models/Model_XSeg/Model.py", line 118, in on_initialize
    gpu_loss_gvs += [ nn.gradients ( gpu_loss, self.model.get_weights() ) ]
  File "/DeepFaceLab_Linux/DeepFaceLab/core/leras/ops/__init__.py", line 55, in tf_gradients
    grads = gradients.gradients(loss, vars, colocate_gradients_with_ops=True )
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/gradients_impl.py", line 172, in gradients
    unconnected_gradients)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/gradients_util.py", line 669, in _GradientsHelper
    lambda: grad_fn(op, *out_grads))
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/gradients_util.py", line 336, in _MaybeCompile
    return grad_fn()  # Exit early
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/gradients_util.py", line 669, in <lambda>
    lambda: grad_fn(op, *out_grads))
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/nn_grad.py", line 596, in _Conv2DGrad
    data_format=data_format),
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 1300, in conv2d_backprop_input
    name=name)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 744, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3485, in _create_op_internal
    op_def=op_def)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1949, in __init__
    self._traceback = tf_stack.extract_stack()

...which was originally created as op 'Conv2D_30', defined at:
  File "/.conda/envs/deepfacelab/lib/python3.7/threading.py", line 890, in _bootstrap
    self._bootstrap_inner()
[elided 4 identical lines from previous traceback]
  File "/DeepFaceLab_Linux/DeepFaceLab/models/ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "/DeepFaceLab_Linux/DeepFaceLab/models/Model_XSeg/Model.py", line 103, in on_initialize
    gpu_pred_logits_t, gpu_pred_t = self.model.flow(gpu_input_t, pretrain=self.pretrain)
  File "/DeepFaceLab_Linux/DeepFaceLab/facelib/XSegNet.py", line 85, in flow
    return self.model(x, pretrain=pretrain)
  File "/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 117, in __call__
    return self.forward(*args, **kwargs)
  File "/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/XSeg.py", line 167, in forward
    logits = self.out_conv(x)
  File "/DeepFaceLab_Linux/DeepFaceLab/core/leras/layers/LayerBase.py", line 14, in __call__
    return self.forward(*args, **kwargs)
  File "/DeepFaceLab_Linux/DeepFaceLab/core/leras/layers/Conv2D.py", line 101, in forward
    x = tf.nn.conv2d(x, weight, strides, 'VALID', dilations=dilations, data_format=nn.data_format)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
    return target(*args, **kwargs)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/nn_ops.py", line 2273, in conv2d
    name=name)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 979, in conv2d
    data_format=data_format, dilations=dilations, name=name)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 744, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "/.conda/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3485, in _create_op_internal
    op_def=op_def)

Can not start a new thread with python3.6

when i run bash 6_train_Quick96.sh

file not found

hello!
when running this sh 2_extract_PNG_from_video_data_src.sh
i get this error 2_extract_PNG_from_video_data_src.sh: line 2: source: env.sh: file not found

Can you please post a license?

With no license explicitly posted, legally speaking a project defaults to nobody having permission to use these in any way, which causes many companies to just forbid any use whatsoever.

Can you please choose an appropriate common open source license and add it to the repository? Thanks.

can not assign GPU device for operation

I have used your instruction to install 'deepfacelab' on my computer,
everything is ok until I want to run this file: '5_XSEG_train.sh'
it detects my GPU device as '[0] GeForce GTX 1080 ti' and I select this device to execute the process, but the following error occurs after running the code.

I'm using the following packages and devices:
CUDA11.1
GPU 1080 ti 11 GIG
Ubuntu 18.04

Error: Cannot assign a device for operation XSeg/conv01/conv/weight: node XSeg/conv01/conv/weight (defined at media/mosi/0D10-9619/DeepFaceLab/DeepFaceLab_Linux/DeepFaceLab/core/leras/layers/Conv2D.py:76) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Make sure the device specification refers to a valid device.
[[XSeg/conv01/conv/weight]]
Traceback (most recent call last):
File "/home/mosi/anaconda3/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1375, in _do_call
return fn(*args)
File "/home/mosi/anaconda3/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1358, in _run_fn
self._extend_graph()
File "/home/mosi/anaconda3/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1398, in _extend_graph
tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation XSeg/conv01/conv/weight: {{node XSeg/conv01/conv/weight}} was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Make sure the device specification refers to a valid device.
[[XSeg/conv01/conv/weight]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/media/mosi/0D10-9619/DeepFaceLab/DeepFaceLab_Linux/DeepFaceLab/mainscripts/Trainer.py", line 57, in trainerThread
debug=debug,
File "/media/mosi/0D10-9619/DeepFaceLab/DeepFaceLab_Linux/DeepFaceLab/models/Model_XSeg/Model.py", line 17, in init
super().init(*args, force_model_class_name='XSeg', **kwargs)
File "/media/mosi/0D10-9619/DeepFaceLab/DeepFaceLab_Linux/DeepFaceLab/models/ModelBase.py", line 189, in init
self.on_initialize()
File "/media/mosi/0D10-9619/DeepFaceLab/DeepFaceLab_Linux/DeepFaceLab/models/Model_XSeg/Model.py", line 68, in on_initialize
data_format=nn.data_format)
File "/media/mosi/0D10-9619/DeepFaceLab/DeepFaceLab_Linux/DeepFaceLab/facelib/XSegNet.py", line 68, in init
do_init = not model.load_weights( model_file_path )
File "/media/mosi/0D10-9619/DeepFaceLab/DeepFaceLab_Linux/DeepFaceLab/core/leras/layers/Saveable.py", line 96, in load_weights
nn.batch_set_value(tuples)
File "/media/mosi/0D10-9619/DeepFaceLab/DeepFaceLab_Linux/DeepFaceLab/core/leras/ops/init.py", line 29, in batch_set_value
nn.tf_sess.run(assign_ops, feed_dict=feed_dict)
File "/home/mosi/anaconda3/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 968, in run
run_metadata_ptr)
File "/home/mosi/anaconda3/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1191, in _run
feed_dict_tensor, options, run_metadata)
File "/home/mosi/anaconda3/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1369, in _do_run
run_metadata)
File "/home/mosi/anaconda3/envs/deepfacelab/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1394, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation XSeg/conv01/conv/weight: node XSeg/conv01/conv/weight (defined at media/mosi/0D10-9619/DeepFaceLab/DeepFaceLab_Linux/DeepFaceLab/core/leras/layers/Conv2D.py:76) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Make sure the device specification refers to a valid device.
[[XSeg/conv01/conv/weight]]

Is there any way to run it in CPU mode only? Going to use google colab.

Is there any way to run this with no GPU? CPU only?

I am trying to extract without a GPU, CPU only. Going to use google colab. I can't seem to get any images to extract after using the script. It displays nothing in Terminal nor in the folder. Is this possible or is there a way to force CPU on my end?

Maybe this helps someone having issues

First of all, thanks for maintaining this repo.

I had some struggle getting this running (again), first of all current env.sh defines "export DFL_PYTHON="python3.6"" but #3 of the installation instructions states "3.7". Changing env.sh to 3.7 worked.

I further had dependency-issues executing "conda create -n deepfacelab -c main python=3.7 cudnn=8.0.5 cudatoolkit=11.2" under Ubuntu 20.04, creating environment deepfacelab with anaconda-navigator & installing cudnn (7.6.5) & cudatoolkit (10.2.89) worked.

In my case, i also had to add "eval "$(conda shell.bash hook)"" to env.sh for not getting "CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'." error.

Hope this helps.

Tensorflow issue

I'm using Linux Mint 20.

My issue is with tensorflow.
Conda by default installs tensorflow2.2 which doesn't work with the dst face extract script and returns this error.

AttributeError: module 'tensorflow' has no attribute 'global_variables_initializer'

Downgrading tensorflow to 1.9 works with the dst face extract script but returns the following error with train SAEHD sript.

Error: module 'tensorflow.initializers' has no attribute 'glorot_uniform'
Traceback (most recent call last):
  File "/home/bertus/DeepFaceLab_Linux/DeepFaceLab/mainscripts/Trainer.py", line 57, in trainerThread
    debug=debug,
  File "/home/bertus/DeepFaceLab_Linux/DeepFaceLab/models/ModelBase.py", line 189, in __init__
    self.on_initialize()
  File "/home/bertus/DeepFaceLab_Linux/DeepFaceLab/models/Model_SAEHD/Model.py", line 239, in on_initialize
    inter_out_ch = self.inter.compute_output_channels ( (nn.floatx, (None,encoder_out_ch)))
  File "/home/bertus/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 147, in compute_output_channels
    shape = self.compute_output_shape(shapes)
  File "/home/bertus/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 121, in compute_output_shape
    self.build()
  File "/home/bertus/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 65, in build
    self._build_sub(v[name],name)
  File "/home/bertus/DeepFaceLab_Linux/DeepFaceLab/core/leras/models/ModelBase.py", line 33, in _build_sub
    layer.build_weights()
  File "/home/bertus/DeepFaceLab_Linux/DeepFaceLab/core/leras/layers/Dense.py", line 45, in build_weights
    kernel_initializer = tf.initializers.glorot_uniform(dtype=self.dtype)
AttributeError: module 'tensorflow.initializers' has no attribute 'glorot_uniform'

I tried to downgrade tensorflow more since it seems to be recent changes in the module that's causing this but then i ran into dependancy issue so the best solution seems to be to upgrade the scripts but i have no Python knowledge.

Error pretraining with CelebA

I ran the 4.1_download_CelebA.sh script and have a pretrain_CelebA folder inside DeepFaceLab, and the faceset.pak inside that. However when I try to train and select pretraining on, it says

[y] Enable pretraining mode ( y/n ?:help ) : 
y
Initializing models: 100%|#####################################################################################################################################################################################| 5/5 [00:02<00:00,  2.25it/s]
Loading samples: 0it [00:00, ?it/s]
Error: No training data provided.
Traceback (most recent call last):
  File "/home/user1/DeepFaceLab_Linux/DeepFaceLab/mainscripts/Trainer.py", line 106, in trainerThread
    debug=debug)
  File "/home/user1/DeepFaceLab_Linux/DeepFaceLab/models/ModelBase.py", line 246, in __init__
    self.on_initialize()
  File "/home/user1/DeepFaceLab_Linux/DeepFaceLab/models/Model_SAEHD/Model.py", line 853, in on_initialize
    generators_count=src_generators_count
  File "/home/user1/DeepFaceLab_Linux/DeepFaceLab/samplelib/SampleGeneratorFace.py", line 48, in __init__
    raise ValueError('No training data provided.')
ValueError: No training data provided.

How can I pretrain the model?

nagadit / deepfacelab_linux Goto Github PK

deepfacelab_linux's Introduction

Using

1. Install Anaconda

2. Install System Dependencies

3. Install DeepFaceLab

4. Download Pretrain (optional)

5. Navigate to the scripts directory and begin using DeepFaceLab_Linux ᗡ:

deepfacelab_linux's People

Contributors

Stargazers

Watchers

Forkers

deepfacelab_linux's Issues

I am able to see my device when i run,

Also when I run the following command in python env

Here is my output for command nvidia-smi

Expected behaviour

Actual behaviour

Steps to reproduce

Environment

Steps to reproduce

Error Output

Recommend Projects

Recommend Topics

Recommend Org