First of all thanks for the great model! I tested it extensively by now and ran across

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hints on improvements for training and matching about knn-vc HOT 3 CLOSED

bshall commented on September 27, 2024

Hints on improvements for training and matching

from knn-vc.

Comments (3)

asusdisciple commented on September 27, 2024 1

I was able to solve the issue. The problem lies with the train_loader class. Most of the "training time" results from the heavy dataloading in a multi gpu setting. By default the workers are dismissed after every epoch, so the dataset needs to be loaded again. By using persistent_workers=True the training time can be reduced to 13 sec/Epoch on 4 GPUs. Its still not optimal since actual training is only performed during a few seconds of this time frame, but it is still a large improvement.

Another thing I noticed was to set the batch_size down to a rather small size, since the loading to the device in

  for i, batch in pb:

         if rank == 0:
             start_b = time.time()
         x, y, _, y_mel = batch
         x = x.to(device, non_blocking=True)
         y = y.to(device, non_blocking=True)
         y_mel = y_mel.to(device, non_blocking=True)
         y = y.unsqueeze(1)

took a long time. Now one epoch was trained in 7 seconds.
Just wanted to let you guys know, in case you run into these problems.

from knn-vc.

HninLwin-byte commented on September 27, 2024

I was fine-tuning the model with my own dataset, I got this error. If someone encountered the same error, please share me how to solve this problem.
checkpoints directory : /content/drive/MyDrive/data/knn-vc/pertained_model
/usr/local/lib/python3.10/dist-packages/torchaudio/transforms/_transforms.py:580: UserWarning: Argument 'onesided' has been deprecated and has no influence on the behavior of this module.
warnings.warn(
Epoch: 1: 0% 0/100 [00:00<?, ?it/s]
0% 0/31 [00:00<?, ?it/s]Before padding - Wav shape: torch.Size([1, 7040])
After padding - Wav shape: torch.Size([1, 7744])
Before padding - Wav shape: torch.Size([1, 7040])
After padding - Wav shape: torch.Size([1, 7744])
Before padding - Wav shape: torch.Size([1, 0])
Before padding - Wav shape: torch.Size([1, 7040])
After padding - Wav shape: torch.Size([1, 7744])
0% 0/31 [00:01<?, ?it/s]
Epoch: 1: 0% 0/100 [00:01<?, ?it/s]
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/content/drive/.shortcut-targets-by-id/1PdgEkagsNCmi1R4J43LiU8ygwHM4ooqv/data/knn-vc/hifigan/train.py", line 342, in
main()
File "/content/drive/.shortcut-targets-by-id/1PdgEkagsNCmi1R4J43LiU8ygwHM4ooqv/data/knn-vc/hifigan/train.py", line 338, in main
train(0, a, h)
File "/content/drive/.shortcut-targets-by-id/1PdgEkagsNCmi1R4J43LiU8ygwHM4ooqv/data/knn-vc/hifigan/train.py", line 145, in train
for i, batch in pb:
File "/usr/local/lib/python3.10/dist-packages/tqdm/std.py", line 1182, in iter
for obj in iterable:
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 630, in next
data = self._next_data()
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1345, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1371, in _process_data
data.reraise()
File "/usr/local/lib/python3.10/dist-packages/torch/_utils.py", line 694, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/content/drive/.shortcut-targets-by-id/1PdgEkagsNCmi1R4J43LiU8ygwHM4ooqv/data/knn-vc/hifigan/meldataset.py", line 203, in getitem
mel_loss = self.alt_melspec(audio)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/content/drive/.shortcut-targets-by-id/1PdgEkagsNCmi1R4J43LiU8ygwHM4ooqv/data/knn-vc/hifigan/meldataset.py", line 77, in forward
wav = F.pad(wav, ((self.n_fft - self.hop_size) // 2, (self.n_fft - self.hop_size) // 2), "reflect")
RuntimeError: Expected 2D or 3D (batch mode) tensor with possibly 0 batch size and other non-zero dimensions for input, but got: [1, 0]

from knn-vc.

RF5 commented on September 27, 2024

Hi @asusdisciple , thanks for the suggestions. We have added the persistent workers trick to the training code now.

And @HninLwin-byte thanks for your issue, it looks like one of your audio files might be corrupt or less than the minimum allowable length (about 160ms). I would double check all files in your dataset are not corrupt / readable, and at least 160ms long.

from knn-vc.

Hints on improvements for training and matching about knn-vc HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent