Comments (8)
Sorry the late response.
I added
plt.close("all")
well.. suddenly started getting messages about freeing stuff up not in the main thread or some such.
So I changed it to read (based on your instructions):
@staticmethod
def evaluation(eval_step, losses, mcd, source_len, target_len, source, target, prediction_forced, prediction, stop_prediction, stop_target, alignment, classifier):
"""Log evaluation results.
Arguments:
eval_step -- number of the current evaluation step (i.e. epoch)
losses (dictionary of {loss name, value})-- dictionary with values of batch losses
mcd (float) -- evaluation Mel Cepstral Distorsion
source_len (tensor) -- number of characters of input utterances
target_len (tensor) -- number of frames of ground-truth spectrograms
source (tensor) -- input utterances
target (tensor) -- ground-truth spectrograms
prediction_forced (tensor) -- ground-truth-aligned spectrograms
prediction (tensor) -- predicted spectrograms
stop_prediction (tensor) -- predicted stop token probabilities
stop_target (tensor) -- true stop token probabilities
alignment (tensor) -- alignments (attention weights for each frame) of the last evaluation batch
classifier (float) -- accuracy of the reversal classifier
"""
# log losses
total_loss = sum(losses.values())
Logger._sw.add_scalar(f'Eval/loss_total', total_loss, eval_step)
for n, l in losses.items():
Logger._sw.add_scalar(f'Eval/loss_{n}', l, eval_step)
# show random sample: spectrogram, stop token probability, alignment and audio
idx = random.randint(0, alignment.size(0) - 1)
predicted_spec = prediction[idx, :, :target_len[idx]].data.cpu().numpy()
f_predicted_spec = prediction_forced[idx, :, :target_len[idx]].data.cpu().numpy()
target_spec = target[idx, :, :target_len[idx]].data.cpu().numpy()
# log spectrograms
if hp.normalize_spectrogram:
predicted_spec = audio.denormalize_spectrogram(predicted_spec, not hp.predict_linear)
f_predicted_spec = audio.denormalize_spectrogram(f_predicted_spec, not hp.predict_linear)
target_spec = audio.denormalize_spectrogram(target_spec, not hp.predict_linear)
f = Logger._plot_spectrogram(predicted_spec)
Logger._sw.add_figure(f"Predicted/generated", f, eval_step)
plt.close(f)
f = Logger._plot_spectrogram(f_predicted_spec)
Logger._sw.add_figure(f"Predicted/forced", f, eval_step)
plt.close(f)
f = Logger._plot_spectrogram(target_spec)
Logger._sw.add_figure(f"Target/eval", f, eval_step)
plt.close(f)
# log audio
waveform = audio.inverse_spectrogram(predicted_spec, not hp.predict_linear)
Logger._sw.add_audio(f"Audio/generated", waveform, eval_step, sample_rate=hp.sample_rate)
waveform = audio.inverse_spectrogram(f_predicted_spec, not hp.predict_linear)
Logger._sw.add_audio(f"Audio/forced", waveform, eval_step, sample_rate=hp.sample_rate)
# log alignment
alignment = alignment[idx, :target_len[idx], :source_len[idx]].data.cpu().numpy().T
f=Logger._plot_alignment(alignment)
Logger._sw.add_figure(f"Alignment/eval", f, eval_step)
plt.close(f)
# log source text
utterance = text.to_text(source[idx].data.cpu().numpy()[:source_len[idx]], hp.use_phonemes)
Logger._sw.add_text(f"Text/eval", utterance, eval_step)
# log stop tokens
f = Logger._plot_stop_tokens(stop_target[idx].data.cpu().numpy(), stop_prediction[idx].data.cpu().numpy())
Logger._sw.add_figure(f"Stop/eval", f, eval_step)
plt.close(f)
# log mel cepstral distorsion
Logger._sw.add_scalar(f'Eval/mcd', mcd, eval_step)
# log reversal language classifier accuracy
if hp.reversal_classifier:
Logger._sw.add_scalar(f'Eval/classifier', classifier, eval_step)
So far so good. At 6 epochs on the resumed training and the Xorg memory is no longer increasing every training loop. And no crashes.
from multilingual_text_to_speech.
ok, I added PyQt5 to the environment and added the following to the main script:
import matplotlib
matplotlib.use("Qt5Agg")
And I'm resuming training now.
from multilingual_text_to_speech.
Hello, thank you for your observation!
I unfortunately cannot replicate the problem.
The code does not explicitly dispose created figures which are passed into tensorboard's SummaryWritter
. However, the documentation of SummaryWritter.add_figure(tag, figure, global_step=None, close=True, walltime=None)
says that the call should automatically close the figure
if close=True
.
Can you please change the utils/logging.py
file as follows and test whether it works?
...
# log spectrograms
if hp.normalize_spectrogram:
predicted_spec = audio.denormalize_spectrogram(predicted_spec, not hp.predict_linear)
f_predicted_spec = audio.denormalize_spectrogram(f_predicted_spec, not hp.predict_linear)
target_spec = audio.denormalize_spectrogram(target_spec, not hp.predict_linear)
f = Logger._plot_spectrogram(predicted_spec)
Logger._sw.add_figure(f"Predicted/generated", f, eval_step)
plt.close(f)
f = Logger._plot_spectrogram(f_predicted_spec)
Logger._sw.add_figure(f"Predicted/forced", f, eval_step)
plt.close(f)
f = Logger._plot_spectrogram(target_spec)
Logger._sw.add_figure(f"Target/eval", f, eval_step)
plt.close(f)
# log audio
waveform = audio.inverse_spectrogram(predicted_spec, not hp.predict_linear)
Logger._sw.add_audio(f"Audio/generated", waveform, eval_step, sample_rate=hp.sample_rate)
waveform = audio.inverse_spectrogram(f_predicted_spec, not hp.predict_linear)
Logger._sw.add_audio(f"Audio/forced", waveform, eval_step, sample_rate=hp.sample_rate)
# log alignment
alignment = alignment[idx, :target_len[idx], :source_len[idx]].data.cpu().numpy().T
f = Logger._plot_alignment(alignment)
Logger._sw.add_figure(f"Alignment/eval", f, eval_step)
plt.close(f)
# log source text
utterance = text.to_text(source[idx].data.cpu().numpy()[:source_len[idx]], hp.use_phonemes)
Logger._sw.add_text(f"Text/eval", utterance, eval_step)
# log stop tokens
Logger._sw.add_figure(f"Stop/eval", Logger._plot_stop_tokens(stop_target[idx].data.cpu().numpy(), stop_prediction[idx].data.cpu().numpy()), eval_step)
...
Thank you very much.
from multilingual_text_to_speech.
Sorry the late response.
I added
plt.close("all")
as the last statement in both the evaluation reporting and the train reporting.
This seems to have solved the issue where the plots were causing the Xorg server to reserve memory for plots never being screen displayed.
from multilingual_text_to_speech.
ack, now I'm getting crashes:
Exception ignored in: <function Image.__del__ at 0x7efbc32268c0>
Traceback (most recent call last):
File "/home/muksihs/git/Multilingual_Text_to_Speech/env/lib/python3.7/tkinter/__init__.py", line 3507, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7efbc32268c0>
Traceback (most recent call last):
File "/home/muksihs/git/Multilingual_Text_to_Speech/env/lib/python3.7/tkinter/__init__.py", line 3507, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7efbc32268c0>
Traceback (most recent call last):
File "/home/muksihs/git/Multilingual_Text_to_Speech/env/lib/python3.7/tkinter/__init__.py", line 3507, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
from multilingual_text_to_speech.
I stil cannot reproduce your issue ๐ฅ
This issue concerns something similar. It solves the problem with these changes, i.e.:
I fixed issue of #5 by changing the backend of matplotlib from Tkinter(TkAgg) to PyQt5(Qt5Agg).
(See https://stackoverflow.com/questions/14694408/runtimeerror-main-thread-is-not-in-main-loop and http://matplotlib.1069221.n5.nabble.com/Matplotlib-Tk-and-multithreading-td40647.html )
Another way is probably to remove the plt.close(...)
as I suggested above and sometimes explicitly force garbage collecting:
import gc
gc.collect()
Can you try it out and let me know, please?
from multilingual_text_to_speech.
ok, I added PyQt5 to the environment and added the following to the main script:
import matplotlib matplotlib.use("Qt5Agg")And I'm resuming training now.
With the plot.close(f) code, no crashes because of thread violations so far. (5 hours run time).
from multilingual_text_to_speech.
I am glad to hear that ๐
from multilingual_text_to_speech.
Related Issues (20)
- Adding support for windows sapi5 or android HOT 4
- Voice cloning attempts HOT 1
- Model is much slower on CPU HOT 6
- No softmax layer in the classifier? HOT 1
- Params.py issue
- torch version issue HOT 4
- can't run train.py HOT 1
- data HOT 3
- When I try to train it, I got the following error: HOT 6
- How "Pronunciation control" can be implemented? HOT 1
- batchnorm1D on padded values results in large activation scaling HOT 3
- Project dependencies may have API risk issues HOT 2
- about ยต and variances ฯ HOT 5
- Dataset with various sample rates and frequency bins HOT 1
- preprocess Error HOT 1
- Can we get a cloned voicie in Real Time ? HOT 1
- is the pretrained model support speech generation in Hebrew? HOT 1
- CUDA Out of Memory error after a couple of epochs HOT 1
- Same here.
- why do we need multiple languages & multiple speakers? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from multilingual_text_to_speech.