adobe-research / deepafx-st Goto Github PK
View Code? Open in Web Editor NEWDeepAFx-ST - Style transfer of audio effects with differentiable signal processing. Please see https://csteinmetz1.github.io/DeepAFx-ST/
License: Other
DeepAFx-ST - Style transfer of audio effects with differentiable signal processing. Please see https://csteinmetz1.github.io/DeepAFx-ST/
License: Other
Hello!
When I run the script:
(deepafx-st) E:\CODE\DeepAFx-ST>python scripts/process.py -i "E:\CODE\DeepAFx-ST\audio files\raw\160 JAZZ DNB2_MdcL3.wav" -r "E:\CODE\DeepAFx-ST\audio files\target\05 NOT TiGHT.wav" -c "E:\CODE\DeepAFx-ST\checkpoints\style\jamendo\autodiff\lightning_logs\version_0\checkpoints\epoch=362-step=1210241-val-jamendo-autodiff.ckpt"
I encounter a runtime error:
Resampling to 24000 Hz...
Traceback (most recent call last):
File "scripts/process.py", line 89, in <module>
x_24000 = torch.tensor(resampy.resample(x.view(-1).numpy(), x_sr, 24000))
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
Any ideas how I can resolve this?
System specs:
Windows 10
Anaconda Python
cmd.exe
I want to process an entire audio file, but the code currently uses only five seconds from the input and reference.
When I comment these out in process.py, it returns the processed file, but the audio seems to appear more than once in the output.
x_24000 = x_24000[0:1, : 24000 * 5]
r_24000 = r_24000[0:1, : 24000 * 5]
hi @csteinmetz1 ,
great work.
hoping that the inference notebook would be ready soon on google colab.
there are an EQ and Compressor in "system.processor"
how can I get the detail?
if run process.py with option "--time", it crashed with "forward() got an unexpected keyword argument 'time_it'":
process.py -i examples/voice_raw.wav -r examples/voice_produced.wav --time -c checkpoints/style/libritts/tcn1/lightning_logs/version_1/checkpoints/epoch=367-step=1226911-val-libritts-tcn1.ckpt
Exception has occurred: TypeError (note: full exception trace is shown but execution is paused at: _run_module_as_main)
forward() got an unexpected keyword argument 'time_it'
File "C:\Users\xxx\anaconda3\envs\afx\Lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "D:\DeepAFx-ST-0.1.0\scripts\process.py", line 130, in
y_hat, p, e, encoder_time_sec, dsp_time_sec = system(
File "C:\Users\xxx\anaconda3\envs\afx\Lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\xxx\anaconda3\envs\afx\Lib\runpy.py", line 194, in _run_module_as_main (Current frame)
return _run_code(code, main_globals, None,
Is it possible to change the resample from 24000 to 44100? Ideally, I would like to get the 20hz-20khz bandwidth.
Is the google colab available anywhere or not ? The link on the GitHub does not work :/ Thanks !
DeepAFx-ST/deepafx_st/data/dataset.py
Lines 229 to 246 in 49bd0c8
I attempted to improve DeepAFx-ST. Here's what I did.
Download the zip from https://github.com/adobe-research/DeepAFx-ST and extract it.
Open Notepad++, press CTRL+SHIFT+F, find 24000, replace 44100, set the directory, replace in files.
At this point you can safely add the checkpoints and examples.
Edit scripts/process.py
Replace x_44100 = torch.tensor(resampy.resample(x.view(-1).numpy(), x_sr, 44100))
with x_44100 = torch.tensor(resampy.resample(x.reshape(-1).numpy(), x_sr, 44100))
Under x_44100 = x_44100.view(1, -1)
insert x_44100 = x_44100[0:1, : x_44100.shape[-1] // 2]
Under x_44100 = x
insert x_44100 = x_44100[0:1, : x_44100.shape[-1]]
Replace r_44100 = torch.tensor(resampy.resample(r.view(-1).numpy(), r_sr, 44100))
with r_44100 = torch.tensor(resampy.resample(r.reshape(-1).numpy(), r_sr, 44100))
Under r_44100 = r_44100.view(1, -1)
insert r_44100 = r_44100[0:1, : r_44100.shape[-1] // 2]
Under r_44100 = r
insert r_44100 = r_44100[0:1, : r_44100.shape[-1]]
Remove x_44100 = x_44100[0:1, : 44100 * 5]
Remove r_44100 = r_44100[0:1, : 44100 * 5]
Replace filename = os.path.basename(args.input).replace(".wav", "")
with filename = os.path.splitext(os.path.basename(args.input))[0]
Remove reference = os.path.basename(args.reference).replace(".wav", "")
Replace out_filepath = os.path.join(dirname, f"{filename}_out_ref={reference}.wav")
with out_filepath = os.path.join(dirname, f"{filename}_DeepAFx-ST.wav")
Remove in_filepath = os.path.join(dirname, f"{filename}_in.wav")
Remove torchaudio.save(in_filepath, x_44100.cpu().view(1, -1), 44100)
You should be good to go!
It's possible that this approach may have broken some things not related to processing.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.