We can extract both vocals and music background from this repo ? about music_source_separation HOT 5 CLOSED

leminhnguyen commented on June 25, 2024

We can extract both vocals and music background from this repo ?

from music_source_separation.

Comments (5)

qiuqiangkong commented on June 25, 2024 1

Hi nguyenlm, Thanks for the remind! Now you can follow https://github.com/bytedance/music_source_separation/blob/master/separate_scripts/download_checkpoints.sh to download the resunet_subbandtime checkpoints.

…

On Thu, 16 Sept 2021 at 16:21, nguyenlm ***@***.***> wrote: Hey @qiuqiangkong <https://github.com/qiuqiangkong>, In the separate_scripts/download_checkpoints.sh you downloaded the ismir2021 checkpoint, but in the separate_scripts/separate_vocals.sh the default model was resunet_subbandtime. That leads to mismatch error 😄. I've tried the resunet_ismir2021 model to separate vocals, it reduced about 70% the accompaniment in the audio, that's awesome. I can improve more by finetuning your pretrained model ? Btw you can release resunet_subbandtime model. Again, thanks for your amazing work. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADFXTSPUZSWG3HIUWFCJKDTUCGSJJANCNFSM5EAXQSTQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

from music_source_separation.

qiuqiangkong commented on June 25, 2024

Yes, both vocals and accompaniment are supported.

from music_source_separation.

leminhnguyen commented on June 25, 2024

the pretrained model doesn't seem to work. I tried to download the checkpoints from your script, after that run the separate_vocals.sh but the size mismatch problem was raised:

size mismatch for stft.conv_real.weight: copying a param with shape torch.Size([1025, 1, 2048]) from checkpoint, the shape in current model is torch.Size([257, 1, 512]).
size mismatch for stft.conv_imag.weight: copying a param with shape torch.Size([1025, 1, 2048]) from checkpoint, the shape in current model is torch.Size([257, 1, 512]).
size mismatch for istft.ola_window: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for istft.conv_real.weight: copying a param with shape torch.Size([2048, 2048, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 1]).
size mismatch for istft.conv_imag.weight: copying a param with shape torch.Size([2048, 2048, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 1]).
size mismatch for bn0.weight: copying a param with shape torch.Size([1025]) from checkpoint, the shape in current model is torch.Size([257]).
size mismatch for bn0.bias: copying a param with shape torch.Size([1025]) from checkpoint, the shape in current model is torch.Size([257]).
size mismatch for bn0.running_mean: copying a param with shape torch.Size([1025]) from checkpoint, the shape in current model is torch.Size([257]).
size mismatch for bn0.running_var: copying a param with shape torch.Size([1025]) from checkpoint, the shape in current model is torch.Size([257]).
size mismatch for encoder_block1.conv_block1.bn1.weight: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([8]).
size mismatch for encoder_block1.conv_block1.bn1.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([8]).
size mismatch for encoder_block1.conv_block1.bn1.running_mean: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([8]).
size mismatch for encoder_block1.conv_block1.bn1.running_var: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([8]).
size mismatch for encoder_block1.conv_block1.conv1.weight: copying a param with shape torch.Size([32, 2, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 8, 3, 3]).
size mismatch for encoder_block1.conv_block1.shortcut.weight: copying a param with shape torch.Size([32, 2, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 8, 1, 1]).
size mismatch for after_conv2.weight: copying a param with shape torch.Size([8, 32, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 1, 1]).
size mismatch for after_conv2.bias: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([32]).

from music_source_separation.

qiuqiangkong commented on June 25, 2024

Did you change the hyper-parameters such as window_size in the model? They should be 2048 by default.

…

On Thu, 16 Sept 2021 at 13:42, nguyenlm ***@***.***> wrote: the pretrained model doesn't seem to work. I tried to download the checkpoints from your script, after that run the separate_vocals.sh but the size mismatch problem was raised: size mismatch for stft.conv_real.weight: copying a param with shape torch.Size([1025, 1, 2048]) from checkpoint, the shape in current model is torch.Size([257, 1, 512]). size mismatch for stft.conv_imag.weight: copying a param with shape torch.Size([1025, 1, 2048]) from checkpoint, the shape in current model is torch.Size([257, 1, 512]). size mismatch for istft.ola_window: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for istft.conv_real.weight: copying a param with shape torch.Size([2048, 2048, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 1]). size mismatch for istft.conv_imag.weight: copying a param with shape torch.Size([2048, 2048, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 1]). size mismatch for bn0.weight: copying a param with shape torch.Size([1025]) from checkpoint, the shape in current model is torch.Size([257]). size mismatch for bn0.bias: copying a param with shape torch.Size([1025]) from checkpoint, the shape in current model is torch.Size([257]). size mismatch for bn0.running_mean: copying a param with shape torch.Size([1025]) from checkpoint, the shape in current model is torch.Size([257]). size mismatch for bn0.running_var: copying a param with shape torch.Size([1025]) from checkpoint, the shape in current model is torch.Size([257]). size mismatch for encoder_block1.conv_block1.bn1.weight: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([8]). size mismatch for encoder_block1.conv_block1.bn1.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([8]). size mismatch for encoder_block1.conv_block1.bn1.running_mean: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([8]). size mismatch for encoder_block1.conv_block1.bn1.running_var: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([8]). size mismatch for encoder_block1.conv_block1.conv1.weight: copying a param with shape torch.Size([32, 2, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 8, 3, 3]). size mismatch for encoder_block1.conv_block1.shortcut.weight: copying a param with shape torch.Size([32, 2, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 8, 1, 1]). size mismatch for after_conv2.weight: copying a param with shape torch.Size([8, 32, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 1, 1]). size mismatch for after_conv2.bias: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([32]). — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADFXTSMSB7R246Z4JQKO7W3UCF7TRANCNFSM5EAXQSTQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

from music_source_separation.

leminhnguyen commented on June 25, 2024

Hey @qiuqiangkong, In the separate_scripts/download_checkpoints.sh you downloaded the ismir2021 checkpoint, but in the separate_scripts/separate_vocals.sh the default model was resunet_subbandtime. That leads to mismatch error 😄.

I've tried the resunet_ismir2021 model to separate vocals, it reduced about 70% the accompaniment in the audio, that's awesome. I can improve more by finetuning your pretrained model ? Btw you can release resunet_subbandtime model ?

Again, thanks for your amazing work.

from music_source_separation.

We can extract both vocals and music background from this repo ? about music_source_separation HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent