Coder Social home page Coder Social logo

train a model using 24kHz about lpcnet HOT 16 CLOSED

xiph avatar xiph commented on July 4, 2024
train a model using 24kHz

from lpcnet.

Comments (16)

jmvalin avatar jmvalin commented on July 4, 2024

The correct way to do it involves adding a few bands and changing a few constants throughout the code. It's not fundamentally hard, but easy to forget a few things. That being said, considering that 24 isn't that big a change compared to 16, you may be able to get away with pretending that your 24 kHz audio is really 16 kHz and not modifying the code at all.

from lpcnet.

luvpine avatar luvpine commented on July 4, 2024

Thank you for your answer.
Then, did you normalize training wav files (like -26db, -18db) when you extract features using dump_data?

from lpcnet.

hdmjdp avatar hdmjdp commented on July 4, 2024

@luvpine have you got good wav using 24k without modifying code?

from lpcnet.

luvpine avatar luvpine commented on July 4, 2024

@hdmjdp No, I mean I'd like to make 24kHz wav file (high quality), not train using 24kHz data.

from lpcnet.

hdmjdp avatar hdmjdp commented on July 4, 2024

I know, do you modify frame_size=160?

from lpcnet.

belevtsoff avatar belevtsoff commented on July 4, 2024

@jmvalin Thanks a lot for the info! It would be great if you could give more details on what to change to switch to 22050Hz. Specifically, I was thinking of changing these things:

  • increase NB_BANDS=24.
  • change FRAME_SIZE to 256 (~11.6ms)
  • OVERLAP_SIZE=256
  • TRAINIG_OFFSET=128
  • WINDOW_SIZE=512

I'm not sure about the pitch parameters though. In particular, currently:
PITCH_MIN_PERIOD 32
PITCH_MAX_PERIOD 256
PITCH_FRAME_SIZE 320

My guess is that the first two should be increased to match the difference in sampling rates, but should they be power of two? Also I guess the last parameter should be increased to match the window length. Do these changes sound right to you? Thanks a lot!

from lpcnet.

luvpine avatar luvpine commented on July 4, 2024

@hdmjdp @belevtsoff In my case, change 16kHz -> 24kHz.

Key changing things:

  • increase NB_BANDS=20
  • change FRAME_SIZE = 480
  • change total features 55 -> 59.

Also I changed nb_features , nb_total_features, sampling rate, feature index num(because features is increased), Conv1D layer filters size.

from lpcnet.

belevtsoff avatar belevtsoff commented on July 4, 2024

@luvpine thanks! This set up sort of works, but I'm experiencing lowering of the pitch. Have you observed anything like this? (I've changed PITCH_FRAME_SIZE=480 as well)

from lpcnet.

belevtsoff avatar belevtsoff commented on July 4, 2024

UPD: actually it keeps producing really bad results for me with 24kHz, can't figure out what the problem is.

from lpcnet.

opencvbaby avatar opencvbaby commented on July 4, 2024

I tried 44.1KHz and 48KHz, also produced bad waveform, train and validation loss are all right, don't know why

from lpcnet.

opencvbaby avatar opencvbaby commented on July 4, 2024

solved the problem, now 44.1KHz works well

from lpcnet.

belevtsoff avatar belevtsoff commented on July 4, 2024

@opencvbaby Great! Can you share the full recipe?

from lpcnet.

opencvbaby avatar opencvbaby commented on July 4, 2024

@belevtsoff Sorry, I can't share the recipe. There are some noise in high frequency, I am still optimizing it. Maybe you should take care of pitch_embedding part

from lpcnet.

OswaldoBornemann avatar OswaldoBornemann commented on July 4, 2024

@opencvbaby @luvpine could you share the parameter you have changed like nb_features , nb_total_features, sampling rate, feature index num and Conv1D layer filters size ? thanks a lot.

from lpcnet.

linzai1992 avatar linzai1992 commented on July 4, 2024

@belevtsoff Have you figure out your problems? Could you give some advice?

from lpcnet.

xiaoyangnihao avatar xiaoyangnihao commented on July 4, 2024

pitch_embedding

@opencvbaby hi, what do you mean by taking care of pitch_embedding part? can you make some explainations?

from lpcnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.