Coder Social home page Coder Social logo

Comments (4)

nattaran avatar nattaran commented on August 22, 2024 3

I have the same question. I did not find any range and description for these quality-based parameters in the repo. That would be great to add them in readme.

from nisqa.

gabrielmittag avatar gabrielmittag commented on August 22, 2024 2

Hi,

That's correct for noisiness, coloration and discontinuity. For Loudness the score represents how optimal the loudness is, that means a sample with non-optimal loudness (either too loud or too quiet) will be rated with a lower score.

Here is a brief explanation of the different dimensions:
image

The following graph shows the average loudness predictions of the model vs the active speech level in dBov. The optimal level is around -26 dBov because most samples in the dataset were normalized for that level (apart from the ones with non-optimal loudness on purpose).
image

Figure source: https://link.springer.com/book/10.1007/978-3-030-91479-0

from nisqa.

gabrielmittag avatar gabrielmittag commented on August 22, 2024

Hi,

For the overall quality MOS and the four quality dimensions the range is [1, 5] where 1 is poor quality and 5 is excellent quality. BTW - the quality dimensions (Noisiness, Coloration, Discontinuity, Loudness) cannot be used for synthetic speech. To predict the Naturalness of synthetic speech use the nisqa_tts.tar weights

Let me know if anything is still unclear. I'll try to add some more info to the readme or in the wiki.

from nisqa.

StianHanssen avatar StianHanssen commented on August 22, 2024

@gabrielmittag Hi, sorry to bring this up again, but I just wanted to further clarify the score when it comes to the loudness dimension. For example, if I get a 1 in loudness, does that mean the speech is too quiet, or does it mean the speech is so loud that it is peaking which is bad in a different way?

For noisiness, coloration and discontinuity I take it that we have less of each the closer to 5 we are. I.e. an audio clip that is not very noisy is closer to 5.

from nisqa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.