Coder Social home page Coder Social logo

Comments (5)

IpsumDominum avatar IpsumDominum commented on June 8, 2024 2

Hello, just wanna clarify, is it because the current model uses a bidirectional LSTM this is not possible?

from allosaurus.

xinjli avatar xinjli commented on June 8, 2024

Hi, thanks for asking!

Unfortunately, the current model is not able to do the real-time transcription. The real-time model would need some special architecture, which is not implemented in the current model.
If you want to use it for real-time purposes. Maybe the best way, for now, is to feed your audio stream into the model for a fixed amount of time (e.g: 2 second), and then concatenate the outputs.

from allosaurus.

willstott101 avatar willstott101 commented on June 8, 2024

Fair enough, I'm curious about the theoretical minimum latency of the model. I see there is a "window_size": 0.025, in pm_config.json and "window_size": 3, in am_config.json (uni2005). Is the minimum latency therefore basically 0.025 * 3 (seconds I assume), or am I wrong in assuming those window sizes are the overall limiter of data passed to any given execution of the neural network? Perhaps the network keeps state as windows are passed to it. Perhaps those windows aren't actually what I think they are. 🤷

from allosaurus.

xinjli avatar xinjli commented on June 8, 2024

hi sorry for the late reply.

For this model, the minimum latency would be 0.025 + 0.01 + 0.01 because the window are overlapping by 0.01. And of course you also need to consider the time spent on feature extraction and inference

from allosaurus.

padster06 avatar padster06 commented on June 8, 2024

hi, continuing on wills thread about real time audio streaming, we ran into a bit of a blocker. In the lifter (pm.feature.lifter) function it seems to change the output based on the length of the array inputted as "cepstra". The same array just with less elements gets returned with different values. Is there an obvious way to make this function invariant to input array length? Or do we need to keep state and have a rolling average type thing?

Thanks

from allosaurus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.