Coder Social home page Coder Social logo

grvydev / s.a.t.u.r.d.a.y Goto Github PK

View Code? Open in Web Editor NEW
656.0 11.0 34.0 5.99 MB

A toolbox for working with WebRTC, Audio and AI

License: MIT License

Go 78.07% JavaScript 11.28% HTML 2.48% Dockerfile 2.92% Shell 0.07% Makefile 2.60% Python 2.58%
ai audio golang opus opus-codec webrtc whisper-cpp

s.a.t.u.r.d.a.y's People

Contributors

attendev avatar grvydev avatar tryanderr0r avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

s.a.t.u.r.d.a.y's Issues

Error with Python Install

Interested in trying this out / tinkering with it on my system, but with the python install step I get:

  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [8 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/private/var/folders/j1/ydw6655510z_5szr12ptrhn40000gn/T/pip-install-oi62efds/llvmlite_6ee7e6f42e224bccba1d315e951338df/setup.py", line 55, in <module>
          _guard_py_ver()
        File "/private/var/folders/j1/ydw6655510z_5szr12ptrhn40000gn/T/pip-install-oi62efds/llvmlite_6ee7e6f42e224bccba1d315e951338df/setup.py", line 52, in _guard_py_ver
          raise RuntimeError(msg.format(cur_py, min_py, max_py))
      RuntimeError: Cannot install on Python version 3.11.4; only versions >=3.7,<3.11 are supported.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Despite installing python 3.10.11 with pyenv I'm still getting this error, think I need to somehow try and nuke 3.11.4 completely from my system (on mac)

Client crashes if there is no video device

It seems like the client side functions getMedia() and getDevices() crashes with an uncaught exception if you dont have a webcam. I was able to hack around it by setting video: false in both of these functions.

Why chunk text and audio?

Hi @GRVYDEV, I was curious about why you decided to send text from the TTT to the TTS in chunks, and hence audio chunks from the TTS to the browser client.
Why not get the entire text from TTT --> TTS and the entire audio from TTS --> browser client? Is it to account for long texts that might need to be synthesized by SATURDAY, hitting some bottleneck somewhere in the pipeline?

Or is it to minimize latency since I guess with chunked text and audio, we could have SATURDAY speaking as soon as we have text generated by the TTT and not have to wait for the entire piece of text to be generated.

Thanks!

feat: Make logging better

We recently moved to using the slog package. Id like a custom handler implemented to make our output look closer to this
[2023-06-05 08:58:39.477] [INFO] [peer.go:243] => PeerLocal trickle peer_id=6FQaOl9g2S v=0

[feature] : add bubble chat

Summary

Converting transcription to bubble chat.
@GRVYDEV

Motivation

To make it friendly and readable
So we can make it a group call

Describe alternatives you've considered

  1. Modified response data channel
  2. Remove TranscribedText
  3. Storage all responses on the web
  4. Return text generated in text to text
    => Make sure not too much data send through data-channel if we have a long call.

Additional context

I have a demo like that
image

NewRTCConnection without transcriptionStream

Hi @GRVYDEV , awesome project! Thanks for the taking the time and effort for building this in public.

I wanted to test my webrtc streaming between the browser and the go client without the STT for now.
So, I wanted to pass the transcriptionStream as nil and I saw that there's a fixme about NewRTCConnection blowing up if a transcriptionStream is not passed in.

Could you elaborate why that happens and how I could fix it?

Right now, I get the log ...transcription relay is disabled as expected but then get a websocket: bad handshake error.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.