yacinemtb / talk Goto Github PK

View Code? Open in Web Editor NEW

574.0 574.0 43.0 83.51 MB

Let's make sand talk

TypeScript 75.30% Shell 24.70%

talk's Introduction

(っ◔◡◔)っWOW WEBSITE WAWWW !!11

٩(＾◡＾)۶⁀⊙෴☉⁀(ᗒᗣᗕ)՞ web eng page!! ٩(⊙‿⊙)۶꒰(･‿･)꒱

talk's People

Contributors

Stargazers

Watchers

Forkers

tomchapin segmond adityapalve haorand mccharley argent-oxidum awokeknowing ccsssccc jmanhype glavin001 rudolfolah controlcpluscontrolv f1lth jmpaz hbcbh1999 choombaa babyblue26 roddurd cpuchner rezaarezvan apollohuang1 gd3kr yifever ghgoodreau fareesh kagayaki-7 paramsingh prajna1999 prnthh laksh9950 mdoman13-bot sjfbo rahulchhabra07 c-nikhilkarthik singhcoder theycallmeloki afrog33k osmurec lucidshift lschiavini kennyfergy nobodykr

talk's Issues

Not enough issues

Would be easier to help if there are more issues, otherwise the best I can really do is nits or orthogonal feature prs as is OSS tradition :)

Maybe track the fixMe's somewhere, would make a PR but don't want to repeat work if you are already on it

Making llama prompts configurable

Basically to allow more varity in the response and also change the prompts based on the model's supported format e.g. alpaca vs vicuna

Speaker use: Differentiate between AI and user voices

Since the microphone is always listening, the AI will create a feedback loop with its own outputs if headphones aren't used. We could potentially differentiate between voices.

Discussion on how to support easy installation

Current CUBLAS is asked via prompt in the build.sh script, but the command should instead grep lspci output, or check if its a aarch64 vs x86 and automatically enable the flag only if a CUDA capable device is detected

Only running on CPU

I've got (maybe too many) 3070s plugged in, but none of them are being utilized. Is there anywhere cuda needs to be explicitly specified as the backend?

Research - Dynamic speech reflex

Right now, I'm planning to initiate the response with a "vim pedal", aka a hotkey, because knowing when to respond is difficult. https://github.com/yacineMTB/talk/blob/master/index.ts#L108-L135

When humans speak to each other, we use intonation and other signals to let the other human know when the floor is open, and we also use it to let the other human know that we want the floor.

Right now, we just need some naive event firing when the speaker stops speaking.
Is this something that we can get out of whisper.cpp's embeddings? Possibly a classifier trained on top of the embeddings?

Also I wouldn't shy away from running a python sidecar that takes requests from the main node proc.

What would be awesome

Figuring out how to either get whisper.cpp, or some sidecar, that takes a byte stream and outputs a continous "activation function" based on likelihood to respond

Can this be accessed with an API or something?

Okay so im just having a look over the github, I know nothing else about this project.

So, it takes live microphone input, converts it to text, and even seems smart enough to wait for a lul in the speech to actually finalize the current input. Right, thats pretty flash, I like that a lot.

Now that I have the speech as text, what exactly do I do with it? How do I make use of this? I really want other apps to use this data. Does this have any kind of API system or something? Basically, how do I actually get the data out of this bad boy?

Much appreciated!

Runpod / Docker support

Use Case

For those, such as myself (MacBook Pro, Intel + AMD GPU), without powerful GPUs, it would be great to self host on a serverless GPU service (e.g. Runpod).

Also having Docker build would make setup much easier. We could distribute pre-build Docker images for quick installation.

Implementation

Runpod

Here are some related examples I found:

Whisper
- https://github.com/runpod-workers/worker-whisper
- https://github.com/runpod-workers/worker-faster_whisper
GGML language models: https://github.com/OpenAccess-AI-Collective/servereless-runpod-ggml

Segfault after promiseTimeout

Sometimes a whisper timeout will cause the application to crash. Happens intermittently with both short and long conversations.

Need help with steps to reproduce

Audio process stops after a few minutes

After ~4 minutes, the audio process (record_audio.sh) stops sending chunks.

For some reason, adding a listener on stderr fixes the issue:

audioProcess.stderr.on('data', (data) => {
  console.log(data);
});

Not sure why this works, but the application works fine for 45+ minutes with this snippet.

Add linter

sigh
should be default with language
rob pike was right btw

Add a TTS engine

This thing needs to respond back to us on some event.
Right now, the strategy to reduce latency is to generate precanned responses constantly. Maybe we can also follow the same strategy with some TTS system?

Ideally this would

be abstracted in some function, takes some text and responds with audio

For now we can just save it as a wav file. The scope of this task is figuring out what reasonable candidates we have for TTS, with one of the goals being low latency.

Piper interface mismatch: Indice data out of bounds thrown

Non-zero status code returned while running Gather node. Name:'/enc_p/emb/Gather' Status Message: indices element out of data bounds, idx=144 must be within the inclusive range [-130,129]
Aborted (core dumped)

The provided error indicates that there's an issue with the Piper ONNX model, which is used for generating synthetic speech. The model appears to have trouble with the 'Gather' operation—an operation that pulls out specific indices from a tensor—at a certain index in the input data.

Specifically, the error message 'indices element out of data bounds, idx=144 must be within the inclusive range [-130,129]' suggests the index at '144' is not within the limit set to run the Gather operation.

Without knowing the specifics of this STT model, it's hard to give a specific solution. However, you can try the following:

Input Sanitization: Check and sanitize the input data to ensure it's within the expected limits and format for the model.
Model Parameters: Verify if you're using the correct model, and the parameters for the model are set correctly. Make sure that your indices fall within the expected range.
Model Version: If an update or different version of the model or the ONNX runtime library is available, try using that. Sometimes, these types of errors can occur due to a bug or incompatibility, which may be rectified in newer versions.
Contact Software Providers: If none of the above works, your best bet would be to directly get in touch with the maintainers of the Piper model or the software you're using. Providing them with the error log can help them diagnose the issue more effectively.

How is this different from whisper.cpp's talk_llama?

Thanks in advance

Create a node cplusplus binding (similar to linked submodule) for piper

output.mp4

Now, our model can speak!

Here's what the time looks like, as observed by the cpp

[kache@whitebox ~]$ echo "I need to renable the response reflex. Though, I think to start with, we can just use a simple keypress." | piper --model ~/models/piper/en-gb-southern_english_female-low.onnx
Load time: 0.157878 sec
WARNING: Piper was not compiled with pcaudiolib. Output audio will be written to the current directory.
Output directory: "/home/kache/."
/home/kache/./1686592663635066288.wav
Real-time factor: 0.0332105 (infer=0.172695 sec, audio=5.2 sec)

This gives us a really good lowerbound on time
However, most of the time being observed from the node proc (refer to video) is from resource acquisition.
We should fix this!

This issue tracks fixing this by building a custom c binding around piper, or coming up with a side car server/mechanism

segmentation fault

/c/Program Files/nodejs/npm: line 37: 1993 Segmentation fault "$NODE_EXE" "$NPM_CLI_JS" "$@"

Allowing interruptions

The user may want to interrupt the LLM while it's speaking, so that it doesn't ramble.

I think that a simple "detect user talking" solution might be good enough, since in normal human conversation, a person could interrupt with any sentence. If the user speaks above a certain word threshold while the LLM is speaking, the LLM should stop and yield.

I am working on this right now, leaving this issue as a placeholder.

Unexpected token .

I am trying to install and when doing
npx cmake-js compile --CDWHISPER_CUBLAS="ON" -T whisper-addon -B Release
it gives
npx: installed 70 in 2.291s
Unexpected token .

need a research paper assistant

Would love to have this be able to talk to me about a research paper I load into a vector database or similar functionality. I have some custom made txt based research assistant but a voice one would be extremely useful

Resetting context in conversation

Sometimes you will want to whisper 'stop'

Sometimes you may want to say 'wait wait wait, this is not what I meant'

Sometimes you will scream 'Damn it, forget what I said, let's start from the beginning'

This is not possible currently since we pass history has well in the conversation.

Some ways to achieve this -

Allow a keypress like F to forget everything
Use some trigger words in the transcription

getting error MODULE NOT FOUND

So I have gotten it to run the script succesfully

strau@James MINGW64 ~/Downloads/talk-master/talk-master (master)
$ ./build.sh
Starting Script...
Installing npm dependencies in the current directory...
npm WARN [email protected] No repository field.
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: [email protected] (node_modules\fsevents):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for [email protected]: wanted {"os":"darwin","arch":"any"} (current: {"os":"win32","arch":"x64"})

added 19 packages from 17 contributors, updated 1 package and audited 349 packages in 7.688s

34 packages are looking for funding
run npm fund for details

found 0 vulnerabilities

Do you want to turn CUBLAS ON? [y/n] n
Installing npm dependencies for whisper.cpp examples...
npm WARN [email protected] No repository field.
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\freebsd-arm64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"freebsd","arch":"arm64"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\darwin-x64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"darwin","arch":"x64"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\android-arm):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"android","arch":"arm"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\android-x64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"android","arch":"x64"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\freebsd-x64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"freebsd","arch":"x64"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\android-arm64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"android","arch":"arm64"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\linux-arm):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"linux","arch":"arm"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\linux-arm64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"linux","arch":"arm64"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\linux-ia32):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"linux","arch":"ia32"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\linux-loong64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"linux","arch":"loong64"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\darwin-arm64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"darwin","arch":"arm64"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\netbsd-x64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"netbsd","arch":"x64"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\linux-s390x):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"linux","arch":"s390x"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\openbsd-x64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"openbsd","arch":"x64"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\linux-x64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"linux","arch":"x64"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\linux-ppc64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"linux","arch":"ppc64"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\sunos-x64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"sunos","arch":"x64"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\win32-arm64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"win32","arch":"arm64"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\linux-riscv64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"linux","arch":"riscv64"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\linux-mips64el):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"linux","arch":"mips64el"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: [email protected] (node_modules\fsevents):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for [email protected]: wanted {"os":"darwin","arch":"any"} (current: {"os":"win32","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: @esbuild/[email protected] (node_modules@esbuild\win32-ia32):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for @esbuild/[email protected]: wanted {"os":"win32","arch":"ia32"} (current: {"os":"win32","arch":"x64"})

audited 402 packages in 4.311s

41 packages are looking for funding
run npm fund for details

found 0 vulnerabilities

Compiling whisper.cpp examples...
info find VS using VS2019 (16.11.33214.272) found at:
info find VS "C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools"
info find VS run with --verbose for detailed information
info TOOL Using Visual Studio 16 2019 generator.
info CMD BUILD
info RUN [
info RUN 'cmake',
info RUN '--build',
info RUN 'C:\Users\strau\Downloads\talk-master\talk-master\whisper.cpp\build',
info RUN '--config',
info RUN 'Release',
info RUN '--target',
info RUN 'whisper-addon'
info RUN ]
Microsoft (R) Build Engine version 16.11.2+f32259642 for .NET Framework
Copyright (C) Microsoft Corporation. All rights reserved.

whisper.vcxproj -> C:\Users\strau\Downloads\talk-master\talk-master\whisper.cpp\build\bin\Release\whisper.dll
common.vcxproj -> C:\Users\strau\Downloads\talk-master\talk-master\whisper.cpp\build\examples\Release\common.lib
whisper-addon.vcxproj -> C:\Users\strau\Downloads\talk-master\talk-master\whisper.cpp\build\bin\Release\whisper-addon.node
Moving compiled whisper.cpp code to bindings directory...
Installing npm dependencies for llama.cpp examples...
npm WARN [email protected] No repository field.
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: [email protected] (node_modules\fsevents):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for [email protected]: wanted {"os":"darwin","arch":"any"} (current: {"os":"win32","arch":"x64"})

audited 329 packages in 1.648s

34 packages are looking for funding
run npm fund for details

found 0 vulnerabilities

Compiling llama.cpp examples...
info find VS using VS2019 (16.11.33214.272) found at:
info find VS "C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools"
info find VS run with --verbose for detailed information
info TOOL Using Visual Studio 16 2019 generator.
info CMD BUILD
info RUN [
info RUN 'cmake',
info RUN '--build',
info RUN 'C:\Users\strau\Downloads\talk-master\talk-master\llama.cpp\build',
info RUN '--config',
info RUN 'Release',
info RUN '--target',
info RUN 'llama-addon'
info RUN ]
Microsoft (R) Build Engine version 16.11.2+f32259642 for .NET Framework
Copyright (C) Microsoft Corporation. All rights reserved.

ggml.vcxproj -> C:\Users\strau\Downloads\talk-master\talk-master\llama.cpp\build\ggml.dir\Release\ggml.lib
Auto build dll exports
llama.vcxproj -> C:\Users\strau\Downloads\talk-master\talk-master\llama.cpp\build\bin\Release\llama.dll
common.vcxproj -> C:\Users\strau\Downloads\talk-master\talk-master\llama.cpp\build\examples\common.dir\Release\common.lib
Auto build dll exports
llama-addon.vcxproj -> C:\Users\strau\Downloads\talk-master\talk-master\llama.cpp\build\bin\Release\llama-addon.node
Moving compiled llama.cpp code to bindings directory...
Do you want to download models? [y/n] n
Skipping model download...
Script completed successfully!

but when i go to npm run start I Get this error

strau@James MINGW64 ~/Downloads/talk-master/talk-master (master)
$ npm run start

[email protected] start C:\Users\strau\Downloads\talk-master\talk-master
npx ts-node ./index.ts

Error: Cannot find module './bindings/whisper/whisper-addon'
Require stack:

C:\Users\strau\Downloads\talk-master\talk-master\index.ts
at Function.Module._resolveFilename (internal/modules/cjs/loader.js:931:15)
at Function.Module._resolveFilename.sharedData.moduleResolveFilenameHook.installedValue [as _resolveFilename] (C:\Users\strau\Downloads\talk-master\talk-master\node_modules@cspotcode\source-map-support\source-map-support.js:811:30)
at Function.Module._load (internal/modules/cjs/loader.js:774:27)
at Module.require (internal/modules/cjs/loader.js:1003:19)
at require (internal/modules/cjs/helpers.js:107:18)
at Object. (C:\Users\strau\Downloads\talk-master\talk-master\index.ts:8:17)
at Module._compile (internal/modules/cjs/loader.js:1114:14)
at Module.m._compile (C:\Users\strau\Downloads\talk-master\talk-master\node_modules\ts-node\src\index.ts:1618:23)
at Module._extensions..js (internal/modules/cjs/loader.js:1143:10)
at Object.require.extensions. [as .ts] (C:\Users\strau\Downloads\talk-master\talk-master\node_modules\ts-node\src\index.ts:1621:12) {
code: 'MODULE_NOT_FOUND',
requireStack: [ 'C:\Users\strau\Downloads\talk-master\talk-master\index.ts' ]
}
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! [email protected] start: npx ts-node ./index.ts
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] start script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR! C:\Users\strau\AppData\Roaming\npm-cache_logs\2023-06-13T09_43_32_959Z-debug.log
2023-06-13T09_43_32_959Z-debug.log

Running Piper through a Docker container

a question on the Piper repo from @ishan0102 rhasspy/piper#109 (comment)

Also quick question, does this mean piper can only be called by spinning up a docker image and running these commands? Is there a way to get piper into my path so that I can call it from anywhere? I'm trying to get this repo working on my M1 mac: https://github.com/yacineMTB/talk

using the docker commands here: rhasspy/piper#109 (comment)

it's possible that it could work if the command that's run here is updated to use docker exec piper ...: https://github.com/yacineMTB/talk/blob/master/src/depedenciesLibrary/voice.ts#L25

Architecture

An issue to capture the state of the architecture.
I'll edit this with the state, and open it up for discussion

Windows nvidia

does this work for windows or only for linux if so can anyone help me i been trying to run this for the past two days had no luck. If anything ill pay top dollar to get this project running really intrested in running this repo. Please get back to me if your willing to help.

Add Ethereum address for donations

Ik Ik you worked at stripe, but should also add a crypto address for donations (pref Ethereum)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.