cbh123 / narrator Goto Github PK

View Code? Open in Web Editor NEW

4.3K 4.3K 529.0 211 KB

David Attenborough narrates your life

Python 100.00%

narrator's People

Contributors

Stargazers

Watchers

Forkers

ric2z mindkhichdi brennenawana touristshaun debjyotigorai ciccarone vr000m dgilperez andre15silva doriandarko pastaghost hirajanwin jarekmor femiadebogun ml-ovox mattm csmimrankhan tonypls jithinraj andrewyu0 alome007 jeffara codingkevin sasan-j davisgcii jagamypriera lancetw quacrobat lucianosp dietrich-stein 10krotator jawond endagegnehu vincentsider aitoapps svandragt surim0n emarkensten jessezondervan nicabarnimble tomchapin gui13 hololeo henrii1 stephenroddy hatgit mustaphau mwvaughn kelseywood rasmushjulskov anandvc kisio giladoved chonki emilot rocinante42 kwesikwaa 0xm1kr markopolo123 adamsaper ovidb aaronmyung gideonuchehara raviprasadmr jimmylv umerazad jmandzik farshad-vgw joshuavoydik ideabrian tejallam codybontecou raphaelsr wollerman sxakil vladih hbcbh1999 ruthvik-17 rkp64 sinakarimi7 shelbaz koisose faducoder mykeln prnv30 niksmac sergeicu mdomorffaruk taocao seyedhashtag shivamsinha15 oxxio ajd2 ulukanu thelyoncrypt luisgurmendez fordnox fuho c00renut o7renebro

narrator's Issues

ELEVEN_API_KEY, not ELEVENLABS_API_KEY

In README.md,
change:

export OPENAI_API_KEY=<token>
export ELEVENLABS_API_KEY=<eleven-token>

export OPENAI_API_KEY=<token>
export ELEVEN_API_KEY=<eleven-token>

Blank image from capture

I am getting a weird blank image from capture. Not what is actually from my webcam and my webcam is not activated during. Any ideas?

Narration is always starting with "this image"

Is there a way to change the prompt for example to make it so that each result isn't starting with "this image"?

uhm, just a noob who needs help

What python version do i need for this?

Simple audio needs 3.8 but i found that another module required 3.9 so i was not able to get it running. Any help is greatly appreciated.

Scripts running fine - but audio not playing

Everything seems to be working:
Getting "📸 Say cheese! Saving frame." and "🎙️ David says:" in the terminal in VSC.

But no audio is playing, not sure how to trouble-shoot. Any ideas? 😊

no such file or directory ./frames

Can't give descriptions of individuals in the image provided

🎙️ David says:
As I am not allowed to provide descriptions or any other details regarding the individuals in the image you’ve provided, I can't assist with your request. If you have a different kind of inquiry not involving personal details, feel free to ask!

any way around this? I assume this is a new limitation of openAI?

Context only tracks assistant

narrator/narrator.py

Line 94 in 4bab104

script = script + [{"role": "assistant", "content": analysis}]

I'm not sure the best way to include user prompts in the messages history here, since you don't want to include the actual image every time, but a placeholder of some sort that shows the LLM was prompted into the given response may help consistency and avoid an 'intro' every time, I'm not certain.

Something like:
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image (user uploaded image)"},
],
},

Reduce Rate Requests - errors out with pro plan

I have a paid account with OPEN AI but recieve this error message, suggesting I am exceeding my rate limit.

The limits for (gpt-4)[https://platform.openai.com/account/limits] are:

gpt-4	10,000 TPM	3 RPM200 RPD

My opportunity is:

I have the pro account and want to use the repo
I cannot use the repo right now.

, line 877, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

AssertionError

My apologies if this is a silly problem with an easy fix, but I promise I have googled it and could not fix it.
The capture.py runs smoothly for me, but the narrator.py throws the following error; and it does not depend on the Eleven Labs audio id that I use.

I'd really appreciate any help you'd be able to offer :)

File "\narrator-main\narrator.py", line 114, in
main()

File "\narrator-main\narrator.py", line 105, in main
play_audio(analysis)

File "\narrator-main\narrator.py", line 40, in play_audio
audio = generate(text, voice=os.environ.get("7Wqa3tuynJ4uUcRnTwAI"))

File "\anaconda3\lib\site-packages\elevenlabs\simple.py", line 61, in generate
assert isinstance(voice, Voice)

AssertionError

No module named 'PIL'

python capture.py
Traceback (most recent call last):
  File "/Users/jr/Documents/hobby/narrator/capture.py", line 3, in <module>
    from PIL import Image
ModuleNotFoundError: No module named 'PIL'

Not entirely sure what the issue is, wrong Python version? I'm on Python 3.11.4. Tried installing PIL without success:

pip3 install PIL
ERROR: Could not find a version that satisfies the requirement PIL (from versions: none)
ERROR: No matching distribution found for PIL

Pydantic Version Error?

I get the following error when I run it, do I need to change the version of pydantic or elevenlabs for it to work?

Traceback (most recent call last):
File "/Users/Documents/GitHub/narrator/narrator.py", line 8, in
from elevenlabs import generate, play, set_api_key, voices
File "/Users/anaconda3/lib/python3.11/site-packages/elevenlabs/init.py", line 1, in
from .api import * # noqa F403
^^^^^^^^^^^^^^^^^^
File "/Users/anaconda3/lib/python3.11/site-packages/elevenlabs/api/init.py", line 2, in
from .history import * # noqa F403
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/anaconda3/lib/python3.11/site-packages/elevenlabs/api/history.py", line 6, in
from pydantic import model_validator
ImportError: cannot import name 'model_validator' from 'pydantic' (/Users/anaconda3/lib/python3.11/site-packages/pydantic/init.cpython-311-darwin.so)
(base) @iMac-2 narrator % python narrator.py
Traceback (most recent call last):
File "/Users/Documents/GitHub/narrator/narrator.py", line 8, in
from elevenlabs import generate, play, set_api_key, voices
File "/Users/anaconda3/lib/python3.11/site-packages/elevenlabs/init.py", line 1, in
from .api import * # noqa F403
^^^^^^^^^^^^^^^^^^
File "/Users/anaconda3/lib/python3.11/site-packages/elevenlabs/api/init.py", line 2, in
from .history import * # noqa F403
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/anaconda3/lib/python3.11/site-packages/elevenlabs/api/history.py", line 6, in
from pydantic import model_validator
ImportError: cannot import name 'model_validator' from 'pydantic' (/Users/anaconda3/lib/python3.11/site-packages/pydantic/init.cpython-311-darwin.so)

UnicodeEncodeError: 'ascii' codec can't encode character '\u201d' in position 58: ordinal not in range(128)

Got this error but no clue...anyone?

👀 David is watching...
Traceback (most recent call last):
File "/narrator/narrator.py", line 102, in
main()
File "/narrator/narrator.py", line 88, in main
analysis = analyze_image(base64_image, script=script)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/narrator/narrator.py", line 57, in analyze_image
response = client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/openai/_utils/_utils.py", line 299, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/openai/resources/chat/completions.py", line 556, in create
return self._post(
^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/openai/_base_client.py", line 1055, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/openai/_base_client.py", line 834, in request
return self._request(
^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/openai/_base_client.py", line 854, in _request
request = self._build_request(options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/openai/_base_client.py", line 435, in _build_request
headers = self._build_headers(options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/openai/_base_client.py", line 393, in _build_headers
headers = httpx.Headers(headers_dict)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/httpx/_models.py", line 70, in init
self._list = [
^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/httpx/_models.py", line 74, in
normalize_header_value(v, encoding),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/narrator/lib/python3.11/site-packages/httpx/_utils.py", line 53, in normalize_header_value
return value.encode(encoding or "ascii")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'ascii' codec can't encode character '\u201d' in position 58: ordinal not in range(128)

No such file or directory: './frames/frame.jpg'

capture.py seems to be capturing the images (webcam is on, console repeating "Say Cheese"

However when running narrator.py in a different terminal, I am getting an error message saying that frames/frame.jpg doesn't exist

I do not see this directory or files

API KEY error

Hello,
anyone else having similar problem to mine? I entered the api_key for OpenAI as an environmental variable with cmd command setx OPENAI_API_KEY "sk-nU..." and also tried adding a system variable but I always get the same error when launching narrator.py:

openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or
by setting the OPENAI_API_KEY environment variable

Any tips what would be the problem?

Build error on macOS- clang: error: invalid arch name '-arch root:xnu-10002.60.71.505.1~3/RELEASE_ARM64_T6000'

And people wonder why Nix is necessary... /eyeroll

Building wheels for collected packages: simpleaudio
  Building wheel for simpleaudio (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [22 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.macosx-10.9-universal2-cpython-39
      creating build/lib.macosx-10.9-universal2-cpython-39/simpleaudio
      copying simpleaudio/__init__.py -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio
      copying simpleaudio/shiny.py -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio
      copying simpleaudio/functionchecks.py -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio
      creating build/lib.macosx-10.9-universal2-cpython-39/simpleaudio/test_audio
      copying simpleaudio/test_audio/c.wav -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio/test_audio
      copying simpleaudio/test_audio/e.wav -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio/test_audio
      copying simpleaudio/test_audio/g.wav -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio/test_audio
      copying simpleaudio/test_audio/left_right.wav -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio/test_audio
      copying simpleaudio/test_audio/notes_2_16_44.wav -> build/lib.macosx-10.9-universal2-cpython-39/simpleaudio/test_audio
      running build_ext
      building 'simpleaudio._simpleaudio' extension
      creating build/temp.macosx-10.9-universal2-cpython-39
      creating build/temp.macosx-10.9-universal2-cpython-39/c_src
      clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -iwithsysroot/System/Library/Frameworks/System.framework/PrivateHeaders -iwithsysroot/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/Headers -Werror=implicit-function-declaration -Wno-error=unreachable-code -arch root:xnu-10002.60.71.505.1~3/RELEASE_ARM64_T6000 -DDEBUG=0 -I/Users/pmarreck/Documents/narrator/venv/include -I/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/include/python3.9 -c c_src/posix_mutex.c -o build/temp.macosx-10.9-universal2-cpython-39/c_src/posix_mutex.o -mmacosx-version-min=10.6
      clang: error: invalid arch name '-arch root:xnu-10002.60.71.505.1~3/RELEASE_ARM64_T6000'
      error: command '/usr/bin/clang' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for simpleaudio
  Running setup.py clean for simpleaudio
Failed to build simpleaudio
ERROR: Could not build wheels for simpleaudio, which is required to install pyproject.toml-based projects

FFMPEG not found

Getting this when running narrator:

ValueError: ffplay from ffmpeg not found, necessary to play audio. On mac you can install it with 'brew install ffmpeg'. On linux and windows you can install it from https://ffmpeg.org/

Shouldn't ffmpeg be in the requirements.txt? Where should the executables be placed?

Thanks!

David Attonborough voice not available

elevenlabs.api.error.APIError: A voice for the voice_id ENfvYmv6CRqDodDZTieQ was not found.

it doesn't seem to be listed here also:

https://api.elevenlabs.io/v1/voices

gpt-4-vision-preview` does not exist

After using the install methode and adding the API keys I receive this error.

👀 David is watching...

The model gpt-4-vision-preview does not exist or you do not have access to it. Learn more: https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4.', 'type': 'invalid_request_error', 'param': None, 'code': 'model_not_found'}}

OpenAI no longer provides descriptions of images will 'real people'

OpenAI now returns the following error: I'm sorry, but I'm not able to provide visual descriptions of images with real people. If you have any other questions or need information on a different topic, feel free to ask!

A voice for the voice_id was not found

Hi! Thanks for hacking this one, it's super cool :)

Narrator doesn't work for me, the elevenpath API always returns

elevenlabs.api.error.APIError: A voice for the voice_id XXX was not found.

where XXX is the voiceId of my freshly created voice.

any ideas? I can't find the error in their docs

Simple guide

This is a very cool project. .... is there a complete step by step guide for this as all the additional bits to set up Replicate, voice in Elevenlab etc have lost me a bit...

🤞🤘

Can it start processing the next frame when the previous frame's audio starts playing?

It would be nice if the pause between narrations was shorter and more natural. (Also, do you know of a method to keep the context/memory of previous frames?)

Beginner's issue

Hello,

This is my first ever "programming" experience and I am having some issues, here is what I get when I try to run narrator.py

What can I do to solve this?

👀 David is watching...
Traceback (most recent call last):
File "/Users/ME/projectai/narrator/narrator.py", line 102, in
main()
File "/Users/ME/projectai/narrator/narrator.py", line 88, in main
analysis = analyze_image(base64_image, script=script)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/ME/projectai/narrator/narrator.py", line 57, in analyze_image
response = client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/narrator/lib/python3.12/site-packages/openai/_utils/_utils.py", line 299, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/narrator/lib/python3.12/site-packages/openai/resources/chat/completions.py", line 556, in create
return self._post(
^^^^^^^^^^^
File "/opt/anaconda3/envs/narrator/lib/python3.12/site-packages/openai/_base_client.py", line 1055, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/narrator/lib/python3.12/site-packages/openai/_base_client.py", line 834, in request
return self._request(
^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/narrator/lib/python3.12/site-packages/openai/_base_client.py", line 877, in _request
raise self._make_status_error_from_response(err.response) from None
openai.NotFoundError: Error code: 404 - {'error': {'message': 'The model gpt-4-vision-preview does not exist or you do not have access to it. Learn more: https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4.', 'type': 'invalid_request_error', 'param': None, 'code': 'model_not_found'}}
(narrator) ME@MacBook-Pro-de-ME narrator %

Is Replicate necessary?

The README directs users to create a Replicate account and set an API key, but I skipped those steps and (other issues aside), the script worked fine. If that's true, the README can be simplified to remove the Replicate steps.

Add pillow and simpleaudio in requirements?

On Mac, Python 3.10.13,
i had to pip install pillow and simpleaudio.
Now i have
Pillow 10.1.0
simpleaudio 1.0.4

SimpleAudio incompatible

Trying to run narrator and getting this issue:

import simpleaudio as sa
File "/Users/jakeboyles/Documents/repos/narrator/venv/lib/python3.11/site-packages/simpleaudio/init.py", line 1, in
from simpleaudio.shiny import *
File "/Users/jakeboyles/Documents/repos/narrator/venv/lib/python3.11/site-packages/simpleaudio/shiny.py", line 5, in
import simpleaudio._simpleaudio as _sa
ImportError: dlopen(/Users/jakeboyles/Documents/repos/narrator/venv/lib/python3.11/site-packages/simpleaudio/_simpleaudio.cpython-311-darwin.so, 0x0002): tried: '/Users/jakeboyles/Documents/repos/narrator/venv/lib/python3.11/site-packages/simpleaudio/_simpleaudio.cpython-311-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64')), '/System/Volumes/Preboot/Cryptexes/OS/Users/jakeboyles/Documents/repos/narrator/venv/lib/python3.11/site-packages/simpleaudio/_simpleaudio.cpython-311-darwin.so' (no such file), '/Users/jakeboyles/Documents/repos/narrator/venv/lib/python3.11/site-packages/simpleaudio/_simpleaudio.cpython-311-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))

Tried to arch -arm64 pip install simpleaudio but didn't work.

API Keys

Where can I add my API keys so that I don't need to export them every time I launch the project?