Coder Social home page Coder Social logo

bananaml / demo-whisper Goto Github PK

View Code? Open in Web Editor NEW
4.0 5.0 35.0 9 KB

This is a Whisper transcription starter template from Banana.dev that allows on-demand serverless GPU inference of the openai/whisper-base model from Hugging Face. Basically your own Whisper API.

Home Page: https://www.banana.dev

Dockerfile 12.33% Python 87.67%

demo-whisper's Introduction

Banana.dev Whisper transcription starter template

This is an openai/whisper-base starter template from Banana.dev that allows on-demand serverless GPU inference.

You can fork this repository and deploy it on Banana as is, or customize it based on your own needs.

Rather than accepting files in requests, it uses AWS S3 to read stored audio files at runtime and return the transcribed text.

Running this app

Deploying on Banana.dev

  1. Fork this repository to your own Github account.
  2. Connect your Github account on Banana.
  3. Create a new model on Banana from the forked Github repository.

Running after deploying

  1. Wait for the model to build after creating it.
  2. Make an API request to it using one of the provided snippets in your Banana dashboard.

For more info, check out the Banana.dev docs.

Testing locally

Using Docker

Build the model as a Docker image. You can change the banana-whisper part to anything.

Make sure to change the three AWS variables to your own.

docker build --build-arg AWS_ACCESS_KEY_ID=your_access_key_id --build-arg AWS_SECRET_ACCESS_KEY=your_secret_key --build-arg AWS_BUCKET=your_bucket -t banana-whisper .

Run the Potassium server

docker run --publish 8000:8000 -it banana-whisper

Run inference after the above is built and running. This assumes you have a "hello_world.wav" file in your S3 bucket.

curl -X POST -H 'Content-Type: application/json' -d '{"path": "hello_world.wav"}' http://localhost:8000

Without Docker

You could also install and run it without Docker.

Just make sure that the pip dependencies in the Docker file (and torch) are installed in your Python virtual environment.

Run the Potassium app in one terminal window.

AWS_ACCESS_KEY_ID=your_access_key_id AWS_SECRET_ACCESS_KEY=your_secret_key AWS_BUCKET=your_bucket python3 app.py

Call the model in another terminal window with the Potassium app still running from the previous step.

curl -X POST -H 'Content-Type: application/json' -d '{"path": "hello_world.wav"}' http://localhost:8000

Requirements

ffmpeg

The ffmpeg system dependency is required.

S3

S3 read credentials (access key id and secret) and an S3 bucket are required to read files uploaded to S3.

You should add these to your model's settings using the same keys as in the Dockerfile

demo-whisper's People

Contributors

erikkaum avatar nik-418 avatar kylejmorris avatar

Stargazers

Saeyeon Hong avatar Kyle McLaren avatar Vitor Oliveira avatar Josh Jarabek avatar

Watchers

Daniel avatar  avatar  avatar Blake avatar Kostas Georgiou avatar

demo-whisper's Issues

Does not work

Hey there, I wanted to give a quick try for your service and this example caught my eye. I followed the instructions, but sadly it did not work.

I know I could probably spend some time and fix it, but common, it's the example. Just wanted to let you know.

The example request in the dashboard provides no insight of how to upload an audio file.

curl -X POST \
    -H 'Content-Type: application/json' \
    -H 'X-Banana-API-Key: KEY' \
    -d '{"prompt":"In the summer I like [MASK]."}' \
    https://URL.run.banana.dev

Is the request the same for all templates? I don't know, first time using your platform... and the Readme says Make an API request to it using one of the provided snippets in your Banana dashboard

sent the request anyways:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/potassium/potassium.py", line 194, in _handle_generic
    out = endpoint.func(req)
  File "/opt/conda/lib/python3.8/site-packages/potassium/potassium.py", line 131, in wrapper
    out = func(self._context, request)
  File "app.py", line 57, in handler
    context.get("s3").download_file(context.get("bucket"), path, "sample.wav")
  File "/opt/conda/lib/python3.8/site-packages/boto3/s3/inject.py", line 190, in download_file
    return transfer.download_file(
  File "/opt/conda/lib/python3.8/site-packages/boto3/s3/transfer.py", line 326, in download_file
    future.result()
  File "/opt/conda/lib/python3.8/site-packages/s3transfer/futures.py", line 103, in result
    return self._coordinator.result()
  File "/opt/conda/lib/python3.8/site-packages/s3transfer/futures.py", line 266, in result
    raise self._exception
  File "/opt/conda/lib/python3.8/site-packages/s3transfer/tasks.py", line 269, in _main
    self._submit(transfer_future=transfer_future, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/s3transfer/download.py", line 354, in _submit
    response = client.head_object(
  File "/opt/conda/lib/python3.8/site-packages/botocore/client.py", line 535, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/opt/conda/lib/python3.8/site-packages/botocore/client.py", line 928, in _make_api_call
    api_params = self._emit_api_params(
  File "/opt/conda/lib/python3.8/site-packages/botocore/client.py", line 1043, in _emit_api_params
    self.meta.events.emit(
  File "/opt/conda/lib/python3.8/site-packages/botocore/hooks.py", line 412, in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/botocore/hooks.py", line 256, in emit
    return self._emit(event_name, kwargs)
  File "/opt/conda/lib/python3.8/site-packages/botocore/hooks.py", line 239, in _emit
    response = handler(**kwargs)
  File "/opt/conda/lib/python3.8/site-packages/botocore/handlers.py", line 284, in validate_bucket_name
    raise ParamValidationError(report=error_msg)
botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid bucket name "": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:(s3|s3-object-lambda):[a-z\-0-9]*:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-.]{1,63}$|^arn:(aws).*:s3-outposts:[a-z\-0-9]+:[0-9]{12}:outpost[/:][a-zA-Z0-9\-]{1,63}[/:]accesspoint[/:][a-zA-Z0-9\-]{1,63}$"

Yeah, I've seen the repo uses S3, but only for local dev...?

Can't I just send the file directly, as in OpenAI whisper api? I know you are not a whisper api company, but it's your example...

good luck to you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.