Coder Social home page Coder Social logo

lobehub / lobe-tts Goto Github PK

View Code? Open in Web Editor NEW
291.0 7.0 41.0 356 KB

🎤 Lobe TTS - A high-quality & reliable TTS/STT library for Server and Browser

Home Page: https://tts.lobehub.com

License: MIT License

JavaScript 0.89% Shell 0.16% TypeScript 98.95%
lobehub tts auzre edge microsoft-speech-api opeanai speech-recognition speech-to-text stt text-to-speech

lobe-tts's Introduction

Lobe TTS

A high-quality & reliable TTS/STT library for Server and Browser



Changelog · Report Bug · Request Feature

Table of contents

TOC

📖 Introduction

🤖 Lobe Chat supports Text-to-Speech (TTS) and Speech-to-Text (STT) technologies, enabling our application to convert text messages into clear voice outputs, allowing users to interact with our conversational agent as if they were talking to a real person. Users can choose from a variety of voices to pair with the agent.

In the process of implementing this feature, we found that there was no satisfactory TTS (Text-to-Speech) frontend library available on the market. As a result, we invested a lot of effort, including data conversion, audio progress management, and speech visualization, among other tasks.

Note

Therefore, we decided to refine our implementation and make it open source, hoping to assist developers who wish to implement TTS. @lobehub/tts is a high-quality TTS toolkit developed in TypeScript, which supports usage both on the server-side and in the browser.

  • Server-side: With just 15 lines of code, you can achieve high-quality voice generation capabilities comparable to OpenAI's TTS service. It currently supports EdgeSpeechTTS, MicrosoftTTS, OpenAITTS, and OpenAISTT.
  • Browser-side: It provides high-quality React Hooks and visual audio components, supporting common functions such as loading, playing, pausing, and dragging the timeline. Additionally, it offers a very rich set of capabilities for adjusting the audio track styles.

📦 Usage

Generate Speech on server

run the script below use Bun: bun index.js

// index.js
import { EdgeSpeechTTS } from '@lobehub/tts';
import { Buffer } from 'buffer';
import fs from 'fs';
import path from 'path';

// Instantiate EdgeSpeechTTS
const tts = new EdgeSpeechTTS({ locale: 'en-US' });

// Create speech synthesis request payload
const payload = {
  input: 'This is a speech demonstration',
  options: {
    voice: 'en-US-GuyNeural',
  },
};

// Call create method to synthesize speech
const response = await tts.create(payload);

// generate speech file
const mp3Buffer = Buffer.from(await response.arrayBuffer());
const speechFile = path.resolve('./speech.mp3');

fs.writeFileSync(speechFile, mp3Buffer);
Audio.mp4

Important

Run on Node.js

As the Node.js environment lacks the WebSocket instance, we need to polyfill WebSocket. This can be done by importing the ws package.

// import at the top of the file
import WebSocket from 'ws';

global.WebSocket = WebSocket;

Use the React Component

import { AudioPlayer, AudioVisualizer, useAudioPlayer } from '@lobehub/tts/react';

export default () => {
  const { ref, isLoading, ...audio } = useAudioPlayer(url);

  return (
    <Flexbox align={'center'} gap={8}>
      <AudioPlayer audio={audio} isLoading={isLoading} style={{ width: '100%' }} />
      <AudioVisualizer audioRef={ref} isLoading={isLoading} />
    </Flexbox>
  );
};
audio-vis.mp4

📦 Installation

Important

This package is ESM only.

To install @lobehub/tts, run the following command:

$ pnpm i @lobehub/tts

$ bun add @lobehub/tts

Compile with Next.js

Note

By work correct with Next.js SSR, add transpilePackages: ['@lobehub/tts'] to next.config.js. For example:

const nextConfig = {
  transpilePackages: ['@lobehub/tts'],
};

⌨️ Local Development

You can use Github Codespaces for online development:

Or clone it for local development:

$ git clone https://github.com/lobehub/lobe-tts.git
$ cd lobe-tts
$ bun install
$ bun dev

🤝 Contributing

Contributions of all types are more than welcome, if you are interested in contributing code, feel free to check out our GitHub Issues to get stuck in to show us what you’re made of.

🩷 Sponsor

Every bit counts and your one-time donation sparkles in our galaxy of support! You're a shooting star, making a swift and bright impact on our journey. Thank you for believing in us – your generosity guides us toward our mission, one brilliant flash at a time.

🔗 More Products

  • 🤖 Lobe Chat - An open-source, extensible (Function Calling), high-performance chatbot framework. It supports one-click free deployment of your private ChatGPT/LLM web application.
  • 🤯 Lobe theme - The modern theme for stable diffusion webui, exquisite interface design, highly customizable UI, and efficiency boosting features.


📝 License

Copyright © 2023 LobeHub.
This project is MIT licensed.

lobe-tts's People

Contributors

arvinxx avatar canisminor1990 avatar mrslimslim avatar mushan0x0 avatar semantic-release-bot avatar stephanvs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

lobe-tts's Issues

[Request] Coqui TTS

🥰 需求描述 | Feature Description

It would be nice to add support for Coqui TTS. For example, if you start a local server and connect to it.

Coqui TTS docs:
https://tts.readthedocs.io/en/latest/inference.html

🧐 解决方案 | Proposed Solution

You can just give the system, or the made VENV commands like tts -list-models and so on. Just to run locally or on a server with a normal processor

📝 补充信息 | Additional Information

And as a side note. I want to use TTS voiceover for SRT and ASS subtitles, it would be nice to have such a function in the application

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Awaiting Schedule

These updates are awaiting their schedule. Click on a checkbox to get an update now.

  • Update dependency eslint to v9
  • Update dependency vercel to v34

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Ignored or Blocked

These are blocked by an existing closed PR and will not be recreated unless you click a checkbox below.

Detected dependencies

github-actions
.github/workflows/issue-auto-comments.yml
  • wow-actions/auto-comment v1
  • wow-actions/auto-comment v1
  • wow-actions/auto-comment v1
  • actions-cool/issues-helper v3
.github/workflows/issue-check-inactive.yml
  • actions-cool/issues-helper v3
.github/workflows/issue-close-require.yml
  • actions-cool/issues-helper v3
  • actions-cool/issues-helper v3
  • actions-cool/issues-helper v3
.github/workflows/release.yml
  • actions/checkout v4
  • oven-sh/setup-bun v1
.github/workflows/test.yml
  • actions/checkout v4
  • oven-sh/setup-bun v1
npm
package.json
  • @babel/runtime ^7
  • lodash-es ^4
  • openai ^4.17.3
  • query-string ^8
  • react-error-boundary ^4
  • remark-gfm ^3
  • remark-parse ^10
  • swr ^2
  • unified ^11
  • unist-util-visit ^5
  • url-join ^5
  • uuid ^9
  • @commitlint/cli ^18
  • @lobehub/i18n-cli ^1.11.1
  • @lobehub/ui ^1
  • @types/lodash-es ^4
  • @types/node ^20
  • @types/react ^18
  • @types/react-dom ^18
  • @types/uuid ^9
  • @vercel/node ^3
  • antd ^5
  • antd-style ^3
  • commitlint ^18
  • dumi ^2
  • eslint ^8
  • father 4.3.1
  • husky ^8
  • lint-staged ^15
  • prettier ^3
  • react ^18
  • react-dom ^18
  • react-layout-kit ^1
  • remark ^14
  • remark-cli ^11
  • semantic-release ^21
  • tsx ^4.1.2
  • typescript ^5
  • vercel ^28
  • @lobehub/ui >=1
  • antd >=5
  • antd-style >=3
  • lucide-react >=0.292
  • react >=18
  • react-dom >=18
  • react-layout-kit >=1

  • Check this box to trigger a request for Renovate to run again on this repository

[Request] Misleading project description

🥰 需求描述 | Feature Description

I recently explored your GitHub project and noticed that the current project description might lead users to believe that the library generates server-side audio. However, after further investigation, it appears to function as a wrapper for external TTS services like Google and Microsoft.

🧐 解决方案 | Proposed Solution

To enhance clarity for potential users, I suggest updating the project description to explicitly mention that the library facilitates integration with external TTS services for audio generation, rather than directly generating audio itself.

📝 补充信息 | Additional Information

This adjustment would provide a more accurate understanding of the project's capabilities and help users make informed decisions about its suitability for their needs.

[Bug]

💻 系统环境 | Operating System

Other

🌐 浏览器 | Browser

Safari

🐛 问题描述 | Bug Description

STT 在 ios 的 Safari 上录音异常卡顿,以及语音识别能力也处于几乎不能用的状态。

🚦 期望结果 | Expected Behavior

No response

📷 复现步骤 | Recurrence Steps

在 ios Safari 上打开 https://tts.lobehub.com/ 里 STT 的 Demo,然后听下录音就能发现问题。

📝 补充信息 | Additional Information

No response

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.