Coder Social home page Coder Social logo

kwhinnery / assemblyai-node-sdk Goto Github PK

View Code? Open in Web Editor NEW

This project forked from assemblyai/assemblyai-node-sdk

0.0 0.0 0.0 10.3 MB

The AssemblyAI JavaScript SDK provides an easy-to-use interface for interacting with the AssemblyAI API, which supports async and real-time transcription, audio intelligence models, as well as the latest LeMUR models.

Home Page: https://www.assemblyai.com

License: MIT License

JavaScript 1.66% TypeScript 98.34%

assemblyai-node-sdk's Introduction


npm Test GitHub License AssemblyAI Twitter AssemblyAI YouTube Discord

AssemblyAI JavaScript SDK

The AssemblyAI JavaScript SDK provides an easy-to-use interface for interacting with the AssemblyAI API, which supports async and real-time transcription, as well as the latest LeMUR models. It is written primarily for Node.js in TypeScript with all types exported, but also compatible with other runtimes.

Installation

You can install the AssemblyAI SDK by running:

npm install assemblyai
yarn add assemblyai
pnpm add assemblyai
bun add assemblyai

Usage

Import the AssemblyAI package and create an AssemblyAI object with your API key:

import { AssemblyAI } from "assemblyai";

const client = new AssemblyAI({
  apiKey: process.env.ASSEMBLYAI_API_KEY,
});

You can now use the client object to interact with the AssemblyAI API.

Create a transcript

When you create a transcript, you can either pass in a URL to an audio file or upload a file directly.

// Transcribe file at remote URL
let transcript = await client.transcripts.transcribe({
  audio: "https://storage.googleapis.com/aai-web-samples/espn-bears.m4a",
});

// Upload a file via local path and transcribe
let transcript = await client.transcripts.transcribe({
  audio: "./news.mp4",
});

Note You can also pass streams and buffers to the audio property.

transcribe queues a transcription job and polls it until the status is completed or error. You can configure the polling interval and polling timeout using these options:

let transcript = await client.transcripts.transcribe(
  {
    audio: "https://storage.googleapis.com/aai-web-samples/espn-bears.m4a",
  },
  {
    // How frequently the transcript is polled in ms. Defaults to 3000.
    pollingInterval: 1000,
    // How long to wait in ms until the "Polling timeout" error is thrown. Defaults to infinite (-1).
    pollingTimeout: 5000,
  }
);

If you don't want to wait until the transcript is ready, you can use submit:

let transcript = await client.transcripts.submit({
  audio: "https://storage.googleapis.com/aai-web-samples/espn-bears.m4a",
});

Get a transcript

This will return the transcript object in its current state. If the transcript is still processing, the status field will be queued or processing. Once the transcript is complete, the status field will be completed.

const transcript = await client.transcripts.get(transcript.id);

If you created a transcript using submit, you can still poll until the transcript status is completed or error using waitUntilReady:

const transcript = await client.transcripts.waitUntilReady(transcript.id, {
  // How frequently the transcript is polled in ms. Defaults to 3000.
  pollingInterval: 1000,
  // How long to wait in ms until the "Polling timeout" error is thrown. Defaults to infinite (-1).
  pollingTimeout: 5000,
});

List transcripts

This will return a page of transcripts you created.

const page = await client.transcripts.list();

You can also paginate over all pages.

let nextPageUrl: string | null = null;
do {
  const page = await client.transcripts.list(nextPageUrl);
  nextPageUrl = page.page_details.next_url;
} while (nextPageUrl !== null);

Delete a transcript

const res = await client.transcripts.delete(transcript.id);

Use LeMUR

Call LeMUR endpoints to summarize, ask questions, generate action items, or run a custom task.

Custom Summary:

const { response } = await client.lemur.summary({
  transcript_ids: ["0d295578-8c75-421a-885a-2c487f188927"],
  answer_format: "one sentence",
  context: {
    speakers: ["Alex", "Bob"],
  },
});

Question & Answer:

const { response } = await client.lemur.questionAnswer({
  transcript_ids: ["0d295578-8c75-421a-885a-2c487f188927"],
  questions: [
    {
      question: "What are they discussing?",
      answer_format: "text",
    },
  ],
});

Action Items:

const { response } = await client.lemur.actionItems({
  transcript_ids: ["0d295578-8c75-421a-885a-2c487f188927"],
});

Custom Task:

const { response } = await client.lemur.task({
  transcript_ids: ["0d295578-8c75-421a-885a-2c487f188927"],
  prompt: "Write a haiku about this conversation.",
});

Transcribe in real-time

Create the real-time transcriber.

const rt = client.realtime.transcriber();

You can also pass in the following options.

const rt = client.realtime.transcriber({
  realtimeUrl: 'wss://localhost/override',
  apiKey: process.env.ASSEMBLYAI_API_KEY // The API key passed to `AssemblyAI` will be used by default,
  sampleRate: 16_000,
  wordBoost: ['foo', 'bar']
});

You can also generate a temporary auth token for real-time.

const token = await client.realtime.createTemporaryToken({ expires_in = 60 });
const rt = client.realtime.transcriber({
  token: token,
});

Warning

Storing your API key in client-facing applications exposes your API key. Generate a temporary auth token on the server and pass it to your client.

You can configure the following events.

rt.on("open", ({ sessionId, expiresAt }) => console.log('Session ID:', sessionId, 'Expires at:', expiresAt));
rt.on("close", (code: number, reason: string) => console.log('Closed', code, reason));
rt.on("transcript", (transcript: TranscriptMessage) => console.log('Transcript:', transcript));
rt.on("transcript.partial", (transcript: PartialTranscriptMessage) => console.log('Partial transcript:', transcript));
rt.on("transcript.final", (transcript: FinalTranscriptMessage) => console.log('Final transcript:', transcript));
rt.on("error", (error: Error) => console.error('Error', error));

After configuring your events, connect to the server.

await rt.connect();

Send audio data via chunks.

// Pseudo code for getting audio
getAudio((chunk) => {
  rt.sendAudio(chunk);
});

Or send audio data via a stream by piping to the real-time stream.

audioStream.pipeTo(rt.stream());

Close the connection when you're finished.

await rt.close();

Tests

To run the test suite, first install the dependencies, then run pnpm test:

pnpm install
pnpm test

assemblyai-node-sdk's People

Contributors

swimburger avatar dylanbfox avatar misraturp avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.