Coder Social home page Coder Social logo

askorama / orama Goto Github PK

View Code? Open in Web Editor NEW
8.1K 38.0 259.0 61.49 MB

🌌 Fast, dependency-free, full-text and vector search engine with typo tolerance, filters, facets, stemming, and more. Works with any JavaScript runtime, browser, server, service!

Home Page: https://docs.askorama.ai

License: Other

TypeScript 97.01% Shell 0.47% Dockerfile 0.01% Astro 1.27% MDX 1.03% Rust 0.22%
data-structures full-text search typo-tolerance algiorithm search-engine search-algorithm javascript typescript node

orama's Introduction


Full-text, vector, and hybrid search with a unique API.
On your browser, server, mobile app, or at the edge.
In less than 2kb.


Tests npm bundle size

Join Orama's Slack channel

If you need more info, help, or want to provide general feedback on Orama, join the Orama Slack channel

Highlighted features

Installation

You can install Orama using npm, yarn, pnpm, bun:

npm i @orama/orama

Or import it directly in a browser module:

<html>
  <body>
    <script type="module">
      import { create, search, insert } from 'https://unpkg.com/@orama/orama@latest/dist/index.js'

      // ...
    </script>
  </body>
</html>

With Deno, you can just use the same CDN URL or use npm specifiers:

import { create, search, insert } from 'npm:@orama/orama'

Read the complete documentation at https://docs.askorama.ai.

Usage

Orama is quite simple to use. The first thing to do is to create a new database instance and set an indexing schema:

import { create, insert, remove, search, searchVector } from '@orama/orama'

const db = await create({
  schema: {
    name: 'string',
    description: 'string',
    price: 'number',
    embedding: 'vector[1536]', // Vector size must be expressed during schema initialization
    meta: {
      rating: 'number',
    },
  },
})

Orama currently supports 10 different data types:

Type Description example
string A string of characters. 'Hello world'
number A numeric value, either float or integer. 42
boolean A boolean value. true
enum An enum value. 'drama'
geopoint A geopoint value. { lat: 40.7128, lon: 74.0060 }
string[] An array of strings. ['red', 'green', 'blue']
number[] An array of numbers. [42, 91, 28.5]
boolean[] An array of booleans. [true, false, false]
enum[] An array of enums. ['comedy', 'action', 'romance']
vector[<size>] A vector of numbers to perform vector search on. [0.403, 0.192, 0.830]

Orama will only index properties specified in the schema but will allow you to set and store additional data if needed.

Once the db instance is created, you can start adding some documents:

await insert(db, {
  name: 'Wireless Headphones',
  description: 'Experience immersive sound quality with these noise-cancelling wireless headphones.',
  price: 99.99,
  embedding: [...],
  meta: {
    rating: 4.5,
  },
})

await insert(db, {
  name: 'Smart LED Bulb',
  description: 'Control the lighting in your home with this energy-efficient smart LED bulb, compatible with most smart home systems.',
  price: 24.99,
  embedding: [...],
  meta: {
    rating: 4.3,
  },
})

await insert(db, {
  name: 'Portable Charger',
  description: 'Never run out of power on-the-go with this compact and fast-charging portable charger for your devices.',
  price: 29.99,
  embedding: [...],
  meta: {
    rating: 3.6,
  },
})

After the data has been inserted, you can finally start to query the database.

const searchResult = await search(db, {
  term: 'headphones',
})

In the case above, you will be searching for all the documents containing the word "headphones", looking up in every string property specified in the schema:

{
  elapsed: {
    raw: 99512,
    formatted: '99μs',
  },
  hits: [
    {
      id: '41013877-56',
      score: 0.925085832971998432,
      document: {
        name: 'Wireless Headphones',
        description: 'Experience immersive sound quality with these noise-cancelling wireless headphones.',
        price: 99.99,
        meta: {
          rating: 4.5
        }
      }
    }
  ],
  count: 1
}

You can also restrict the lookup to a specific property:

const searchResult = await search(db, {
  term: 'immersive sound quality',
  properties: ['description'],
})

Result:

{
  elapsed: {
    raw: 21492,
    formatted: '21μs',
  },
  hits: [
    {
      id: '41013877-56',
      score: 0.925085832971998432,
      document: {
        name: 'Wireless Headphones',
        description: 'Experience immersive sound quality with these noise-cancelling wireless headphones.',
        price: 99.99,
        meta: {
          rating: 4.5
        }
      }
    }
  ],
  count: 1
}

You can use non-string data to filter, group, and create facets:

const searchResult = await search(db, {
  term: 'immersive sound quality',
  where: {
    price: {
      lte: 199.99
    },
    rating: {
      gt: 4
    }
  },
})

Performing hybrid and vector search

Orama is a full-text and vector search engine. This allows you to adopt different kinds of search paradigms depending on your specific use case.

To perform vector or hybrid search, you can use the same search method used for full-text search.

You'll just have to specify which property you want to perform vector search on, and a vector to be used to perform vector similarity:

const searchResult = await searchVector(db, {
  mode: 'vector', // or 'hybrid'
  vector: {
    value: [...], // OpenAI embedding or similar vector to be used as an input
    property: 'embedding' // Property to search through. Mandatory for vector search
  }
})

If you're using the Orama Secure AI Proxy (highly recommended), you can skip the vector configuration at search time, since the official Orama Secure AI Proxy plugin will take care of it automatically for you:

import { create } from '@orama/orama'
import { pluginSecureProxy } from '@orama/plugin-secure-proxy'

const secureProxy = secureProxyPlugin({
  apiKey: '<YOUR-PUBLIC-API-KEY>',
  defaultProperty: 'embedding', // the default property to perform vector and hybrid search on
  model: 'openai/text-embedding-ada-002' // the model to use to generate embeddings
})

const db = await create({
  schema: {
    name: 'string',
    description: 'string',
    price: 'number',
    embedding: 'vector[1536]',
    meta: {
      rating: 'number',
    },
  },
  plugins: [secureProxy]
})

const resultsHybrid = await search(db, {
  mode: 'vector', // or 'hybrid'
  term: 'Videogame for little kids with a passion about ice cream',
  where: {
    price: {
      lte: 19.99
    },
    'meta.rating': {
      gte: 4.5
    }
  }
})

Performing Geosearch

Orama supports Geosearch as a search filter. It will search through all the properties specified as geopoint in the schema:

import { create, insert } from '@orama/orama'

const db = await create({
  schema: {
    name: 'string',
    location: 'geopoint'
  }
})

await insert(db, { name: 'Duomo di Milano', location: { lat: 45.46409, lon: 9.19192 } })
await insert(db, { name: 'Piazza Duomo',    location: { lat: 45.46416, lon: 9.18945 } })
await insert(db, { name: 'Piazzetta Reale', location: { lat: 45.46339, lon: 9.19092 } })

const searchResult = await search(db, {
  term: 'Duomo',
  where: {
    location: {           // The property we want to filter by
      radius: {           // The filter we want to apply (in that case: "radius")
        coordinates: {    // The central coordinate
          lat: 45.4648, 
          lon: 9.18998
        },
        unit: 'm',        // The unit of measurement. The default is "m" (meters)
        value: 1000,      // The radius length. In that case, 1km
        inside: true      // Whether we want to return the documents inside or outside the radius. The default is "true"
      }
    }
  }
})

Orama Geosearch APIs support distance-based search (via radius), or polygon-based search (via polygon).

By default, Orama will use the Haversine formula to perform Geosearch, but high-precision search can be enabled by passing the highPrecision option in your radius or polygon configuration. This will tell Orama to use the Vicenty Formulae instead, which is more precise for longer distances.

Read more in the official docs.

Official Docs

Read the complete documentation at https://docs.askorama.ai.

Official Orama Plugins

Write your own plugin: https://docs.askorama.ai/open-source/plugins/writing-your-own-plugins

License

Orama is licensed under the Apache 2.0 license.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.