Coder Social home page Coder Social logo

hkulekci / qdrant-php Goto Github PK

View Code? Open in Web Editor NEW
74.0 7.0 13.0 168 KB

Qdrant is a vector similarity engine & vector database. It deploys as an API service providing search for the nearest high-dimensional vectors. With Qdrant, embeddings or neural network encoders can be turned into full-fledged applications for matching, searching, recommending, and much more!

License: MIT License

PHP 100.00%
php php-client qdrant qdrant-vector-database

qdrant-php's Introduction

Qdrant PHP Client

Test Application codecov

This library is a PHP Client for Qdrant.

Qdrant is a vector similarity engine & vector database. It deploys as an API service providing search for the nearest high-dimensional vectors. With Qdrant, embeddings or neural network encoders can be turned into full-fledged applications for matching, searching, recommending, and much more!

Installation

You can install the client in your PHP project using composer:

composer require hkulekci/qdrant

An example to create a collection :

use Qdrant\Endpoints\Collections;
use Qdrant\Http\GuzzleClient;
use Qdrant\Models\Request\CreateCollection;
use Qdrant\Models\Request\VectorParams;

include __DIR__ . "/../vendor/autoload.php";
include_once 'config.php';

$config = new \Qdrant\Config(QDRANT_HOST);
$config->setApiKey(QDRANT_API_KEY);

$client = new Qdrant(new GuzzleClient($config));

$createCollection = new CreateCollection();
$createCollection->addVector(new VectorParams(1024, VectorParams::DISTANCE_COSINE), 'image');
$response = $client->collections('images')->create($createCollection);

So now, we can insert a point :

use Qdrant\Models\PointsStruct;
use Qdrant\Models\PointStruct;
use Qdrant\Models\VectorStruct;

$points = new PointsStruct();
$points->addPoint(
    new PointStruct(
        (int) $imageId,
        new VectorStruct($data['embeddings'][0], 'image'),
        [
            'id' => 1,
            'meta' => 'Meta data'
        ]
    )
);
$client->collections('images')->points()->upsert($points);

While upsert data, if you want to wait for upsert to actually happen, you can use query paramaters:

$client->collections('images')->points()->upsert($points, ['wait' => 'true']);

You can check for more parameters : https://qdrant.github.io/qdrant/redoc/index.html#tag/points/operation/upsert_points

Search with a filter :

use Qdrant\Models\Filter\Condition\MatchString;
use Qdrant\Models\Filter\Filter;
use Qdrant\Models\Request\SearchRequest;
use Qdrant\Models\VectorStruct;

$searchRequest = (new SearchRequest(new VectorStruct($embedding, 'elev_pitch')))
    ->setFilter(
        (new Filter())->addMust(
            new MatchString('name', 'Palm')
        )
    )
    ->setLimit(10)
    ->setParams([
        'hnsw_ef' => 128,
        'exact' => false,
    ])
    ->setWithPayload(true);

$response = $client->collections('images')->points()->search($searchRequest);

qdrant-php's People

Contributors

fafiebig avatar gregpriday avatar hkulekci avatar mbukovy avatar nicr42 avatar snapshotpl avatar yunwuxin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

qdrant-php's Issues

Feature request: Decouple from Guzzle

The way the project is structured is a bit odd. It depends on Guzzle, but the code's actual dependency on Guzzle is between the HttpClientInterface abstraction layer. So it's possible to use a different HTTP client (such as nyholm/http-client or symfony/http-client), but the composer.json forces the user of the library to also load Guzzle, which may be undesirable and create version conflicts.

In my opinion, it would make more sense to just use pass a PSR-18 Psr\Http\Client\ClientInterface along with a Config object into the Qdrant constructor. The Qdrant class could then have a function that takes a PSR-7 RequestInterface, uses the information from Config to set the right path and headers, passes it into the provided HttpClientInterface, and processes the response (the exception handling that is now present in the Guzzle class).
You could even extract this function to another class and pass that to the endpoint classes, which makes more sense from the perspective of dependency inversion.

I'd be willing to make a PR for this you agree with the proposed direction.

If you have any questions feel free to ask them

We need to implement PointsBatch class

We had a solution for PointsBatch but it had not worked with multi-vector named vectors. So, for this reason, for now batch method working with PointsStruct.

delete method of Payload class need to be fixed

$filter parameter should be Qdrant\Models\Filter\Filter object. And also the delete-by filter needs to be tested. On the other side, for this method, one of the $keys and $filters could be null.

    // Qdrant\Endpoints\Collections\Points\Payload
    public function delete(array $points, array $keys, array $filters): Response
    {
        return $this->client->execute(
            $this->createRequest(
                'POST',
                '/collections/' . $this->getCollectionName() . '/points/payload/delete',
                [
                    'filters' => $filters,
                    'keys' => $keys,
                    'points' => $points,
                ]
            )
        );
    }

Documents :

https://qdrant.github.io/qdrant/redoc/index.html#tag/points/operation/delete_payload

TypeError encountered in Qdrant\Models\VectorStruct::getName() when the $name attribute is null.

Summary

TypeError encountered in Qdrant\Models\VectorStruct::getName() when the $name attribute is null.

Description

When an instance of VectorStruct is created with only the $vector argument provided and $name left as null, a TypeError occurs with the following message:

TypeError Qdrant\Models\VectorStruct::getName(): Return value must be of type string, null returned.

The issue seems to arise from the getName() method, which expects to return a string. However, when $name is null, it does not meet this expectation.

The error was resolved by modifying the getName() function to return an empty string when $name is null, like this:

$this->name ? $this->name : '';

Environment

  • PHP Version: 8.1.17
  • Library Version: v0.4

Steps to Reproduce

  1. Create an instance of VectorStruct with only the $vector argument provided, leaving $name as null.
  2. Call the getName() method on this instance.
  3. Observe the TypeError.

Expected Behavior

If the $name argument is not provided when instantiating VectorStruct, calling the getName() method should either not cause a TypeError, or it should be documented that providing a $name is required.

Actual Behavior

A TypeError is thrown when getName() is called on an instance of VectorStruct that was instantiated with $name as null.

Possible Fix

Modify the getName() function to return an empty string when $name is null. This would look like:

$this->name ? $this->name : '';

Endpoints

Collections

  • GET List collections
  • GET Collection info
  • PUT Create collection
  • PATCH Update collection parameters
  • DEL Delete collection
  • POST Update aliases of the collections
  • PUT Create index for field in collection
  • DEL Delete index for field in collection
  • GET Collection cluster info
  • POST Update collection cluster setup
  • GET List aliases for collection
  • GET List collections aliases
  • PUT Recover from a snapshot
  • GET List collection snapshots
  • POST Create collection snapshot
  • DEL Delete collection snapshot
  • GET Download collection snapshot

Points

  • GET Get point
  • POST Get points
  • PUT Upsert points
  • POST Delete points
  • POST Set payload
  • PUT Overwrite payload
  • POST Delete payload
  • POST Clear payload
  • POST Batch Update Point
  • POST Scroll points
  • POST Search points
  • POST Search batch points
  • POST Recommend points #18
  • POST Recommend batch points
  • POST Count points

Cluster

  • GET Get cluster status info
  • POST Tries to recover current peer Raft state.
  • DEL Remove peer from the cluster
  • GET Collection cluster info
  • POST Update collection cluster setup

Snapshots

  • PUT Recover from a snapshot
  • GET List collection snapshots
  • POST Create collection snapshot
  • DEL Delete collection snapshot
  • GET Download collection snapshot
  • GET List of storage snapshots
  • POST Create storage snapshot
  • DEL Delete storage snapshot
  • GET Download storage snapshot

Service

  • GET Collect telemetry data
  • GET Collect Prometheus metrics data
  • POST Set lock options
  • GET Get lock options

indexingThreshold

indexingThreshold value may be set to 0 (as per QDrant docs), but then it's ignored and skipped in the request

OptimizersConfig.php
simple fix:

if ($this->indexingThreshold || $this->indexingThreshold===0) {
$data['indexing_threshold'] = $this->indexingThreshold;
}

Add getters for structs.

I've found a few instances where I needed to get variable values from structs. Namely MultiVectorStruct, PointsStruct, PointStruct, VectorStruct. A very simple solution would be to make these variables public, but a more maintainable solution might be to have a single trait for all structs that allows get access to all protected variables.

I haven't checked this implementation, but something like this:

<?php
namespace Qdrant\Traits;

use InvalidArgumentException;
use ReflectionProperty;

/**
 * Trait ProtectedPropertyAccessor
 *
 * Allows access to protected properties through the magic __get method.
 */
trait ProtectedPropertyAccessor
{
    /**
     * Magic method to implement generic getter functionality for protected properties.
     *
     * @param string $property The name of the property to get.
     * @return mixed The value of the property.
     * @throws InvalidArgumentException if the property doesn't exist or is not protected.
     */
    public function __get(string $property)
    {
        if (property_exists($this, $property)) {
            $reflection = new ReflectionProperty($this, $property);
            if ($reflection->isProtected()) {
                return $this->$property;
            } else {
                throw new InvalidArgumentException("Access to property '$property' is not allowed");
            }
        }

        throw new InvalidArgumentException("Property '$property' does not exist");
    }
}

What do you think @hkulekci?

Enhance Flexibility with an `executeRaw` Function

In case of future updates and features from Qdrant, we should increase the adaptability of our library. With this proposed change, as soon as Qdrant introduces new features, users of this library can instantly integrate and utilize them without having to wait for explicit support in this package.

I'm suggesting two potential paths:

  1. Direct Implementation in Qdrant Client: We can add an executeRaw function to the main Qdrant client class. Here's a draft of that function:
public function executeRaw(string $method, string $uri, array $body = []): ResponseInterface
{
    $httpFactory = new HttpFactory();
    $request = $httpFactory->createRequest($method, $uri);
    
    if ($body) {
        try {
            $request = $request->withBody(
                $httpFactory->createStream(json_encode($body, JSON_THROW_ON_ERROR))
            );
        } catch (\JsonException $e) {
            throw new InvalidArgumentException('Json parse error!');
        }
    }

    return $this->client->execute($request);
}

However, this approach does introduce some code redundancy.

  1. Utilize AbstractEndpoint: Instead of direct implementation, we can modify the createRequest method in AbstractEndpoint to be public and static. This way, we centralize the request creation process.
public static function createRequest(string $method, string $uri, array $body = []): RequestInterface
{
    //... [code]
}

Given this modification, the executeRaw function within the main client becomes:

public function executeRaw(string $method, string $uri, array $body = []): ResponseInterface
{
    $request = AbstractEndpoint::createRequest($method, $uri, $body);
    return $this->client->execute($request);
}

@hkulekci, which approach aligns more with our vision for the library? Once we decide, I'll be happy to draft a PR for review.

Support for multiple vectors in VectorStruct

Hey @hkulekci,

First up, I've gotta say your library is seriously useful โ€“ it's saved me loads of time. So thanks for that!

I've been using it for my Laravel Scout engine and noticed something. Correct me if I'm wrong, but it doesn't look like it has support for multiple vectors, even though Qdrant's supports that?

We could tweak the VectorStruct constructor to take an array or a string for $name. If it gets an array, it knows it's dealing with multiple vectors. If it's a string, then it's just a single vector. This maintains backward compat.

Here's a bit of code to show what I mean:

class VectorStruct
{
    protected array $vectors;

    public function __construct(array $vector, $name = null)
    {
        // Check if $name is an array. If true, it means we have multiple vectors.
        // Otherwise, it's a single vector.
        if (is_array($name)) {
            // Multiple vectors. $vector is now an array of vector names, and $name is an array of vectors.
            $this->vectors = array_combine($vector, $name);
        } else {
            // Single vector.
            $this->vectors = [$name => $vector];
        }
    }

    public function toSearch(): array
    {
        $search = [];
        foreach ($this->vectors as $name => $vector) {
            $search[] = [
                'name' => $name,
                'vector' => $vector,
            ];
        }
        return $search;
    }

    public function toArray(): array
    {
        return $this->vectors;
    }
}

If you think it's a good fit, I'd be happy to throw together a PR. Or if I've got it all wrong and missed how to do this the official way, a nudge in the right direction would be awesome.

Thanks for considering!

Greg

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.