Coder Social home page Coder Social logo

Comments (9)

lin-goo avatar lin-goo commented on June 7, 2024 1

No, it means the planner will use (and show) a different plan when there's more data.不,这意味着当有更多数据时,计划器将使用(并显示)不同的计划。

Hi~ I have now increased the amount of data to 8000 entries and the index is now working properly. Thanks again for your answer!

-- analyse result
Limit  (cost=108.60..108.72 rows=2 width=16) (actual time=4.346..4.375 rows=2 loops=1)
  ->  Index Scan using faces_tsv_content_hnsw_idx on faces  (cost=108.60..628.14 rows=8923 width=16) (actual time=4.344..4.372 rows=2 loops=1)
        Order By: (tsv_content <=> '[-0.121626005... ,0.015510366]'::vector)"
Planning Time: 0.357 ms
Execution Time: 4.467 ms

from pgvector.

ankane avatar ankane commented on June 7, 2024

Hi @lin-goo, it looks like you only have ~500 rows, so a table scan will likely be around the same speed. See the docs for how to encourage the planner to use the index.

from pgvector.

lin-goo avatar lin-goo commented on June 7, 2024

I recreated the table with smaller dimensions, this time resulting in the use of indexes, with the following information

-- create table sql
CREATE TABLE tf (
    id BIGSERIAL PRIMARY KEY,
    user_id BIGINT NOT NULL DEFAULT 0,
    tsv_content vector(3) UNIQUE NOT NULL,
    created_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    updated_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    deleted_time BIGINT DEFAULT 0
) ;
CREATE INDEX tf_tsv_content_hnsw_idx ON tf USING hnsw (tsv_content vector_cosine_ops) WITH (m = 16, ef_construction = 64);


-- query sql
EXPLAIN ANALYSE SELECT id FROM tf ORDER BY
    tsv_content <=> '[1, 2, 3]'
LIMIT 2;


-- analyse result
Limit  (cost=4.48..4.60 rows=2 width=16) (actual time=0.045..0.047 rows=2 loops=1)
  ->  Index Scan using tf_tsv_content_hnsw_idx on tf  (cost=4.48..54.60 rows=810 width=16) (actual time=0.043..0.044 rows=2 loops=1)
"        Order By: (tsv_content <=> '[0,0,0]'::vector)"
Planning Time: 0.077 ms
Execution Time: 0.070 ms

from pgvector.

lin-goo avatar lin-goo commented on June 7, 2024

Does the use of an index correlate with the size of the vector dimension? @ankane

from pgvector.

lin-goo avatar lin-goo commented on June 7, 2024

Hi @lin-goo, it looks like you only have ~500 rows, so a table scan will likely be around the same speed. See the docs for how to encourage the planner to use the index.嗨,看起来你只有500行,所以表扫描的速度可能是一样的。请参阅文档了解如何鼓励计划者使用索引。

The data is only 500 rows because it is in the development phase and does not store more data, the amount of data in the production environment will be a lot of

from pgvector.

ankane avatar ankane commented on June 7, 2024

The difference likely has to do with TOAST (vectors over 498 dimensions / 2 KB are stored out-of-line by default, and this isn't included in the table scan cost estimate). When there are more rows, it should use the index.

from pgvector.

lin-goo avatar lin-goo commented on June 7, 2024

Do you mean that even though it doesn't show the use of indexes in the analysis results, it is used in the actual query?

from pgvector.

ankane avatar ankane commented on June 7, 2024

No, it means the planner will use (and show) a different plan when there's more data.

from pgvector.

lin-goo avatar lin-goo commented on June 7, 2024

I'll try increasing the amount of data then and see if the index is used, thank you very much for your reply!

from pgvector.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.