Hello, I'm currently facing performance challenges with pgvector on

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Great, thanks for sharing <a class="user-mention notranslate" data-hovercard-type="use

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Performance Issue with Large Tables and HNSW Indexes about pgvector HOT 9 CLOSED

williamjeong2 commented on June 8, 2024

Performance Issue with Large Tables and HNSW Indexes

from pgvector.

Comments (9)

williamjeong2 commented on June 8, 2024 2

@ankane @pashkinelfe Your suggestions were very helpful.

This is the result of using 'pg_prewarm'. In the results below, the top 2 were using 'pg_prewarm' to warm up the indexes. It didn't take long.

Average QPS: 110.34824108994871, Total Queries Processed: 1079
Average QPS: 107.39188954299956, Total Queries Processed: 1112
Average QPS: 5.389447228919696, Total Queries Processed: 54
Average QPS: 16.25110630066733, Total Queries Processed: 155
Average QPS: 5.931909562061934, Total Queries Processed: 60
Average QPS: 6.5119287118480065, Total Queries Processed: 63
Average QPS: 6.479642628293995, Total Queries Processed: 67
Average QPS: 6.736313818006796, Total Queries Processed: 70

To summarize, I can say that using 'pg_prewarm' has performed the best in my case so far. Of course, in my case, especially since I have a very large index size (400 GB for the vector indexes of all tables combined), I think it will require a very large memory. Nevertheless, I think it should be possible to achieve faster TPS than what I have now, because I have seen very fast QPS results in blog posts, including supabase. Other control variables such as vector length, postgresql configuration, etc. remain to be seen.

from pgvector.

ankane commented on June 8, 2024 1

It looks like the buffer hit rate is pretty low, so a lot of reads are happening from disk. I suspect you'll see better performance with an SSD, especially since HNSW does a lot of random access. I don't think partitioning will help in this situation (unless you're filtering by the partition key). You could also try prewarming the index with pg_prewarm to get it in memory, but it may not stay there.

from pgvector.

williamjeong2 commented on June 8, 2024 1

I measured QPS with just moving the database to SSD (each line is a different table, with more data stored as you go down. The top row has 10 million rows and the last row has 20 million rows, with more rows as you go down):

Average QPS: 7.245211969821667, Total Queries Processed: 8
Average QPS: 6.831864681418859, Total Queries Processed: 7
Average QPS: 77.75631549154483, Total Queries Processed: 76
Average QPS: 78.14635672947067, Total Queries Processed: 99
Average QPS: 6.005239444042287, Total Queries Processed: 7
Average QPS: 5.587675697528546, Total Queries Processed: 7
Average QPS: 5.5092620964731545, Total Queries Processed: 6
Average QPS: 5.468562478180391, Total Queries Processed: 6

And this is the output of explain (analyze, buffers) for one of the queries:

Table that had slow QPS.

Limit  (cost=224.84..226.83 rows=3 width=1261) (actual time=163.940..164.732 rows=3 loops=1)
  Buffers: shared hit=473 read=1186
  ->  Index Scan using paper_2016_embedding_idx on paper_2016  (cost=224.84..3815999.63 rows=5742623 width=1261) (actual time=163.937..164.727 rows=3 loops=1)
        Order By: (embedding <=> '[0.6871753,-0.11484199,-0.52307606,...]'::vector)
        Buffers: shared hit=473 read=1186
Planning:
  Buffers: shared hit=1
Planning Time: 0.153 ms
Execution Time: 164.778 ms

Table that had fast QPS.

Limit  (cost=224.84..226.81 rows=3 width=1245) (actual time=15.788..16.005 rows=3 loops=1)
  Buffers: shared hit=1210 read=29
  ->  Index Scan using paper_2018_embedding_idx on paper_2018  (cost=224.84..5054610.74 rows=7706712 width=1245) (actual time=15.784..15.998 rows=3 loops=1)
        Order By: (embedding <=> '[0.6871753,-0.11484199,-0.52307606,...]'::vector)
        Buffers: shared hit=1210 read=29
Planning:
  Buffers: shared hit=42 read=10
Planning Time: 1.375 ms
Execution Time: 16.055 ms

Things are much better than before. But It looks like only the fastest tables are in the buffer. There is still a 'pg_prewarm' task left. However, in my case, my largest table is 100G excluding indexes(80G), so I don't expect it to work well. But I will give it a try.

from pgvector.

ankane commented on June 8, 2024 1

Great, thanks for sharing @williamjeong2!

from pgvector.

ankane commented on June 8, 2024

Hi @williamjeong2, can you paste the output of EXPLAIN (ANALYZE, BUFFERS) for one of the queries?

from pgvector.

williamjeong2 commented on June 8, 2024

Hi @ankane, here is the output(768 dim):

Limit  (cost=224.84..226.81 rows=3 width=1247) (actual time=8249.103..8318.815 rows=3 loops=1)
  Buffers: shared hit=128 read=1246
  ->  Index Scan using paper_embedding_idx on paper  (cost=224.84..11281870.44 rows=17203648 width=1247) (actual time=8249.099..8318.807 rows=3 loops=1)
        Order By: (embedding <=> '[0.8093864,0.60382533,...]'::vector)
        Buffers: shared hit=128 read=1246
Planning:
  Buffers: shared hit=118
Planning Time: 2.416 ms
Execution Time: 8319.119 ms

from pgvector.

williamjeong2 commented on June 8, 2024

@ankane
Thanks so much, I'll try the things you suggested to me and share them here for use cases.

from pgvector.

ankane commented on June 8, 2024

Awesome, sounds good.

from pgvector.

pashkinelfe commented on June 8, 2024

@williamjeong2 Considering you have 20Gb HNSW index, it is likely that pg_prewarm will help. Otherwise 'natural' warm-up could take a long time. To check, try running your test for around 1000 sec and see how TPS evolves. Or just use pg_prewarm )

from pgvector.

Performance Issue with Large Tables and HNSW Indexes about pgvector HOT 9 CLOSED

Comments (9)

Table that had slow QPS.

Table that had fast QPS.

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent