right now Bench hardcodes dimensions at <div class="snippet-clipboard-content notr

Let's use the following grid for the existing parameters <div class="snippet-clipb

Raw output and plots for: <div class="highlight highlight-source-java notranslate

Other datasets with the same selection of PQ: <a href="https://github.com/jbellis/

find out how much we can compress openai embeddings vectors about jvector HOT 14 CLOSED

jbellis commented on May 24, 2024

find out how much we can compress openai embeddings vectors

from jvector.

Comments (14)

jbellis commented on May 24, 2024 1

Let's use the following grid for the existing parameters

        var mGrid = List.of(16, 24, 32, 48);
        var efConstructionGrid = List.of(100, 200, 400);
        var efSearchFactor = List.of(1, 2);

from jvector.

dlg99 commented on May 24, 2024

Raw output and plots for:

        var files = List.of(
                "../ivec/pages_ada_002",
                "../hdf5/nytimes-256-angular.hdf5",
                "../hdf5/glove-100-angular.hdf5",
                "../hdf5/glove-200-angular.hdf5",
                "../hdf5/fashion-mnist-784-euclidean.hdf5",
                "../hdf5/sift-128-euclidean.hdf5");

        var mGrid = List.of(16, 24, 32, 48);
        var efConstructionGrid = List.of(100, 200, 400);
        var efSearchFactor = List.of(1, 2);
        var diskOptions = List.of(true, false);

bench_ada_hft5.txt

from jvector.

jbellis commented on May 24, 2024

First graph includes PQ=0, is that a bug in the plot?

from jvector.

jbellis commented on May 24, 2024

i think something is broken w/ the last one (possibly w/ the inputs), recall of 0.003 is way too low

from jvector.

jbellis commented on May 24, 2024

I would like to see graphs of the plots constrained to PQ of 1/8 original size (current one-size-fits-all setting), and 1/16; if recall does not drop significantly at overquery=2, then also include 1/32

from jvector.

dlg99 commented on May 24, 2024

PQ=0 is filled on the chart when pq is not used (diskOptions == false)

from jvector.

dlg99 commented on May 24, 2024

this is ada 100k with dataset downloaded from s3 (previous was from gdrive+slack) and PQ as

        List<Integer> pqDimensions = new ArrayList<>();
        int dims = ds.baseVectors.get(0).length;
        for (int i = 2; i <= 32; i *= 2) {
            if (dims / i > 1) {
                pqDimensions.add(dims / i);
            }
        }

bench_ada_100k.txt

from jvector.

dlg99 commented on May 24, 2024

ada 1M does not work because "MappedRandomAccessReader doesn't support large files"

from jvector.

jbellis commented on May 24, 2024

that's addressed in main branch, but 100k is fine for now

from jvector.

dlg99 commented on May 24, 2024

Other datasets with the same selection of PQ:
bench_others.txt

from jvector.

jbellis commented on May 24, 2024

Going back to the original goal of evaluating PQ on Ada002 embeddings -- it looks like this is the first dataset we've found where even at 1/8 the size the recall at 16/100/OQ=2 is worse than the recall at 16/100/OQ=1/PQ=Off. So being even more aggressive with PQ is not warranted.

from jvector.

jbellis commented on May 24, 2024

PQ@768 build 63.70s,
PQ encode 4.85s,
Build M=16 ef=100 in 15.08s with 0.40 short edges
  Query PQ=false top 101/1 recall 0.9434 in 11.77s after 146464790 nodes visited
  Query PQ=true top 101/1 recall 0.9201 in 72.69s after 147004020 nodes visited
  Query PQ=false top 101/2 recall 0.9702 in 22.05s after 250168950 nodes visited
  Query PQ=true top 101/2 recall 0.9703 in 123.74s after 251120730 nodes visited

I was looking at the wrong numbers in your graph (seduced by 1536/8=192)

from jvector.

jbellis commented on May 24, 2024

PQ@384 build 35.77s,
PQ encode 3.04s,
Build M=16 ef=100 in 15.49s with 0.40 short edges
  Query PQ=false top 101/1 recall 0.9463 in 2.60s after 29129956 nodes visited
  Query PQ=true top 101/1 recall 0.8293 in 7.76s after 29300004 nodes visited
  Query PQ=false top 101/2 recall 0.9719 in 4.34s after 49809210 nodes visited
  Query PQ=true top 101/2 recall 0.9629 in 12.88s after 50224696 nodes visited

PQ@192 build 22.40s,
PQ encode 1.45s,
Build M=16 ef=100 in 15.10s with 0.40 short edges
  Query PQ=false top 101/1 recall 0.9427 in 2.68s after 30315300 nodes visited
  Query PQ=true top 101/1 recall 0.6806 in 4.19s after 31296206 nodes visited
  Query PQ=false top 101/2 recall 0.9691 in 4.69s after 51024144 nodes visited
  Query PQ=true top 101/2 recall 0.8682 in 6.49s after 51692598 nodes visited

from this very small sample it looks like it's okay to reduce to 384 if you're doing OQ=2, but not 192

from jvector.

jbellis commented on May 24, 2024

[I switched from numruns=10 to numruns = 2 is why all the query times got much smaller]

from jvector.

find out how much we can compress openai embeddings vectors about jvector HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent