The milvus-tools from milvus-io

May I ask if there are any tools available to help me import data from the standalone version of Milvus into the cluster version of Milvus?

Migrating data from MIlvus1.0 to Milvus2.0

Hi ,
I am trying to migrate the data from milvus1.0 to milvus2.0, but it gives an error Error with: name 'delids' is not defined
and i don't have idea abut delids.

pls help.

Milvus DM - save the auto id generation

Hello guys, here is my question:

Why If I migrate collection from Milvus to HDF5 and then from HDF5 to another Milvus - the milvus id auto generation breaks down. So If I want to insert the new vector into migrated collection of another Milvus it produces the following error mesage:

Status(code=12, message='Entities IDs are user-defined. Please provide IDs for all entities of the collection.')                                                                                      
[]

I want to save the auto id generation on migrated collection. Maybe I am doing something wrong?

fix bug: when do the milvus to hdf5 meet the UnboundLocalError when segment_list and row_list empty

There will occur the error when the segment_list and row_list are empty, then the total_vectors and total_ids are assignment before clarify. This happened for me when I create all of the partition tags for data, and give None for partition tags when I do the milvus to hdf5, cause the default partition tag None has no data, then the error will occur.

here is the pull requests: #42

milvusdm error

when I execute milvusdm --yaml M2M.yaml i encounter an error.
2021-04-15 19:50:23,301 ｜ ERROR ｜ milvus_to_milvus.py ｜ transform_milvus_data ｜ 44 ｜ Error with: cannot reshape array of size 350208 into shape (171,64)2021-04-15 19:50:23,301 ｜ ERROR ｜
My milvus version is 1.0.0.

HDF5 examples of float vectors and binary vectors unknown dimension

Example of float vectors and binary vectors h5 file, not knowing the dimensionality of the vectors

Error with: local variable 'total_vectors' referenced before assignment

使用milvusdm迁移collection
源和目标节点版本均为0.10.3（源为单机版、目标为集群版）

报错信息：Error with: local variable 'total_vectors' referenced before assignment

M2M配置信息如下：

M2M:
  # The dest-milvus version.
  milvus_version: 0.10.3
  # Working directory of the source Milvus.
  source_milvus_path: '/data0/milvus'
  mysql_parameter:
    host: '172.18.248.189'
    user: 'root'
    port: 3306
    password: '123456'
    database: 'milvus'
  source_collection: # specify the 'partition_1' and 'partition_2' partitions of the 'test' collection.
    tidea_is_sample:
      - ''
  dest_host: '172.18.151.165'
  dest_port: 19531
  mode: 'skip' # 'skip/append/overwrite'

错误日志：

2021-11-05 21:18:58,140 ｜ DEBUG ｜ read_milvus_meta.py ｜ connect_mysql ｜ 20 ｜ Successfully connect mysql
2021-11-05 21:18:58,142 ｜ INFO ｜ milvus_to_milvus.py ｜ transform_milvus_data ｜ 38 ｜ Ready to transform all data of collection: tidea_is_sample/partitions: ['']
2021-11-05 21:18:58,143 ｜ DEBUG ｜ read_milvus_meta.py ｜ get_collection_info ｜ 72 ｜ Get collection info(dimension, index_file_size, metric_type, version):((512, 1073741824, 1, '0.10.3'),)
2021-11-05 21:18:58,147 ｜ DEBUG ｜ read_milvus_data.py ｜ read_milvus_file ｜ 89 ｜ Reading milvus/db data from collection: tidea_is_sample/partition:
2021-11-05 21:18:58,148 ｜ DEBUG ｜ read_milvus_meta.py ｜ get_collection_dim_type ｜ 96 ｜ Get meta data about dimension and types: ((512, 1),)
2021-11-05 21:18:58,148 ｜ DEBUG ｜ read_milvus_meta.py ｜ get_collection_segments_rows ｜ 109 ｜ Get meta data about segment and rows: ()
2021-11-05 21:18:58,149 ｜ ERROR ｜ milvus_to_milvus.py ｜ transform_milvus_data ｜ 44 ｜ Error with: local variable 'total_vectors' referenced before assignment

AttributeError: 'Collection' object has no attribute 'flush'

When I run the collection_prepare.py file to run the milvus benchmark, I get this error

AttributeError: 'Collection' object has no attribute 'flush'

How can we fix this?

Can't find the installed app

my codes:

export MILVUSDM_PATH='/home/${MY_USER_NAME}/milvusdm'
export LOGS_NUM=0
pip3 install pymilvusdm

and then:

pymilvusdm
pymilvusdm: command not found

Anything wrong?

The issue about indexing data in Milvus

It was found that in the Milvus to Milvus process if one of the Milvus data was indexing, the transfer would be unsuccessful

milvusdm can't work

OS: CentOS7.4,
Milvus old version: 0.10.3
Milvus new version: 1.1.1
We migrate to version1.1.1 from version0.10.3 with milvusdm:
yaml file:
M2M: milvus_version: 1.1.1 source_milvus_path: '/data0/milvus' mysql_parameter: source_collection: # specify the 'partition_1' and 'partition_2' partitions of the 'test' collection intelligence_picture_v1: dest_host: '10.11.205.18' dest_port: 19530 mode: 'skip' # 'skip/append/overwrite'

ERROR as flowing:
2022-05-06 16:53:26,929 ｜ ERROR ｜ grpc_handler.py ｜ handler ｜ 72 ｜
Addr [10.11.205.18:19530] fake_register_link
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.UNIMPLEMENTED
details = ""
debug_error_string = "{"created":"@1651827206.928677178","description":"Error received from peer ipv4:10.11.205.18:19530","file":"src/core/lib/surface/call.cc","file_line":1067,"grpc_message":"","grpc_status":12}"

    {'API start': '2022-05-06 16:53:26.927801', 'RPC start': '2022-05-06 16:53:26.928089', 'RPC error': '2022-05-06 16:53:26.928964'}

Support for milvus>=2.2

It seems that the hdf5 to milvus does not support custom schema. I have 3 columns "embedding", "id", "other".
It seems that the DM tool only imports the hardcoded group "embeddings", and "ids".

Error when migrating data

使用milvusdm迁移collection
目标：1.x版本
数据源：0.10.x版本

使用三个版本的milvusdm均遇到了异常，详情如下：

0.1版
2021-08-23 16:30:23,042 ｜ ERROR ｜ milvus_to_milvus.py ｜ transform_milvus_data ｜ 44 ｜ Error with: cannot reshape array of size 357564416 into shape (43648,256)

1.0版
2021-08-23 16:20:14,277 ｜ ERROR ｜ milvus_client.py ｜ insert ｜ 98 ｜ The amount of data inserted each time cannot exceed 256 MB
0%| | 0/1 [00:09<?, ?it/s]

2.0版
2021-08-23 16:16:19,198 ｜ ERROR ｜ grpc_handler.py ｜ handler ｜ 71 ｜
Addr [xx.xx.xx.xx:19530]（隐去ip地址） fake_register_link
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.UNIMPLEMENTED
details = ""
debug_error_string = "{"created":"@1629706579.197927267","description":"Error received from peer ipv4:xx.xx.xx.xx:19530","file":"src/core/lib/surface/call.cc","file_line":1067,"grpc_message":"","grpc_status":12}"

F2M does not follow yaml to execute

Does not follow the yaml file collection_parameter: dimension: 256, while the Faiss file has dimension 128 . After creating the collectiton, the dimension is 128 instead of 256 as set in the yaml file

ERROR ｜ milvus_client.py ｜ insert ｜ 98 ｜ The amount of data inserted each time cannot exceed 256 MB

My files are 11 g in size. When I perform migration, I am prompted that the number of single inserts is too large

Error with: name 'delids' is not defined

Encountered an issue when trying to migrate another dataset using DM tool 2.0 (New issues need to be resolved)
2022-09-20 05:23:11,401 ｜ ERROR ｜ milvus_to_milvus.py ｜ transform_milvus_data ｜ 47 ｜ Error with: name 'delids' is not defined

Data export from Milvus 2.x standalone

Hi,

I've tried to export Milvus data to HDF5 when running Milvus in a standalone setup using dockerized milvus, etcd and minio.

The yaml configuration looks like:

M2H:
  milvus_version: 2.0.0
  source_milvus_path: '<directory-where-milvus-volume-is-mapped>'
  mysql_parameter:
  source_collection:
    <my-collection-name>:
      - '_default'

  data_dir: '<data-directory-where-to-export>'

However this fails with:
ERROR ｜ read_milvus_meta.py ｜ connect_sqlite ｜ 31 ｜ SQLite ERROR: connect failed with unable to open database file

So I wonder if it is possible to use the tool from the standalone run Milvus?

Thanks!

The type of go_benchmark response is not json: panic: nprobe not valid

How to pass search param of inv_flat index type exactly with your benchmark code, assume that I have successfully create INV_FLAT index on the dataset. Passing

search_parameters = {
        "anns_field": anns_field,
        "metric_type": metric_type,
        "param": {
            "nprobe": 32,
        },
        "limit": topk,
        "expression": expression,
    }

gives error:

Traceback (most recent call last):
  File "go_benchmark.py", line 167, in <module>
    go_search(go_benchmark=go_benchmark, uri=uri, user=user, password=password, collection_name=collection_name,
  File "go_benchmark.py", line 113, in go_search
    raise ValueError(msg)
ValueError: The type of go_benchmark response is not json: panic: nprobe not valid

Appreciate you helps. Thanks.

Error using milvusdm

when I execute milvusdm --yaml M2H.yaml
I got:

2021-04-23 14:28:21,740 ｜ INFO ｜ milvus_to_hdf5.py ｜ read_milvus_data ｜ 50 ｜ Ready to read all data of collection: ann_1m_sq8/partitions: [None] 0%| | 0/1 [00:00<?, ?it/s] 2021-04-23 14:28:21,908 ｜ ERROR ｜ milvus_to_hdf5.py ｜ read_milvus_data ｜ 56 ｜ Error with: cannot reshape array of size 307200000 into shape (600000,16)

Is there any wrong with the data volume?

Milvus data when migration not correct at sqlite

1. Error:

Collection name, partition, index, none of them are migrate to the new host
Old Host:

New Host: After migrations:

2. How to reproduce:

Run pymilvusdm==1.0 and pymilvus==1.0.1, migrate data of milvus server 1.0.0
Run both on M2M.yaml, H2M.yaml and M2H.yaml
Data migrate success but missing improtant data
i have tested on both append and overwrite
data for test

from milvus import Milvus, IndexType, MetricType, Status
milvus = Milvus(host='milvusv2.local', port='19530')
param = {'collection_name':'test01', 'dimension':256, 'index_file_size':1024, 'metric_type':MetricType.L2}
milvus.create_collection(param)
milvus.create_partition('test01', 'tag01')
import random
vectors = [[random.random() for _ in range(256)] for _ in range(20)]
vector_ids = [id for id in range(20)]
milvus.insert(collection_name='test01', records=vectors, ids=vector_ids)
milvus.insert('test01', vectors, partition_tag="tag01")
ivf_param = {'nlist': 16384}
milvus.create_index('test01', IndexType.IVF_FLAT, ivf_param)

Docker-compose


services:
  milvus:
    image: 'milvusdb/milvus:1.0.0-cpu-d030521-1ea92e'
    hostname: milvus.local
    networks:
      binhbtn:
        ipv4_address: 172.23.0.3
    volumes:
      - /tmp/db:/var/lib/milvus/db
      - /tmp/logs:/var/lib/milvus/logs
      - /tmp/wal:/var/lib/milvus/wal
  milvusv2:
    image: 'milvusdb/milvus:1.0.0-cpu-d030521-1ea92e'
    hostname: milvusv2.local
    networks:
      binhbtn:
        ipv4_address: 172.23.0.4
    volumes:
      - /tmp/2/db:/var/lib/milvus/db
      - /tmp/2/logs:/var/lib/milvus/logs
      - /tmp/2/wal:/var/lib/milvus/wal
  python37:
    image: 'python:3.7.13'
    tty: true
    networks:
      binhbtn:
        ipv4_address: 172.23.0.5
    volumes:
      - /tmp/2/db:/var/lib/milvus/db
      - /tmp/2/logs:/var/lib/milvus/logs
      - /tmp/2/wal:/var/lib/milvus/wal
      - /tmp/db:/var/lib/milvus/dest/db
      - /tmp/logs:/var/lib/milvus/dest/logs
      - /tmp/wal:/var/lib/milvus/dest/wal
    depends_on:
      - milvus
      - milvusv2

networks:
  binhbtn:
    driver: bridge
    ipam:
     config:
       - subnet: 172.23.0.0/16

3. Config

M2M.yaml

  M2M:  milvus_version: 1.0.0
  source_milvus_path: '/var/lib/milvus'
  mysql_parameter:
  source_collection:
    test01:
  dest_host: 'milvus.local'
  dest_port: 19530
  mode: 'overwrite'

H2M.yaml

H2M:
  milvus_version: 1.x
  data_path:
  data_dir: '/var/lib/milvus/backup'
  dest_host: '172.23.0.3'
  dest_port: 19530
  mode: 'overwrite'
  dest_collection_name: 'test01'
  dest_partition_name: 'tag01'
  collection_parameter:
    dimension:
    index_file_size:
    metric_type:

M2H.yaml

M2H:
  milvus_version: 1.0.0
  source_milvus_path: '/var/lib/milvus'
  mysql_parameter:
  source_collection:
    test01:
  data_dir: '/var/lib/milvus/backup'

mlivus to milvus error 1.0

ERROR ｜ milvus_to_milvus.py ｜ transform_milvus_data ｜ 44 ｜ Error with: local variable 'total_vectors' referenced before assignment

backup from milvus-cluster

I'm wondering how we can backup index and vector data from a Milvus cluster to an HDF5 file for HA purposes.

Data integrity

When I transform data of milvus, some situations happend which some vectors can not find in new milvus.
origin milvus:
version=0.10.4,
index_type=IndexType.IVF_FLAT
index_param={'nlist': 16384}

new milvus:
version=1.0.0,
index_type=IndexType.IVF_FLAT
index_param={'nlist': 16384}

H2M No vector dimension in the log file

2021-02-08 11:52:48,802 ｜ DEBUG ｜ data_to_milvus.py ｜ insert_data ｜ 69 ｜ Successfuly insert collection: test_bina/partition: , total num: 5000
Only the total number of vectors, and hopefully all the information inserted in Milvus colletion can be printed

H2M error grpc: received message larger than max

milvusdm --yaml H2M.yaml
0%| | 0/1 [00:00<?, ?it/s]2023-01-09 21:10:13,679 ｜ ERROR ｜ grpc_handler.py ｜ handler ｜ 72 ｜

Addr [192.168..:19530] bulk_insert

RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.RESOURCE_EXHAUSTED
details = "grpc: received message larger than max (77985775 vs. 67108864)"
debug_error_string = "{"created":"@1673316613.678811865","description":"Error received from peer

ipv4:192.168..:19530","file":"src/core/lib/surface/call.cc","file_line":1067,"grpc_message":"grpc: received message larger than max (77985775 vs. 67108864)","grpc_status":8}"

{'API start': '2023-01-09 21:10:11.611670', 'RPC start': '2023-01-09 21:10:11.612275', 'RPC error': '2023-01-09 21:10:13.679653'}

2023-01-09 21:10:13,680 ｜ ERROR ｜ milvus_client.py ｜ insert ｜ 86 ｜ <_MultiThreadedRendezvous of RPC that terminated with:

status = StatusCode.RESOURCE_EXHAUSTED

details = "grpc: received message larger than max (77985775 vs. 67108864)"

debug_error_string = "{"created":"@1673316613.678811865","description":"Error received from peer

ipv4:192.168..:19530","file":"src/core/lib/surface/call.cc","file_line":1067,"grpc_message":"grpc: received message larger than max (77985775 vs. 67108864)","grpc_status":8}"

0%| | 0/1 [00:04<?, ?it/s]

how can i solve this problem? thanks.

the sour collection: xxx does not exists

(Status(code=0, message='Show collections successfully!'), ['milvus_datas'])

2021-03-25 16:50:16,962 ｜ ERROR ｜ milvus_to_milvus.py ｜ transform_milvus_data ｜ 47 ｜ Error with: The sour collection: milvus_datas does not exists.

Error occurred when migrating from Milvus to HDF5

An error encountered when using DM to migrate data from Milvus to HDF5.

22-06-07 16:21:30,979 ｜ INFO ｜ milvus_to_hdf5.py ｜ read_milvus_data ｜ 49 ｜ Ready to read all data of collection: video_fingerprint/partitions: [None]
0%| | 0/1 [00:03<?, ?it/s]
2022-06-07 16:21:34,230 ｜ ERROR ｜ milvus_to_hdf5.py ｜ read_milvus_data ｜ 56 ｜ Error with: name 'delids' is not defined

Have tried different versions of milvusdm (1.0, 2.0) and the results are same. And it happened with another milvus server.

Version of milvusdm 2.0
Version of milvus 1.1.1

Configuration shown below ( M2H.yaml )

M2H:
milvus_version: 1.1.1
source_milvus_path: '/data1/milvus_1.x_uni_video'
mysql_parameter:
host: '127.0.0.1'
user: 'root'
port: 3376
password: 'xxxxxxxxxxxx'
database: 'milvus'
source_collection:
video_fingerprint:
data_dir: '/data1/milvus_migration/uni_video'
mode: 'overwrite'

Error when reading milvus data for empty collection

Issue description

When trying to read an empty collection, milvusdm fails saying that total_vectors variable is referenced before assignment. This error origins from get_files_data function in read_milvus_data.py:

milvus-tools/pymilvusdm/core/read_milvus_data.py

Lines 55 to 80 in 41143e5

    
           def get_files_data(self, table_id, collection_path, milvus_meta): 
        
               dim, types = milvus_meta.get_collection_dim_type(table_id) 
        
               segment_list, row_list = milvus_meta.get_collection_segments_rows(table_id) 
        
               # total_vectors = [] 
        
               # total_ids = [] 
        
               # total_rows = 0 
        
               # for segment_id, rows in zip(segment_list, row_list): 
        
               #     total_rows += rows 
        
               #     vectors, ids = self.get_segment_data(collection_path, segment_id, dim, rows, types) 
        
               #     total_vectors += vectors 
        
               #     # total_ids += ids.tolist() 
        
               #     total_ids += ids 
        
               total_rows = 0 
        
               for segment_id, rows in zip(segment_list, row_list): 
        
                   vectors, ids = self.get_segment_data(collection_path, segment_id, dim, rows, types) 
        
                   if total_rows==0: 
        
                       total_vectors = vectors 
        
                       total_ids = ids 
        
                   else: 
        
                       total_ids = np.append(total_ids, ids) 
        
                       total_vectors = np.append(total_vectors, vectors, axis=0) 
        
                   total_rows += rows 
        
                   del vectors 
        
                   del ids 
        
               return total_vectors, total_ids, total_rows

When either segment_list or row_list is empty, the for loop won't run and thus an attempt is made to return total_vectors and total_ids which haven't been initialized.

I've created a PR with a simple fix/workaround for this issue: #33

Reproduction steps

Using Milvus 1.1.1, pymilvus==1.1.1 and pymilvusdm==2.0.

Create collection using pymilvus:

_DIM = 8
from milvus import Milvus, IndexType, MetricType, Status
milvus = Milvus('127.0.0.1', '19530')
collection_name = 'example_collection'
param = { 'collection_name': collection_name, 'dimension': _DIM }
milvus.create_collection(param)
milvus.flush([collection_name])

Prepare configuration YAML M2H.yml:

M2H:
  milvus_version: 1.1.1
  source_milvus_path: '<SOURCE_MILVUS_PATH>'
  mysql_parameter:
    host: '127.0.0.1'
    user: 'root'
    port: 3306
    password: 'password'
    database: 'milvus'
  source_collection:
    example_collection:
  data_dir: 'backup'

Call milvusdm --yaml M2H.yml:

<TIMESTAMP> ｜ INFO ｜ milvus_to_hdf5.py ｜ read_milvus_data ｜ 49 ｜ Ready to read all data of collection: example_collection/partitions: [None]
  0%|                                                                                                                      | 0/1 [00:00<?, ?it/s]
<TIMESTAMP>｜ ERROR ｜ milvus_to_hdf5.py ｜ read_milvus_data ｜ 56 ｜ Error with: local variable 'total_vectors' referenced before assignment

Same error happens when non-default partition is used and contains some vectors, while the default partition stays empty.

Benchmark binary

Why is the benchmark in a binary, and not in actual code? This is not a transparent way to share and replicate benchmark results.

Why is Milvus 2.x not supported now？

Why is Milvus 2.x not supported now？
It hasn't been updated for so long.

Support milvus annoy index

i want to test the search performance of annoy index.
After i ingested data into milvus and built index, i run the go_benchmark and receive the following exception:

It seems that 'benchmark' binary currently not support annoy index.
Does milvus has the plan to open-source the source code of benchmark?

migrate milvus to milvus data when schema is different

we are going to add new field like "hash". but i read some article milvus didn't support alter feature yet.
so we're testing milvusDM tool.

is it possible to migrate data from original collection to new collection on same host like below ?

original collection schema is id / image_url / embeddings.
new collection schema is id / image_url / embeddings / hash.

thank you

MIlvusdm issue with large scale data

I tested Milvusdm with 100 million data, 10 million data, respectively, and when the amount of data is large, it simply does not run in small machines, summary: Milvusdm occupies a very high memory, please fix

ERROR ｜ grpc_handler.py ｜ ping ｜ 338 ｜ Retry to connect server 127.0.0.1:19530 failed.

2022-07-25 11:26:22,835 ｜ ERROR ｜ grpc_handler.py ｜ ping ｜ 338 ｜ Retry to connect server 127.0.0.1:19530 failed.
2022-07-25 11:26:22,835 ｜ ERROR ｜ main.py ｜ fai2mil ｜ 72 ｜ Fail connecting to server on 127.0.0.1:19530. Timeout

Error with: name 'delids' is not defined

when I execute 'milvusdm --yaml M2M.yaml'
Exception:Error with: name 'delids' is not defined

Milvus to milvus support etcd

as title, our milvus use etcd to manage metadata, but milvus to milvus no config to connect etcd

Can the milvus benchmark measure recall?

Can the milvus benchmark 2.1 measure recall? I find that it can only measure throughput now.

Is there a plan to support faiss ivf_pq index files?

For now "pymilvusdm only supports faiss flat and ivf_flat index files", so is there a plan to support faiss ivf_pq index files?
Or any clue or document to start this work by loading the index?
The header is 'IxPT'.

Error: cannot reshape array error

使用milvusdm迁移collection
源和目标节点版本均为0.10.5
由于0.10.5版本无分区，配置如下：

source_collection:
collection_name_xxx:
- ''

执行迁移时提示以下异常，请问是目标集群的限制吗？该如何解决呢？

ERROR ｜ milvus_to_milvus.py ｜ transform_milvus_data ｜ 44 ｜ Error with: cannot reshape array of size 100335616 into shape (48992,64)

HDF5 to Milvus fail

When Milvus to HDF5.Partition not set,Generated file named None.h5.I don't know if that works

When HDF5 to Milvus.I don't know what to specify as the ’dest_partition_name‘ attribute when I don't have a partition,Finally, I gave an empty string

Finally, the following error occurred

2022-06-21 16:44:23,939 ｜ ERROR ｜ grpc_handler.py ｜ handler ｜ 72 ｜
Addr [192.168.23.131:19530] fake_register_link
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.UNIMPLEMENTED
details = ""
debug_error_string = "{"created":"@1655801063.939163866","description":"Error received from peer ipv4:192.168.23.131:19530","file":"src/core/lib/surface/call.cc","file_line":1067,"grpc_message":"","grpc_status":12}"

{'API start': '2022-06-21 16:44:23.937994', 'RPC start': '2022-06-21 16:44:23.938438', 'RPC error': '2022-06-21 16:44:23.939350'}

2022-06-21 16:47:28,638 ｜ ERROR ｜ main.py ｜ execute ｜ 139 ｜ server is not healthy, please try again later

I'm really going crazy. Can you tell me how to specify configuration items when I don't have a specific partition

Failed to save if partition tag contains "/"

Hi, I'm using "/" to use a hierarchical structure in the partition, fails to save M2H.

In save_data.py, hdf5_filename and yaml_filename are just created by concatenating partition_tag.

I think "/" be escaped or os.makedir in save_hdf5_data (not in __init__)

	def get_files_data(self, table_id, collection_path, milvus_meta):
	dim, types = milvus_meta.get_collection_dim_type(table_id)
	segment_list, row_list = milvus_meta.get_collection_segments_rows(table_id)

	# total_vectors = []
	# total_ids = []
	# total_rows = 0
	# for segment_id, rows in zip(segment_list, row_list):
	# total_rows += rows
	# vectors, ids = self.get_segment_data(collection_path, segment_id, dim, rows, types)
	# total_vectors += vectors
	# # total_ids += ids.tolist()
	# total_ids += ids
	total_rows = 0
	for segment_id, rows in zip(segment_list, row_list):
	vectors, ids = self.get_segment_data(collection_path, segment_id, dim, rows, types)
	if total_rows==0:
	total_vectors = vectors
	total_ids = ids
	else:
	total_ids = np.append(total_ids, ids)
	total_vectors = np.append(total_vectors, vectors, axis=0)
	total_rows += rows
	del vectors
	del ids
	return total_vectors, total_ids, total_rows

milvus-io / milvus-tools Goto Github PK

milvus-tools's People

Stargazers

Watchers

Forkers

milvus-tools's Issues

1. Error:

2. How to reproduce:

3. Config

Issue description

Reproduction steps

Recommend Projects

Recommend Topics

Recommend Org