Coder Social home page Coder Social logo

gnes-ai / gnes Goto Github PK

View Code? Open in Web Editor NEW
1.3K 54.0 210.0 53.24 MB

GNES is Generic Neural Elastic Search, a cloud-native semantic search system based on deep neural network.

Home Page: https://gnes.ai

License: Other

Shell 1.45% Dockerfile 0.47% Python 94.48% HTML 3.60%
gnes cloud-native nlp computer-vision deep-learning machine-learning docker-swarm semantic-search neural-network dnn database tensorflow pytorch python elasticsearch search-engine video-processing grpc distributed-systems microservices

gnes's Introduction

GNES Generic Neural Elastic Search, logo made by Han Xiao

PyPI Documentation Status PyPI - License

HighlightsOverviewInstallGetting StartedHubDocumentationTutorialContributingRelease NotesBlog

What is it

GNES [jee-nes] is Generic Neural Elastic Search, a cloud-native semantic search system based on deep neural network.

GNES enables large-scale index and semantic search for text-to-text, image-to-image, video-to-video and any-to-any content form.

Highlights

💭 To know more about the key tenets of GNES, read this blog post

☁️

Cloud-Native & Elastic

🐣

Easy-to-Use

🔬

State-of-the-Art

GNES is all-in-microservice! Encoder, indexer, preprocessor and router are all running in their own containers. They communicate via versioned APIs and collaborate under the orchestration of Docker Swarm/Kubernetes etc. Scaling, load-balancing, automated recovering, they come off-the-shelf in GNES. How long would it take to deploy a change that involves just switching a layer in VGG? In GNES, this is just one line change in a YAML file. We abstract the encoding and indexing logic to a YAML config, so that you can change or stack encoders and indexers without even touching the codebase. Taking advantage of fast-evolving AI/ML/NLP/CV communities, we learn from best-of-breed deep learning models and plug them into GNES, making sure you always enjoy the state-of-the-art performance.

🌌

Generic & Universal

📦

Model as Plugin

💯

Best Practice

Searching for texts, image or even short-videos? Using Python/C/Java/Go/HTTP as the client? Doesn't matter which content form you have or which language do you use, GNES can handle them all. When built-in models do not meet your requirments, simply build your own with GNES Hub. Pack your model as a docker container and use it as a plugin. We love to learn the best practice from the community, helping our GNES to achieve the next level of availability, resiliency, performance, and durability. If you have any ideas or suggestions, feel free to contribute.

Overview

component overview

GNES Hub

component overview

GNES Hub ship AI/ML models as Docker containers and use Docker containers as plugins. It offers a clean and sustainable way to port external algorithms (with the dependencies) into the GNES framework.

GNES Hub is hosted on the Docker Hub.

Install GNES

There are two ways to get GNES, either as a Docker image or as a PyPi package. For cloud users, we highly recommend using GNES via Docker.

Run GNES as a Docker Container

docker run gnes/gnes:latest-alpine

This command downloads the latest GNES image (based on Alpine Linux) and runs it in a container. When the container runs, it prints an informational message and exits.

💡 Choose the right GNES image

Besides the alpine image optimized for the space, we also provide Buster (Debian 10.0), Ubuntu 18.04 and Ubuntu 16.04-based images. The table below summarizes all available GNES tags. One can fill in {ver} with latest, stable or v0..xx. latest refers to the latest master of this repository, which may not be stable. We recommend you to use an official release by changing the latest to a version number, say v0.0.24, or simply using stable for the last release, e.g. gnes:stable-ubuntu

Tag Size and layers Description
{ver}-alpine based on Alpine Linux;
no deep learning libraries;
extremely lightweight and portable, enables fast scaling on even edge devices.
{ver}-buster based on Debian 10.0;
no deep learning libraries;
recommended for building or extending a GNES-Hub image.
{ver}-ubuntu18 based on Ubuntu 18.04;
no deep learning libraries.
{ver}-full based on Ubuntu 16.04;
python-3.6.8, cuda-10.0, tf1.14, pytorch1.1, faiss, multiple pretrained models;
heavy but self-contained, useful in testing GNES end-to-endly.

⚠️ Since 2019/10/21, we have stopped hosting the public mirror Tencent Cloud. The old Docker images still exist, but there won't be new images available on Tencent Cloud anymore.

We also provide a public mirror Github packages. Select the mirror that serves you well.

docker login --username=xxx docker.pkg.github.com/gnes-ai/gnes  # login to github package so that we can pull from it
docker run docker.pkg.github.com/gnes-ai/gnes/gnes:latest-alpine

The table below shows the status of the build pipeline.

RegistryBuild status
Docker Hub
gnes/gnes:[tag]
Github Package
docker.pkg.github.com/gnes-ai/gnes/gnes:[tag]

Install GNES via pip

You can also install GNES as a Python3 package via:

pip install gnes

Note that this will only install a "barebone" version of GNES, consists of the minimal dependencies for running GNES. No third-party pretrained models, deep learning/NLP/CV packages will be installed. We make this setup as the default installation behavior, as a model interested to NLP engineers may not be interested to CV engineers. In GNES, models serve as Docker plugins.

🚸 Tensorflow, Pytorch and torchvision are not part of GNES installation. Depending on your model, you may have to install them in advance.

Though not recommended, you can install GNES with full dependencies via:

pip install gnes[all]
🍒 Or cherry-picking the dependencies according to the table below: (click to expand...)
pip install gnes[bert]
bert-serving-server>=1.8.6, bert-serving-client>=1.8.6
pip install gnes[flair]
flair>=0.4.1
pip install gnes[annoy]
annoy==1.15.2
pip install gnes[chinese]
jieba
pip install gnes[vision]
opencv-python>=4.0.0, imagehash>=4.0
pip install gnes[leveldb]
plyvel>=1.0.5
pip install gnes[test]
pylint, memory_profiler>=0.55.0, psutil>=5.6.1, gputil>=1.4.0
pip install gnes[transformers]
pytorch-transformers
pip install gnes[onnx]
onnxruntime
pip install gnes[audio]
librosa>=0.7.0
pip install gnes[scipy]
scipy
pip install gnes[nlp]
bert-serving-server>=1.8.6, pytorch-transformers, flair>=0.4.1, bert-serving-client>=1.8.6
pip install gnes[cn_nlp]
pytorch-transformers, bert-serving-client>=1.8.6, bert-serving-server>=1.8.6, jieba, flair>=0.4.1
pip install gnes[all]
pylint, psutil>=5.6.1, pytorch-transformers, annoy==1.15.2, bert-serving-client>=1.8.6, gputil>=1.4.0, bert-serving-server>=1.8.6, imagehash>=4.0, onnxruntime, memory_profiler>=0.55.0, jieba, flair>=0.4.1, librosa>=0.7.0, scipy, plyvel>=1.0.5, opencv-python>=4.0.0

A good way to cherry-pick dependencies is following the example in GNES Hub and building you own GNES image.

Either way, if you end up reading the following message after $ gnes or $ docker run gnes/gnes, then you are ready to go!

success installation of GNES

Getting Started

🐣 Preliminaries

Before we start, let me first introduce two important concepts in GNES: microservice and workflow.

Microservice

For machine learning engineers and data scientists who are not familiar with the concept of cloud-native and microservice, one can picture a microservice as an app on your smartphone. Each app runs independently, and an app may cooperate with other apps to accomplish a task. In GNES, we have four fundamental apps, aka. microservices, they are:

  • Preprocessor: transforming a real-world object to a list of workable semantic units;
  • Encoder: representing a semantic unit with vector representation;
  • Indexer: storing the vectors into memory/disk that allows fast-access;
  • Router: forwarding messages between microservices: e.g. batching, mapping, reducing.

In GNES, we have implemented dozens of preprocessor, encoder, indexer to process different content forms, such as image, text, video. It is also super easy to plug in your own implementation, which we shall see an example in the sequel.

Workflow

Now that we have a bunch of apps, what are we expecting them to do? A typical search system has two fundamental tasks: index and query. Index is storing the documents, query is searching the documents. In a neural search system, one may face another task: train, where one fine-tunes an encoder/preprocessor according to the data distribution in order to achieve better search relevance.

These three tasks correspond to three different workflows in GNES.

Building a flower search engine in 3 minutes

📣 Since v0.0.46 GNES Flow has become the main interface of GNES. GNES Flow provides a pythonic and intuitive way to implement a workflow, enabling users to run or debug GNES on a local machine. By default, GNES Flow orchestrates all microservices using multi-thread or multi-process backend, it can be also exported to a Docker Swarm/Kubernetes YAML config, allowing one to deliver GNES to the cloud.

🔰 The complete example and the corresponding Jupyter Notebook can be found at here.

In this example, we will use the new gnes.flow API (gnes >= 0.0.46 is required) to build a toy image search system for indexing and retrieving flowers based on their similarities.

Define the indexing workflow

Let's first define the indexing workflow by:

from gnes.flow import Flow
flow = (Flow(check_version=False)
        .add_preprocessor(name='prep', yaml_path='yaml/prep.yml')
        .add_encoder(yaml_path='yaml/incep.yml')
        .add_indexer(name='vec_idx', yaml_path='yaml/vec.yml')
        .add_indexer(name='doc_idx', yaml_path='yaml/doc.yml', recv_from='prep')
        .add_router(name='sync', yaml_path='BaseReduceRouter', num_part=2, recv_from=['vec_idx', 'doc_idx']))

Here, we use the inceptionV4 pretrained model as the encoder and the built-in indexers for storing vectors and documents. The flow should be quite self-explanatory, if not, you can always convert it to a SVG image and see its visualization:

flow.build(backend=None).to_url()

index

Indexing flower image data

To index our flower data, we need an iterator that generates bytes strings and feed those bytes strings into the defined flow.

def read_flowers(sample_rate=1.0):
    with tarfile.open('17flowers.tgz') as fp:
        for m in fp.getmembers():
            if m.name.endswith('.jpg') and random.random() <= sample_rate:
                yield fp.extractfile(m).read()

We can now do indexing via the multi-process backend:

with flow(backend='process') as fl:
    fl.index(bytes_gen=read_flowers(), batch_size=64)

It will take few minutes depending on your machine.

Querying similar flowers

We simply sample 20 flower images as queries and search for their top-10 similar images:

num_q = 20
topk = 10
sample_rate = 0.05

# do the query
results = []
with flow.build(backend='process') as fl:
    for q, r in fl.query(bytes_gen=read_flowers(sample_rate)):
        q_img = q.search.query.raw_bytes
        r_imgs = [k.doc.raw_bytes for k in r.search.topk_results]
        r_scores = [k.score.value for k in r.search.topk_results]
        results.append((q_img, r_imgs, r_scores))
        if len(results) > num_q:
            break

Here is the result, where queries are on the first row.

Elastic made easy

To increase the number of parallel components in the flow, simply add replicas to each service:

flow = (Flow(check_version=False, ctrl_with_ipc=True)
        .add_preprocessor(name='prep', yaml_path='yaml/prep.yml', replicas=5)
        .add_encoder(yaml_path='yaml/incep.yml', replicas=6)
        .add_indexer(name='vec_idx', yaml_path='yaml/vec.yml')
        .add_indexer(name='doc_idx', yaml_path='yaml/doc.yml', recv_from='prep')
        .add_router(name='sync', yaml_path='BaseReduceRouter', num_part=2, recv_from=['vec_idx', 'doc_idx']))
flow.build(backend=None).to_url()

replicas

Deploying a flow via Docker Swarm/Kubernetes

One can convert a Flow object to Docker Swarm/Kubernetes YAML compose file very easily via:

flow.build(backend=None).to_swarm_yaml()
version: '3.4'
services:
  Frontend0:
    image: gnes/gnes:latest-alpine
    command: frontend --port_in 56086 --port_out 52674 --port_ctrl 49225 --check_version
      False --ctrl_with_ipc True
  prep:
    image: gnes/gnes:latest-alpine
    command: preprocess --port_in 52674 --port_out 65461 --host_in Frontend0 --socket_in
      PULL_CONNECT --socket_out PUB_BIND --port_ctrl 49281 --check_version False --ctrl_with_ipc
      True --yaml_path yaml/prep.yml
  Encoder0:
    image: gnes/gnes:latest-alpine
    command: encode --port_in 65461 --port_out 50488 --host_in prep --socket_in SUB_CONNECT
      --port_ctrl 62298 --check_version False --ctrl_with_ipc True --yaml_path yaml/incep.yml
  vec_idx:
    image: gnes/gnes:latest-alpine
    command: index --port_in 50488 --port_out 57791 --host_in Encoder0 --host_out
      sync --socket_in PULL_CONNECT --socket_out PUSH_CONNECT --port_ctrl 58367 --check_version
      False --ctrl_with_ipc True --yaml_path yaml/vec.yml
  doc_idx:
    image: gnes/gnes:latest-alpine
    command: index --port_in 65461 --port_out 57791 --host_in prep --host_out sync
      --socket_in SUB_CONNECT --socket_out PUSH_CONNECT --port_ctrl 50333 --check_version
      False --ctrl_with_ipc True --yaml_path yaml/doc.yml
  sync:
    image: gnes/gnes:latest-alpine
    command: route --port_in 57791 --port_out 56086 --host_out Frontend0 --socket_out
      PUSH_CONNECT --port_ctrl 51285 --check_version False --ctrl_with_ipc True --yaml_path
      BaseReduceRouter --num_part 2

To deploy it, simply copy the generated YAML config to a file say my-gnes.yml, and then do

docker stack deploy --compose-file my-gnes.yml gnes-531

Building a cloud-native semantic poem search engine

In this example, we will build a semantic poem search engine using GNES. Unlike the previous flower search example, here we run each service as an isolated Docker container and then orchestrate them via Docker Swarm. It represents a common scenario in the cloud settings. You will learn how to use powerful and customized GNES images from GNES hub.

🔰 Please checkout this repository for details and follow the instructions to reproduce.

query

👨‍💻️ Take-home messages

Let's make a short recap of what we have learned.

  • GNES is all-in-microservice, there are four fundamental components: preprocessor, encoder, indexer and router.
  • GNES has three typical workflows: train, index, and query.
  • One can leverage GNES Flow API to define, modify, export or even visualize a workflow.
  • GNES requires an orchestration engine to coordinate all microservices. It supports Kubernetes, Docker Swarm, or built-in multi-process/thread solution.

Documentation

ReadTheDoc

The official documentation of GNES is hosted on doc.gnes.ai. It is automatically built, updated and archived on every new release.

Tutorial

🚧 Tutorial is still under construction. Stay tuned! Meanwhile, we sincerely welcome you to contribute your own learning experience / case study with GNES!

Benchmark

We have setup this repository to track the network latency over different GNES versions. As a part of CICD pipeline, this repo gets automatically updated when the GNES master is updated or a new GNES version is released.

Contributing

❤️ The beginning is always the hardest. But fear not, even if you find a typo, a missing docstring or unit test, you can simply correct them by making a commit to GNES. Here are the steps:

  1. Create a new branch, say fix-gnes-typo-1
  2. Fix/improve the codebase
  3. Commit the changes. Note the commit message must follow the naming style, say fix(readme): improve the readability and move sections
  4. Make a pull request. Note the pull request must follow the naming style. It can simply be one of your commit messages, just copy paste it, e.g. fix(readme): improve the readability and move sections
  5. Submit your pull request and wait for all checks passed (usually 10 minutes)
    • Coding style
    • Commit and PR styles check
    • All unit tests
  6. Request reviews from one of the developers from our core team.
  7. Get a LGTM 👍 and PR gets merged.

Well done! Once a PR gets merged, here are the things happened next:

  • all Docker images tagged with -latest will be automatically updated in an hour. You may check the its building status at here
  • on every Friday when a new release is published, PyPi packages and all Docker images tagged with -stable will be updated accordindly.
  • your contribution and commits will be included in our weekly release note. 🍻

More details can be found in the contributor guidelines.

Citing GNES

If you use GNES in an academic paper, you are more than welcome to make a citation. Here are the two ways of citing GNES:

  1. \footnote{https://github.com/gnes-ai/gnes}
    
  2. @misc{tencent2019GNES,
      title={GNES: Generic Neural Elastic Search},
      author={Xiao, Han and Yan, Jianfeng and Wang, Feng and Fu, Jie and Liu, Kai},
      howpublished={\url{https://github.com/gnes-ai}},
      year={2019}
    }

License

If you have downloaded a copy of the GNES binary or source code, please note that the GNES binary and source code are both licensed under the Apache License, Version 2.0.

Tencent is pleased to support the open source community by making GNES available.
Copyright (C) 2019 THL A29 Limited, a Tencent company. All rights reserved.

gnes's People

Contributors

colethienes avatar hanxiao avatar jemmyshin avatar larryjianfeng avatar mergify[bot] avatar micro-pixel avatar numb3r3 avatar raccoonliukai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gnes's Issues

🏗️GNES is under construction and rapid release cycles

🙇Hello there, I'd like to thank you for your interests in GNES, greatly appreciated!

After reading through the README, you may realize that GNES has an ambitious goal and the workload to enable any-to-any semantic search is quite something. Right now we are rapidly iterating GNES in terms of its usability, stability. This includes:

  • adding new APIs;
  • refactoring old APIs;
  • restructuring the project structure;
  • writing tutorial;
  • building a healthy GNES eco-system;
  • making demos

⏰A new release is scheduled every Friday's evening (GMT+8), the release procedure can be found here.

Please be patient if you encounter some issues, and please understand that we may not respond in time regarding your issues. Thanks for your understanding. 🙏

Meanwhile, if you'd like to understand the design principles behind GNES, feel free to read this blog post: Generic Neural Elastic Search: From bert-as-service and Go Way Beyond

For other questions, please contact our team via [email protected]

Scalability Benchmarking and combining text and image

I have 3 questions:

  1. How will I combine Text and Image instead of text to text, image to image...? Example: Consider there is a product which has images as well as textual information, how will I use GNES to get the most similar matching product from the index?

  2. How scalable is this solution? Can I index 100 million images or text?

  3. What to do if I want to save the indexes to database and load them during run time?

How to access gRPCFrontend??

Team: Great work!

I am able to run Docker Swarm Stack with all the mentioned services running successfully. However, I am not familiar with gPRC, how do I make use of this frontend to pass the documents for ingestion and further query ??

I am familiar with Python, fyi. Also, I see that you have HTTP end-to-end in works, if there is a beta ready, I would love to try and provide early feedback for you!!

Thank you!

Error while running examples on README

I am trying to run GNES locally on my Linux machine. However, when I run bash run.sh, I received the following errors.

E:EncoderService:[bas:run:305]:could not determine a constructor for the tag '!GPT2Encoder'
in "gpt2.yml", line 3, column 5

Namely, there is something wrong with gpt2.yml. I replaced the value of model_path with a real path to a downloaded GPT2 model, but it still did not work.

image

I installed gnes and also was running the docker service. Is there anything else we need to do to get it working? Thanks!

Clarify storage and distribution APIs

According to the marketing page, there are storage integrations with tencent cloud, but they don't appear to exist in the source code.

  1. How do you plan on persisting indexes between pod restarts?
  2. Can there be more than one index per cluster?
  3. Can indexes be written and read from in an online manner?

error while installing using pip

While installing gnes using pip install gnes:

I get the following file missing error with the latest release. What could be the posiible reasons?

building 'gnes.indexer.chunk.bindexer.cython' extension
creating build/temp.linux-x86_64-3.5
creating build/temp.linux-x86_64-3.5/gnes
creating build/temp.linux-x86_64-3.5/gnes/indexer
creating build/temp.linux-x86_64-3.5/gnes/indexer/chunk
creating build/temp.linux-x86_64-3.5/gnes/indexer/chunk/bindexer
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.5m -c gnes/indexer/chunk/bindexer/bindexer.c -o build/temp.linux-x86_64-3.5/gnes/indexer/chunk/bindexer/bindexer.o -O3 -g0
gnes/indexer/chunk/bindexer/bindexer.c:4:20: fatal error: Python.h: No such file or directory
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

----------------------------------------

Command "/usr/bin/python3 -u -c "import setuptools, tokenize;file='/tmp/pip-install-75yi68kt/gnes/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-2wvrek1d/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-75yi68kt/gnes/

Is this still maintained

I really like the ambition and the direction this project set out. However since the last commit is from october, I am a bit concerned that the project died :(. Can you give any comments on that?

Add a notice when indexing flow text is only accept Bytes. Not str

Hi. I found an issue when use Flow to indexing text as type str it will encounter horrible message:

E:CLIClient:[cli:sta: 53]:<_Rendezvous of RPC that terminated with:
	status = StatusCode.UNKNOWN
	details = "Exception iterating requests!"
	debug_error_string = "None"
>

This caused by passing the string not bytes.

so to solve this you need to convert/encode your str type to bytes.
I will make PR for this soon

How to use GNES for text classification?

Problem and Question

Hi, I have take a look the poem project. And want to use another data for indexing. How to use labeled csv to do supervised learning with text data?

my data sample.tsv with thai text:

intent	question
ClassA	FAQ 1? ចូលទៅប្រើប្រាស់កម្មវិធីនេះ?
ClassA	Another FAQ Question similar to FAQ 1?
ClassB	TestQuestion with thai text ចូលទៅប្រើប្រាស់កម្មវិធីនេះ?
ClassB	Another data sample

What I have trial

  1. I try to pass Pandas Series and it raise GRPC Error,
  2. I try to pass tuple with (intent, question) and raise GRPC error
  3. I try to use the question only to index it and convert the str into bytes. This is successfully build without GRPC error, but it raise
W:EncoderService:[enc:emb: 42]:document (doc_id=20) contains no chunks!
W:IndexerService:[ind:_ha: 57]:document (doc_id=10) contains no chunks!
W:EncoderService:[enc:emb: 42]:document (doc_id=22) contains no chunks!
W:IndexerService:[ind:_ha: 57]:document (doc_id=12) contains no chunks!
W:EncoderService:[enc:emb: 42]:document (doc_id=24) contains no chunks!
W:IndexerService:[ind:_ha: 57]:document (doc_id=16) contains no chunks!
E:EncoderService:[enc:emb: 67]:can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
W:EncoderService:[enc:emb: 68]:encoder service throws an exception, the sequel pipeline may not work properly

Question

How to train the labeled csv data?

Can you support sharding?

really like this projects. how ever sometimes the data is too large can we want to do sharding and merge the result.

Modify data records after indexed

Hi.
Let's think about we have 100.000 sentence data or images.
And want to remove 5 sentences or images from searching scope.
How it is possible? because remove all of them and recompute need lot of time and resources

Add the HTTP Client to the flow

Amazing project! Been using it for a week now and I love the architecture of the entire system.

If it's not too drastic of a change, it would be nice to have the HTTP client optionally added to the flow so that it is possible to send json requests through the client to the frontend. I would like to be able to deploy the client image in the swarm yaml. This would have the benefit of fully containerizing gnes and minimizing dependencies for the end user. Also, I've seen a few people have problems with leveraging the bytes generator directly.

Could potentially add to the service map then to the node graph?

Exemples of use cases

Hi,
This is a VERY GREAT looking project. Thank you for your effort.

But from the documentation is not clear what kind of uses are possible with this project.
For example, in a text search, I would imagine that gnes would enable me to find similar documents. Is that right?

And how it compared with Vespa.ai (https://github.com/vespa-engine) ?

Some examples of what is possible with Gnes would be nice!

Poem search example without Docker

I am facing problems running Docker in my machine, like

  • once the docker is started, it never stops, even after deleting those corresponding images
  • could not able to start gRPC server

It would be better if you could give the poem example without docker and gRPC
like the example you gave for image search (in a simple notebook manner).

Stuck with on `create new stub`. On make `client query`

Explanation

Hi. I has been implement the demo-poems-ir. And it is working properly. Right now I want to use the sample for my custom class.
here the class:

class FaqClient(CLIClient):

    @property
    def bytes_generator(self):
        num_rows = 0

        for rr in csv.DictReader(self.args.txt_file, delimiter='\t'):
            yield json.dumps(rr['answer']).encode()
            num_rows += 1
            if num_rows > self.args.num_poems:
                return

    def query_callback(self, req, resp):
        print(colored(req.search.query, 'green'))
        for k in resp.search.topk_results:
            print(colored(k.doc.doc_id, 'magenta'))
            print(colored(k.doc.raw_text, 'yellow'))
            print(colored(k.score.value, 'blue'))
            pprint.pprint(json.loads(k.score.explained))
            input('press any key to continue...')

        if self.args.prompt:
            input('on to next query, press any key to continue...')

Problem

When I run make client_query or make client_index it will stuck in:

I:FaqClient:[bas:__i:136]:create new stub...

Here the screenshot
Screen Shot 2019-10-07 at 23 36 31

Questions

What's exactly happened when the Client info create new stub...? I read the source it only init GnesRPCStub with the channel. It stuck there and can not passing the

self.logger.critical('gnes client ready at %s:%d!' % (self.args.grpc_host, self.args.grpc_port))

Kindly advise

Waiting on channel to be ready

I'm running the demo-poems-ir and it seems to be stuck at

I:MyClient:[bas:__i:124]:setting up grpc insecure channel...
I:MyClient:[bas:__i:133]:waiting channel to be ready...

Probably something with my setup. Anyone else seen this.
Thanks

semantic poem demo issues

HI, i really appreciate and see potential in this gnes project. I have two questions:

  1. I have been able to successfully run the flowers demo, along with other images. However i am unsure how to run text objects through. can this be further elaborated.

  2. Also i have tried the semantics poem demo but when i run make client_index d=10 it seems to get output " GNES client ready" but gets stuck at the first indexing with 0.0 batch/s speed.

I would greatly appreciate any insight into the above. Thanks in advance

refactoring the core module by using c++ or golang

hi, all

have you think about to using some static language like c++ or golang to programming the core modules for this project?

i have go throughed this project and just do though it is not effective to build such large project by using some language like python. even though it is easily for demostration on purposes but not good eough for mature and high-scaleble architecture for search engine. so from my perspective .the project may be should devide to two pieces:

for ML training modules : it is good to be programming by python like it is Depth calculation model which maybe updating dynamicly

for service modules: which should be running effectively and performance-sensitive with highly scalible. i would like suggest to programming by using C++ or golang .

here is some project for example like:
https://github.com/alibaba/euler

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.