Coder Social home page Coder Social logo

Comments (15)

kkraune avatar kkraune commented on June 23, 2024 1

@Gladiator566 : I have created vespa-engine/vespa#30219 for the X-Content-Hash issue you reported - this is most likely a different issue than reported by @ricoms here. Thanks for reporting!

from pyvespa.

kkraune avatar kkraune commented on June 23, 2024

Hi, sorry for slow reponse! The (first) problem to solve is the configproxy not reaching the config server. Please try https://github.com/vespa-engine/sample-apps/blob/master/examples/operations/multinode-HA/docker-compose.yaml and validate that works - then you can modify the compose file, adding your stuff.

Maybe best place to look is network/hostnames - this looks like a connectivity problem, so maybe add a network and use a fully qualified hostname instead of just vespa

from pyvespa.

Gladiator566 avatar Gladiator566 commented on June 23, 2024

@kkraune Hi, I try to use bge-m3 model to do embedding hybrid search, and I use refer to official tutorial to deploy a local docker containter to use vespa. Since I have over millions data to feed, so I try to use feed_iterable function to feed iterable bulk data, and I encountered same problems as above, like WARNING/urllib3.connectionpool: Retrying NewConnectionError: Failed to establish a new connection or Max retries exceeded with URL sth like that. I try to set max_connections params to a huge number, and try to create a session to do feed, but it doesn't work, how can I solve this connection full error to insert bulk data to vespa? Thank you !

my env: linux, pyvespa version is 0.39, docker image is latest

from pyvespa.

kkraune avatar kkraune commented on June 23, 2024

Hi @Gladiator566 I think you must look in the vespa.log to validate what the problem might be - and if so, follow the advise to try /multinode-HA/docker-compose.yaml to verify this works, before trying your own configuration

You can also try https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa-cloud.html to make it easier, using the free trial, to eliminate other failures.

from pyvespa.

Gladiator566 avatar Gladiator566 commented on June 23, 2024

@kkraune I try to use vespa cloud as tutorial, but i got error like RuntimeError: Status code 400 doing POST at https://api.vespa-external.aws.oath.cloud:4443/application/v4/tenant/bge-m3/application/bgeM3/instance/default/deploy/dev-aws-us-east-1c: Value of X-Content-Hash header does not match computed content hash, how to solve this problem?

from pyvespa.

kkraune avatar kkraune commented on June 23, 2024

Thanks for reporting. Can you add the steps you took, so we can reproduce? Or did you follow the steps in https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa-cloud.html and it failed here?

 app = vespa_cloud.deploy()

A good hint is also to make sure there are no applications already deployed.

@hmusum I assume this is an error from our API, we should document how to fix this

from pyvespa.

Gladiator566 avatar Gladiator566 commented on June 23, 2024

@kkraune yes, I follow the exact steps in https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa-cloud.html , concretely as bellow:

  1. download vespa-cli from github latest release
  2. vespa config set target cloud
  3. vespa config set application bge-m3.bgeM3
  4. vespa auth cert -N
  5. vespa auth api-key
  6. add public api-key to cloud browser key site
  7. vespa_cloud = VespaCloud( tenant=os.environ["TENANT_NAME"], application='bgeM3', key_content=None, key_location=api_key_path, application_package=application_package)

and it failed at
app = vespa_cloud.deploy(), there are no applications already deployed in cloud.

Thanks.

from pyvespa.

kkraune avatar kkraune commented on June 23, 2024

Hi again, I tried https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa-cloud.html and it worked for me. I run the notebook locally on my laptop. Some ideas

the .vespa directory in your home dir stores credentials - you cna temprarily move this to another name to reset all credentials, and try the guide again, with no other changes. You can also delete the api-key in the console and try with a fresh one

from pyvespa.

hmusum avatar hmusum commented on June 23, 2024

The "Value of X-Content-Hash header does not match computed content hash" error is due to some misconfiguration or bug on the client side, but it's hard to say what the user should do without knowing the root cause of the error.

from pyvespa.

kkraune avatar kkraune commented on June 23, 2024

The problem seems to be a mismatch with the hash computed in

https://github.com/vespa-engine/pyvespa/blob/master/vespa/deployment.py#L644

and validation of this in Vespa Cloud. pyvespa 0.39, which is the latest. We are looking into.

from pyvespa.

vudangthinh avatar vudangthinh commented on June 23, 2024

@kkraune Hi, I try to use bge-m3 model to do embedding hybrid search, and I use refer to official tutorial to deploy a local docker containter to use vespa. Since I have over millions data to feed, so I try to use feed_iterable function to feed iterable bulk data, and I encountered same problems as above, like WARNING/urllib3.connectionpool: Retrying NewConnectionError: Failed to establish a new connection or Max retries exceeded with URL sth like that. I try to set max_connections params to a huge number, and try to create a session to do feed, but it doesn't work, how can I solve this connection full error to insert bulk data to vespa? Thank you !

Hi, I also encounter the same problem. Did you know how to fix it?

from pyvespa.

kkraune avatar kkraune commented on June 23, 2024

Hi @vudangthinh ! I don't think increasing number of connections will help, the error message is probably a symptom of a maxed out instance.

The https://docs.vespa.ai/en/vespa-cli.html has better feed flow control, can you please try that and see how the feeding goes and let me know?

from pyvespa.

vudangthinh avatar vudangthinh commented on June 23, 2024

I tried to use vespa feed, however the error still persistent:
At first, the indexing process was ok, but when I run many document the error start happen:

feed: got error "Post "http://127.0.0.1:8080/document/v1/benchmark/hybridsearch/docid/2936": write tcp 127.0.0.1:52604->127.0.0.1:8080: write: broken pipe" (no body) for put id:benchmark:hybridsearch::2936: retrying
feed: got error "Post "http://127.0.0.1:8080/document/v1/benchmark/hybridsearch/docid/3283": write tcp 127.0.0.1:52610->127.0.0.1:8080: write: broken pipe" (no body) for put id:benchmark:hybridsearch::3283: retrying
feed: got error "Post "http://127.0.0.1:8080/document/v1/benchmark/hybridsearch/docid/3090": write tcp 127.0.0.1:52622->127.0.0.1:8080: write: broken pipe" (no body) for put id:benchmark:hybridsearch::3090: retrying
feed: got error "Post "http://127.0.0.1:8080/document/v1/benchmark/hybridsearch/docid/3273": write tcp 127.0.0.1:52634->127.0.0.1:8080: write: broken pipe" (no body) for put id:benchmark:hybridsearch::3273: retrying

from pyvespa.

kkraune avatar kkraune commented on June 23, 2024

OK - can you please check vespa.log inside the Docker Container? Could be a resource problem, the log might say

from pyvespa.

bratseth avatar bratseth commented on June 23, 2024

You are probably sending more requests than the system can handle timely and therefore some of them end up crossing a connection recycling event. These will be retried until timeout so not really an error in itself, but you probably want to increase your resources (maybe run with GPU) or feed slower. Setting a lower timeout (--timeout) should get rid of these messages and lead to less queuing which is probably advantageous if you want to determine faster what actual max throughput you can get.

from pyvespa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.