Bug Deion I know that Cohere (through Bedrock) has a limit o

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

[Bug]: Bedrock Cohere embeddings are not working as expected.,about run-llama/llama_index

dosubot commented on June 14, 2024 2

To address the "Input too long for selected model" error you're encountering, even after implementing the recent fix and adjusting your chunk size to 400 with an overlap of 80, consider the following steps:

Check Model Capabilities: Ensure the model (cohere.embed-english-v3) supports the chunk sizes you're using. Different models have varying input size limitations.
Adjust Chunk Parameters: If some chunks are still too large, reduce the chunk size further or adjust the overlap. This might help keep the chunks within the model's acceptable input range.
Explore Other Models: If adjusting sizes doesn't work, look into using a different model that can handle larger inputs. The list_supported_models method can help identify suitable alternatives.
Customize Request Body: For the Cohere provider, you might need to customize how the request body is constructed, particularly how text is split and sent. This could involve adjusting settings like truncate.
Review API and SDK Settings: Ensure your API requests are optimally configured, which might help with processing large inputs more efficiently.
Consult Documentation and Support: If issues persist, the official documentation or support channels for the models you're using may offer additional insights or solutions.

By following these steps, you should be better equipped to resolve the issue with input size when making embeddings of a PDF file.

Details

Sources

About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

from llama_index.

gsuriano commented on June 14, 2024

I suggest to set the chunk_size and similar parameters not with ServiceContext but with Settings.
I link documentation about it https://docs.llamaindex.ai/en/stable/module_guides/supporting_modules/service_context_migration/

from llama_index.

omarinho commented on June 14, 2024

@gsuriano, Thanks for the suggestion. I will migrate to the Settings approach for sure; however, I am still encountering the same error.

Perhaps this limit is another constraint because I have conducted some testing, and it begins to fail when the total character count of "texts" is close to 8192 characters.

from llama_index.

gsuriano commented on June 14, 2024

To effectively change embed_batch_size you should use the Settings object. That's why I suggested using Settings also for the chunk_size, I think that the split is done with default values of llamaindex and not with the chunk_size that you set on the ServiceContext.

from llama_index.

omarinho commented on June 14, 2024

@gsuriano, Thank you for the advice. I was able to change the embed_batch_size via Settings. However, the error still persists even when I set embed_batch_size=1, so apparently, this is not the problem.

Upon debugging, I found that the error starts to occur at this node:

{"texts": ["page_label: 2\ngenai_document_id: 1681b55d-a16b-4fe7-9aa5-a8edf60501b6\ngenai_tenant_id: 5853\ngenai_created_date_utc: 2024-05-09T16:06:43.000Z\ngenai_key1: value 1\ngenai_key2: value 2\ngenai_application: CE\ngenai_entitytype: Program\ngenai_entityid: 51137\ngenai_llmmodel: anthropic.claude-3-haiku-20240307-v1:0\ngenai_embeddingmodel: cohere.embed-english-v3\ngenai_embeddingdimension: 1024\n\nIT Change Management \nUTRGV 2 \n \nTable of Contents \nIntroduction ............................................................................................................................................................. 3 \nDefining Change .................................................................................................................................................... 3 \nRoles and Responsibilities ..............."], "input_type": "search_document", "truncate": "NONE"}

The "texts" field has only 848 characters. However, the error I encountered is:

File "C:\work\LambdaModules10\Lib\site-packages\llama_index\embeddings\bedrock\base.py", line 345, in _get_embedding
response = self._client.invoke_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\work\LambdaModules10\Lib\site-packages\botocore\client.py", line 565, in _api_call
return self._make_api_call(operation_name, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\work\LambdaModules10\Lib\site-packages\botocore\client.py", line 1021, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: Input is too long for requested model.

If "texts" has fewer characters, it works without issues.

I would appreciate any suggestions.

from llama_index.

omarinho commented on June 14, 2024

I have finally found the issue.

This is caused by the TRUNCATE: 'NONE' parameter in this code (llama_index\embeddings\bedrock\base.py):

        request_body = json.dumps(
            {
                "texts": payload,
                "input_type": input_types[input_type],
                "truncate": "NONE",
            }
        )

If you remove this truncate parameter, it works! In the Amazon Documentation, they said the default value is NONE, but when it is explicitly specified as "NONE", it causes the "input too long" error. It appears to be an internal bug in Amazon Bedrock.

Anyway, if the default value is NONE, removing the truncate parameter should have the same behavior in theory. Source: https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-embed.html

I have created a pull request with this fix.

#13442

from llama_index.

[Bug]: Bedrock Cohere embeddings are not working as expected. about llama_index HOT 6 CLOSED

Comments (6)

Details

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent