Comments (6)
To address the "Input too long for selected model" error you're encountering, even after implementing the recent fix and adjusting your chunk size to 400 with an overlap of 80, consider the following steps:
-
Check Model Capabilities: Ensure the model (
cohere.embed-english-v3
) supports the chunk sizes you're using. Different models have varying input size limitations. -
Adjust Chunk Parameters: If some chunks are still too large, reduce the chunk size further or adjust the overlap. This might help keep the chunks within the model's acceptable input range.
-
Explore Other Models: If adjusting sizes doesn't work, look into using a different model that can handle larger inputs. The
list_supported_models
method can help identify suitable alternatives. -
Customize Request Body: For the Cohere provider, you might need to customize how the request body is constructed, particularly how text is split and sent. This could involve adjusting settings like
truncate
. -
Review API and SDK Settings: Ensure your API requests are optimally configured, which might help with processing large inputs more efficiently.
-
Consult Documentation and Support: If issues persist, the official documentation or support channels for the models you're using may offer additional insights or solutions.
By following these steps, you should be better equipped to resolve the issue with input size when making embeddings of a PDF file.
Details
Sources
- [Question]: ModelError: Your input is too long. Max input length is 4096 tokens, but you supplied 5441 tokens.
- Chunk size sometimes exceeds max model size
- llama-index-legacy/llama_index/legacy/embeddings/utils.py
- llama-index-legacy/llama_index/legacy/embeddings/bedrock.py
- llama-index-core/llama_index/core/service_context.py
- llama-index-legacy/llama_index/legacy/service_context.py
- docs/docs/module_guides/indexing/vector_store_index.md
- llama-index-integrations/indices/llama-index-indices-managed-vectara/llama_index/indices/managed/vectara/base.py
- llama-index-core/llama_index/core/indices/base.py
- llama-index-core/llama_index/core/indices/vector_store/base.py
- llama-index-integrations/embeddings/llama-index-embeddings-bedrock/llama_index/embeddings/bedrock/base.py
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
To continue the conversation, mention @dosu.
from llama_index.
I suggest to set the chunk_size and similar parameters not with ServiceContext but with Settings.
I link documentation about it https://docs.llamaindex.ai/en/stable/module_guides/supporting_modules/service_context_migration/
from llama_index.
@gsuriano, Thanks for the suggestion. I will migrate to the Settings approach for sure; however, I am still encountering the same error.
Perhaps this limit is another constraint because I have conducted some testing, and it begins to fail when the total character count of "texts" is close to 8192 characters.
from llama_index.
To effectively change embed_batch_size you should use the Settings object. That's why I suggested using Settings also for the chunk_size, I think that the split is done with default values of llamaindex and not with the chunk_size that you set on the ServiceContext.
from llama_index.
@gsuriano, Thank you for the advice. I was able to change the embed_batch_size via Settings. However, the error still persists even when I set embed_batch_size=1, so apparently, this is not the problem.
Upon debugging, I found that the error starts to occur at this node:
{"texts": ["page_label: 2\ngenai_document_id: 1681b55d-a16b-4fe7-9aa5-a8edf60501b6\ngenai_tenant_id: 5853\ngenai_created_date_utc: 2024-05-09T16:06:43.000Z\ngenai_key1: value 1\ngenai_key2: value 2\ngenai_application: CE\ngenai_entitytype: Program\ngenai_entityid: 51137\ngenai_llmmodel: anthropic.claude-3-haiku-20240307-v1:0\ngenai_embeddingmodel: cohere.embed-english-v3\ngenai_embeddingdimension: 1024\n\nIT Change Management \nUTRGV 2 \n \nTable of Contents \nIntroduction ............................................................................................................................................................. 3 \nDefining Change .................................................................................................................................................... 3 \nRoles and Responsibilities ..............."], "input_type": "search_document", "truncate": "NONE"}
The "texts" field has only 848 characters. However, the error I encountered is:
File "C:\work\LambdaModules10\Lib\site-packages\llama_index\embeddings\bedrock\base.py", line 345, in _get_embedding
response = self._client.invoke_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\work\LambdaModules10\Lib\site-packages\botocore\client.py", line 565, in _api_call
return self._make_api_call(operation_name, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\work\LambdaModules10\Lib\site-packages\botocore\client.py", line 1021, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: Input is too long for requested model.
If "texts" has fewer characters, it works without issues.
I would appreciate any suggestions.
from llama_index.
I have finally found the issue.
This is caused by the TRUNCATE: 'NONE' parameter in this code (llama_index\embeddings\bedrock\base.py):
request_body = json.dumps( { "texts": payload, "input_type": input_types[input_type], "truncate": "NONE", } )
If you remove this truncate parameter, it works! In the Amazon Documentation, they said the default value is NONE, but when it is explicitly specified as "NONE", it causes the "input too long" error. It appears to be an internal bug in Amazon Bedrock.
Anyway, if the default value is NONE, removing the truncate parameter should have the same behavior in theory. Source: https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-embed.html
I have created a pull request with this fix.
from llama_index.
Related Issues (20)
- [Question]: how does llamaindex support large datasets? HOT 11
- [Bug]: MongoDBAtlasVectorSearch & VectorStoreIndex.from_vector_store are not working as expected HOT 5
- [Bug]: LanceDBVectorStore database size blows up on creation HOT 1
- [Question]: node structure for elasticsearch ~ Dosubot
- [Question]: SQL query response has 'sql\n' from SQLAutoVectorQueryEngine HOT 1
- [Question]: I want to get the coordinates using Unstructured as the metadata of nodes HOT 2
- [Bug]: async functions do not work.. HOT 6
- [Feature Request]: HOT 1
- [Bug]: ChatSummaryMemoryBuffer failed with chat_engine HOT 1
- [Bug]: When using HyDE, Langfuse doesn't log trace event HOT 5
- [Bug]: VertexAIVectorStore fails w/ IndexError HOT 1
- [Bug]: HOT 1
- Enable Gpu when Generating Sparse vectors with Qdrant hybrid Mode. HOT 1
- [Question]: Is data shared when creating embeddings or querying? HOT 2
- [Question]: Is it expected that `VectorStoreIndex.persist` and `load_index_from_storage` are not symmetric? HOT 2
- [Bug]: Cannot use SentenceWindowNodeParser with custom SentenceSplitter HOT 4
- [Question]: Building a chatbot with custom data HOT 5
- [Feature Request] Document Loader for single files: Support for file path along with IO bytes object
- [Question]: Graph Data Processing HOT 3
- [Bug]: Postgres MetadataFilter FilterOperator.IN will fail if value is list with one element. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama_index.