quarkiverse / quarkus-langchain4j Goto Github PK

View Code? Open in Web Editor NEW

121.0 12.0 66.0 8.22 MB

Quarkus Langchain4j extension

Home Page: https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html

License: Apache License 2.0

Java 86.47% JavaScript 1.70% HTML 11.82%

ai langchain4j llm quarkus-extension

quarkus-langchain4j's Introduction

Quarkus LangChain4j

This repository contains Quarkus extensions that facilitate seamless integration between Quarkus and LangChain4j, enabling easy incorporation of Large Language Models (LLMs) into your Quarkus applications.

Features

Here is a non-exhaustive list of features that are currently supported:

Declarative AI services
Integration with diverse LLMs (OpenAI GPTs, Hugging Faces, Ollama...)
Tool support
Embedding support
Document store integration (Redis, Chroma, Infinispan...)
Native compilation support
Integration with Quarkus observability stack (metrics, tracing...)

Documentation

Refer to the comprehensive documentation for detailed information and usage guidelines.

Samples

Check out the samples and integration tests to gain practical insights on how to use these extensions effectively.

Getting Started

To incorporate Quarkus LangChain4j into your Quarkus project, add the following Maven dependency:

<dependency>
    <groupId>io.quarkiverse.langchain4j</groupId>
    <artifactId>quarkus-langchain4j-openai</artifactId>
    <version>{latest-version}</version>
</dependency>

or, to use hugging face:

<dependency>
    <groupId>io.quarkiverse.langchain4j</groupId>
    <artifactId>quarkus-langchain4j-huggingface</artifactId>
    <version>{latest-version}</version>
</dependency>

Make sure to replace {latest-version} with the most recent release version available on Maven Central.

Contributing

Feel free to contribute to this project by submitting issues or pull requests.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

quarkus-langchain4j's People

Contributors

Stargazers

Watchers

Forkers

cescoffier jmartisk gsmet maxandersen karesti dhouibi zhfeng dandreadis sboeckelmann alesj sebastienblanc sberyozkin andreas-eberle phillip-kruger edeandrea lordofthejars hbelmiro eformat hyxxxxxx1 n-enami tmessini yanick-salzmann rhuss ppalaga iocanel junjiem sarxos pierrebtz markusfindenig andrekullmann zbendhiba starksm64 mike-costello andreadimaio holly-cummins jamesnetherton rhtevan dongyanbing lburgazzoli humcqc philippart-s jdubois maheshrajamani sunxiaojian robertomalatesta csotiriou unvirtualhh blueoceandevops geoffallendev dcotfr exav trinhtson48 kdubois gaol duchamk moheyel-dinbadr akubicharm

quarkus-langchain4j's Issues

Investigate if OpenAI Streaming API can be used

Hopefully it can be considered worth investigating. I've read around, quite a few related discussions, and one of the main techniques to get faster OpenAI response time is apparently supporting a streaming API.
For example, with a ChatBot sample, users would see a response being formed gradually, word by word, or sentence by sentence, minimising the effect of a somewhat slow response.
Thanks

Unable to use sentence-transformers/all-mpnet-base-v2 with HuggingFace

This embedding model is the default when using Python.

See: https://api.python.langchain.com/en/latest/embeddings/langchain.embeddings.huggingface.HuggingFaceEmbeddings.html

The exception is the following:

jakarta.ws.rs.ProcessingException: The timeout period of 10000ms has been exceeded while executing POST /pipeline/feature-extraction/sentence-transformers/all-mpnet-base-v2 for server api-inference.huggingface.co:443
        at org.jboss.resteasy.reactive.client.impl.InvocationBuilderImpl.unwrap(InvocationBuilderImpl.java:223)
        at org.jboss.resteasy.reactive.client.impl.InvocationBuilderImpl.method(InvocationBuilderImpl.java:344)
        at io.quarkiverse.langchain4j.huggingface.HuggingFaceRestApi$$QuarkusRestClientInterface.embed(Unknown Source)
        at io.quarkiverse.langchain4j.huggingface.QuarkusHuggingFaceClientFactory$QuarkusHuggingFaceClient.embed(QuarkusHuggingFaceClientFactory.java:84)
        at io.quarkiverse.langchain4j.huggingface.QuarkusHuggingFaceEmbeddingModel.embedTexts(QuarkusHuggingFaceEmbeddingModel.java:70)
        at io.quarkiverse.langchain4j.huggingface.QuarkusHuggingFaceEmbeddingModel.embedAll(QuarkusHuggingFaceEmbeddingModel.java:63)
        at dev.langchain4j.model.embedding.EmbeddingModel_39f15f757608db0afd626cecc7d217137da45ff3_Synthetic_ClientProxy.embedAll(Unknown Source)
        at dev.langchain4j.store.embedding.EmbeddingStoreIngestor.ingest(EmbeddingStoreIngestor.java:62)
        at io.quarkiverse.langchain4j.sample.openshift.ingestion.IngestionJob.run(IngestionJob.java:63)
        at picocli.CommandLine.executeUserObject(CommandLine.java:2026)
        at picocli.CommandLine.access$1500(CommandLine.java:148)
        at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2453)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2415)
        at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273)
        at picocli.CommandLine$RunLast.execute(CommandLine.java:2417)
        at io.quarkus.picocli.runtime.PicocliRunner$EventExecutionStrategy.execute(PicocliRunner.java:26)
        at picocli.CommandLine.execute(CommandLine.java:2170)
        at io.quarkus.picocli.runtime.PicocliRunner.run(PicocliRunner.java:40)
        at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:132)
        at io.quarkus.runtime.Quarkus.run(Quarkus.java:71)
        at io.quarkus.runtime.Quarkus.run(Quarkus.java:44)
        at io.quarkus.runner.GeneratedMain.main(Unknown Source)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:568)
        at io.quarkus.runner.bootstrap.StartupActionImpl$1.run(StartupActionImpl.java:113)
        at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: io.vertx.core.http.impl.NoStackTraceTimeoutException: The timeout period of 10000ms has been exceeded while executing POST /pipeline/feature-extraction/sentence-transformers/all-mpnet-base-v2 for server api-inference.huggingface.co:443

10s seems a lot for embedding computation.

ERROR [io.qua.lan.run.ais.DeclarativeAiServiceBeanDestroyer] (main) Unable to close ...

Have this code:

///usr/bin/env jbang "$0" "$@" ; exit $?
//JAVA 21+
//PREVIEW
//JAVAC_OPTIONS -parameters

//DEPS io.quarkus.platform:quarkus-bom:3.5.1@pom
//DEPS io.quarkiverse.langchain4j:quarkus-langchain4j-openai:0.3.0
//DEPS io.quarkus:quarkus-picocli

//Q:CONFIG quarkus.banner.enabled=false
//Q:CONFIG quarkus.log.level=DEBUG
//Q:CONFIG quarkus.langchain4-openai.timeout=60s

import java.io.IOException;

import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.UserMessage;
import io.quarkiverse.langchain4j.RegisterAiService;
import jakarta.enterprise.context.control.ActivateRequestContext;
import jakarta.inject.Inject;
import picocli.CommandLine.Command;
import picocli.CommandLine.Parameters;

@Command(mixinStandardHelpOptions = true, version = "0.1", header = "Explain usage of a source file using ChatGPT.", description = """
		Uses Quarkus LangChain4j and ChatGP to explain what
			a source file does in a Quarkus project.

		Note: Be aware the source code is sent to remote server.
		""")
public class magic implements Runnable {

	@Parameters(description = "The question to answer")
	String question;

	@RegisterAiService
	public interface MetaGen {

		@SystemMessage("""
				You are to try and make a program that does what calculation or operation the
				user wants to do, execute the program and validate the result.
				""")
		@UserMessage("{question}")
		String generate(String question);

	}

	@Inject
	MetaGen gpt;

	@Override
	@ActivateRequestContext
	public void run() {

		var result = gpt.generate(question);
		System.out.println(result);

	}

}

just trying to do the simplest possible but when I run this I get:

2023-12-10 09:09:13,265 ERROR [io.qua.lan.run.ais.DeclarativeAiServiceBeanDestroyer] (main) Unable to close magic$MetaGen$$QuarkusImpl@82d8fd9

no stacktrace or anything. worked fine in 0.1.0.

Reduce test duplication by using EmbeddingStoreWithoutMetadataIT

See https://github.com/langchain4j/langchain4j/blob/main/langchain4j-core/src/test/java/dev/langchain4j/store/embedding/EmbeddingStoreWithoutMetadataIT.java - tests for different store types can extend this test case.
This is a new class that isn't in any released langchain4j version yet.

AI has a broad meaning

Reference to e.g. @RegisterAiService
Suggest to change AI service or the term AI used, as AI is a field of study and has a broad meaning. in this extension's case its more of a service for LLMs.

Support a user specific RAG augmentation

Consider how Quarkus Authentication and LLM API Keys will work together.

I'm assuming, if I start a demo with an API key, in prod, it will be more like a company's API key.
Depending on the nature of the service, it may be enough, for example, if it is a chat bot offered by a bank, then anyone (or any authenticated user, possibly with a specific role) can talk to it.

I'm not quite sure how the trust boundaries will form if there is a requirement to offer a user specific AI support in prod, for example, alice or bob, they may have their own augmentation documents not to be shared with anyone else.

I think #41 is very relevant, as well as #79, and #81. For example, a user logs in and possibly approves an LLM scope (#81), then a specific method with a RAG option (#41) requires a specific identity (and possibly role #79) for a user specific document only be fed into the LLM.

I guess, a custom RAG store can be a database, where the key is the user name and documents - the values. The rag store would support a request scope injection and have @Inject SecurityIdentity identity; and use the identity name to fetch the docs specific to this user identity only. I can give demoing it a try, may be as part of #81

@SystemMessage should be able to target classes when used on AI Service

When you map your interaction using an interface, the system message should be on the interface itself and not repeated on every method.

Add observability

Also, take a look at https://smith.langchain.com/

jbang Dynamic tool

It would probably be cool to have a jbang based CodeExecutionEngine.

@maxandersen did you already mention this or did I hallucinate?

Qute templating error when using Kotlin

I have a weird bug with the qute template when I try to replicate the example from the Quickstart with Kotlin.

I'm using Quarkus 3.6 with Kotlin & gradle and OpenApi.

I have the following MyAiService. It is basically the one from the example but in Kotlin and I removed the Email sending part to simplify the reproduce (code attached).

@RegisterAiService
interface MyAiService {
    @SystemMessage("You are a professional poet")
    @UserMessage("Write a poem about {topic}. The poem should be {lines} lines long.")
    fun writeAPoem(topic: String, lines: Int): String
}

When I try to call the writeAPoem method, I get the following error that topic would be missing:

io.quarkus.qute.TemplateException: Rendering error: Entry "topic" not found in the data map in expression {topic}
	at io.quarkus.qute.TemplateException$Builder.build(TemplateException.java:169)
	at io.quarkus.qute.EvaluatorImpl.propertyNotFound(EvaluatorImpl.java:234)
	at io.quarkus.qute.EvaluatorImpl.resolve(EvaluatorImpl.java:204)
	at io.quarkus.qute.EvaluatorImpl.resolveReference(EvaluatorImpl.java:131)
	at io.quarkus.qute.EvaluatorImpl.evaluate(EvaluatorImpl.java:85)
	at io.quarkus.qute.ResolutionContextImpl.evaluate(ResolutionContextImpl.java:29)
	at io.quarkus.qute.ExpressionNode.resolve(ExpressionNode.java:36)
	at io.quarkus.qute.SectionNode$SectionResolutionContextImpl.execute(SectionNode.java:228)
	at io.quarkus.qute.SectionHelper$SectionResolutionContext.execute(SectionHelper.java:66)
	at io.quarkus.qute.Parser$1.resolve(Parser.java:1288)
	at io.quarkus.qute.SectionNode.resolve(SectionNode.java:53)
	at io.quarkus.qute.SectionNode.resolve(SectionNode.java:58)
	at io.quarkus.qute.TemplateImpl$TemplateInstanceImpl.renderData(TemplateImpl.java:233)
	at io.quarkus.qute.TemplateImpl$TemplateInstanceImpl.renderAsyncNoTimeout(TemplateImpl.java:224)
	at io.quarkus.qute.TemplateImpl$TemplateInstanceImpl.render(TemplateImpl.java:149)
	at io.quarkiverse.langchain4j.QuarkusPromptTemplateFactory$QuteTemplate.render(QuarkusPromptTemplateFactory.java:61)
	at dev.langchain4j.model.input.PromptTemplate.apply(PromptTemplate.java:73)
	at io.quarkiverse.langchain4j.runtime.aiservice.MethodImplementationSupport.prepareUserMessage(MethodImplementationSupport.java:197)
	at io.quarkiverse.langchain4j.runtime.aiservice.MethodImplementationSupport.implement(MethodImplementationSupport.java:56)
	at com.arconsis.youtube.quarkus.langchain.services.ai.MyAiService$$QuarkusImpl.writeAPoem(Unknown Source)
	at com.arconsis.youtube.quarkus.langchain.services.ai.MyAiService_y7si7UWyP2GRrk7fUaA8LGsjFd0_Synthetic_ClientProxy.writeAPoem(Unknown Source)
	at com.arconsis.youtube.quarkus.langchain.rest.PoemResource.writePoem(PoemResource.kt:16)
	at com.arconsis.youtube.quarkus.langchain.rest.PoemResource$quarkusrestinvoker$writePoem_eb52e6c497337f520dd6d26c0e6958815f82dfe5.invoke(Unknown Source)
	at org.jboss.resteasy.reactive.server.handlers.InvocationHandler.handle(InvocationHandler.java:29)
	at io.quarkus.resteasy.reactive.server.runtime.QuarkusResteasyReactiveRequestContext.invokeHandler(QuarkusResteasyReactiveRequestContext.java:141)
	at org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.run(AbstractResteasyReactiveContext.java:147)
	at io.quarkus.vertx.core.runtime.VertxCoreRecorder$14.runWith(VertxCoreRecorder.java:582)
	at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2513)
	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1538)
	at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:29)
	at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:29)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:833)

Interestingly, if I remove topic, then the lines parameter works just fine.

Steps to reproduce

Download 2023-12_quarkus-llms.zip
unzip
add your OpenAPI API Key either in the application.properties or as the env variable OPEN_API_KEY
run ./gradlew quarkusDev
execute curl -L 'localhost:8080/poem?topic=winter%20in%20Germany&lines=19' to trigger a request
observe the error from qute

Dev UI support

For starters, I plan to have:

View information about declarative AI services and tools
Add embeddings into the store, if it exists in the CDI container
Search for relevant embeddings and view them as a table

Documentation - Clarify how to use tools

The documentation makes all the concepts extremely clear and, really, for the most part, you don't need to read anything else than our doc to get set up (congrats on that!).

There's something a bit obscure though: how will the AI services call tools and how you should define your tools so that your AI services can actually make use of them.

For instance, let's take this example of the doc:

quarkus-langchain4j/docs/modules/ROOT/pages/ai-services.adoc

Lines 223 to 234 in a8ca184

    
           [source,java] 
        
           ---- 
        
           @ApplicationScoped 
        
           public class CustomerRepository implements PanacheRepository<Customer> { 
        
               @Tool("get the customer name for the given customerId") 
        
               public String getCustomerName(long id) { 
        
                   return find("id", id).firstResult().name; 
        
               } 
        
           } 
        
           ----

How is the id parameter linked to the customerId mention in the tool description?
What will actually trigger the call to this particular tool?

Note that it might be a pretty stupid question for people having knowledge of all this but, given the rest of the doc is extremely useful to understand all the concepts, I was kinda hoping we could also describe how this would work.

What would it mean to use type-safe Qute template?

It should be possible to control access to the RAG store with roles or permissions

It may already work but some tests should be added

Some integration tests (native mode!) for embedding stores

Infinispan as embedding storage support

Infinispan is implementing the Vector Search capabilities.
Tagging the issue here

Allow declaring RAG per method

0.2 release?`

I think we have enough for a 0.2 release. WDYT @cescoffier ?

Allows injection of EmbeddingStore regardless of their type

It should be possible to inject EmbeddingStore:

@Inject EmbeddingStore store;

At the moment, it needs to be RedisEmbeddingStore or ChromaEmbeddingStore.
It is not homogeneous with the EmbeddingModel, which is injected as @Inject EmbeddingModel model regardless of the implementation.

Check if pg-vector from langchain4j can be used as is.

We would probably however want a dev-service for this in any case

Document how the Quarkus implentation is different from upstream langchain4j

We should also explain how things have been implemented and point to places in the code to make it easier for interested folks to get started

Add an OpenId Connect demo where a user is required to approve the `LLM` scope

I propose to demo and also recommend the following set up: a user logs in to a frontend Quarkus application (say Quarkus LLM) which talks to a microservice which uses the LLM, when the users logs in to the frontend the user is redirected to OIDC where the user will be asked to allow Quarkus LLM to apply for example Large Language Model to whatever Quarkus LLM is expected to solve/do. When the user approves and logs in to Quarkus LLM will propagate the access token to the microservice which will only be allowed to be accessed when such a token has an LLM scope/permission.

Document how our OpenAI REST Client can be used

We have a REST Client for OpenAI (OpenAiRestApi.java) that users could use manually if they want more control than Langchain4j's model provide.

We should document this

Reduce the amount of warning when compiling to native

Take one of the integration tests with an in-process embedding model
Look at the log:

Warning: Could not resolve class sun.font.FontConfigManager for reflection configuration. Reason: java.lang.ClassNotFoundException: sun.font.FontConfigManager.
Warning: Could not resolve class sun.font.FontConfigManager$FcCompFont for reflection configuration. Reason: java.lang.ClassNotFoundException: sun.font.FontConfigManager$FcCompFont.
Warning: Could not resolve class sun.font.FontConfigManager$FontConfigFont for reflection configuration. Reason: java.lang.ClassNotFoundException: sun.font.FontConfigManager$FontConfigFont.
Warning: Could not resolve class sun.font.FontConfigManager$FontConfigInfo for reflection configuration. Reason: java.lang.ClassNotFoundException: sun.font.FontConfigManager$FontConfigInfo.
Warning: Could not resolve class sun.awt.X11FontManager for reflection configuration. Reason: java.lang.ClassNotFoundException: sun.awt.X11FontManager.
Warning: Could not resolve class sun.awt.X11GraphicsConfig for reflection configuration. Reason: java.lang.ClassNotFoundException: sun.awt.X11GraphicsConfig.
Warning: Could not resolve class sun.awt.X11GraphicsDevice for reflection configuration. Reason: java.lang.ClassNotFoundException: sun.awt.X11GraphicsDevice.
Warning: Could not resolve class sun.java2d.xr.XRSurfaceData for reflection configuration. Reason: java.lang.ClassNotFoundException: sun.java2d.xr.XRSurfaceData.
Warning: Could not resolve class sun.awt.X11.XToolkit for reflection configuration. Reason: java.lang.ClassNotFoundException: sun.awt.X11.XToolkit.
Warning: Could not resolve class sun.awt.X11.XErrorHandlerUtil for reflection configuration. Reason: java.lang.ClassNotFoundException: sun.awt.X11.XErrorHandlerUtil.
Warning: Could not resolve class software.amazon.awssdk.core.internal.interceptor.HttpChecksumRequiredInterceptor for reflection configuration. Reason: java.lang.ClassNotFoundException: software.amazon.awssdk.core.internal.interceptor.HttpChecksumRequiredInterceptor.
Warning: Could not resolve class org.apache.commons.logging.impl.Jdk14Logger for reflection configuration. Reason: java.lang.ClassNotFoundException: org.apache.commons.logging.impl.Jdk14Logger.
Warning: Could not resolve class org.apache.commons.logging.impl.Log4JLogger for reflection configuration. Reason: java.lang.ClassNotFoundException: org.apache.commons.logging.impl.Log4JLogger.
Warning: Could not resolve class org.apache.commons.logging.impl.LogFactoryImpl for reflection configuration. Reason: java.lang.ClassNotFoundException: org.apache.commons.logging.impl.LogFactoryImpl.
Warning: Could not resolve class org.apache.commons.logging.impl.WeakHashtable for reflection configuration. Reason: java.lang.ClassNotFoundException: org.apache.commons.logging.impl.WeakHashtable.
Warning: Could not resolve class sun.awt.X11.XToolkit for reflection configuration. Reason: java.lang.ClassNotFoundException: sun.awt.X11.XToolkit.
Warning: Could not resolve class sun.awt.X11FontManager for reflection configuration. Reason: java.lang.ClassNotFoundException: sun.awt.X11FontManager.
Warning: Could not resolve class sun.awt.X11GraphicsEnvironment for reflection configuration. Reason: java.lang.ClassNotFoundException: sun.awt.X11GraphicsEnvironment.

User input and output to the prompt should be sanitized

By default, basic escaping of <, > and a few other characters should be done, but users should also be able to register per/post handlers for sanitizing if they would like to

Milvus embedding store

The official Java client uses gRPC, but they also have a REST interface (https://milvus.io/api-reference/restful/v2.3.x/About.md), so I think we will go with that.. just like we did with Pinecone.
We can also add DevServices support for Milvus I think.

PgVector processor needs to produce EmbeddingStoreBuildItem (or find a completely different approach)

We produce EmbeddingStoreBuildItem when a processor creates a bean for an embedding store. This is for the Langchain4jDevUIProcessor to be able to tell whether to register the page for working with embedding stores and its JSON-RPC backend.

Or, perhaps as an improvement, we could use something more akin to SelectedEmbeddingModelCandidateBuildItem for embedding stores too, but it would somehow need to be produced earlier. SelectedEmbeddingModelCandidateBuildItem is produced after BeanDiscoveryFinishedBuildItem exists, which is too late, because we need to react to it by potentially registering the JSON-RPC bean, and that isn't possible if bean discovery is already finished.

I tried whatever and it did not work

This is how I reprodiced the problem:
'
BLKA VLA VLA

Feature: Support Tools for Huggingface models

I just tried to use Tools with Huggingface, but it seems that is not yet supported. I got this error:

 java.lang.IllegalArgumentException: Tools are currently not supported for HuggingFace models
	at io.quarkiverse.langchain4j.huggingface.QuarkusHuggingFaceChatModel.generate(QuarkusHuggingFaceChatModel.java:108)
	at dev.langchain4j.model.chat.ChatLanguageModel_XNMsOaekknG7BdNZ5YSUkjh1SqE_Synthetic_ClientProxy.generate(Unknown Source)
	at io.quarkiverse.langchain4j.runtime.aiservice.MethodImplementationSupport.implement(MethodImplementationSupport.java:104)
	at com.arconsis.youtube.quarkus.langchain.services.ai.MyAiService$$QuarkusImpl.findUserLocations(Unknown Source)
	at com.arconsis.youtube.quarkus.langchain.services.ai.MyAiService_y7si7UWyP2GRrk7fUaA8LGsjFd0_Synthetic_ClientProxy.findUserLocations(Unknown Source)
	at com.arconsis.youtube.quarkus.langchain.rest.EmployeesResource.writePoem(EmployeesResource.java:19)
	at com.arconsis.youtube.quarkus.langchain.rest.EmployeesResource$quarkusrestinvoker$writePoem_71e6bb6015aacbbc37e37d75924cc107b689db1a.invoke(Unknown Source)
	at org.jboss.resteasy.reactive.server.handlers.InvocationHandler.handle(InvocationHandler.java:29)
	at io.quarkus.resteasy.reactive.server.runtime.QuarkusResteasyReactiveRequestContext.invokeHandler(QuarkusResteasyReactiveRequestContext.java:141)
	at org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.run(AbstractResteasyReactiveContext.java:147)
	at io.quarkus.vertx.core.runtime.VertxCoreRecorder$14.runWith(VertxCoreRecorder.java:582)
	at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2513)
	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1538)
	at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:29)
	at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:29)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:1583)

Would be great to also have support for tools with Hunggingface.

Allow declaring retriever on method directly

Adding the RAG to the class means that the RAG is used for all methods. However, you may need additional documents for specific methods and method not requiring RAG at all (which would cost multiple network calls for nothing)

Tokens not being processed with specifying user message as input parameter

If I supply a user message with tokens as an input parameter the tokens are not processed.

@SystemMessage("You are a marvel comics writer, expert in all sorts of super heroes and super villains.")
String narrate(@UserMessage String userMessage, Fight fight);

default String narrate(Fight fight) {
    try {
        var userMessage = Files.readString(Path.of(getClass().getClassLoader().getResource("usermessage.txt").toURI()));
        return narrate(userMessage, fight);
    }
    catch (IOException | URISyntaxException e) {
        throw new RuntimeException(e);
    }
}

produces

In the heart of the bustling city, amidst towering skyscrapers and bustling streets, a fierce battle unfolded. The air crackled with anticipation as {fight.winnerName}, a formidable force for justice, faced off against {fight.loserName}, a cunning mastermind of chaos. The clash of powers was about to commence, and the fate of the city hung in the balance.

With a flash of lightning, {fight.winnerName} unleashed their incredible {fight.winnerPowers}, harnessing the elements themselves. The ground trembled beneath their feet as they summoned a mighty gust of wind, sending debris swirling through the air. Their level of mastery over their powers was unmatched, and it showed in every move they made.

But {fight.loserName} was no ordinary adversary. With a sly grin, they tapped into their own unique {fight.loserPowers}, manipulating shadows and illusions with a mischievous flair. Their level of trickery and deception was unparalleled, making it difficult for even the most astute heroes to discern reality from illusion.

As the battle raged on, the clash of powers illuminated the night sky, casting an ethereal glow over the city. {fight.winnerName} fought with unwavering determination, their powers reaching new heights as they tapped into their inner strength. With each strike, they pushed themselves further, refusing to let evil prevail.

In the end, it was {fight.winnerName} who emerged victorious. Their unwavering spirit and mastery over their {fight.winnerPowers} proved to be the deciding factor. With a final surge of energy, they delivered a decisive blow, incapacitating {fight.loserName} and restoring peace to the city once more. The citizens rejoiced, grateful for the unwavering heroism of {fight.winnerName} and the triumph of good over evil.

+++++
Name: {fight.winnerName}
Powers: {fight.winnerPowers}
Level: {fight.winnerLevel}
+++++

+++++
Name: {fight.loserName}
Powers: {fight.loserPowers}
Level: {fight.loserLevel}
+++++

where the contents of usermessage.txt is

Narrate the fight between a super hero and a super villain.

During the narration, don't repeat "super hero" or "super villain".

Write 4 paragraphs maximum. Be creative.

The narration must be:
- G rated
- Workplace/family safe
- No sexism, racism, or other bias/bigotry

Here is the data you will use for the winner:

+++++
Name: {fight.winnerName}
Powers: {fight.winnerPowers}
Level: {fight.winnerLevel}
+++++

Here is the data you will use for the loser:

+++++
Name: {fight.loserName}
Powers: {fight.loserPowers}
Level: {fight.loserLevel}
+++++

Here is the data you will use for the fight:

+++++
{fight.winnerName} who is a {fight.winnerTeam} has won the fight against {fight.loserName} who is a {fight.loserTeam}.

The fight took place in {fight.location.name}, which can be described as {fight.location.description}.
+++++

whereas when I have this:

  @SystemMessage("You are a marvel comics writer, expert in all sorts of super heroes and super villains.")
  @UserMessage("""
    Narrate the fight between a super hero and a super villain.

    During the narration, don't repeat "super hero" or "super villain".
    
    Write 4 paragraphs maximum. Be creative.
    
    The narration must be:
    - G rated
    - Workplace/family safe
    - No sexism, racism, or other bias/bigotry
    
    Here is the data you will use for the winner:
    
    +++++
    Name: {fight.winnerName}
    Powers: {fight.winnerPowers}
    Level: {fight.winnerLevel}
    +++++
    
    Here is the data you will use for the loser:
    
    +++++
    Name: {fight.loserName}
    Powers: {fight.loserPowers}
    Level: {fight.loserLevel}
    +++++
    
    Here is the data you will use for the fight:
    
    +++++
    {fight.winnerName} who is a {fight.winnerTeam} has won the fight against {fight.loserName} who is a {fight.loserTeam}.
    
    The fight took place in {fight.location.name}, which can be described as {fight.location.description}.
    +++++
    """)
  String narrate(Fight fight);

I get this:

In the gritty streets of Gotham City, a clash of epic proportions unfolded. Han Solo, a hero known for his sharpshooting skills and skepticism towards the force, faced off against Storm Trooper, a villain armed with nothing more than a small gun. The odds seemed stacked against the Storm Trooper, but he was determined to prove his worth.

As the battle commenced, Han Solo swiftly dodged the Storm Trooper's feeble shots, his agility and experience shining through. With a smirk on his face, Han Solo aimed his big gun with precision, firing shots that echoed through the city. The Storm Trooper stumbled, his small gun no match for the firepower of his opponent.

Undeterred, the Storm Trooper fought back with unwavering determination. He maneuvered through the chaos, attempting to outsmart Han Solo. But the hero's quick reflexes and strategic thinking proved to be too much for the Storm Trooper to handle. With each passing moment, Han Solo's level of expertise became more evident.

In a final, decisive move, Han Solo disarmed the Storm Trooper, leaving him defenseless and defeated. The hero's victory was celebrated by the citizens of Gotham City, who witnessed the triumph of justice over villainy. Han Solo, with his unwavering resolve and unmatched skills, had once again proven himself as a true hero in the face of adversity.

Compile time check if chat memory is required

I just tried the example from the guide and forgot about adding the chat memory / chat memory provider. The code compiles and quarkusDev is starting but when I try to use it, it fails with

Caused by: dev.langchain4j.exception.IllegalConfigurationException: Please set up chatMemory or chatMemoryProvider in order to use tools. A ChatMemory that can hold at least 3 messages is required for the tools to work properly. While the LLM can technically execute a tool without chat memory, if it only receives the result of the tool's execution without the initial message from the user, it won't interpret the result properly.
	at dev.langchain4j.exception.IllegalConfigurationException.illegalConfiguration(IllegalConfigurationException.java:12)
	at dev.langchain4j.service.AiServices.performBasicValidation(AiServices.java:325)

It would be great if the extension could detect this during build time to avoid having runtime issues.

Implement audit feature

Ability to serve a default response?

Since integrating with OpenAI costs real money, should there be a way that I can serve a default response if I want to "turn off" the integration at runtime? Maybe when I'm running in dev mode or if I'm doing a quick demo I don't want it to make "real" calls to an openai provider.

Proper CDI scope for conversation

Memory is required to keep tracks of the previous messages in a conversation. It's also required when using tools.

This issue is about addressing two issues:

how to configure the memory size - even if it's just a message count, it may not work as the context would be exceeded (especially when using RAG). Also, eviction is tricky
how to properly define the boundary of the memory - like the web socket session

Of course, the memory storage must be per conversation / users and not for the application.

An idea is to define a new CDI scope that would handle the second issue.

Advise users in the docs to be careful with using RAG with sensitive/personal information

As it may end up propagating into the LLM model outside of the trust boundaries or a response based on this augmentation can be logged by the agent

AIService interaction point

The idea is to leverage AI services to provide a declarative interface like REST Client.
You would describe the interaction point with the LLM using an interface and annotations.

For example:

@RegisterAiService(
    name=... // Optional - override the config key, can also be named `configKey`
    chatModel = ...// String - the chat model (or streaming chat model) identified as a name. If not set, use the default one (validated and set at build time)
    tools = ...// List<String> - the list of bean identifiers providing tools (validated at build time); if not set, all tools are available
    chatMemory = ...// String - the bean identifier for the chat memory; if not set, use the default one 
    moderationModel = .. // String - the bean identifier for the chat memory; if not set, use the default one (no moderation)
    retriever = ... // String - the bean identifier of the RAG 
)
public interface MyAiService { 
  // ...
}

All the attributes are optional, meaning that the following snippet would use sensible defaults:

@RegisterAiService
public interface MyAiService { 
  // ...
}

All the configurations can be set in the application.properties using the quarkus.aiservices.$name.attr=value syntax (prefix not decided yet).

While the skeleton will be doable at build time, it will not be possible to initialize everything at build time, as the RAG may connect to the store (it might be interesting to preload the in-memory store at build time, but in the general case, it would not work).

It should be possible to use fault-tolerance annotation on the AiService methods. (timeout, retry, or even circuit breaker...).
If OTel is available, each method would be timed and counter automatically. The outcome would also be monitored.
If an audit service is available (See #12), each method invocation will be audited.

Other extensions:

Tools category - in a bean providing tools, we may need security feature (authentication) or identify the tools category. Specific processing (such as authentication) can be applied before calling the tool. That also means we need tools interceptor. We can reuse CDI interceptors. However, we may need access to the context (both the conversational context and the duplicated context).

Introduce Git Incremental Builder

Given the fact that the native tests take a while to run, it would be nice if we introduced a setup similar to Quarkus where GIB ensures that only the proper modules are built.

cc @famod

Add pinecone embedding store

Pinecone seems be to pretty popular with LangChain users so we should support it.

A java client exists and is based on gRPC, so I would like @cescoffier's input on how we should handle this

RegisterAiService is locked down to ChatLanguageModel

Hello,

Currently the RegisterAiService cannot find or be configured to use a StreamingChatLanguageModel, it's locked to a "Blocking"ChatLanguageModel
It would be nice to have streaming here to write chat bots with provider that support streaming of response.

Regards

ChatMemory is persisted over request boundaries

I was not looking a bit more into the examples from the guides page and especially the chat memory management (https://docs.quarkiverse.io/quarkus-langchain4j/dev/ai-services.html#memory). Unfortunately, I got a bit confused when I tested it as it seems to behave quite different from what I initially expected. Hopefully you can help to clarify things.

In the example code, the ChatMemoryBean is created as a RequestScoped bean. However, when you just use it, its get method is only called once with default as memoryId. At the end of the first request, it's close method is also called (as expected) and the memories map is freed. However, when you do the second request, the bean is not used any more and no new memory is created. Instead, all other requests just use the same memory.

Is this expected? If yes, why is the bean even request scoped?

One thing I noticed is that the get method is called by dev.langchain4j.service.AiServiceContext. And AiServiceContext itself has a map doing basically the same as ChatMemoryBean in the sense that it has a map from memory id to ChatMemory. I think this is where the memory lives longer than the request scope.

Allow tools method to receive the @MemoryId object

Explore how @Tool works with Panache ActiveRecords and repositories

Allow for specifying the organization id in the configuration

See https://platform.openai.com/docs/api-reference/organization-optional - allow the organizationId to be specified in the configuration.

Allow declaring tools on methods

Tools declaration provides access to tools for all the methods. However, you may want to control which method can access which tools.

@Description needs to be allowed on method parameters

This is will allow the annotation to use be useful for Java records

Using higher-level objects in template not working

Since the user message is processed as a Qute template I figured I could use higher level objects, but it doesn't seem to work.

  @SystemMessage("You are a marvel comics writer, expert in all sorts of super heroes and super villains.")
  @UserMessage("""
    Narrate the fight between a super hero and a super villain.
    
    During the narration, don't repeat "super hero" or "super villain". We know who is who.
    
    Write 4 paragraphs maximum. Be creative.
    
    The narration must be:
    - G rated
    - Workplace/family safe
    - No sexism, racism, or other bias/bigotry
    
    Here is the data you will use for the winner:
    
    +++++
    Name: {fight.winnerName}
    Powers: {fight.winnerPowers}
    Level: {fight.winnerLevel}
    +++++
    
    Here is the data you will use for the loser:
    
    +++++
    Name: {fight.loserName}
    Powers: {fight.loserPowers}
    Level: {fight.loserLevel}
    +++++
    
    Here is the data you will use for the fight:
    
    +++++
    {fight.winnerName} who is a {fight.winnerTeam} has won the fight against {fight.loserName} who is a {fight.loserTeam}.
    
    The fight took place in {fight.location.name}, which can be described as {fight.location.description}.
    +++++
    """)
  @Fallback(fallbackMethod = "narrateFallback")
  String narrate(@SpanAttribute("arg.fight") Fight fight);

results in the following stack trace.

I did try debugging, and when I put a breakpoint on QuarkusPromptTemplateFactory.QuteTemplate.render, it doesn't seem that the attribute in the method signature is preset in the list of variables:

14:42:11 ERROR [io.qu.la.ru.ai.AiServiceMethodImplementationSupport] (main) Execution of io.quarkus.sample.superheroes.narration.service.LangchainNarrationService#narrate failed: io.quarkus.qute.TemplateException
	at io.quarkus.qute.CompletedStage.get(CompletedStage.java:65)
	at io.quarkus.qute.MultiResultNode.process(MultiResultNode.java:20)
	at io.quarkus.qute.TemplateImpl$TemplateInstanceImpl.lambda$renderData$5(TemplateImpl.java:239)
	at io.quarkus.qute.CompletedStage.whenComplete(CompletedStage.java:285)
	at io.quarkus.qute.TemplateImpl$TemplateInstanceImpl.renderData(TemplateImpl.java:233)
	at io.quarkus.qute.TemplateImpl$TemplateInstanceImpl.renderAsyncNoTimeout(TemplateImpl.java:224)
	at io.quarkus.qute.TemplateImpl$TemplateInstanceImpl.render(TemplateImpl.java:149)
	at io.quarkiverse.langchain4j.QuarkusPromptTemplateFactory$QuteTemplate.render(QuarkusPromptTemplateFactory.java:61)
	at dev.langchain4j.model.input.PromptTemplate.apply(PromptTemplate.java:73)
	at io.quarkiverse.langchain4j.runtime.aiservice.AiServiceMethodImplementationSupport.prepareUserMessage(AiServiceMethodImplementationSupport.java:248)
	at io.quarkiverse.langchain4j.runtime.aiservice.AiServiceMethodImplementationSupport.doImplement(AiServiceMethodImplementationSupport.java:89)
	at io.quarkiverse.langchain4j.runtime.aiservice.AiServiceMethodImplementationSupport.implement(AiServiceMethodImplementationSupport.java:70)
	at io.quarkiverse.langchain4j.runtime.aiservice.MethodImplementationSupportProducer$1$1.apply(MethodImplementationSupportProducer.java:31)
	at io.quarkiverse.langchain4j.runtime.aiservice.MethodImplementationSupportProducer$1$1.apply(MethodImplementationSupportProducer.java:28)
	at io.quarkiverse.langchain4j.runtime.aiservice.MetricsWrapper$2.get(MetricsWrapper.java:42)
	at io.micrometer.core.instrument.composite.CompositeTimer.record(CompositeTimer.java:69)
	at io.quarkiverse.langchain4j.runtime.aiservice.MetricsWrapper.wrap(MetricsWrapper.java:39)
	at io.quarkiverse.langchain4j.runtime.aiservice.MethodImplementationSupportProducer$1$2.apply(MethodImplementationSupportProducer.java:40)
	at io.quarkiverse.langchain4j.runtime.aiservice.MethodImplementationSupportProducer$1$2.apply(MethodImplementationSupportProducer.java:37)
	at io.quarkiverse.langchain4j.runtime.aiservice.SpanWrapper.wrap(SpanWrapper.java:54)
	at io.quarkiverse.langchain4j.runtime.aiservice.MethodImplementationSupportProducer$1$2.apply(MethodImplementationSupportProducer.java:40)
	at io.quarkiverse.langchain4j.runtime.aiservice.MethodImplementationSupportProducer$1$2.apply(MethodImplementationSupportProducer.java:37)
	at io.quarkiverse.langchain4j.runtime.aiservice.MethodImplementationSupportProducer$1.implement(MethodImplementationSupportProducer.java:46)
	at io.quarkus.sample.superheroes.narration.service.LangchainNarrationService$$QuarkusImpl.narrate(Unknown Source)
	at io.quarkus.sample.superheroes.narration.service.LangchainNarrationService_HgURQ-FqZr4uD7EFDPHkzI-Z0Cg_Synthetic_ClientProxy.narrate(Unknown Source)
	at io.quarkus.sample.superheroes.narration.service.LangchainNarrationServiceTests.narrate(LangchainNarrationServiceTests.java:18)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at io.quarkus.test.junit.QuarkusTestExtension.runExtensionMethod(QuarkusTestExtension.java:1013)
	at io.quarkus.test.junit.QuarkusTestExtension.interceptTestMethod(QuarkusTestExtension.java:827)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
	at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:218)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:214)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:139)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:69)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:35)
	at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57)
	at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:54)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:198)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:169)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:93)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:58)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:141)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:57)
	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:103)
	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:85)
	at org.junit.platform.launcher.core.DelegatingLauncher.execute(DelegatingLauncher.java:47)
	at org.junit.platform.launcher.core.SessionPerRequestLauncher.execute(SessionPerRequestLauncher.java:63)
	at com.intellij.junit5.JUnit5IdeaTestRunner.startRunnerWithArgs(JUnit5IdeaTestRunner.java:57)
	at com.intellij.rt.junit.IdeaTestRunner$Repeater$1.execute(IdeaTestRunner.java:38)
	at com.intellij.rt.execution.junit.TestsRepeater.repeat(TestsRepeater.java:11)
	at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:35)
	at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:232)
	at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:55)
Caused by: io.quarkus.qute.TemplateException: Rendering error: Entry "fight" not found in the data map in expression {fight.winnerName}
	at io.quarkus.qute.TemplateException$Builder.build(TemplateException.java:169)
	at io.quarkus.qute.EvaluatorImpl.propertyNotFound(EvaluatorImpl.java:234)
	at io.quarkus.qute.EvaluatorImpl.resolve(EvaluatorImpl.java:204)
	at io.quarkus.qute.EvaluatorImpl.resolveReference(EvaluatorImpl.java:131)
	at io.quarkus.qute.EvaluatorImpl.lambda$resolveReference$2(EvaluatorImpl.java:135)
	at io.quarkus.qute.CompletedStage.thenCompose(CompletedStage.java:249)
	at io.quarkus.qute.EvaluatorImpl.resolveReference(EvaluatorImpl.java:135)
	at io.quarkus.qute.EvaluatorImpl.evaluate(EvaluatorImpl.java:85)
	at io.quarkus.qute.ResolutionContextImpl.evaluate(ResolutionContextImpl.java:29)
	at io.quarkus.qute.ExpressionNode.resolve(ExpressionNode.java:36)
	at io.quarkus.qute.SectionNode$SectionResolutionContextImpl.execute(SectionNode.java:228)
	at io.quarkus.qute.SectionHelper$SectionResolutionContext.execute(SectionHelper.java:66)
	at io.quarkus.qute.Parser$1.resolve(Parser.java:1288)
	at io.quarkus.qute.SectionNode.resolve(SectionNode.java:53)
	at io.quarkus.qute.SectionNode.resolve(SectionNode.java:58)

I also tried adding {@io.quarkus.sample.superheroes.narration.Fight fight} as the first line in the template (as described in https://quarkus.io/guides/qute#template-parameter-declaration-inside-the-template-itself), but that didn't work either.

ChatBot responses are a liitle bit slow

Sorry if it is a duplicate or requires a correct memory usage, I'm just trying to catch up with some demos.

So, when I went to localhost:8080 in the chat bot sample demo, I was not sure what to ask so I started with:

What should I do next - it thought for about 10-11 secs, and offered a summary of the available accounts, mentioning the saving account among other options.

My next question was:
Please tell me more about the saving account - it thought for a little bit longer, may be 13-14 secs, and then responded.

Then if I stop and start the demo again, the same 10/13 seconds delay is observed when I ask exactly the same questions - even though this is a restart, I thought, given that the questions were already asked with the same API key, the model should already be trained somehow.

I guess this is about configuring the memory correctly managed by Redis (or Infinispan) or may be adding some parameters to the OpenAI builder to instruct it to remember some of its own responses ?

It would be good, if after the restart of the demo, a much faster response is produced (expecially if it is asked exactly by the same authenticated user), though I appreciate it may be difficult to achieve

Thanks

Eliminate `BeanXProviderSupplier`

Discovering this value is hard. Adding the implementation class name might be simpler.

Allow for multiple providers in the same application

Currently each provider (i.e. open-ai, azure open-ai, etc) are in separate extensions that can't co-exist within the same application. It should be feasible to have an application that uses multiple providers.

Think hybrid cloud. Maybe, via configuration, when deployed in Azure the application uses Azure OpenAI, but when deployed on-prem, it uses open-ai. It would be runtime configuration which would "switch" the provider (not build time).

It could also be possible where if one provider is not available/errors out/etc, that the application tries again using an alternate provider.

	[source,java]
	----
	@ApplicationScoped
	public class CustomerRepository implements PanacheRepository<Customer> {

	@Tool("get the customer name for the given customerId")
	public String getCustomerName(long id) {
	return find("id", id).firstResult().name;
	}

	}
	----