Coder Social home page Coder Social logo

rust-genai's Introduction

genai - Multi-AI Providers Library for Rust.

Currently supports natively: Ollama, OpenAI, Anthropic, groq, Gemini, Cohere (more to come)

Static Badge Static Badge

# cargo.toml
genai = "=0.1.4" # Version lock for `0.1.x`

The goal of this library is to provide a common and ergonomic single API to many generative AI Providers, such as OpenAI, Anthropic, Cohere, Ollama.

  • IMPORTANT 1 0.1.x will still have some breaking changes in patches, so make sure to lock your version, e.g., genai = "=0.1.4". In short, 0.1.x can be considered "beta releases." Version 0.2.x will follow semver more strictly.

  • IMPORTANT 2 genai is focused on normalizing chat completion APIs across AI providers and is not intended to be a full representation of a given AI provider. For this, there are excellent libraries such as async-openai for OpenAI and ollama-rs for Ollama.

Examples | Thanks | Library Focus | Changelog | ChatRequestOptions Provider Mapping

Examples

examples/c00-readme.rs

use genai::chat::{ChatMessage, ChatRequest};
use genai::utils::{print_chat_stream, PrintChatStreamOptions};
use genai::Client;

const MODEL_OPENAI: &str = "gpt-4o-mini";
const MODEL_ANTHROPIC: &str = "claude-3-haiku-20240307";
const MODEL_COHERE: &str = "command-light";
const MODEL_GEMINI: &str = "gemini-1.5-flash-latest";
const MODEL_GROQ: &str = "gemma-7b-it";
const MODEL_OLLAMA: &str = "gemma:2b"; // sh: `ollama pull gemma:2b`

// NOTE: Those are the default environment keys for each AI Adapter Type.
//       Can be customized, see `examples/c02-auth.rs`
const MODEL_AND_KEY_ENV_NAME_LIST: &[(&str, &str)] = &[
	// -- de/activate models/providers
	(MODEL_OPENAI, "OPENAI_API_KEY"),
	(MODEL_ANTHROPIC, "ANTHROPIC_API_KEY"),
	(MODEL_COHERE, "COHERE_API_KEY"),
	(MODEL_GEMINI, "GEMINI_API_KEY"),
	(MODEL_GROQ, "GROQ_API_KEY"),
	(MODEL_OLLAMA, ""),
];

// NOTE: Model to AdapterKind (AI Provider) type mapping rule
//  - starts_with "gpt"      -> OpenAI
//  - starts_with "claude"   -> Anthropic
//  - starts_with "command"  -> Cohere
//  - starts_with "gemini"   -> Gemini
//  - model in Groq models   -> Groq
//  - For anything else      -> Ollama
//
// Can be customized, see `examples/c03-kind.rs`

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
	let question = "Why is the sky red?";

	let chat_req = ChatRequest::new(vec![
		// -- Messages (de/activate to see the differences)
		ChatMessage::system("Answer in one sentence"),
		ChatMessage::user(question),
	]);

	let client = Client::default();

	let print_options = PrintChatStreamOptions::from_print_events(false);

	for (model, env_name) in MODEL_AND_KEY_ENV_NAME_LIST {
		// Skip if does not have the environment name set
		if !env_name.is_empty() && std::env::var(env_name).is_err() {
			println!("===== Skipping model: {model} (env var not set: {env_name})");
			continue;
		}

		let adapter_kind = client.resolve_model_info(model)?.adapter_kind;

		println!("\n===== MODEL: {model} ({adapter_kind}) =====");

		println!("\n--- Question:\n{question}");

		println!("\n--- Answer:");
		let chat_res = client.exec_chat(model, chat_req.clone(), None).await?;
		println!("{}", chat_res.content_text_as_str().unwrap_or("NO ANSWER"));

		println!("\n--- Answer: (streaming)");
		let chat_res = client.exec_chat_stream(model, chat_req.clone(), None).await?;
		print_chat_stream(chat_res, Some(&print_options)).await?;

		println!();
	}

	Ok(())
}

More Examples

Thanks

Library Focus:

  • Focuses on standardizing chat completion APIs across major AI Services.

  • Native implementation, meaning no per-service SDKs.

    • Reason: While there are some variations between all of the various APIs, they all follow the same pattern and high-level flow and constructs. Managing the differences at a lower layer is actually simpler and more cumulative accross services than doing sdks gymnastic.
  • Prioritizes ergonomics and commonality, with depth being secondary. (If you require complete client API, consider using async-openai and ollama-rs; they are both excellent and easy to use.)

  • Initially, this library will mostly focus on text chat API (images, or even function calling in the first stage).

  • The 0.1.x version will work, but the APIs will change in the patch version, not following semver strictly.

  • Version 0.2.x will follow semver more strictly.

ChatRequestOptions

Property OpenAI Anthropic Ollama Groq Gemini Cohere
temperature temperature temperature temperature temperature generationConfig.temperature temperature
max_tokens max_tokens max_tokens (default 1024) max_tokens max_tokens generationConfig.maxOutputTokens max_tokens
top_p top_p top_p top_p top_p generationConfig.topP p

MetaUsage

Property OpenAI
usage.
Ollama
usage.
Groq x_groq.usage. Anthropic usage. Gemini usageMetadata. Cohere meta.tokens.
input_tokens prompt_tokens prompt_tokens (1) prompt_tokens input_tokens (added) promptTokenCount (2) input_tokens
output_tokens completion_tokens completion_tokens (1) completion_tokens output_tokens (added) candidatesTokenCount (2) output_tokens
total_tokens total_tokens total_tokens (1) completion_tokens (computed) totalTokenCount (2) (computed)

Note (1): At this point, Ollama does not emit input/output tokens when streaming due to the Ollama OpenAI compatibility layer limitation. (see ollama #4448 - Streaming Chat Completion via OpenAI API should support stream option to include Usage)

Note (2) Right now, with Gemini Stream API, it's not really clear if the usage for each event is cumulative or needs to be added. Currently, it appears to be cumulative (i.e., the last message has the total amount of input, output, and total tokens), so that will be the assumption. See possible tweet answer for more info.

Notes on Possible Direction

  • Will add more data on ChatResponse and ChatStream, especially metadata about usage.
  • Add vision/image support to chat messages and responses.
  • Add function calling support to chat messages and responses.
  • Add embbed and embbed_batch
  • Add the AWS Bedrock variants (e.g., Mistral, and Anthropic). Most of the work will be on "interesting" token signature scheme (without having to drag big SDKs, might be below feature).
  • Add the Google VertexAI variants.
  • (might) add the Azure OpenAI variant (not sure yet).

Links

rust-genai's People

Contributors

jeremychone avatar stargazing-dino avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

rust-genai's Issues

Add ChatParameters for chat specific config

I need the ability to set temperature and max tokens but currently nothing is provided.

In my head here's something of what the API might look like

use genai::chat::{ChatMessage, ChatRequest, ChatParameters};
use genai::client::Client;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Client::default();
    let parameters = ChatParameters::new()
        .with_max_tokens(2000)
        .with_temperature(0.7)
        .with_top_p(0.9);

    let chat_req = ChatRequest::new(vec![
        ChatMessage::user("Tell me a short story about a robot."),
    ]).with_parameters(parameters);

    let chat_res = client.exec_chat("gpt-3.5-turbo", chat_req, None).await?;
    println!("{}", chat_res.content.unwrap_or_default());

    Ok(())
}

Thoughts?

Add ChatRequestOptions temperature and max_token

Feature Spec

The plan was to use the ChatRequestOptions for that, and pass those temperature and max_token in the ChatRequestOptions, as follow:

pub struct ChatRequestOptions {
    /// Will be implemented in 0.1.2
    /// Will capture the `MetaUsage`
    /// - In the `ChatResponse` for `exec_chat`
    /// - In the `StreamEnd` of `StreamEvent::End(StreamEnd)` for `exec_chat_stream`
    pub capture_usage: Option<bool>,

    // -- For Stream only (for now, we flat them out)
    /// Tell the chat stream executor to capture and concatenate all of the text chunks
    /// to the last `StreamEvent::End(StreamEnd)` event as `StreamEnd.captured_content` (so, will be `Some(concatenated_chunks)`)
    pub capture_content: Option<bool>,
}

It will be taken and set by the Provider Adapter if the provider supports those properties.

The plan is to flatten this provider request in the ChatRequestOptions for simplicity and ergonomics.

Note: Eventually, in the ChatResponse we might have a property set_chat_request_options: Option<ChatRequestOptions> which will capture the options supported and set by the adapter/provider (this way, the caller can know what has been really set).

Related

This is related to the PR #1 and issue #3.

OpenAI compatible API support

I noticed that the url in this crate is hardcoded, which currently only supports the official API endpoints of each AI service provider.

Would there be a plan to make the endpoint configurable? For example, allowing users to specify Azure OpenAI endpoints through configuration would greatly enhance flexibility.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.