An AI Toolbox for Simplified Access to AWS Bedrocks, Ollama from Rust

Raphael MANSUY

Last update: Apr 21, 2024

Related tags

Utilities rust aws bedrock ollama ollama-client

Overview

Hiramu

Hiramu is a powerful and flexible Rust library that provides a high-level interface for interacting with various AI models and APIs, including Ollama and AWS Bedrock.

It simplifies the process of generating text, engaging in chat conversations, and working with different AI models.

Features

Easy-to-use interfaces for generating text and engaging in chat conversations with AI models
Support for Ollama and Bedrock AI services
Convenient interface for Claude and Mistral for AWS Bedrock
Asynchronous and streaming responses for efficient handling of large outputs
Customizable options for fine-tuning the behavior of AI models
Comprehensive error handling and informative error messages
Well-documented code with examples and explanations

Getting Started

To start using Hiramu in your Rust project, add the following to your Cargo.toml file:

[dependencies]
hiramu = "0.1.15"

Examples

Generating Text with Mistral

use hiramu::bedrock::model_info::{ModelInfo, ModelName};
use hiramu::bedrock::models::mistral::mistral_client::{MistralClient, MistralOptions};
use hiramu::bedrock::models::mistral::mistral_request_message::MistralRequestBuilder;

async fn generating_text_with_mistral() {
    let mistral_options = MistralOptions::new()
        .profile_name("bedrock")
        .region("us-west-2");

    let client = MistralClient::new(mistral_options).await;

    let request =
        MistralRequestBuilder::new("<s>[INST] What is the capital of France?[/INST]".to_string())
            .max_tokens(200)
            .temperature(0.8)
            .build();

    let model_id = ModelInfo::from_model_name(ModelName::MistralMixtral8X7BInstruct0x);
    let response = client.generate(model_id, &request).await.unwrap();

    println!("Response: {:?}", response.outputs[0].text);
}

Streaming Text Generation with Mistral

use futures::stream::StreamExt;
use hiramu::bedrock::models::mistral::mistral_client::{MistralClient, MistralOptions};
use hiramu::bedrock::models::mistral::mistral_request_message::MistralRequestBuilder;
use hiramu::bedrock::model_info::{ModelInfo, ModelName};

pub async fn generating_text_with_mistral() {
    let mistral_options = MistralOptions::new()
        .profile_name("bedrock")
        .region("us-west-2");

    let client = MistralClient::new(mistral_options).await;

    let request = MistralRequestBuilder::new("<s>[INST] What is the capital of France?[/INST]".to_string())
        .max_tokens(200)
        .temperature(0.8)
        .build();

    let model_id = ModelInfo::from_model_name(ModelName::MistralMixtral8X7BInstruct0x);
    let mut stream = client.generate_with_stream(model_id, &request).await.unwrap();

    while let Some(result) = stream.next().await {
        match result {
            Ok(response) => {
                println!("Response: {:?}", response.outputs[0].text);
            }
            Err(err) => {
                eprintln!("Error: {:?}", err);
            }
        }
    }
}

Generating Text with Ollama

use std::io::Write;

use futures::TryStreamExt;

use hiramu::ollama::ollama_client::OllamaClient;
use hiramu::ollama::model::{GenerateRequestBuilder};

async fn generating_text_with_ollama() {
    let client = OllamaClient::new("http://localhost:11434".to_string());
    
    let request = GenerateRequestBuilder::new("mistral".to_string())
        .prompt("Once upon a time".to_string())
        .build();

    let response_stream = client.generate(request).await.unwrap();

    response_stream
        .try_for_each(|chunk| async move {
            print!("{}", chunk.response);
            std::io::stdout().flush()?;
            Ok(())
        })
        .await
        .unwrap();
}

Chat with Ollama

use futures::TryStreamExt;
use std::io::{self, Write};

use hiramu::ollama::{ChatRequestBuilder, Message, OllamaClient, OllamaError, OptionsBuilder};

async fn demo_chat_with_ollama_with_stream() -> Result<(), OllamaError> {
    let client = OllamaClient::new("http://localhost:11434".to_string());

    let messages = vec![Message::new(
        "user".to_string(),
        "What is the capital of France?  "
            .to_string(),
    )];

    let options = OptionsBuilder::new()
        .num_predict(100) // Limit the number of predicted tokens
        .temperature(0.4);

    let request = ChatRequestBuilder::new("mistral".to_string())
        .messages(messages.to_owned())
        .options_from_builder(options)
        .build();

    let response_stream = client.chat(request).await?;

    let result = response_stream
        .try_for_each(|chunk| async {
            let message = chunk.message;
            print!("{}", message.content);
            // Flush the output to ensure the prompt is displayed.
            io::stdout().flush().unwrap();
            Ok(())
        })
        .await;

    result
}

Chatting with Claude using Bedrock

use std::io::Write;

use futures::TryStreamExt;

use hiramu::bedrock::model_info::{ModelInfo, ModelName};
use hiramu::bedrock::models::claude::claude_client::{ClaudeClient, ClaudeOptions};
use hiramu::bedrock::models::claude::claude_request_message::{
    ChatOptions, ContentBlockDelta, ConversationRequest, Message, StreamResultData,
};

pub async fn chat_with_claude() {
    let claude_options = ClaudeOptions::new()
        .profile_name("bedrock")
        .region("us-west-2");

    let client = ClaudeClient::new(claude_options).await;

    let mut conversation_request = ConversationRequest::default();
    conversation_request
        .messages
        .push(Message::new_user_message("Hello, Claude!".to_owned()));

    let chat_options = ChatOptions::default()
        .with_temperature(0.7)
        .with_max_tokens(100)
        .with_model_id(ModelInfo::from_model_name(
            ModelName::AnthropicClaudeHaiku1x,
        ));

    let response_stream = client
        .chat_with_stream(&conversation_request, &chat_options)
        .await
        .unwrap();

    response_stream
        .try_for_each(|chunk| async move {
            match chunk {
                StreamResultData::ContentBlockStart(..) => {
                    println!("\n------------------------------");
                }
                StreamResultData::ContentBlockStop(..) => {
                    println!("\n------------------------------");
                }
                StreamResultData::ContentBlockDelta(ContentBlockDelta { delta, .. }) => {
                    print!("{}", delta.text);
                    std::io::stdout().flush().unwrap();
                }
                _ => {}
            }
            Ok(())
        })
        .await
        .unwrap();
}

Working with Images with Claude

use std::io::Write;

use futures::TryStreamExt;

use hiramu::bedrock::models::claude::claude_client::{ClaudeClient, ClaudeOptions};
use hiramu::bedrock::models::claude::claude_request_message::{ChatOptions, ContentBlockDelta, ConversationRequest, Message, StreamResultData};
use hiramu::fetch_and_base64_encode_image;

async fn image_with_claude() {
    let claude_options = ClaudeOptions::new()
        .profile_name("bedrock")
        .region("us-west-2");

    let client = ClaudeClient::new(claude_options).await;

    let image_url = "./data/mario.png";
    let input_text = "What's in this image?".to_string();
    let image = fetch_and_base64_encode_image(image_url).await.unwrap().to_string();
    let mime_type = "image/png".to_string();

    let message = Message::new_user_message_with_image(&input_text, &image, &mime_type);

    let mut conversation_request = ConversationRequest::default();
    conversation_request.messages.push(message);

    let chat_options = ChatOptions::default()
        .with_temperature(0.7)
        .with_max_tokens(100);

    let response_stream = client
        .chat_with_stream(&conversation_request, &chat_options)
        .await
        .unwrap();

        response_stream
        .try_for_each(|chunk| async move {
            match chunk {
                StreamResultData::ContentBlockStart(..) => {
                    println!("\n------------------------------");
                }
                StreamResultData::ContentBlockStop(..) => {
                    println!("\n------------------------------");
                }

                StreamResultData::ContentBlockDelta(ContentBlockDelta { delta, .. }) => {
                    print!("{}", delta.text);
                    std::io::stdout().flush().unwrap();
                }
                _ => {}
            }
            Ok(())
        })
        .await
        .unwrap();
}

Using the Raw Bedrock API

Generating a Raw Response

use hiramu::bedrock::bedrock_client::{BedrockClient, BedrockClientOptions};
use hiramu::bedrock::model_info::{ModelInfo, ModelName};

#[tokio::main]
async fn main() {
    let model_id = ModelInfo::from_model_name(ModelName::AnthropicClaudeHaiku1x);
    let profile_name = "bedrock";
    let region = "us-west-2";

    let prompt = "Hi. In a short paragraph, explain what you can do.";

    let payload = serde_json::json!({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 1000,
        "messages": [{
            "role": "user",
            "content": [{
                "type": "text",
                "text": prompt
            }]
        }]
    });

    let options = BedrockClientOptions::new()
        .profile_name(profile_name)
        .region(region);

    let client = BedrockClient::new(options).await;

    let result = client
        .generate_raw(model_id.to_string(), payload)
        .await
        .unwrap();

    println!("{:?}", result);
}

Generating a Raw Stream Response

use futures::TryStreamExt;
use hiramu::bedrock::bedrock_client::{BedrockClient, BedrockClientOptions};
use hiramu::bedrock::model_info::{ModelInfo, ModelName};

#[tokio::main]
async fn main() {
    let model_id = ModelInfo::from_model_name(ModelName::AnthropicClaudeHaiku1x);
    let profile_name = "bedrock";
    let region = "us-west-2";

    let prompt = "Hi. In a short paragraph, explain what you can do.";

    let payload = serde_json::json!({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 1000,
        "messages": [{
            "role": "user",
            "content": [{
                "type": "text",
                "text": prompt
            }]
        }]
    });

    let options = BedrockClientOptions::new()
        .profile_name(profile_name)
        .region(region);

    let client = BedrockClient::new(options).await;

    let stream = client
        .generate_raw_stream(model_id.to_string(), payload)
        .await
        .unwrap();

    stream
        .try_for_each(|chunk| async move {
            println!("{:?}", chunk);
            Ok(())
        })
        .await
        .unwrap();
}

Using Embeddings with Ollama

use hiramu::ollama::{EmbeddingsRequestBuilder, OllamaClient};

pub async fn demo_ollama_embedding() -> Result<(), Box<dyn std::error::Error>> {
    let client = OllamaClient::new("http://localhost:11434".to_string());

    let prompt = "The quick brown fox jumps over the lazy dog.";

    let request = EmbeddingsRequestBuilder::new("nomic-embed-text".to_string(), prompt.to_string())
        .keep_alive("10m".to_string())
        .build();

    match client.embeddings(request).await {
        Ok(response) => {
            // Print embeddings dimensions
            println!("Embeddings dimensions: {:?}", response.embedding.len());
            println!("Embeddings: {:?}", response);
        }
        Err(error) => {
            eprintln!("Error: {:?}", error);
        }
    }

    Ok(())
}

Examples

Here is a table with a description for each example:

Example	Path	Description
`demo_ollama`	src/examples/demo_ollama.rs	A simple example that demonstrates how to use the Ollama API to generate responses.
`demo_chat_with_ollama`	src/examples/demo_chat_with_ollama.rs	A simple example that demonstrates how to use the Ollama Chat API.
`demo_bedrock_raw_generate`	src/examples/demo_bedrock_raw_generate.rs	Demonstrates how to generate a raw response from the Bedrock service using the `generate_raw` method.
`demo_bedrock_raw_stream`	src/examples/demo_bedrock_raw_stream.rs	Demonstrates how to generate a raw stream of responses from the Bedrock service using the `generate_raw_stream` method.
`demo_bedrock_raw_mistral`	src/examples/demo_bedrock_raw_mistral.rs	Demonstrates how to generate a raw stream of responses from the Mistral model in the Bedrock service.
`demo_claude_chat`	src/examples/demo_claude_chat.rs	Demonstrates how to use the Claude model in the Bedrock service to generate a chat response.
`demo_claude_chat_stream`	src/examples/demo_claude_chat_stream.rs	Demonstrates how to use the Claude model in the Bedrock service to generate a stream of chat responses.
`demo_claude_multimedia`	src/examples/demo_claude_multimedia.rs	Demonstrates how to use the Claude model in the Bedrock service to generate a response based on text and an image.
`demo_ollama_embedding`	src/examples/demo_ollama_embedding.rs	Demonstrates how to use the Ollama API to generate text embeddings.
`demo_mistral_stream`	src/examples/demo_mistral_stream.rs	Demonstrates how to use the Mistral model in the Bedrock service to generate a stream of responses.

Contributing

Contributions to Hiramu are welcome! If you encounter any issues, have suggestions for improvements, or want to add new features, please open an issue or submit a pull request on the GitHub repository.

To contribute to the project, follow these steps:

Fork the repository and create a new branch for your changes.
Make your modifications and ensure that the code compiles successfully.
Write tests to cover your changes and ensure that all existing tests pass.
Update the documentation, including the README and API docs, if necessary.
Submit a pull request with a clear description of your changes and the problem they solve.

License

Hiramu is licensed under the MIT License.

Acknowledgements

Hiramu is built on top of the following libraries and APIs:

We would like to express our gratitude to the developers and maintainers of these projects for their excellent work and contributions to the Rust ecosystem.

Comments

Bump rustls from 0.21.10 to 0.21.11 in the cargo group across 1 directory
Bumps the cargo group with 1 update in the / directory: rustls.

Updates rustls from 0.21.10 to 0.21.11

Commits

7b8d1db Prepare 0.21.11

ebcb478 complete_io: bail out if progress is impossible

20f35df Regression test for complete_io infinite loop bug

2f2aae1 Don't specially handle unauthenticated close_notify alerts

e163587 Don't deny warnings from nightly clippy

9f86487 server::handy: fix new nightly clippy lint

7e0e8ab Correct assorted clippy warnings in test code

3587d98 Apply clippy suggestions from Rust 1.72

d082e83 Address clippy::redundant_static_lifetimes

5e7a06c Address clippy::slow_vector_initialization

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency

@dependabot ignore <dependency name> major version will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself)

@dependabot ignore <dependency name> minor version will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself)

@dependabot ignore <dependency name> will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself)

@dependabot unignore <dependency name> will remove all of the ignore conditions of the specified dependency

@dependabot unignore <dependency name> <ignore condition> will remove the ignore condition of the specified dependency and ignore conditions You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
chore: Update Cargo.lock and Cargo.toml for Hiramu 0.1.8 release
feat: Add support for streaming results in ClaudeClient docs: Update README with examples for using ClaudeClient

This commit includes the following changes:

Updated the Cargo.lock and Cargo.toml files to reflect the new version 0.1.8 of the Hiramu crate.

Implemented the ability to stream results in the ClaudeClient, allowing for more efficient and responsive handling of large responses.

Updated the README file with examples demonstrating the usage of the ClaudeClient and its streaming capabilities.
opened by raphaelmansuy 0
feat: Add support for Mistral API in the Bedrock module
Implemented MistralClient for interacting with the Mistral API in the Bedrock service

Added MistralRequestBuilder and MistralResponse models

Added examples for using the Mistral API

Updated the README with information about the Mistral API support

Bumped the version to 0.1.7
opened by raphaelmansuy 0
Bump h2 from 0.3.25 to 0.3.26
Bumps h2 from 0.3.25 to 0.3.26.

Release notes

Sourced from h2's releases.

v0.3.26

What's Changed

Limit number of CONTINUATION frames for misbehaving connections.

See https://seanmonstar.com/blog/hyper-http2-continuation-flood/ for more info.

Changelog

Sourced from h2's changelog.

0.3.26 (April 3, 2024)

Limit number of CONTINUATION frames for misbehaving connections.

Commits

357127e v0.3.26

1a357aa fix: limit number of CONTINUATION frames allowed

5b6c9e0 refactor: cleanup new unused warnings (#757)

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0