LLM orchestration is an important part of creating AI powered applications. Particularly in business use cases, AI agents and RAG pipelines are commonly utilised for refined LLM responses. Although Langchain is currently the most popular LLM orchestration suite, there are also similar crates in Rust that we can use. In this article, we'll be diving into one of them - llm-chain
.
What is llm-chain?
llm-chain
is a collection of crates that describes itself as "the ultimate toolbox" for working with Large Language Models (LLMs) to create AI-powered applications and other tooling.
You can find the crate's GitHub repository here.
Comparison to other LLM orchestration crates
If you've looked around for different crates, you probably noticed that there are a few crates for LLM orchestration in Rust:
llm-chain
(this one!)langchain-rust
anchor-chain
In comparison to the others, llm-chain
is somewhat macro heavy. However, it is also the most developed in terms of having extra data processing utilities. If you're looking for an all-in-one package, llm-chain
is the crate that's most likely to help you get over the line. They also have their own docs.
Getting Started
Pre-requisites
Before you add llm-chain
to your project, make sure you have access to the prompting model you want to use. For example, if you want to use OpenAI, make sure you have an OpenAI API key (set as OPENAI_API_KEY
in environment variables).
Setup
To get started, all you need to do is to add the crate to your project (as well as Tokio for async):
Next, we'll want to add a provider for whatever method of model prompting you want to use. Here we'll add the OpenAI integration by adding the crate to our application:
Note that a full list of integrations can be found here, split by package.
Basic usage
Prompting
To get started with llm-chain
, we can use their basic example as a way to quickly get something working. In this code snippet, we will:
- Initialise an executor using
executor!()
- Use
prompt!()
with the system message and prompt to store both in a struct that will get used when the prompt (or chain) gets ran. - Runs the prompt and returns the results, using a reference to the executor.
- Prints the results.
Running this prompt should yield a result that looks like this:
The default model for the llm_chain_openai
executor is gpt-3.5-turbo
. The executor parameters can be defined in the macro - you can also find more about this here.
Using Templates
However, if we want to move onto more advanced pipelines the easiest way for us to do this would be to use a prompt template with parameters. You can see below that much like in the previous code snippet, we generate an executor and return the results. However, instead of using prompt!()
by itself we use it in Step::for_prompt_template
- which you can find more about here.
The results of the output should look like this:
Chaining prompts
Of course, one of the main reasons why we're using Langchain (or Langchain-like libraries) in the first place is to be able to orchestrate our LLM usage. The llm-chain
Rust crate assists us with this by letting us create chains of LLM prompts using the Chain
struct.
There are three types of chains that we can use with llm-chain
:
- Sequential chains, which apply steps sequentially
- Map-reduce chains, which use a "map" step to apply to each chunk from a loaded file and then reduce the text. This is quite useful for text summarization.
- Conversational chains, which keep track of the conversation history and manage context. Conversational chains are great for chatbot applications, multi-step interactions and other places where context is essential.
Sequential chaining
The easiest to use type of chaining is sequential chaining, which simply pipes the output from each step into the next step. When creating our steps, we will use the Chain
struct instead of creating each step individually:
Next, we'll then use the parameters!
macro to inject parameters into the prompt pipeline:
Running the code should yield a result that looks like this:
Map-reduce chains
Map-reduce chains typically consist of two steps:
- A "Map" step that takes a document and applies an LLM chain to it, treating the output as a new document
- The new documents are then passed to a new chain that combines the separate documents to get a single output. At the end of a Map-Reduce chain, the output can be taken for further processing by sending it to another prompting model (for instance) or as part of a sequential pipeline.
To use this pattern, we need to create a prompt template:
Next, we need to take some text from a file and add it as a parameter - the {{text}}
parameter in the Map prompt will automatically take in the file content:
Note here that because we have two steps, the chain.run()
function will take two different vectors - one for each step, in order. This means that we are passing the article content to the first prompt, but no parameters to the second document.
Conversational Chains
Of course, the last chain we need to talk about is conversational chains. In a nutshell, conversational chains allow you to load context from memory by using saved chat history. In situations where the platform or model cannot access saved chat history, you might store the response and then use it as extra context in the next message.
To use conversational chains, like before, we need to create a Chain
(now imported from the conversation module) and define the steps for it:
Next, we will individually send each prompt to the Chain
in turn, printing out the response from each one. Note that at step 4, we should receive an answer that includes the names of the previous three people we just made a personalized greeting for (Joe, Jane and Alice).
Running this should get an output that looks something like this:
Using embeddings with llm-chain
In terms of using embeddings with llm-chain
, it provides a helper method for using Qdrant as a vector store. It abstracts over the qdrant_client
crate, providing an easy way to embed documents and carry out similarity search. Note that the Qdrant
struct will assume your collection(s) that you want to use have already been created!
Basic usage
While we can use qdrant_client
to manually create our own embeddings, llm-chain
also has an integration for easy access. We will be required to create our own client through qdrant_client
- which we can then use with the Qdrant
struct to be able to parse stuff.
First, let's define a couple of passages that we want to insert into our Qdrant collection:
The Qdrant
struct will automatically assume you have your collection set up and have a QdrantClient
that already exists, along with the collection name. We'll pass these as arguments into a new function that does the following:
- Create embeddings using
llm-chain-openai
- Insert the embeddings into Qdrant
- Conduct a similarity search using the prompt
Firstly, we'll want to define a method for creating our Qdrant
struct so that we can re-use it later on:
Next, we can use the Qdrant
struct to carry out a similarity search! We'll add our documents to our collection, then conduct a similarity search and print out the stored documents:
After this, we can then send it into a Chain
or whatever else we need.
Usage within a prompt template
In isolation, the Qdrant struct is not particularly helpful and mainly provides convenience methods for embedding things. However, we can also add it as part of a ToolCollection
which lets the pipeline know that it is able to use embeddings.
Processing data using llm-chain
While llm-chain
provides tooling for creating LLM pipelines, another important part of Langchain and libraries like it is being able to process and transform data. Prompts (and prompt engineering) are important to get right. However if we're also feeding data into our pipeline, we'll also want to make sure it's as easy as possible to find the most relevant context.
Below are a couple of useful use cases that you may want to check out.
Scraping search results
llm-chain
provides a convenience struct for scraping Google Results using the GoogleSerper
struct - using the Serper.dev service.
As well as this, there is also support for Bing Search API which provides 1000 free searches a month - you can find more about the pricing here. Below is a code snippet of how you can use the API:
Should you need to change between one or the other, both are quite easy to use.
Extracting labelled text
llm-chain
also has some convenience methods for extracting labelled text. If you have a string of bullet points for instance, you can use extract_labeled_text()
to be able to extract the text.
Running this code should result in an output that looks like this:
You can find out more about the parsing
module for llm-chain
here as well as some of the examples.
Conclusion
Thanks for reading! With the power of llm-chain
, you can easily leverage AI for your applications.
Read more: