Learn how to use Langchain Expression Language (LCEL) to create robust and production-ready chains while keeping a simple code base.
Introduction
Increasingly, AI-based apps are becoming more and more complex with the integration of so many new tools. All these integrations mean more complexity. and more difficulty to have a fast, robust, production-ready application. To tackle these problems, Langchain created LCLE which stands for LangChain Langage Expression and and is the solution they developed for creating production-ready code.
Let’s explain this right away!
Langchain LCEL
From the beginning, Langchain created the concept of chains, equivalent to pipelines or sequence of tasks (see this link for for information). It was powerful and simple but as the chains became increasingly complicated, the code was became a nightmare to update and maintain.
And so, the creation of Langchain LCEL(link).
Here’s a one-liner definition of LCEL:
LangChain Expression Language (LCEL) is a declarative system designed for easily building and deploying multi-step computational chains, from simple prototypes to complex, production-level applications.
Here are the interesting parts of this definition (that we will see later one):
- Declarative system: This refers to the chaining declaration used to define a chain :
chain = prompt | model | output_parser
. - Multi-step computational chains: this means that you will create multiple chains and tie all of them in one single chain.
- From simple prototypes to complex, production-level applications: this means that you will create something simple and fast but also robust enough to be used in production.
Ultimately, this means that you will have a very powerful and elegant solution, capable of handling most use cases, although it will initially be complex to set up. There is always a tradeoff in any solution, and in the case of Langchain, it involves a steep learning curve.
LCEL declarative system
The LangChain Expression Language (LCEL) is a declarative system which means it focuses on defining the desired results or goals without explicitly programming the steps to achieve them.
It simplifies the process of setting up complex computational tasks by allowing users to state “what” outcome is needed rather than detailing “how” to achieve it.
Let’s take the following example to explain all this:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_template("tell me a short joke about {topic}")
model = OpenAI(model="gpt-3.5-turbo")
output_parser = StrOutputParser()
chain = prompt | model | output_parser
chain.invoke({"topic": "ice cream"})
As you can see, we create a prompt, a model and a way to parse the output., which are completely standard.
Here’s where the magic happens: chain = prompt | model | output_parser
and here’s what this means:
- You create a Langchain chain that will be contained inside the variable chain
- The first part of your chain is the prompt that will receive your input
- Then you have the
|
which looks like the Unix pipe operator and pass the output of the left part of the pipe to the right part - So the output of the prompt will go as the input of the model
- and finally the output of the model will go to the parser
At no point, you code how the input of one component go to the next, you just wrote what you wanted which is the basic of a declarative system.
Multi-step computational chains
When you begin a complex LLM use case, you will soon arrive at a point where you will have multiple chains that do specific things (calling an API, getting the data from a retriever, handling the chat history). Each of these chains will require different inputs and sometimes the output from on or more other chains. So you will have a complex workflow with chains with different dependencies.
And this is what LCEL allows you to do, to link all these chains together into one single chain.
Let’s take the classical exemple of a RAG chain. The principe behind a RAG chain is to embed the question to get the most relevant chunks of text from your vector store and put it inside the prompt with the question.
This means your prompt will need not only the question but also the retrieved data. In this case, in the logic of the code, you will have a fork before the prompt where you will, in one fork, get the retriever’s data and the other one, pass the question.
Here’s a representation of the workflow:
Let’s implement this and see how it works:
from langchain_community.vectorstores import FAISS
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_openai import OpenAIEmbeddings
vectorstore = FAISS.from_texts(
["Metadocs loves coding with Langchain", "bears like to eat honey"],
embedding=OpenAIEmbeddings(),
)
retriever = vectorstore.as_retriever()
template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
output_parser = StrOutputParser()
setup_and_retrieval = RunnableParallel(
{"context": retriever, "question": RunnablePassthrough()}
)
chain = setup_and_retrieval | prompt | model | output_parser
chain.invoke("What does Metadocs loves to do?")
As you can see, we have 2 simultaneous forks in the process where the rest of the process needs the output for each of the fork to continue. And so, the 2 forks needs to be launched in parallel.
All the magic is done here:
setup_and_retrieval = RunnableParallel(
{"context": retriever, "question": RunnablePassthrough()}
)
Here’s what is happening:
- We create a
RunnableParallel
object which launch a list of task in parallels. It accepts a dict that contains the elements to be launched. - Inside the dict we specify the
context
key which will contains the retrieved data and thequestion
key - For the
context
, we are literally pushing the retriever object that will be invoked to get the retrieved data from your vector store. - For the
question
key, we are usingRunnablePassthrough()
which means to take the chain input data and pass it without any modifications. - This step will then be done only when the 2 keys have finished computing and the rest of the chain will continue but the arguments of the chain will be then
{"context": RETRIEVED_DATA, "question": QUESTION}
And so, when you are at the PromptTemplate
object, the input you will receive is this: {"context": RETRIEVED_DATA, "question": QUESTION}
.
Runnable interface
In the previous part, I talked about multi-step computational chains but in the fork of the example I used, I only used a RunnablePassthrough
and a retriever
and not chains.
This is not an error at all. In fact, this is a core principal of the inner working of LCEL.
All the components of Langchain implement the Runnable interface.
An interface, in coding principals, is a defined contract or blueprint that specifies the methods a class should implement, guiding how objects interact in a system without dictating the specific implementation details.
In this case, the Runnable interface define a set of methods that all higher level components need to implement to be used. We will not go too deep inside but here’s some useful methods:
Stream
: stream back chunks of the response. This allows chain to be process in a streaming way. For example, with this, you can display the output of a chain as it comes and not when everything is finished and without adding more code to your chains.Invoke
: call the chain on an input. This is the default function used in most case inside a chain. For example, when you give aretriever
, behind the scene, the chain will callretriever.invoke(args)
.Batch
: call the chain on a list of inputs. This is used when you want to process a batch of items for example if you process data file by file.
If we take the previous example, here’s some interesting things:
RunnablePassthrough().invoke("How are you ?") # will print "Who are you ?"
retriever.invoke("How are you ?") # will print the retrieved data for "How are you ?"
setup_and_retrieval = RunnableParallel(
{"context": retriever, "question": RunnablePassthrough()}
)
setup_and_retrieval.invoke("How are you ?") # Will return {"context": RETRIEVED_DATA, "question": How are you ?}
chain = setup_and_retrieval | prompt | model | output_parser
chain.invoke("How are you ?") # This will launch the chain for "How are you ?"
This also means that all higher level components, from the simple RunnablePassthrough
to the biggest chain are chains. So whenever you are using LCEL, or even Langchain in general, you are using and composing chains for the smallest things. This is the beauty of LCEL.
Langchain LCEL production ready features
Now that we have see the beauty of LCEL, we need to list all the features that make Langchain LCEL production ready:
- Stream support: this means that with the same code, you can handle streaming use case where you send the results as you generate it and not when it is finished. For example, if you want to print the result of the LLM as you receive it, you will need to use streaming.
- Async support: this means that with the same code, you can handle async usage. For example, in web app, where you want to launch the chain but do not want to block the processing and just put turning wheel icon until the response is ready.
- Optimized parallel execution: all tasks or chains that can be launched in parallel will be launched in parallel without adding more code. This is extremely powerful as parallel execution is inherently complex to do in code.
- Retries and fallback: in the chain, you can specify what needs to be done if the chain fail. For example, if your RAG chain fail, you can easily add some retry and also create a fallback chain sending a message to the user.
- Intermediate results access: because every components are chains, this means that you there are results at every components of a process and LCEL allows to recuperate the results for each steps. This does not look like much but when you want to debug, this is really life saving.
- Input / Output schemas: all components of Langchain are typed meaning they have a type for the input and also for the output (for example, the simplified input type of our example is
str
as it accepts a single string). This means that you can infer the output schema for each components and for the whole chain which makes data validation easier. - Compatibility with LangServe and LangSmith: There are tools to create APIs directly on top of your chain and monitor them. It is incredibly simple to use them, literally with less than 20 lines of code, you can have a fully featured API directly on top of your chain.
The really interesting point is that you have all these features without changing your whole code completely.
Langchain LCEL defaults
We talked a lot about all the good points of LCEL but we need to talk about all the defaults too:
- The learning curve: LCEL is difficult to learn a the beginning because it is a declarative system and so you really need to understand what is happening behind the scene to make it work, which takes a lot of time. The good point is that when it is done, you have a very powerful tool on your hands.
- Debugging chains: when you develop your chain, the error message that you will have are not helpful at all and the only way to understand what is happening is to check the intermediate results. This means that debugging can takes a lot of time and you really need to experience firsthand. This is also why tools like LangSmith are really priceless.
- Workflow complexity: if your chain’s workflow is really complicated with conditions and possibility to come back to specific part of the chain, then it is too complicated for only LCEL and you will need to add LangGraph.
Conclusion
Langchain LCEL is an incredible upgrade to Langchain which makes it so much more production-ready. It allows developer to easily create a streaming and async compatible with the same code and with typing and validation features. The downside is that is more complicated to learn but the end results is that you will have something that you can in a real world application.
Afterward
I hope this tutorial helped you and taught you many things. I will update this post with more nuggets from time to time. Don’t forget to check my other post as I write a lot of cool posts on practical stuff in AI.
Cheers !