Site icon Use AI the right way

Create a real world RAG chat app with Langchain LCEL

You want to create a complex RAG application, add nice features like a vector store, and integrate chat history? You want to use Langchain and you are completely lost on how to do it ?
Then you have come to the right place.


In this blog post we will create a prototype application with the following features:

Here, you will learn to create a real-world RAG chat application using Langchain code to power your own application.” Let’s get to work!


A real world chat application

First, we need to define what we talk about we we say real world. There are a lot of Langchain tutorials that shows how to use this or that feature. For example, you have many tutorials on how to handle chat history in LLM call, how to use RAG, how to create your own vector stores, how to have function calling or even how to create an agent. These are the types of features you will want in your own chat application, but you will soon encounter a common problem:
How do I tie all these features together in way that is robust, scalable but still allow as much flexibility as possible ?
The answer Langchain came with LCEL. You can find a complete and simple explaination of Langchain LCEL here but we can define as a framework to build and deploy multi-step chains, from simple prototypes to complex, production-level applications.
We will use Langchain LCEL in conjonction with two other Langchain features to create our production-ready code: configurable_fields and configurable_alternatives, for which you can find an explanation here.

So let’s begin.

Features of the example

Let’s list the features we will have in our real world RAG chat app so that we have our ideas clear:

These are some really nice features that would compete with any tools on the market right ?
Let’s begin right now!

Initialize the work environnement

We will use the same setup as the previous posts, meaning Streamlit, Langchain, FAISS vector store and Pipenv for managing virtual env. For better readability, I will create a new folder called create-complex-webapp-with-langchain-best-practices and copy inside it all the files I need from the previous post (and rename some).

cd create-complex-webapp-with-langchain-best-practices

Now we need to install the pipenv virtual env:

pipenv install

You can now check that the web app is launching with this:

pipenv run streamlit run

Create the configurable prompt

Let’s begin with the configurable prompt: we want two prompts, one single line explanation and one detailed and the ability to change the type at runtime.

from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import ConfigurableField, RunnablePassthrough

template_single_line = PromptTemplate.from_template(
    """Answer the question in a single line based on the following context.
    If there is not relevant information in the context, just say that you do not know:

Question: {question}

template_detailed = PromptTemplate.from_template(
    """Answer the question in a detailed way with an idea per bullet point based on the following context.
    If there is not relevant information in the context, just say that you do not know:

Question: {question}

prompt_alternatives = {
    "detailed": template_detailed,

configurable_prompt = template_single_line.configurable_alternatives(
        name="Output type",
        description="The type for the output, single line or detailed.",

Here’s what is happening:

Create the configurable FAISS retriever

Now that we have the configurable prompt, we need the configurable retriever. We want to create two FAISS vector stores saved locally and be able to choose either one at runtime for our questions.

politic_vector_store_path = "politic_vector_store_path.faiss"
environnetal_vector_store_path = "environnetal_vector_store_path.faiss"

class ConfigurableFaissRetriever(RunnableSerializable[str, List[Document]]):
    vector_store_topic: str

    def invoke(
        self, input: str, config: Optional[RunnableConfig] = None
    ) -> List[Document]:
        """Invoke the retriever."""

        vector_store_path = (
            if "Politic" in vector_store_topic
            else environnetal_vector_store_path
        faiss_vector_store = FAISS.load_local(
        retriever = faiss_vector_store.as_retriever(
            search_type="similarity", search_kwargs={"k": 4}
        return retriever.invoke(input, config=config)

configurable_faiss_vector_store = ConfigurableFaissRetriever(
        name="Vector store topic",
        description="The topic of the faiss vector store.",

Let’s explain this code:

Create the FAISS vector stores

Now that we have the configurable FAISS retrievers, we need our vector store. The goal is to create vector store one time, save it locally and re-use it. We need to specify the interface so that the vector creation does not appear every time we chat.

with st.expander("Upload Files to Vector Stores"):
    politic_index_uploaded_file = st.file_uploader(
        "Upload a text file to Politic vector store:", type="txt", key="politic_index"
    if politic_index_uploaded_file is not None:
        string_data = politic_index_uploaded_file.getvalue().decode("utf-8")
        splitted_data = string_data.split("\n\n")
        politic_vectorstore = FAISS.from_texts(splitted_data, embedding=embedding)
        st.success("Politic vector store loaded successfully!")

    environnetal_index_uploaded_file = st.file_uploader(
        "Upload a text file to the Environnemental vector store:",
    if environnetal_index_uploaded_file is not None:
        string_data = environnetal_index_uploaded_file.getvalue().decode("utf-8")
        splitted_data = string_data.split("\n\n")
        environnetal_vectorstore = FAISS.from_texts(splitted_data, embedding=embedding)
        st.success("Environnemental vector store loaded successfully!")

Let’s explain this code:

Create the Langchain chain

Now that we the vector store and the different components, let’s create our chain to tie down everything.

DEFAULT_DOCUMENT_PROMPT = PromptTemplate.from_template(template="{page_content}")

def combine_documents(
    docs, document_prompt=DEFAULT_DOCUMENT_PROMPT, document_separator="\n\n"
    """Combine documents into a single string."""
    doc_strings = [format_document(doc, document_prompt) for doc in docs]
    return document_separator.join(doc_strings)

CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template(
    """Given the following conversation and a follow up question, rephrase the 
follow up question to be a standalone question, in its original language.

Chat History:
Follow Up Input: {question}
Standalone question:"""

def format_chat_history(chat_history: dict) -> str:
    """Format chat history into a string."""
    buffer = ""
    for dialogue_turn in chat_history:
        actor = "Human" if dialogue_turn["role"] == "user" else "Assistant"
        buffer += f"{actor}: {dialogue_turn['content']}\n"
    return buffer

vector_store_topic = None
output_type = None

inputs = RunnableMap(
        chat_history=lambda x: format_chat_history(x["chat_history"])
    | model
    | StrOutputParser(),

context = {
    "context": itemgetter("standalone_question")
    | configurable_faiss_vector_store
    | combine_documents,
    "question": itemgetter("standalone_question"),
chain = inputs | context | configurable_prompt | model | StrOutputParser()

Let’s explain this code:

Create the chain invocation

Now that we have the chain ready, we need to invoke the chain with the parameters.

st.header("Chat with your vector stores")
if os.path.exists(politic_vector_store_path) or os.path.exists(
    vector_store_topic = st.selectbox(
        "Choose the vector store configuration:",
        ["Politic", "Environnemental"],
    output_type = st.selectbox(
        "Select the type of response:", ["detailed", "single_line"]

    if "message" not in st.session_state:
        st.session_state["message"] = [
            {"role": "assistant", "content": "Hello 👋, How can I assist you ?"}

    chat_history = []

    for message in st.session_state.message:
        with st.chat_message(message["role"]):

    if query := st.chat_input("Ask me anything"):
        st.session_state.message.append({"role": "user", "content": query})
        with st.chat_message("user"):

        response = chain.with_config(
                "vector_store_topic": vector_store_topic,
                "output_type": output_type,
        ).invoke({"question": query, "chat_history": st.session_state.message})

        st.session_state.message.append({"role": "assistant", "content": response})
        with st.chat_message("assistant"):

Let’s explain this code:

Below what the app will look like. Nice right ?


Congratulation! You just create a super RAG chat application that is looking like something that could be used in the real world. You also learning of to use configuration to change behaviour of your chains and compose multiple chains together.

But the fun only begins now! To have a real world application that you could have user on, you will need a real web framework instead of Streamlit with features like user management, authentification but also better robustness with something better than local FAISS vector stores.
This is a tough work but as they said, Roma has not been made in one day right ?


I hope this tutorial helped you and taught you many things. I will update this post with more nuggets from time to time. Don’t forget to check my other post as I write a lot of cool posts on practical stuff in AI.

Cheers !

Exit mobile version