Create our own Copilot style chat interface with LangChain

Rav

2 days ago

Copilot style interface becomes more and more mainstream as they have gained popularity with tools like Claude, ChatGPT and others that are fully using it. It is a very powerful feature but at the same time, it is not that complex.

So let’s build our own Chat Copilot style from scratch using LangChain, OpenAI and Streamlit so that you can understand how your favorite tool works!

As usual, we will strive for simplicity so let’s get to it!

What is a Copilot style interface?

A Copilot style interface is actually just an interface where you have on one side a panel that you are actively modifying and on the other side you have a helper panel but with far more intelligence.

The best example is Claude’s Artifacts feature (which is actually 1 year old now).

You have the chat on the left and the Artifacts panel on the right which can show different types of inputs, from code to markdown to mermaid diagram to rendered web app code (as you can see in the image) to better answer the query.

What is important is that, using LLMs, you can have an intelligent system that can give you and show you what you need at the right time:

You are trying to understand a new concept -> the system will generate diagram so that you can visualise the idea.
You are coding something -> the system will not only generate the code but also generate diagram for your data model.
You are iterating for the UX of you app -> the system can generate and render webapp code so that you can see how it looks.
You are analysing some complex data structure -> the system can summarize it and show as diagrams and tables.

This is what a copilot is, someone or something that help you when you do something and watch out for you.

Why is it actually a Simple to Implement Feature?

As said before, the copilot feature is really powerful.
But at the same time, it is not this complicated to create because all the hard work is done on the LLM side:

Decision: the decision to add or not a visualization and which one to take is done by the LLM when it generates the answer
Content: the content is generated by the LLM and for the most common format, the generation is robust and detailed.
Format: Powerful enough LLMs handles code, markdown and mermaid diagram. With just these 3, you can cover most of the cases.

These 3 type of outputs are actually very powerful because they are covering a lot of use cases:

Markdown: The markdown format is a very versatile format that can handle a lot document generation features. With this, you can generate nicely formatted document with titles, tables, link and even code inside.
Code: Code generation allows the system to generate code and even render or execute them. The webapp rendering inside Claude Artifacts is basically react code generation with specific libraries and a render system that renders the code so that you can see what it looks like.
Mermaid: The mermaid format, which is inspired from markdown, is a very versatile diagram and charting library that allows generation of diagrams, flowcharts and others.

All these 3 formats are widely used and so libraries to integrate them exist in the most common coding language (python, javascript, …) and framework (react, angular, …).

The Prompt: the Core of the Autopilot

The core of the autopilot is the LLM and so the prompt. This is where you will explain to the LLM what type of visualisation you want and how you want it generated. There is a lot of formats possible but here’s one possible format:

{
    "explanation": "",
    "viz_content: "",
    "viz_type": "markdown | code | mermaid"
}

To craft such prompt you will need to handle multiple points:

The output of the LLM should be a JSON, not just some text.
There should be a specific key of the output JSON that will be used to know which type of visualisation the LLM has chosen.
The generated content should align with what the renderer component expect. For example, when generating mermaid, the LLM will add “`mermaid at the beginning which the renderers does not want.

The good point is that if you get a fairly powerful LLM, it will in most case generate correct output.

Here I showed only 3 types of visualisation because those are the easiest to do but you can use other formats. The only limit is what the LLM can generate in a robust way and what renderer exists for this format.

Copilot-Style chat the easy way

Now that we have seen the theory, let’s create our own copilot chat!

For that, we are going to use Python, LangChain, OpenAI and Streamlit. You can find all the code here.

Python Dependencies

First let’s install the required libraries:

pipenv install langchain langchain-openai streamlit streamlit_mermaid dotenv

Here we are using pipenv to add all our required libraries. Most of them are common, the only special ones are doten and streamlit_mermaid which are respectively a library to load env vars from env files and a streamlit component to render mermaid.

Create a file .env containing your OpenAI api key at the same level as your code:

OPENAI_API_KEY=

Copilot Prompt

Now that this is done, let’s look at the code. Let’s begin with the prompt:

prompt = ChatPromptTemplate.from_messages(
        [
            SystemMessage(
                content="""You are a helpful programming assistant.
            Your will give a detailed answer.
            If needed for the clarity of the answer, you can add a visualisation as a mermaid diagram, code, markdown or text.
            
            For mermaid diagrams, follow these rules:
            - Use '-->' for arrows (not -->)
            - Start with graph direction (TD, LR, etc.)
            - Each node definition on new line with \\n
            - Escape special characters
            
            Example mermaid format:
            {
                "explanation": "Here's a flowchart example",
                "viz_content": "graph TD\\n    A[Start] ---> B[Process]\\n    B ---> C[End]",
                "viz_type": "mermaid"
            }
            
            For markdown content, follow these rules:
            - Use valid markdown syntax
            - Use \\n for newlines
            - Escape special characters
            
            Example markdown format:
            {
                "explanation": "Here's a markdown example",
                "viz_content": "# Title\\n\\nSome **bold** text and a [link](https://example.com)",
                "viz_type": "markdown"
            }
            
            For code content, follow these rules:
            - Use valid code syntax
            - Use \\n for newlines
            - Escape special characters
            
            Example code format:
            {
                "explanation": "Here's a code example",
                "viz_content": "def hello_world():\\n    print('Hello, world!')",
                "viz_type": "code"
            }
            
            Rules for all responses:
            - Must be valid JSON
            - Use \\n for newlines
            - Use \\" for quotes
            - No raw backticks
            """
            ),
            HumanMessage(
                content=f"""
            Previous conversation:
            {conversation}
            
            Current user message: {current_msg}"""
            ),
        ]
    )

Here’s how this prompt is made:

First we precise what role it should take and how to responds. In this case, it is a “programming assistant” and it should answer in a detailed way.
Then we precise, that if needed for the clarity of the answer, it can generate “visualisation as a mermaid diagram, code, markdown or text”. This allow the LLM to decide to generate a visualisation and which one to take.
Then we give the specification and an example for each format type and output we want. This is important as it will make the LLM output more robust for your usage.
Finally we have the placeholders for the conversation history and the current message.

LLM Output Format

Now let’s take care of the output format of the LLM answer. As mentionned before, we need 3 keys, “explanation”, “viz_content” and “viz_type”.

class LLMOutput(BaseModel):
    explanation: str = Field(..., description="The text explanation for the user")
    viz_content: str = Field(..., description="The content to be visualized")
    viz_type: Literal["code", "markdown", "mermaid", "text"] = Field(...)

...

    model = ChatOpenAI(model="gpt-4", temperature=0)
    parser = JsonOutputParser(pydantic_object=LLMOutput)

    chain = prompt | model | parser
    response = chain.invoke({"messages": messages})

Let’s see how this code works:

First we create a pydantic model that will define the type we want as the output.
Then we create a JsonOutputParser with this object that will parse string into such an object
Finally we create a simple chain where the output of the LLM goes to the formatter (model | parser).
In this case, we could have used LangChain structured_output to generate automatically in the prompt the description of the output but this works fine only with OpenAI models and sometimes, you want more control on the output.

Generate Different Type of Visualizations

Now that we have the correct output, we can see how it is used. This is using Streamlit code but the logic behind is the same and can be used with any framework like React or Angular.

if last_msg["viz_content"]:
    if last_msg["viz_type"] == "markdown":
        st.markdown(last_msg["viz_content"])
    elif last_msg["viz_type"] == "mermaid":
        stmd.st_mermaid(last_msg["viz_content"])
    elif last_msg["viz_type"] == "code":
        st.code(last_msg["viz_content"])
    else:
        st.write(last_msg["viz_content"])

Here’s what is happening in this code snippet:

First we check that “viz_content” exists (if the LLM decided to generate a visualisation).
Then we will use “viz_type” to get the type of visualisation we have.
Depending on this type, we use Streamlit component to render it.

Pretty simple right !? And this can be used pretty easily in any application.

Copilot chat execution

Now that we have the code, let’s run it!

pipenv run streamlit run app.py

And now let’s check out out beautiful app:

As you can see, we first asked how Metaclasses worked in python and the LLM decided to give us some Python code along the explanation. Then we asked for a diagram and it generated the mermaid diagram. Pretty neat right ?!

The best part is that it looks like what you can see on ChatGPT or Claude (though not as polished) so this showcase how easy it is to add these kind of features in your own applications.

Agents or No Agents: That Is the Question

As you can guess, what you saw before is is fully implementable using LLM agents. You can actually check out my earlier article for an example. So why not use the same approach?

Query Time: An agent system takes longer to answer than a simple LLM query just because it involves more LLM calls. In certain usage, this latency is not acceptable.
Robustness: the more LLM calls there is, the more unstable the system becomes.
Complexity: An agent system is more complex to implement and you need to handle all the edge cases.
Needs: Do you even need an agent system ? By adding just tools to the LLM, you can already have a really powerful system that can query from multiple sources.
Chat or Generation System: Depending on the type of application you want, you will need agent or not. For a chat system, you might not need it as it can be pretty straightforward but for a system that generates outputs, then a more complex agent system could be better

Therefore the decision to agents or not depends of the use case and what you want. The best way is to begin with the most simple implementation and see how far this will take you!

Conclusion

In this post, we saw how copilot style chat works by using the LLM to decide and generate visualisation accordingly to the user query. By specifying information like the the type of visualisation in the LLM output, we can show many different type of visualisation using markdown and mermaid format.The implementation itself is not difficult which make the integration of such feature simple (the hard part is on the UX and frontend part).

The copilot style add a lot of value to an application, as you can give to the user an assistant that will help him better use you application, whether it is a RAG chat or a coding copilot. One of the best usage of LLMs is to help users be more productive and the copilot style interface really hits the mark for that.