In this post, we are going to talk about integrating reasoning models like DeepSeek or OpenAI o3 into the company’s business. It is not enough to just deploy one such model and use it. The model is capable of “reasoning” but you still need to give the what to “reason” on and how. This is the role of the meta knowledge.
In this blog post, I will explain reasoning models in a simple way and talk about meta knowledge and why it is absolutely necessary. As usual I will try to be straightforward so let’s get to it.
What are reasoning models?
Reasoning models are a special class of LLMs that are significantly better to handle tasks that needs reasoning. They are based on the following principals (for more detailed explanation, don’t hesitate to read my detailed post on how OpenAI o1 works):
- Longer inference time: At the difference of classical LLMs where the response is generated instantly, a reasoning model will takes more time to answer any prompt as it will generate different plans (CoT) and use the final plan to answer the query. This is a paradigm change as we are not only putting resources on the training part but also on the inference.
- CoT generation: The model generates CoT (Chain-of-Thought) which are like a plan or a lists of steps to answer the query (it is based on the prompt technique of the same name that asks the the LLM to answer a query step-by-step).
- Large scale reasoning dataset: To train the reasoning part, a large scale dataset focused on reasoning using code and math problems is used.
- “Aha” moments during training: The model is trained using reenforcement learning where a reward function is given. During the training, the model will experiment different solutions for each problem given in the training dataset to maximise the reward function results. This means that some “aha” moments happen during training where the model, in the middle of trying a solution for a problem, suddenly grasp a new concept or strategy and get instantly better. This is the power of reenforcement learning.
Practical difference with a reasoning model
So what is really the difference between a reasoning model and a normal model for the end users ?
Reasoning models will first generate the plan (or CoT) it needs to answer your query and then will give you the answer. Let’s take an example:
Query: How to merge two dicts ?
Answer:
<think>
Alright, the user just asked, "How to merge 2 dicts?" I need to figure out
what they mean by that. From our previous conversation, I see they were
asking about the capital of France, and I answered correctly. Now, their
follow-up question is about merging two dictionaries.
First, I should consider the context in which they're asking this. Maybe
they are working on a programming task or dealing with data structures.
The term "dict" usually refers to dictionaries in Python, so it's likely
they're coding something and need help combining two dictionaries.
I should explain how to merge two dictionaries by keys in Python. There
are different methods depending on whether the keys are similar or
conflicting. If keys overlap, one dictionary will overwrite the other. I
can outline both scenarios: merging without overwriting and using update
to overwrite existing keys.
Additionally, I'll provide examples to illustrate each method clearly.
Using a simple code example would make it easier for them to understand
how to implement this in their project. It's important to keep the
explanation straightforward and avoid any jargon that might confuse them.
</think>
To merge two dictionaries in Python (assuming you're working with Python
dictionaries), there are a few ways to do this, depending on how you want
to handle overlapping keys:
....
The part between the <think> tags is the plan (or CoT) that it will use to answer the question. In this case, it tried to:
- Figure out what the user mean with the question
- Taking into consideration the context
- What to include in the answer
- How to present it
As you can see, this is very similar to how you would try to answer a math problem which is the goal of using math and code focused training dataset. But this has its pros and cons:
Advantages of reasoning models:
- Complex task: they are made to handle complex tasks where you will need decompose them into smaller tasks.
- Planning: They are very good if you want to discuss and iterate on how to achieve a goal. Because the model will give you the step by step on how to accomplish what you want, it will easier to understand it.
- Validation: Because you will have the plan of the model, you can better validate the reasoning behind it. This is really interesting in context where you really need to have the correct answer because you can validate the steps taken to answer.
- Control over the reasoning: You can control the reasoning by giving advice or methodology to use at runtime. This is the goal of meta knowledge!
Disadvantages of reasoning models:
- Longer inference does not equals better answer: In theory, longer inference should give you better answer but that is not always the case. Because it generate a plan of sub tasks to answer your query, there are as much errors that can happens.
- More token usage: They will generate a lot more tokens as they need to generate multiple solutions and use them to generate the final plan. This means more generated token and bigger cost.
- Slower response: Longer inference means longer answer. In context where you need a very fast answer, these are not the kind of models you need.
- Made for complex tasks: they are made to handle complex tasks, not simple one like “How to say that in this language ?”. Using a reasoning model for a simple task is like using a bazooka to kill a fly.
Reasoning models and agentic
Reasoning models are great because they can handle very complex tasks by creating a plan and decomposing it as smaller tasks. This is perfect for agentic, from the basic of agents to the most complex multi-agent systems.
To talk about usage, let’s see some multi-agent architecture:
- Network: Each agent can communicate with every other agent. Any agent can decide which other agent to call next.
- Supervisor: Each agent communicates with a single supervisor agent. Supervisor agent makes decisions on which agent should be called next.
- Supervisor (tool-calling): This is a special case of supervisor architecture. Individual agents can be represented as tools.
- Hierarchical: You can define a multi-agent system with a supervisor of supervisors.
- Custom multi-agent workflow: Each agent communicates with only a subset of agents. Parts of the flow are deterministic, and only some agents can decide which other agents to call next.
- Single agent: an unique agent use the available tools to answer the query.
In all these architecture, you have these 3 principals used:
- Single responsibility: One agent takes care of one task. The more focused the task is, the better it is to tailor the agent. If the task is not complex, then you do not need a reasoning model.
- Decision making: One agent decides if it needs to call either a tool or another llm for a specific information or task. In the case of supervisor, you have an agent only there to plan and supervise what other agent do.
- Flow flexibility: This is how much freedom the agent system has. In case of the single agent, you have maximal flexibility as you are only restricted by the tools you have. On the opposite side, the custom agent system has limited path and ending possible. The more flexibility there is, the more use case the agent system can handle but with the price of increased complexity and potential fails.
But reasoning models change the scope of these principals:
- Single bigger responsibility: The agent should still takes care of only one task but scale will differ. Instead of having a task “Get the sales of product X for from DATE1 to DATE2”, you can have an agent for the whole database that has the model in its context. Then it can answer the previous task but all others for the same database.
- High level decision making: The agent should be able to call other agents or tools but at much bigger scale. It should be able to create a plan by using all the various products and data given for a domain. You do no ask it “Get the sales per user for the product X in the Y country from DATE1 to DATE2” but you ask “Compare me the sales of each product in the last month with the previous month and give me an analysis”. The only thing it needs is what agents, products, data is available and how to use it.
- Flow flexibility: The flexibility is maximal as the agent can do whatever it is allowed to do to accomplish the task. There is no fixed flow or outcome.
So reasoning models will be key to unlock enterprise level agentic systems but only if we tell it how to: this is the role of the Meta Knowledge.
Meta Knowledge: unlocking reasoning models
The goal of a reasoning model is to create a plan to answer a query. In fact, it is creating a strategy to accomplish the given goal. But by itself, it has only the data it has been trained on. So it is very good on math problems, coding and some other things. But what if you ask it questions on your company data ?
Obviously, it will not be able to give a correct answer. But if you give enough data, it could give you an answer. This is actually RAG. But what about when you want an agent that can by itself get the needed data or call this or that api ? Then you will need to give it information on what data is available, how to get it and how to assemble all the different piece of data it founds. You give it knowledge on how to use knowledge –> meta knowledge.
Let’s take an example of a sql database used in a webapp that sell some products. Inside it, you have tables for users, products, prices, contracts and stocks. We want to create an agent system in front of this database so that we can ask things like this: “Compare the sales of from the last 3 months”.
To simplify this example we will say that the sql queries and code generated by the system are always technically correct and that there are not data problems.
So here’s what we should give to this agent so that it can plan effectively:
- A tool to access the data inside the sql database.
- A tool to send the report in whatever format is accepted.
- A tool to execute generated python code that will be used by the model to do the data processing and calculus.
- The data model of the data base: table names, what do they contains, links between each table, …
- Information on the different processes: specific contracts, discount, …
- Output specification: output format, what data needs to be included, …
So here, we just gave the agent the required tools for its tasks but also the necessary business and technical knowledge on how to use them.
What’s more, using the generated plan for this graph, we can simply validate the reasoning behind it. Then, we can modify the given meta-knowledge. The goal is to give as much information as needed so that the agent can correctly devise a plan to finish the task, the same way as we would give all the required information to a newcomer to fulfill a task.
What does Meta Knowledge look like ?
Meta Knowledge are actually a fancy name for the documents that explain how the business of the company works. It is the kind of documents you give to a newcomer for the domain you want to create the agent for. Here’s some type of documents you can use:
- Slides that explain at higher level the different concepts.
- Business documents that explain the details of the domain.
- Technical specification which the underlying data model.
- Data models of the different data source.
- Programing code that is used in the current system.
But as it is something to give to a model, you need to be careful transform this into a format that can be understood by it. A model only understands images text and code so you need to process all your corpus of documents to generate something the model can use. It should be:
- Understandable: this document should be understandable from both the business owners which should validate it and the model.
- Comprehensive: this document should contains all the required information needed for the agent to handle all its use cases.
- Focused: this document should be focused on the domain or group of use cases that the agents need to handle. More and you would risk having reasoning errors.
- Accepted Format: this document should be in a format that should be readable by both the model and the different maintainers of this document. Possible formats are, for example, markdown or code. The most important point is that it can be validated by all relevant parties.
To summarize, it is the kind of document you would give to a newcomer developper that would be tasked to develop the all the features the agent will handle. If you think it would not be enough for the developer, then it would not be enough for the agent.
Conclusion
Reasoning models are type of model that generate plan (CoT) to complete a task before executing it. For that It is trained on specific logic oriented dataset and use longer inference time to first generate the plan. They are well suited for agentic workflow as they can handle far more complexity that normal models. By giving meta knowledge, which is knowledge on how to use business knowledge, data source, internal processes in a well understandable format, it is possible to create higher level agentic system that can by itself generate correct reasoning for high level task. This is really the beginning of the enterprise level agentic system where you will have agent systems that can communicate between themselves and achieve most of the tasks.
Afterwards
I hope you really loved this post. Don’t forget to check my other post as I write a lot of cool posts on practical stuff in AI.
Follow me on