Building Agentic RAG Application with DeepSeek R1 — A Step-by-Step Guide [Part 2]

Building Agentic RAG Application with LangChain, CrewAI, Ollama & DeepSeek

Jan 29, 2025

∙ Paid

While Retrieval-Augmented Generation (RAG) dominated 2023, agentic workflows are driving massive progress in 2024 and 2025. The use of AI agents opens up new possibilities for building more powerful, robust, and versatile large language model (LLM)- powered applications.

One possibility is enhancing RAG pipelines with AI agents in agentic RAG pipelines. This article provides a step-by-step guide to building an agentic RAG application using the DeepSeek R1 model.

In part 1 you were introduced to the Agentic RAG workflow and the main components of our application. Then we started the implementation by setting up the working environment and installing the LLM locally with Ollama. Then we defined the tools and th agents we use in the second part.

In this part, we will continue building the workflow by defining the agentic workflow tasks then wrap up the whole agentic RAG workflow, and finally put it into action and testing it with different queries.

Agentic RAG Pipeline [Part 1]
Setting Up the Working Environment & Installing DeepSeek LLM [Part 1]
Define Agentic RAG Tools [Part 1]
3.1. Define RAG Tool
3.2. Define Web Search Tool
Define Agents [Part 1]
4.1. Define Router Agent
4.2. Define Retriever Agent
4.3. Define Answer Grader Agent
4.4. Define Hallucination Grader Agent
4.5. Define Answer Generator Agent
Define Agents Tasks
5.1. Define Router Task
5.2. Define Retriever Task
5.3. Define Answer Grading Task
5.4. Define Hallucination Grading Task
5.5. Define Answer Generation Task
Define Agentic Workflow
Agentic RAG Pipeline in Action

My New E-Book: Efficient Python for Data Scientists

Youssef Hosni

Jan 7

I am happy to announce publishing my new E-book Efficient Python for Data Scientists. Efficient Python for Data Scientists is your practical companion to mastering the art of writing clean, optimized, and high-performing Python code for data science. In this book, you'll explore actionable insights and strategies to transform your Python workflows, streamline data analysis, and maximize the potential of libraries like Pandas.

Read full story

5. Define Agents Tasks

5.1. Define Router Task

The first task the agent will do is the routering task which will router the agent to search through the vector store for the answer and if it does not find it, it should search the web for the answer.

router_task = Task(
        description =(
            "Analyze the given question {question} to determine the appropriate search method:\n"
            "\n"
            "1. Use 'vectorstore' if:\n"
            "   - The question contains a keyword or a similar words\n"
            "   - The topic is likely covered in our vector database\n"
            "\n"
            "2. Use 'web_search' if:\n"
            "   - The topic requires current or real-time information\n"
            "   - The question is about general topics not covered in our vector database\n"
            "\n"
            "Make decisions based on semantic understanding rather than keyword matching."
            ),
        expected_output=(
            "Return exactly one word:\n"
            "'vectorstore' - if the question can be answered from our RAG knowledge base\n"
            "'web_search' - if the question requires external information\n"
            "No additional explanation or preamble should be included."
        ),
        agent=router_agent,    
        tools=[router_tool],
)

5.2. Define Retriever Task

The second task is to retrieve the answer from the vector store if the router agent decides that the question can be answered there.

retriever_task = Task(
    description=("Based on the response from the router task extract information for the question {question} with the help of the respective tool."
    "Use the web_serach_tool to retrieve information from the web in case the router task output is 'websearch'."
    "Use the rag_tool to retrieve information from the vectorstore in case the router task output is 'vectorstore'."
    ),
    expected_output=("You should analyse the output of the 'router_task'"
    "If the response is 'websearch' then use the web_search_tool to retrieve information from the web."
    "If the response is 'vectorstore' then use the rag_tool to retrieve information from the vectorstore."
    "Return a clear and concise text as response."),
    agent=Retriever_Agent,
    context=[router_task],
    tools=[rag_tool, web_search_tool],
)

5.3. Define Answer Grading Task

The third task is the answer grading task whose role is to evaluate the retrieved answer from the vector database. If the retrieved document is not the right response to the question then the agent should try again.

grader_task = Task(
    description=("Based on the response from the retriever task for the question {question} evaluate whether the retrieved content is relevant to the question."
    ),
    expected_output=("Binary score 'yes' or 'no' score to indicate whether the document is relevant to the question"
    "You must answer 'yes' if the response from the 'retriever_task' is in alignment with the question asked."
    "You must answer 'no' if the response from the 'retriever_task' is not in alignment with the question asked."
    "Do not provide any preamble or explanations except for 'yes' or 'no'."),
    agent=Grader_agent,
    context=[retriever_task],
)

5.4. Define Hallucination Grading Task

The fourth task is the hallucination grading task which will evaluate the facts in the answer and decide whether it contains real or made-up facts. If there are no hallucinations then the retrieved documents or answers will go to the answer generation agent.

hallucination_task = Task(
    description=("Based on the response from the grader task for the question {question} evaluate whether the answer is grounded in / supported by a set of facts."),
    expected_output=("Binary score 'yes' or 'no' score to indicate whether the answer is sync with the question asked"
    "Respond 'yes' if the answer is in useful and contains fact about the question asked."
    "Respond 'no' if the answer is not useful and does not contains fact about the question asked."
    "Do not provide any preamble or explanations except for 'yes' or 'no'."),
    agent=hallucination_grader,
    context=[grader_task],
)

5.5. Define Answer Generation Task

The final task is the answer generation task which generates the final response. However, before that, it makes sure that the answer is free of hallucination and is relevant to the input question.

answer_task = Task(
    description=("Based on the response from the hallucination task for the quetion {question} evaluate whether the answer is useful to resolve the question."
    "If the answer is 'yes' return a clear and concise answer."
    "If the answer is 'no' then perform a 'web-search' and return the response"),
    expected_output=("Return a clear and concise response if the response from 'hallucination_task' is 'yes'."
    "Perform a web search using 'web_search_tool' and return ta clear and concise response only if the response from 'hallucination_task' is 'no'."
    "Otherwise respond as 'Sorry! unable to find a valid response'."),
    context=[hallucination_task],
    agent=answer_grader,
)

Now we are ready to wrap up everything and define the agentic workflow in the next section.

Keep reading with a 7-day free trial

Subscribe to To Data & Beyond to keep reading this post and get 7 days of free access to the full post archives.

To Data & Beyond