#30DaysOfLangChain – Day 9: Smart Agents: Agent Executors & Memory Management

Welcome to Day 9 of #30DaysOfLangChain! On Day 8, we introduced the powerful concept of LangChain Agents, autonomous LLMs that can reason and use tools. While an agent can perform a single task, real-world interactions often involve multi-turn conversations where the agent needs to remember past turns. This is where Memory Management comes in, working hand-in-hand with the Agent Executor.

Today, we’ll deepen our understanding of the AgentExecutor and, more importantly, equip our agents with conversational memory, allowing them to maintain context across a dialogue.

Agent Executors Revisited: Orchestrating the Agent’s Lifecycle

The AgentExecutor is more than just a runner for your agent; it’s the orchestrator of the agent’s entire decision-making loop. We briefly saw verbose=True on Day 8, which is invaluable for debugging. Let’s look at a couple more important parameters:

verbose=True: As seen, this prints out all the agent’s internal Thought, Action, and Observation steps, providing transparency into its reasoning process.
handle_parsing_errors=True: This helps the executor gracefully handle cases where the LLM’s output for an Action or Thought doesn’t conform to the expected format, preventing a crash.
return_intermediate_steps=True: If set to True, the invoke method will return not just the final output, but also a list of all the intermediate Thought, Action, and Observation steps taken by the agent. This is excellent for debugging, auditing, or building UIs that visualize the agent’s process.

The Crucial Role of Memory for Agents

Imagine talking to someone who forgets everything you said a moment ago. That’s an agent without memory! For an agent to be truly intelligent and helpful in a conversational setting, it needs to remember:

Past user inputs: What was the user asking about earlier?
Past agent responses: What did the agent say previously?
Intermediate thoughts/actions: Sometimes, the agent’s internal reasoning also needs to be preserved for coherence.

LangChain provides various Memory classes to manage this conversation history, which then gets injected into the LLM’s prompt.

ConversationBufferMemory: The simplest form of memory. It stores all messages (input and output) in a buffer and injects them directly into the prompt.
ConversationBufferWindowMemory: Stores a limited number of recent messages (k). Useful for long conversations where you don’t want to exceed token limits or bring in irrelevant older context.
Other types exist for summarizing, entity extraction, etc.

Integrating Memory with Your Agent

To integrate memory into an agent, you typically:

Define a MessagesPlaceholder in your agent’s prompt: This placeholder (e.g., MessagesPlaceholder(variable_name="chat_history")) is where the conversation history will be inserted.
Initialize a Memory object: Create an instance of ConversationBufferMemory or similar.
Pass history to the AgentExecutor: When you invoke the AgentExecutor, you’ll pass the memory’s buffered messages (e.g., memory.buffer_as_messages) to the chat_history variable in your input dictionary.

The agent’s LLM can then reason over the full conversation history when deciding its next Thought or Action.

For more details, check out the official LangChain documentation:

Project: A Conversational Agent with Memory

We’ll enhance our Day 8 agent by adding ConversationBufferMemory. This will allow our agent to remember past turns, answer follow-up questions, and maintain context throughout a multi-turn conversation. We’ll also see return_intermediate_steps in action.

Before you run the code:

Ensure Ollama is installed and running (ollama serve) if using Ollama.
Pull any necessary Ollama models (e.g., llama2).
Ensure your OPENAI_API_KEY is set if using OpenAI models.

import os
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_openai import ChatOpenAI
from langchain_ollama import ChatOllama
from langchain.agents import AgentExecutor, create_react_agent
from langchain import tool
from langchain.memory import ConversationBufferMemory # Import ConversationBufferMemory
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# --- Configuration ---
LLM_PROVIDER = os.getenv("LLM_PROVIDER", "openai").lower()
OLLAMA_MODEL_CHAT = os.getenv("OLLAMA_MODEL_CHAT", "llama2").lower()

# --- Step 1: Define Custom Tools (reusing from Day 8) ---
@tool
def word_reverser(word: str) -> str:
    """Reverses a given word or string."""
    print(f"\n--- Tool Action: Reversing '{word}' ---")
    return word[::-1]

@tool
def character_counter(text: str) -> int:
    """Counts the number of characters in a given string."""
    print(f"\n--- Tool Action: Counting characters in '{text}' ---")
    return len(text)

tools = [word_reverser, character_counter]
print(f"Available tools: {[tool.name for tool in tools]}\n")

# --- Step 2: Initialize LLM ---
def initialize_llm(provider, model_name=None, temp=0.7):
    if provider == "openai":
        if not os.getenv("OPENAI_API_KEY"):
            raise ValueError("OPENAI_API_KEY not set for OpenAI provider.")
        return ChatOpenAI(model=model_name or "gpt-3.5-turbo", temperature=temp)
    elif provider == "ollama":
        try:
            llm = ChatOllama(model=model_name or OLLAMA_MODEL_CHAT, temperature=temp)
            llm.invoke("Hello!") # Test connection
            return llm
        except Exception as e:
            print(f"Error connecting to Ollama LLM or model '{model_name or OLLAMA_MODEL_CHAT}' not found: {e}")
            print("Please ensure Ollama is running and the specified model is pulled.")
            exit()
    else:
        raise ValueError(f"Invalid LLM provider: {provider}. Must be 'openai' or 'ollama'.")

llm = initialize_llm(LLM_PROVIDER)
print(f"Using LLM: {LLM_PROVIDER} ({llm.model_name if hasattr(llm, 'model_name') else OLLAMA_MODEL_CHAT})\n")

# --- Step 3: Create the Agent Prompt with Memory Placeholder ---
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful AI assistant. You have access to the following tools: {tools}. "
               "You should use these tools to answer the user's questions. "
               "Respond in the ReAct format: Thought, Action, Action Input, Observation, Final Answer. "
               "If you don't need a tool, just provide a direct answer. "
               "Maintain context from previous turns."),
    MessagesPlaceholder(variable_name="chat_history"), # THIS IS WHERE MEMORY WILL BE INJECTED
    ("user", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad")
])

# --- Step 4: Create the ReAct Agent ---
agent = create_react_agent(llm, tools, prompt)
print("Agent created using create_react_agent.\n")

# --- Step 5: Initialize Memory ---
# We use a simple buffer memory for this example
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
print("ConversationBufferMemory initialized.\n")

# --- Step 6: Create the Agent Executor with Memory ---
# Now we pass the memory's chat history to the executor's input
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
    handle_parsing_errors=True,
    # Here, we don't directly pass 'memory' to AgentExecutor for simple agents.
    # Instead, we'll manually pass the memory's content in the invoke call.
    # This setup is common for ReAct agents where memory is just another part of the prompt.
)
print("Agent Executor created.\n")

# --- Step 7: Conduct a Multi-Turn Conversation ---
print("--- Starting Multi-Turn Conversation with Agent ---")

conversation_turns = [
    "What is the capital of France?",
    "Reverse the word 'LangChain'.",
    "How many characters are in that reversed word?", # Follow-up question referencing previous turn
    "Tell me about large language models."
]

for i, q in enumerate(conversation_turns):
    print(f"\n--- Turn {i+1} ---")
    print(f"User Question: {q}")
    try:
        # Pass the current chat history from memory to the agent's prompt
        # The 'chat_history' key here matches the variable_name in MessagesPlaceholder
        response = agent_executor.invoke({"input": q, "chat_history": memory.load_memory_variables({})["chat_history"]})
        print(f"Agent Final Answer: {response['output']}")

        # Save the current turn's interaction to memory for the next turn
        memory.save_context({"input": q}, {"output": response["output"]})

    except Exception as e:
        print(f"Agent encountered an error: {e}")
    print("\n" + "="*70 + "\n")

# --- Optional: Demonstrate return_intermediate_steps ---
print("--- Demonstrating return_intermediate_steps ---")
query_return_intermediate = "Reverse the word 'LangChain'."
print(f"Query: {query_return_intermediate}")

# Note: AgentExecutor's invoke can return intermediate steps
# You might need to adjust the prompt or agent type for explicit intermediate steps
# returned as part of the output dictionary. For basic create_react_agent,
# verbose=True often gives enough insight. If return_intermediate_steps=True
# was added to AgentExecutor, the invoke output dict would contain 'intermediate_steps'.
# For now, verbose output itself shows intermediate steps.

# Let's show how memory looks after the conversation
print("--- Current Memory Content (from ConversationBufferMemory) ---")
print(memory.load_memory_variables({})["chat_history"])
print("-" * 70)

Code Explanation:

Tools & LLM Initialization: Reuses the setup from Day 8.
Prompt with MessagesPlaceholder: The critical change! We add MessagesPlaceholder(variable_name="chat_history") to the prompt. This tells the LLM where to inject the conversation history.
ConversationBufferMemory Initialization: We create an instance of ConversationBufferMemory. The memory_key="chat_history" links it to the variable_name in our prompt. return_messages=True ensures it provides a list of BaseMessage objects, which is what MessagesPlaceholder expects.
AgentExecutor Setup: The AgentExecutor itself doesn’t directly take a memory parameter in this specific create_react_agent setup. Instead, the chat_history is passed explicitly in each invoke call. This gives you fine-grained control.
Multi-Turn Conversation Loop:
- In each turn, before agent_executor.invoke, we retrieve the current chat_history from our memory object using memory.load_memory_variables({})["chat_history"].
- This history is passed as part of the input dictionary to the invoke call, matching the chat_history variable_name in the prompt.
- After the agent responds, we use memory.save_context({"input": q}, {"output": response["output"]}) to add the current user input and agent output to the memory, preparing it for the next turn.
return_intermediate_steps (Conceptual): While verbose=True visually shows intermediate steps, if return_intermediate_steps=True was passed to AgentExecutor (though create_react_agent might not always structure it that way for direct output), the response dictionary would have an intermediate_steps key. For simpler create_react_agent instances, the verbose output is often sufficient for understanding the internal process.

This project enhances our agent’s capabilities significantly, allowing for more natural and coherent interactions over multiple turns, a vital feature for any conversational AI.

ML Vector