LangChain

Implementing Chains

Overview of Chains

In this lesson, we’ll explore two high-level chains that streamline common workflows in LangChain:

  1. Summarization: Batch summarization using the Create Stuff Documents Chain
  2. Retrieval: Retrieval-Augmented Generation (RAG) via the Create Retrieval Chain

These constructs let you process multiple documents without reinventing the wheel, whether you need to summarize everything at once or selectively retrieve relevant passages.


1. Summarization with Create Stuff Documents Chain

The Create Stuff Documents Chain merges a list of document chunks into a single prompt and sends it to your LLM. This is ideal when your combined content stays within the model’s context window.

Use cases:

  • Summarize multiple documents in one pass
  • Extract specific insights across all inputs

The image features a graphic titled "Built-in Chain Constructs" with an icon and text that reads "Summarization based on Create Stuff Documents Chain."

Example: Batch Summarization

from langchain.chains import create_stuff_documents_chain
from langchain.llms import OpenAI
from langchain.schema import Document

# Initialize your LLM
llm = OpenAI(model_name="gpt-4")

# Prepare document chunks
docs = [
    Document(page_content="Document text chunk 1..."),
    Document(page_content="Document text chunk 2..."),
    # ...
]

# Build the chain
chain = create_stuff_documents_chain(llm=llm)

# Run the summarization
summary = chain.run(input_documents=docs)
print("Summary:", summary)

Note

Ensure the total token count of your document chunks doesn’t exceed your model’s context limit. You can use the tiktoken library to estimate tokens in advance.


2. Retrieval with Create Retrieval Chain

For larger corpora that exceed a single prompt window, the Create Retrieval Chain combines a retriever with the Stuff Documents logic. This effectively implements a simple RAG workflow:

  1. Retriever retrieves the most relevant chunks.
  2. Document Chain formats those chunks into a prompt.
  3. LLM generates the final answer.

The image is a slide titled "Built-in Chain Constructs" showing a step labeled "2" with the task "Create Retrieval Chain," marked as "Easy."

The image shows a diagram titled "Built-in Chain Constructs" with two linked chain icons connected by a "Retriever" icon in the center.

Example: Simple RAG Pipeline

from langchain.chains import create_retrieval_chain
from langchain.llms import OpenAI
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

# Set up embeddings and vector store
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_texts(["Doc text A", "Doc text B"], embeddings)

# Build the retrieval chain
retrieval_chain = create_retrieval_chain(
    llm=OpenAI(model_name="gpt-4"),
    retriever=vectorstore.as_retriever()
)

# Run RAG
query = "What are the key takeaways from these documents?"
result = retrieval_chain.run(query)
print("Answer:", result)

Warning

Always index your documents using the same embedding model you use at query time to ensure consistency in vector representations.


Comparison of Built-In Chains

Chain TypeUse CaseComponentsKey Benefit
Create Stuff Documents ChainBatch summarization, multi-doc info extractionLLMSimple prompt stitching
Create Retrieval ChainRAG over large corporaRetriever + LLM + Document ChainScales beyond context window

Next Steps

Now that you understand both the Summarization and Retrieval chains, let’s dive into hands-on demos to see them in action.


Watch Video

Watch video content

Previous
RAG with Webpages