KodeKloud Notes

In this lesson, we’ll explore how to enrich large language models (LLMs) with up-to-date information by injecting dynamic context from your own datasets—whether they’re confidential, private, internal, or public. This technique helps you build chatbots and applications that stay current beyond GPT-3.5’s September 2021 pre-training cutoff.

The image illustrates the concept of "Adding Dynamic Context" with a diagram of a chatbot and categories labeled "Confidential," "Private," "Internal," and "Public."

Why Dynamic Context Matters

Large language models are powerful, but their knowledge is frozen at the time of their last pre-training. To handle events, documents, or data generated after September 2021, you need a way to feed fresh information at query time.

The image displays the text "Adding Dynamic Context" and "September 2021 (Cut-off date for the pre-training of GPT 3.5)" with a copyright notice from KodeKloud.

Core Workflow: Indexing & Retrieval

Adding dynamic context involves two key phases:

Phase	Purpose	Example Tools
Indexing	Convert documents into vector embeddings	OpenAI Embeddings, Hugging Face Embeddings
Retrieval	Find and return the most relevant passages	Pinecone, Weaviate, Elasticsearch

1. Indexing

Break each document, FAQ, or dataset entry into chunks.
Generate a vector embedding for each chunk.
Store embeddings in a vector database (often called a “vector store”).

2. Retrieval

Compute the embedding for the user’s prompt.
Perform a similarity search against your vector store.
Retrieve the top-k most relevant passages.

Note

Adjust the value of k (top-k passages) based on token limits and response quality.

The image is a flowchart illustrating the process of adding dynamic context, involving LLMs, similarity search, and output. Each step is represented by an icon and labeled accordingly.

End-to-End Sequence

User Query: A prompt is submitted through the chatbot UI.
Embedding: The application computes an embedding for the prompt.
Search: The vector store returns the most similar passages.
Injection: Retrieved passages are prepended (or appended) to the original prompt.
LLM Call: The augmented prompt is sent to the language model API.
Generation: The model uses both pre-trained knowledge and dynamic context to craft a precise answer.
Response: The chatbot displays the final output to the user.

The image is a flowchart illustrating the process of adding dynamic context, involving a user interacting with a chatbot, which connects to search and retrieval, embeddings, and a large language model (LLM).

Next Steps: Hands-On Demo

In the following section, we’ll implement a working prototype using the AskUs dataset, which contains events and information generated after GPT-3.5’s cutoff. You’ll see how dynamic context dramatically improves answer accuracy with real-world data. Ready to dive in?

Links and References

Watch Video

Watch video content