Instructor-Led Workshop
Building RAG Agents with LLMs

Agents powered by large language models (LLMs) are quickly gaining popularity from both individuals and companies as people are finding new emerging capabilities and opportunities to greatly improve their productivity. An especially powerful recent development has been the popularization of retrieval-based LLM systems that can hold informed conversations by using tools, looking at documents, and planning their approaches. These systems are very fun to experiment with and offer unprecedented opportunities to make life easier, but also require many queries to large deep learning models and need to be implemented efficiently.

You will be designing retrieval-augmented generation systems and bundling them into deliverable formats. Along the way, you will learn advanced LLM composition techniques for internal reasoning, dialog management, and tooling. 

 

Learning Objectives
 

By participating in this workshop, you’ll learn how to:
  • Compose an LLM system that can interact predictably with a user by leveraging internal and external reasoning components.
  • Design a dialog management and document reasoning system that maintains state and coerces information into structured formats.
  • Leverage embedding models for efficient similarity queries for content retrieval and dialog guardrailing.
  • Implement, modularize, and evaluate a retrieval-augmented generation agent that can answer questions about the research papers in its dataset without any fine-tuning.

Download workshop datasheet (PDF 106 KB)

Workshop Outline

Introduction
(15 mins)
LLM Inference Interfaces
(60 mins)

    Explore the course environment, microservices, and LLM inferencing options.

  • Get comfortable with the course environment and learn about microservices for software compartmentalization and resource delivery.
  • Discuss LLM service options for inference use-cases, including local and scalable deployment strategies and value propositions.
  • Get comfortable with remotely-accessible access points like GPT4 and NGC-hosted NVIDIA AI Foundation Model endpoints.
Break (15 mins)
Pipeline Design with LangChain, Gradio, and LangServe
(60 mins)

    Orchestrate LLM endpoints into pipelines using open-source frameworks.

  • Learn how to use LangChain to chain multiple LLM-enabled modules using the functional LangChain Expression Language (LCEL) syntax.
  • Formalize internal/external reasoning and modularize them into runnables.
  • Use LangServe to interact with a Gradio frontend by sending an LLM chain over a port interface.
Break (60 mins)
Dialog Management with Running States
(75 mins)

    Develop running logic systems to remember information and guide dialog.

  • Learn about running state logic to retain state as your chain runs.
  • Leverage knowledge extraction via slot filling to keep a smart knowledge base.
  • Integrate a dialog managing chatbot to coerce the user for credentials, retrieve info from a database interface, and maintain dialog state.
Working with Documents
(45 mins)

    Learn about how to work with longer-form documents that exceed context limits.

  • Learn about document chunking, reduction, and refinement strategies.
  • Use the same LLM chaining skills to build systems that summarize research papers by exporting a while-loop-enabled runnable.
Embeddings for Semantic Similarity and Guardrailing
(60 mins)

    Explore the use of embedding models for vector-enabled semantic reasoning.

  • Formalize encoder-vs-decoder benefits and understand how embedding logic works.
  • Use vector representations to reason about passage meanings and similarity.
  • Design a guardrailing system that leverages a custom-build input rail to answer a question or kindly refuse.
Break (15 mins)
Vector Stores for RAG Agents
(60 mins)

    Integrate vector stores to help agent systems retrieve and reason with documents.

  • Formalize vector stores as structures that help automate vector reasoning logic.
  • Incorporate vector stores into retrieval-augmented generation pipelines that reason about conversation history and preprocessed document pools.
Evaluation, Assessment, and Q&A
(60 mins)

    Evaluate your RAG system using LLM-as-a-Judge evaluation chains.

  • Review key learnings and motivate evaluation as a natural progression.
  • Launch your retriever component into the frontend to run the assessment.
  • Finish the course and complete the assessment to earn a certificate!
 

Workshop Details

Duration: 8 hours

Price: Contact us for pricing.

Prerequisites:

Technologies: Python, LangChain, NVIDIA AI Foundation Endpoints, FAISS, Gradio, LangServe, FastAPI

Hardware Requirements: Desktop or laptop computer capable of running the latest version of Chrome or Firefox. Each participant will be provided with dedicated access to a fully configured, GPU-accelerated workstation in the cloud.

Certificate: Upon successful completion of the assessment, participants will receive an NVIDIA DLI certificate to recognize their subject matter competency and support professional career growth.

Languages: English

Upcoming Public Workshops

If your organization is interested in boosting and developing key skills in AI, accelerated data science, or accelerated computing, you can request instructor-led training from the NVIDIA DLI.

Continue Your Learning with These DLI Trainings

Getting Started with Image Segmentation

Modeling Time-Series Data with Recurrent Neural Networks in Keras

Building Transformer-Based Natural Language Processing Applications

Building Intelligent Recommender Systems

Data Parallelism: How to Train Deep Learning Models on Multiple GPUs

Questions?