Oreilly – Programming Generative AI 2024-10

Apr 18, 2025 - 18:20

0 0

Programming Generative AI Course. This hands-on course takes you from building simple neural networks in PyTorch to working with large multi-faceted models capable of understanding text and images simultaneously. Along the way, you’ll learn how to train your own generative models from scratch to generate infinite images, generate text using large language models like ChatGPT, write your own text-to-image pipeline to understand how notification-based generative models work, and customize large pre-trained models like Persistent Publishing to generate images of new topics with unique visual styles (and more).

What you will learn:

Training a variational AutoCAD with PyTorch to learn a compact latent space from images
Generate and edit realistic human faces with unconditional diffusion models and SDEdit
Using large language models like GPT2 to generate text with Hugging Face Transformers
Performing text-based semantic image search using multi-faceted models such as CLIP
Programming your own text-to-image pipeline to understand how notification-based generative models like persistent publishing work
Properly evaluating generative models, both qualitatively and quantitatively
Automatic image description using pre-trained base models
Producing images in a specific visual style with efficient fine-tuning of stable propagation with LoRA
Create personalized AI avatars by teaching new topics and concepts to pre-trained publishing models with Dreambooth
Guiding the structure and composition of generated images using depth- and edge-conditioned ControlNets
Perform near-real-time inference with SDXL Turbo for frame-by-frame video-to-video translation

This course is suitable for people who:

Engineers and developers interested in building productive AI systems and applications
Data scientists interested in working with advanced deep learning models
Students, researchers, and academics looking for a practical or applied resource to supplement their theoretical or conceptual knowledge
Technical artists and creative coders who want to enhance their creative practice
Anyone who is interested in working with productive AI and doesn’t know where or how to start

Generative AI Programming Course Specifications

Publisher: Oreilly
Instructor: Jonathan Dinu
Training level: Beginner to advanced
Training duration: 9 hours and 24 minutes

Course headings

Introduction
Programming Generative AI: Introduction
Lesson 1: The What, Why, and How of Generative AI
Topics
1.1 Generative AI in the Wild
1.2 Defining Generative AI
1.3 Multitudes of Media
1.4 How Machines Create
1.5 Formalizing Generative Models
1.6 Generative versus Discriminative Models
1.7 The Generative Modeling Trilemma
1.8 Introduction to Google Collab
Lesson 2: PyTorch for the Impatient
Topics
2.1 What is PyTorch?
2.2 The PyTorch Layer Cake
2.3 The Deep Learning Software Trilemma
2.4 What Are Tensors, Really?
2.5 Tensors in PyTorch
2.6 Introduction to Computational Graphs
2.7 Backpropagation Is Just the Chain Rule
2.8 Effortless Backpropagation with torch.autograd
2.9 PyTorch’s Device Abstraction (ie, GPUs)
2.10 Working with Devices
2.11 Components of a Learning Algorithm
2.12 Introduction to Gradient Descent
2.13 Getting to Stochastic Gradient Descent (SGD)
2.14 Comparing Gradient Descent and SGD
2.15 Linear Regression with PyTorch
2.16 Perceptrons and Neurons
2.17 Layers and Activations with torch.nn
2.18 Multi-layer Feedforward Neural Networks (MLP)
Lesson 3: Latent Space Rules Everything Around Me
Topics
3.1 Representing Images as Tensors
3.2 Desiderata for Computer Vision
3.3 Features of Convolutional Neural Networks
3.4 Working with Images in Python
3.5 The FashionMNIST Dataset
3.6 Convolutional Neural Networks in PyTorch
3.7 Components of a Latent Variable Model (LVM)
3.8 The Humble Autoencoder
3.9 Defining an Autoencoder with PyTorch
3.10 Setting up a Training Loop
3.11 Inference with an Autoencoder
3.12 Look Ma, No Features!
3.13 Adding Probability to Autoencoders (VAE)
3.14 Variational Inference: Not Just for Autoencoders
3.15 Transforming an Autoencoder into a VAE
3.16 Training a VAE with PyTorch
3.17 Exploring Latent Space
3.18 Latent Space Interpolation and Attribute Vectors
Lesson 4: Demystifying Diffusion
Topics
4.1 Generation as a Reversible Process
4.2 Sampling as Iterative Denoising
4.3 Diffusers and the Hugging Face Ecosystem
4.4 Generating Images with Diffuser Pipelines
4.5 Deconstructing the Diffusion Process
4.6 Forward Process as Encoder
4.7 Reverse Process as Decoder
4.8 Interpolating Diffusion Models
4.9 Image-to-Image Translation with SDEdit
4.10 Image Restoration and Enhancement
Lesson 5: Generating and Encoding Text with Transformers
Topics
5.1 The Natural Language Processing Pipeline
5.2 Generative Models of Language
5.3 Generating Text with Transformers Pipelines
5.4 Deconstructing Transformer Pipelines
5.5 Decoding Strategies
5.6 Transformers are Just Latent Variable Models for Sequences
5.7 Visualizing and Understanding Attention
5.8 Turning Words into Vectors
5.9 The Vector Space Model
5.10 Embedding Sequences with Transformers
5.11 Computing the Similarity Between Embeddings
5.12 Semantic Search with Embeddings
5.13 Contrastive Embeddings with Sentence Transformers
Lesson 6: Connecting Text and Images
Topics
6.1 Components of a Multimodal Model
6.2 Vision-Language Understanding
6.3 Contrastive Language-Image Pretraining
6.4 Embedding Text and Images with CLIP
6.5 Zero-Shot Image Classification with CLIP
6.6 Semantic Image Search with CLIP
6.7 Conditional Generative Models
6.8 Introduction to Latent Diffusion Models
6.9 The Latent Diffusion Model Architecture
6.10 Failure Modes and Additional Tools
6.11 Stable Diffusion Deconstructed
6.12 Writing Our Own Stable Diffusion Pipeline
6.13 Decoding Images from the Stable Diffusion Latent Space
6.14 Improving Generation with Guidance
6.15 Playing with Prompts
Lesson 7: Post-Training Procedures for Diffusion Models
Topics
7.1 Methods and Metrics for Evaluating Generative AI
7.2 Manual Evaluation of Stable Diffusion with DrawBench
7.3 Quantitative Evaluation of Diffusion Models with Human Preference Predictors
7.4 Overview of Methods for Fine-Tuning Diffusion Models
7.5 Sourcing and Preparing Image Datasets for Fine-Tuning
7.6 Generating Automatic Captions with BLIP-2
7.7 Parameter Efficient Fine-Tuning with LoRA
7.8 Inspecting the results of fine-tuning
7.9 Inference with LoRAs for Style-Specific Generation
7.10 Conceptual Overview of Textual Inversion
7.11 Subject-Specific Personalization with Dreambooth
7.12 Dreambooth versus LoRA Fine-Tuning
7.13 Dreambooth Fine-Tuning with Hugging Face
7.14 Inference with Dreambooth to Create Personalized AI Avatars
7.15 Adding Conditional Control to Text-to-Image Diffusion Models
7.16 Creating Edge and Depth Maps for Conditioning
7.17 Depth and Edge-Guided Stable Diffusion with ControlNet
7.18 Understanding and Experimenting with ControlNet Parameters
7.19 Generative Text Effects with Font Depth Maps
7.20 Few Step Generation with Adversarial Diffusion Distillation (ADD)
7.21 Reasons to Distill
7.22 Comparing SDXL and SDXL Turbo
7.23 Text-Guided Image-to-Image Translation
7.24 Video-Driven Frame-by-Frame Generation with SDXL Turbo
7.25 Near Real-Time Inference with PyTorch Performance Optimizations
Summary
Programming Generative AI: Summary

Course prerequisites

Comfortable programming in Python
Knowledge of machine learning basics
Familiarity with deep learning and neural networks will be helpful but is not required

Oreilly – Programming Generative AI 2024-10