Oreilly – Data Pipelines with Apache Airflow, video edition 2022-12
Oreilly – Data Pipelines with Apache Airflow, video edition 2022-12

Data Pipelines with Apache Airflow video edition. This course is a video version of the best-selling book. In this edition, the narrator reads the book aloud while the content, charts, code, and text from the book are displayed on the screen. The experience is similar to listening to an audiobook, except that you can also see the content visually. Data pipelines manage the flow of data from initial collection to integration, cleansing, analysis, visualization, and more. Apache Airflow is a single platform that you can use to design, implement, monitor, and maintain your pipeline. The easy-to-use interface, out-of-the-box options, and flexible scriptability with Python make Airflow ideal for any data management task.
This book teaches you how to build and maintain an effective data pipeline. You’ll explore common usage patterns, including gathering data from multiple sources, connecting to data lakes, and deploying to the cloud. A combination of reference and training, this practical guide covers all aspects of the directed acyclic graphs (DAGs) that power Airflow and shows you how to customize them to meet your pipeline needs.
What you will learn:
- Build, test, and deploy Airflow pipelines as DAGs
- Automate data movement and transformation
- Analyzing historical datasets using backfilling
- Custom component development
- Airflow setup in production environments
This course is suitable for people who:
- They work in the fields of DevOps, data engineering, machine learning engineering, and administrative systems.
- Have intermediate skills in Python programming.
Data Pipelines with Apache Airflow video edition course details
- Publisher: Oreilly
- Lecturer: Julian de Ruiter , Bas Harenslak
- Training level: Beginner to advanced
- Training duration: 10 hours and 22 minutes
- Number of lessons: 80
Course headings
- Part 1. Getting started
- Chapter 1 Meet Apache Airflow
- Chapter 1 Pipeline graphs vs. sequential scripts
- Chapter 1 Introducing Airflow
- Chapter 1 When to use Airflow
- Chapter 2 Anatomy of an Airflow DAG
- Chapter 2 Running a DAG in Airflow
- Chapter 2 Running at regular intervals
- Chapter 3 Scheduling in Airflow
- Chapter 3 Cron-based intervals
- Chapter 3 Processing data incrementally
- Chapter 3 Understanding Airflow’s execution dates
- Chapter 3 Best practices for designing tasks
- Chapter 4 Templating tasks using the Airflow context
- Chapter 4 Templating the PythonOperator
- Chapter 4 Hooking up other systems
- Chapter 5 Defining dependencies between tasks
- Chapter 5 Branching
- Chapter 5 Conditional tasks
- Chapter 5 More about trigger rules
- Chapter 5 Sharing data between tasks
- Chapter 5 Chaining Python tasks with the Taskflow API
- Part 2. Beyond the basics
- Chapter 6 Triggering workflows
- Chapter 6 Polling custom conditions
- Chapter 6 Triggering other DAGs
- Chapter 7 Communicating with external systems
- Chapter 7 Developing locally with external systems
- Chapter 7 Moving data from between systems
- Chapter 8 Building custom components
- Chapter 8 Building a custom hook
- Chapter 8 Building a custom operator
- Chapter 8 Packaging your components
- Chapter 9 Testing
- Chapter 9 Setting up a CI/CD pipeline
- Chapter 9 Testing with files on disk
- Chapter 9 Working with external systems
- Chapter 9 Using tests for development
- Chapter 10 Running tasks in containers
- Chapter 10 Introducing containers
- Chapter 10 Containers and Airflow
- Chapter 10 Creating container images for tasks
- Chapter 10 Running tasks in Kubernetes
- Chapter 10 Using the KubernetesPodOperator
- Part 3. Airflow in practice
- Chapter 11 Best practices
- Chapter 11 Manage credentials centrally
- Chapter 11 Use factories to generate common patterns
- Chapter 11 Designing reproducible tasks
- Chapter 11 Handling data efficiently
- Chapter 11 Managing your resources
- Chapter 12 Operating Airflow in production
- Chapter 12 Which executor is right for me?
- Chapter 12 A closer look at the scheduler
- Chapter 12 Installing each executor
- Chapter 12 Setting up the KubernetesExecutor
- Chapter 12 Capturing logs of all Airflow processes
- Chapter 12 Visualizing and monitoring Airflow metrics
- Chapter 12 Creating dashboards with Grafana
- Chapter 12 How to get notified of a failing task
- Chapter 12 Scalability and performance
- Chapter 13 Securing Airflow
- Chapter 13 Encrypting data at rest
- Chapter 13 Encrypting traffic to the webserver
- Chapter 13 Fetching credentials from secret management systems
- Chapter 14 Project: Finding the fastest way to get around NYC
- Chapter 14 Extracting the data
- Chapter 14 Structuring a data pipeline
- Part 4. In the clouds
- Chapter 15 Airflow in the clouds
- Chapter 15 Google Cloud Composer
- Chapter 16 Airflow on AWS
- Chapter 16 AWS-specific hooks and operators
- Chapter 16 Building the DAG
- Chapter 17 Airflow on Azure
- Chapter 17 Overview
- Chapter 18 Airflow in GCP
- Chapter 18 Integrating with Google services
- Chapter 18 GCP-specific hooks and operators
- Chapter 18 Getting data into BigQuery
Course images
Sample course video
Installation Guide
After Extract, view with your favorite player.
Subtitles: None
Quality: 720p
Download link
File(s) password: www.downloadly.ir
File size
1.3 GB