Oreilly – Algorithms and Data Structures for Massive Datasets, Video Edition 2022-10

Oreilly – Algorithms and Data Structures for Massive Datasets, Video Edition 2022-10 Downloadly IRSpace

Apr 18, 2025 - 18:18 Updated: Aug 17, 2025 - 18:20

0 1

Oreilly – Algorithms and Data Structures for Massive Datasets, Video Edition 2022-10

Algorithms and Data Structures for Massive Datasets, Video Edition. Today’s massive data sets challenge traditional data structures and algorithms. This engaging, practical guide introduces new techniques that can reliably handle even the largest distributed data sets. This course will help you become familiar with modern techniques for managing and analyzing very large data sets. By learning this course, you will be able to design and implement high-performance, scalable systems for processing large data sets. This course explains complex concepts in a simple way with practical, real-world examples.

What you will learn:

Probabilistic Sketch Data Structures: For Solving Practical Problems
Choosing the right database: for your application
Evaluate and design efficient data structures and algorithms on disk
Understanding algorithmic trade-offs: in large-scale systems
Calculating basic statistics: from current data
Correct sampling: from current data
Calculating percentiles: with limited space resources

Who is this course suitable for?

They face the challenges of managing big data.
They are looking for solutions to improve the efficiency of data processing systems.
They want to use modern techniques for data analysis.
They are interested in understanding the principles of how systems like Google and Facebook work.

Course details

Publisher: Oreilly
Instructor: Emin Tahirovic , Dzejla Medjedovic , Ines Dedovic
Training level: Beginner to advanced
Training duration: 9 hours and 53 minutes

Course topics

Chapter 1. Introduction
Chapter 1. An example: How to solve it
Chapter 1. How to solve it, take two: A book walkthrough
Chapter 1. The structure of this book
Chapter 1. Latency vs. bandwidth
Part 1. Hash-based sketches
Chapter 2. Review of hash tables and modern hashing
Chapter 2. Usage scenarios in modern systems
Chapter 2. Collision resolution: Theory vs. practice
Chapter 2. Hash tables for distributed systems: Consistent hashing
Chapter 2. Adding a new node/resource
Chapter 3. Approximate membership: Bloom and quotient filters
Chapter 3. A simple implementation
Chapter 3. A bit of theory
Chapter 3. Bloom filter adaptations and alternatives
Chapter 3. Understanding metadata bits
Chapter 3. Python code for lookup
Chapter 3. Comparison between Bloom filters and quotient filters
Chapter 4. Frequency estimation and count-min sketch
Chapter 4. Update
Chapter 4. Error vs. space in count-min sketch
Chapter 4. Range queries with count-min sketch
Chapter 5. Cardinality estimation and HyperLogLog
Chapter 5. HyperLogLog incremental design
Chapter 5. LogLog
Chapter 5. Use case: Catching worms with HLL
Chapter 5. The effect of the number of buckets (m)
Part 2. Real-time analytics
Chapter 6. Streaming data: Bringing everything together
Chapter 6. Streaming data system: A meta example
Chapter 6. Deduplication
Chapter 6. Practical constraints and concepts in data streams
Chapter 6. Math bit: Sampling and estimation
Chapter 6. Biased sampling strategy
Chapter 7. Sampling from data streams
Chapter 7. Reservoir sampling
Chapter 7. Biased reservoir sampling
Chapter 7. Sampling from a sliding window
Chapter 7. Priority sampling
Chapter 7. Sampling algorithms comparison
Chapter 8. Approximate quantiles on data streams
Chapter 8. Approximate quantiles
Chapter 8. T-digest: How it works
Chapter 8. Scale functions
Chapter 8. Merging t-digests
Chapter 8. Q-digest
Chapter 8. Quantile queries with q-digests
Part 3. Data structures for databases and external memory algorithms
Chapter 9. Introducing the external memory model
Chapter 9. Example 1: Finding a minimum
Chapter 9. Example 2: Binary search
Chapter 9. Optimal searching
Chapter 9. External memory model: Simple or simplistic?
Chapter 10. Data structures for databases: B-trees, Bε-trees, and LSM-trees
Chapter 10. Data structures in this chapter
Chapter 10. B-tree balancing
Chapter 10. Delete
Chapter 10. Math bit: Why are B-tree lookups optimal in external memory?
Chapter 10. Bε-trees
Chapter 10. Lookups
Chapter 10. Log-structured merge-trees (LSM-trees)
Chapter 10. LSM-tree cost analysis
Chapter 11. External memory sorting
Chapter 11. Challenges of sorting in external memory: An example
Chapter 11. External memory merge-sort (M/B-way merge-sort)
Chapter 11. What about external quick-sort?
Chapter 11. Finding good enough pivots

Pictures from the course Algorithms and Data Structures for Massive Datasets, Video Edition

Algorithms and Data Structures for Massive Datasets, Video Edition

Sample course video

Installation Guide

After Extract, view with your favorite player.

Subtitles: None

Quality: 720p

Download link

Download Part 1 – 1 GB

Download Part 2 – 42 MB

File(s) password: www.downloadly.ir

File size

1.04 GB

Oreilly – Algorithms and Data Structures for Massive Datasets, Video Edition 2022-10