Datacamp – Big Data with PySpark 2024-8
Datacamp – Big Data with PySpark 2024-8

Big Data with PySpark, Advance your data skills by mastering Apache Spark. Using the Spark Python API, PySpark, you will leverage parallel computation with large datasets, and get ready for high-performance machine learning. From cleaning data to creating features and implementing machine learning models, you’ll execute end-to-end workflows with Spark. The track ends with building a recommendation engine using the popular MovieLens dataset and the Million Songs dataset.
What you’ll learn
- Learn to implement distributed data management and machine learning in Spark using the PySpark package.
- Learn the fundamentals of working with big data with PySpark.
- Learn how to clean data with Apache Spark in Python.
- Learn the gritty details that data scientists are spending 70-80% of their time on; data wrangling and feature engineering.
- Learn how to make predictions from data with Apache Spark, using decision trees, logistic regression, linear regression, ensembles, and pipelines.
Specificatoin of Big Data with PySpark
- Publisher : Datacamp
- Teacher : Lore Dirick
- Language : English
- Level : All Levels
- Number of Course : 6
- Duration : 25 hours and 0 minutes
Content of Big Data with PySpark
Pictures
Sample Clip
Installation Guide
Extract the files and watch with your favorite player
Subtitle : English
Quality: 720p
Download Links
Big Data Fundamentals with PySpark
Building Recommendation Engines with PySpark
Cleaning Data with PySpark
Feature Engineering with PySpark
Introduction to PySpark
Machine Learning with PySpark
Password file(s): www.downloadly.ir
File size
409 MB