This plan includes
- Limited free courses access
- Play & Pause Course Videos
- Video Recorded Lectures
- Learn on Mobile/PC/Tablet
- Quizzes and Real Projects
- Lifetime Course Certificate
- Email & Chat Support
What you'll learn?
- PySpark Programming
- Data Analysis
- Python and Bokeh
- Data Transformation and Manipulation
- Data Visualization
- Big Data Machine Learning
- Geo Mapping
- Geospatial Machine Learning
- Creating Dashboards
Course Overview
Welcome to the Building Big Data Pipelines with PySpark & MongoDB & Bokeh course. In
this course we will be building an intelligent data pipeline using big data technologies like
Apache Spark and MongoDB.
We will be building an ETLP pipeline, ETLP stands for Extract Transform Load and Predict.
These are the different stages of the data pipeline that our data has to go through in order for it
to become useful at the end. Once the data has gone through this pipeline we will be able to
use it for building reports and dashboards for data analysis.
The data pipeline that we will build will comprise of data processing using PySpark, Predictive
modelling using Spark’s MLlib machine learning library, and data analysis using MongoDB and
Bokeh.
-
You will learn how to create data processing pipelines using PySpark
-
You will learn machine learning with geospatial data using the Spark MLlib library
-
You will learn data analysis using PySpark, MongoDB and Bokeh, inside of jupyter notebook
-
You will learn how to manipulate, clean and transform data using PySpark dataframes
-
You will learn basic Geo mapping
-
You will learn how to create dashboards
-
You will also learn how to create a lightweight server to serve Bokeh dashboards
Pre-requisites
- Basic Understanding of Python
- Little or no understanding of GIS
- Basic understanding of Programming concepts
- Basic understanding of Data
- Basic understanding of what Machine Learning is
Target Audience
- Python Developers at any level
- Developers at any level
- Machine Learning engineers at any level
- Data Scientists at any level
- The curious mind
- GIS Developers at any level
Curriculum 25 Lectures 02:03:01
-
Section 1 : Introduction
-
Section 2 : Setup and Installations
- Lecture 1 :
- Python Installation
- Lecture 2 :
- Installing Third Party Libraries
- Lecture 3 :
- Installing Apache Spark
- Lecture 4 :
- Installing Java (Optional)
- Lecture 5 :
- Testing Apache Spark Installation
- Lecture 6 :
- Installing MongoDB
- Lecture 7 :
- Installing NoSQL Booster for MongoDB
-
Section 3 : Data Processing with PySpark and MongoDB
- Lecture 1 :
- Integrating PySpark with Jupyter Notebook
- Lecture 2 :
- Data Extraction
- Lecture 3 :
- Data Transformation
- Lecture 4 :
- Loading Data into MongoDB
-
Section 4 : Machine Learning with PySpark and MongoDB
- Lecture 1 :
- Data Pre-processing
- Lecture 2 :
- Building the Predictive Model
- Lecture 3 :
- Creating the Prediction Dataset
-
Section 5 : Data Visualization
- Lecture 1 :
- Loading the Data Sources from MongoDB
- Lecture 2 :
- Creating a Map Plot
- Lecture 3 :
- Creating a Bar Chart
- Lecture 4 :
- Creating a Magnitude Plot
- Lecture 5 :
- Creating a Grid Plot
-
Section 6 : Building the Data Pipeline Scripts
- Lecture 1 :
- Installing Visual Studio Code
- Lecture 2 :
- Creating the PySpark ETL Script
- Lecture 3 :
- Creating the Machine Learning Script
- Lecture 4 :
- Creating the Dashboard Server
-
Section 7 : Source Code and Notebook
- Lecture 1 :
- Source Code and Notebook
Our learners work at
Frequently Asked Questions
-
How do i access the course after purchase?
It's simple. When you sign up, you'll immediately have unlimited viewing of thousands of expert courses, paths to guide your learning, tools to measure your skills and hands-on resources like exercise files. There’s no limit on what you can learn and you can cancel at any time. -
Are these video based online self-learning courses?
Yes. All of the courses comes with online video based lectures created by certified instructors. Instructors have crafted these courses with a blend of high quality interactive videos, lectures, quizzes & real world projects to give you an indepth knowledge about the topic. -
Can i play & pause the course as per my convenience?
Yes absolutely & thats one of the advantage of self-paced courses. You can anytime pause or resume the course & come back & forth from one lecture to another lecture, play the videos mulitple times & so on. -
How do i contact the instructor for any doubts or questions?
Most of these courses have general questions & answers already covered within the course lectures. However, if you need any further help from the instructor, you can use the inbuilt Chat with Instructor option to send a message to an instructor & they will reply you within 24 hours. You can ask as many questions as you want. -
Do i need a pc to access the course or can i do it on mobile & tablet as well?
Brilliant question? Isn't it? You can access the courses on any device like PC, Mobile, Tablet & even on a smart tv. For mobile & a tablet you can download the Learnfly android or an iOS app. If mobile app is not available in your country, you can access the course directly by visting our website, its fully mobile friendly. -
Do i get any certificate for the courses?
Yes. Once you complete any course on our platform along with provided assessments by the instructor, you will be eligble to get certificate of course completion. -
For how long can i access my course on the platform?
You require an active subscription to access courses on our platform. If your subscription is active, you can access any course on our platform with no restrictions. -
Is there any free trial?
Currently, we do not offer any free trial. -
Can i cancel anytime?
Yes, you can cancel your subscription at any time. Your subscription will auto-renew until you cancel, but why would you want to?