-
Big Data Engineering Certification Program
Course Duration: 4 Months
Module-I
Introduction to Big Data and Hadoop
HADOOP Framework
Module-II
Module-III
ETL & Batch Processing with Hadoop
Module-IV
Functional Programming with Scala
Module-V
Big Data Analytics with Spark
Module-VI
Stream processing Frameworks
Course Fee: Rs 49,999/-
Capstone Project
Module-VII
Download as PDF
-
Module-I: Introduction to Big Data and Hadoop
Topics
Introduction to Big Data & Big Data Analytics
Challenges of Traditional Systems
Distributed Systems and Introduction to Hadoop
-
Module-II: HADOOP Framework
Components of Hadoop Ecosystem
1. Data and Distributed Storage (HDFS)
2. Yarn Architecture
3. MapReduce Programming
4. Components of Pig
-
Module-III: ETL & Batch Processing with Hadoop
Topics
1) ETL & Data Warehousing
2) Data Ingestion using Sqoop and Flume
3) Apache Kafka Architechture
4) Apache Hive Architecture, Interfaces, Hive Metastore, Dynamic Partitioning
5) NoSQL Databases, Hbase Architecture, Data Model
6) Oozie- Workflow Scheduler for Hadoop
-
Module-IV: Functional Programming with Scala
Topics
1) Introduction to Scala
2) Programming with Scala
3) Inference Classes and Collections
-
Module-V: Big Data Analytics with Spark
Topics
1) Introduction to Spark
2) Hadoop Ecosystem vs Spark
3) Spark Core Processing RDD- RDD Operations, Debugging, Partitioning, Scheduling &
Shuffling
4) Analytics using SparkSQL- Data pre-processing, Regression, Classification & Clustering
5) Spark Graphx
Module-VI: Stream processing Frameworks
1) Streaming Overview
2) Real Time Processing of Big Data
3) Spark Streaming
4) Structures Streaming Applications
-
Module-VII: Capstone project
Tools Covered
Java, Hadoop, Sqoop, Flume, Hive, Hbase, Scala, Apache Spark