Welcome to DigiStacKedu
Courses Images


Teachers Images

Mr. Prashant.

Sr. Hadoop Developer & Consultant.

Mr. Prashant is expert in Big Data Technologies and he has worked with top level IT companies.


  • Price : Call Now.
  • Lessons : 12
  • Length : 3 Month
  • Level : Advance
  • Category : Analysis
  • Started : 15 APR 2024
  • Shift : 02
  • Class : 90

Course Description

Big Data is all about distributed data processing and it is an emerging field in the current IT sector, professionals are using hadoop for processing huge amount of data that is stored on various RDBMS and company servers.this course will upgrade your skills in apache spark, python, scala and ETL operations uisng sqoop moreover, you will learn how real time data is processed using apache kafka.

After completion of this course you'll be able to work on big data tools moreover,you will be an expert of python and scala programming language.In this couse you will learn how to do distributed data processing using python and scala.

We will start this course with basics of Hadoop and its architecture then you will work on hadoop advance level Tools like Apache Sqoop and Apache Hive that is required to process big data,you will learn how to analyze different types of data sets and how ot visualize and generate reports.

Course Syllabus

Module 1: Introduction to Big Data & Hadoop

  1. Introduction to big data and its background.
  2. Understading the role of Hadoop Framework.
  3. A Brief History of Distributed Computing.
  4. Understanding the Basics of Distributed Computing.
  5. Understading Data Warehouses and its role.
  6. Understading Big Data vs Traditional Data Warehouse Systems.
  7. Introduction to RDBMS.
  8. Working on IBM DB2 or Mysql.
  9. Working on DB2/Mysql Databases and Tables.
  10. DB2/Mysql Hands on Lab.
  11. Understanding RDBMS in a Big Data Environment
  1. Understanding the Basics of Virtualization.
  2. Virtual machine installation and its configuration.
  3. Implementing Virtualization to Work with Big Data
  4. Setup and Configuration of Cloudera VM.
  5. Understading Cloudera HDFS and Cloudera Manager.
  6. Understanding Cloudera and mysql Database.
  1. Introduction to Hadoop Distributed File System.
  2. Hadoop Cluster Architecture.
  3. Undersatding Name Node and Data Nodes in Hadoop.
  4. Understanding the role of Job Tracker and Task Tracker.
  5. Role of Secondary Name Node.
  6. Data Replication and rack awareness.
  7. High Availability in Hadoop.
  8. Working on HDFS File System.
  9. Useful Commands to Manage HDFS.
  10. HDFS Hands on Lab.
  1. Introduction to Map & Reduce
  2. Deep background of Map Reduce.
  3. Understanding the map Functions.
  4. Understanding the reduce Function.
  5. Working on map and reduce together.
  6. Map Reduce Programming using java.
  7. Temperature analysis using Map Reduce programming.
  8. Assignments on Map Reduce programming.
  1. An Introduction to Apache Hive.
  2. Introduction to dataware house.
  3. Understand Hive environment and its Architecture.
  4. Hive Thrift Server and Hive Client.
  5. Working with Hive Query Language(HQL).
  6. Perform DDL approach Through Hive
  7. Nested Queries in Hive.
  8. Hive partitioning and bucketing.
  9. Working on Hive User Defined Function.
  10. Hive Left Join and Right Outer Join.
  11. Windowing Function in Hive.
  12. Hands on Lab on apache Hive.
  1. An Introduction to ETL techniques.
  2. introduction of Apache Sqoop .
  3. Requirement of Sqoop and its background.
  4. Performing sqoop operations on hadoop datawarehouse.
  5. Loading data using Sqoop tool in Cloudera.
  6. Introduction to Flume and its methodology.
  7. Requirement of Apache Flume
  8. Activating Flume Agent on Hadoop and working on Sink.
  9. Execution of Flume Agent on hadoop.
  1. Introduction to Apache Pig.
  2. Apache Pig Architecture
  3. Undersatding Pig Local and Map Reduce Mode.
  4. Working with pig Script.
  5. Executing and managing Pig Script on Hadoop.
  6. Apache Pig Advantages and Disadvantages.
  7. Working on PIG User Defined Functions.
  8. Assignments on Apache Pig
  1. Introduction to Apache Spark.
  2. An Introduction to python programming.
  3. Working on scala programming Language.
  4. An Introduction to Apache Spark.
  5. Understanding Run time Architecture of Spark.
  6. Working on RDD-Resilient Distributed Data Sets.
  7. Creating Spark RDD using Scala and Python.
  8. Eclipse setup and Configuration for Apache Spark.
  9. Spark App development using Scala.
  1. Working on Spark Transformations and Actions.
  2. Performing Map Transformation on RDD.
  3. Performing FlatMap Transformation on RDD.
  4. Performing Filter Operation on Spark RDD.
  5. Executing GroupByKey and ReduceByKey.
  6. Understanding Union and Join Operations on RDD.
  7. Word Count Analysis using scala Programming Language.
  8. Introduction to HDFS and Hue Environment.
  1. Understanding YARN Architecture.
  2. Understanding Resource Manager and App Manager
  3. Understanding YRAN Slave Demon.
  4. Background of Node Manager & App Master
  5. Role of Containers in YARN.
  6. Understanding Yarn JOB in Cloudera.
  7. Spark-shell (Scala and Python shell ).
  8. Use Pair RDDs to Join Two Datasets.
  1. Executing Spark Application.
  2. Spark SQL on Cloudera Virtual Machine.
  3. Understanding HIveContext and SQLContext.
  4. Working on Data Frames using Python.
  5. Working on Data Frames using Scala.
  6. Understanding Case Class and Row functionality.
  7. Perform Join Operations on Data Frames.
  1. An Introduction to Spark Streaming.
  2. Developing Network Streaming Module.
  3. Working on Streaming Context.
  4. Windowing Live Stream using spark Streaming module.
  5. Introduction to Apache Kafka.
  6. Produce and Consume Apache Kafka Message.
  7. Collect Web Server Logs with Apache Flume.
  8. Send Web Server Log Messages from Apache Flume to Apache Kafka.
  9. Working on Broadcast variables & Accumulators.
  10. Why Spark Streaming.
  11. Developing Network Streaming App.
  12. Working on Streaming Context.
  13. Windowing Network Stream using spark Streaming module.
  1. An Introduction to Project Development.
  2. Project Assignments.
  3. Task Allocation and Project guidance.
  4. Final Project Submission.

Student Reviews

Our Student reviews after completion of this course.

Frequently Asked Questions

Yes, you will get industry valid certificate after completeion of this course and you will be tagged as a Big Data Expert.
Yes, You can pay your fee in two installments but it depands on the course you selected.
Yes, You can join Big Data & Hadoop online course even if you do not belong from computer science background. Our experts will help you to upgarde your technical skills, you just need some basic computer skills for this course.
This course is 100% practical oriented, you will work on cloudera, HDFS, Hive, Pig, Sqoop and Flume moreover, you will work on how to work on distributed data processing using apache spark and KAFKA.
DigiStackEdu provide cost effective and quality training,We focus on every student and we understand the value of money.Our Trainers are certified and having more than 10+ years of experience in Digital Marketing, Data Science, Java Programming, Web Designing, Big Data and Data Analytics moreover,We have Trained 47,000+ students and professionals who are working in top level IT companies.
DigiStackEdu only provide internship in Data Science, Digital Marketing, Java Programming, PHP programming and Web Development.For more details please contact to our support team.
Big Data and Apache Spark is the most demanding Data processing Framework in the current IT industry and you'll get a lot of job opportunity after completion of this course.
You can submit your fee after 3 classes, Even after that if you face any issue with in the next 7 days then you can contect to our support team for the return.
We'll start this course by very basic concepts of Big Data and Hadoop further we'll move to the advance concept of Hadoop.It will take approx 3 month to be master of Big Data after that you will get a industry valid cerficate and will get a tag as a Hadoop Expert.