In-Person Classroom

Unfortunately, this training model is not available for this certification

  • 4-days of guaranteed to run in-person training
  • Access to CP’s study guide designed by industry experts
  • Exam passing tips and tricks to assist in the exam
  • 2 practice tests to gauge your learning post-training
  • Application assistance and support by certified staff

$ 2499 $ 2299

Live Online Classroom

An online training model with the virtual presence of an instructor

  • 4-days of assured instructor-led online live training
  • Access to CP’s study guide designed by industry experts
  • 24 PDUs certificate on completion of the training
  • 100% exam pass guarantee in the 1st attempt
  • Recorded lesson video for post-training learning

$ 2299 $ 1949

Online Self-Study

Study at your own pace with the self-study model of learning

  • 180 days of complete access to the complete course
  • Access to CP’s study guide designed by industry experts
  • 24 PDUs certificate on completion of the training
  • 100% exam pass guarantee in the 1st attempt
  • Application assistance and support by certified staff

$ 2299 $ 899

Big Data Hadoop Administrator Certification Training

This course provides you with expertise in maintaining complex Hadoop clusters. This training dive deep into Big Data concepts and how to do the administration roles.

Course Overview

Skill Rise Hadoop Certification training will guide you to gain expertise in maintaining complex Hadoop clusters. Our training will provide you with hands-on preparation for the real-world challenges faced by Hadoop Administrators.

Course Agenda

  • Data & Existing Solutions

  • Welcome to the world of Big Data—What, Why & Where

  • Case studies

  • Hadoop & its Ecosystem

  • Hadoop Core components

  • Hadoop & its capabilities

  • Gain knowledge on HDFS its internals, working & features

  • Learn about possibilities without HDFS

  • Differentiate or find similarities in different distributions of Hadoop.

  • Identify the requirements to set up a Hadoop cluster

  • The need for Cluster Management Solution

  • Choice of Installation methods—Automated/ Manual

  • Linux machines setup—Virtualization & Cloud

  • Hadoop Cluster Setup—Apache Hadoop V2 & Cloudera Distribution of Hadoop (CDH)

  • Cloudera manager features and capabilities

  • Working with Hadoop cluster, HDFS & data

  • Working with management console/ UI ( user interfaces) & Linux terminals

  • Understand administration scenarios

  • List and describe the files that control Hadoop configuration

  • Explain how to manage Hadoop configuration with Cloudera Manager

  • Locate configuration files and make changes

  • Explain how to deal with stale configurations

  • Explain the properties of addresses and ports of RPC and HTTP servers run by Hadoop Daemons

  • Locate log files generated on hosts

  • Filter information in log files

  • Explain how to get diagnostic information from log files

  • Explain how to add and remove nodes in an ad-hoc way

  • Explain how to add and remove nodes in a systematic way, otherwise known as commissioning and decommissioning of nodes

  • Explain how to balance a cluster

  • List the steps for managing services including adding, deleting, starting, stopping and checking the status of services

  • Explain the procedure to enable rack awareness

  • List the steps to add, remove and move role instances and hosts

  • Cite the challenges faced with the first version of Hadoop

  • Explain the features in the second version that help overcome the challenges faced in the first version

  • Describe the role of computational frameworks

  • Explain MapReduce concepts

  • Describe MRv2 on YARN

  • Explain configuring and understanding of YARN

  • Describe YARN applications

  • Describe YARN memory and CPU settings

  • Describe the scheduling concepts

  • Indentify the Schedulers

  • Explain the ways to manage resources using Schedulers

  • Describe FIFO, Fair Scheduler, and Capacity Scheduler

  • Explain how to configure Schedulers

  • Explain queue management

  • Planning Hadoop Cluster

  • General Planning considerations

  • Workload and cluster sizing

  • Making Choices—Hardware, Software & Network

  • Making Choices—Master/Slave considerations

  • News from the world—Existing Setups

  • Explain the concepts of Hadoop client, edge nodes, and gateway nodes

  • Install and configure Hadoop clients

  • Explain how Hue works

  • Install and configure Hue

  • Describe how authentication and authorization is managed in Hue

  • Understand Data Ingestion & its types

  • Knowing about various data ingestion tools & their capabilities

  • Understanding how Flume works

  • Understanding how sqoop works

  • List some of the services and open-source components that work within the Hadoop ecosystem

  • List the advantages and key features of Hive

  • Describe briefly about the components of Hive

  • Explain how to configure Hive in different modes

  • Explain the architecture of HBase and cite the advantages of using HBase

  • Explain the working of Apache Kafka

  • Describe the architecture of Apache Spark

  • Describe the different ways to avoid risks and secure data

  • Identify the different threat categories

  • Describe the security aspects for different nodes

  • Describe operating system security

  • Describe Kerberos and how it works

  • Describe Service Level Authorization

  • Describe cluster monitoring

  • Describe the ways to choose the right monitoring solutions

  • List the features and considerations of Cloudera manager for monitoring

  • Describe the different categories of Hadoop Metrics

  • List the different types of Hadoop Metrics

  • List the steps to monitor a cluster by using Cloudera Manager