Apache Hadoop Administrator | Agilitics





Buy Courses

Elastics Search Developer
January 9, 2018
Apache Hadoop Data Analyst
January 9, 2018
Show all

Apache Hadoop Administrator


No. of Days: 4

Prior knowledge of Apache Hadoop is not required. Unix/Linux
administration knowledge will be helpful.
Associated Certification(s):
Upon completion of the course, attendees can go for CCAH or HDP
Administrator. Certification is a great differentiator; it helps establish you
as a leader in the field, providing employers and customers with tangible
evidence of your skills and expertise.
Course Objectives:
This four-day administrator training course for Apache Hadoop provides
participants with a comprehensive understanding of all the steps
necessary to operate and maintain a Hadoop cluster. From installation
and configuration through load balancing and tuning. This training
course is the best preparation for the real-world challenges faced by
Hadoop administrators.
Course Content:
Through instructor-led discussion and interactive, hands-on exercises,
participants will navigate the Hadoop ecosystem, learning topics such as:

  • The internals of YARN, MapReduce, and HDFS
  • Determining the correct hardware and infrastructure for your cluster
  • Proper cluster configuration and deployment to integrate with the data center
  • How to load data into the cluster from dynamically-generated files using Flume and from RDBMS using Sqoop
  • Configuring the FairScheduler to provide service-level agreements for multiple users of a cluster
  • Best practices for preparing and maintaining Apache Hadoop in production
  • Troubleshooting, diagnosing, tuning, and solving Hadoop issues

Course Outline
The Case for Apache Hadoop

  • Why Hadoop?
  • Core Hadoop Components
  • Fundamental Concepts HDFS
  • HDFS Features
  • Writing and Reading Files
  • NameNode Memory Considerations
  • Overview of HDFS Security> Using the Namenode Web UI
  • Using the Hadoop File Shell

Getting Data into HDFS

  • Ingesting Data from External Sources with Flume
  • Ingesting Data from Relational Databases with Sqoop
  • Best Practices for Importing Data

YARN and MapReduce

  • What Is MapReduce?
  • Basic MapReduce Concepts
  • YARN Cluster Architecture
  • Resource Allocation
  • Failure Recovery
  • Using the YARN Web UI
  • MapReduce Version 1

Planning Your Hadoop Cluster

  • General Planning Considerations
  • Choosing the Right Hardware
  • Network Considerations
  • Configuring Nodes
  • Planning for Cluster Management

Hadoop Installation and Initial Configuration

  • Deployment Types
  • Installing Hadoop
  • Specifying the Hadoop Configuration
  • Performing Initial HDFS Configuration
  • Performing Initial YARN and MapReduce Configuration
  • Hadoop Logging

Installing and Configuring Hive, Impala, and Pig

  • Hive
  • Impala
  • Pig

Hadoop Clients

  • What is a Hadoop Client?
  • Installing and Configuring Hadoop Clients
  • Installing and Configuring Hue
  • Hue Authentication and Authorization

Cloudera Manager / APACHE Ambari

  • The Motivation for Cloudera Manager /Apache Ambari
  • Cloudera Manager/ Apache Ambari Features
  • Express and Enterprise Versions
  • Cloudera Manager / Apache Ambari Topology
  • Installing Cloudera Manager / Apache Ambari
  • Installing Hadoop Using Cloudera Manager / Apache Ambari
  • Performing Basic Administration Tasks Using Cloudera Manager /
    Apache Ambari

Advanced Cluster Configuration

  • Configuring Hadoop Ports
  • Explicitly Including and Excluding Hosts
  • Configuring HDFS for Rack Awareness
  • Configuring HDFS High Availability

Hadoop Security

  • Why Hadoop Security Is Important
  • Hadoop’s Security System Concepts
  • What Kerberos Is and How it Works
  • Configuring HDFS High Availability

Cluster Maintenance

  • Checking HDFS Status
  • Adding and Removing Cluster Nodes
  • Rebalancing the Cluster
  • Cluster Upgrading
  • Checking HDFS Status
  • Copying Data Between Clusters
  • Adding and Removing Cluster Nodes
  • Rebalancing the Cluster
  • Cluster Upgrading

Cluster Monitoring and Troubleshooting

  • General System Monitoring
  • Monitoring Hadoop Clusters
  • Common Troubleshooting Hadoop Clusters
  • Common Misconfigurations
Reviews (0)


There are no reviews yet.

Be the first to review “Apache Hadoop Administrator”

Your email address will not be published. Required fields are marked *

Request a Call Back
Request For Demo