Sunday, February 02, 2020

Data Platform Tips 67 - Azure HDInsight Availability Infrastructure

HDInsight provides four primary services to ensure high availability with automatic failover capabilities:

  • Apache Ambari server
  • Application Timeline Server for Apache YARN
  • Job History Server for Hadoop MapReduce
  • Apache Livy

These infrastructure consists of a number of services and software components, some of which are designed by Microsoft. The following components are unique to the HDInsight platform:

  • Slave failover controller
  • Master failover controller
  • Slave high availability service
  • Master high availability service

There are also other high availability services, which are supported by open source Apache reliability components. These components are also present on HDInsight clusters:
  • Hadoop File System (HDFS) NameNode
  • YARN ResourceManager
  • HBase Master

