Kubernetes for Scalability and High Availability
Kubernetes is an open source system for deploying, scaling and managing containerized applications automatically. With the help of Kubernetes, we can deploy our applications quickly, scale it according to our requirement without having to stop anything in the process.

Kubernetes(k8s) is a platform for working with containerized workloads and services which gives you the means to do deployments and easy way to scale and also helps in monitoring. It was originally designed by Google but now maintained by the Cloud Native Computing Foundation. Many cloud services offer a Kubernetes based platform or infrastructure on which Kubernetes can be deployed as a platform-providing service. But how can we be sure that Kubernetes stays up and running when a single component or even the entire data center goes down?
This is when programming for Kubernetes High-Availability comes into action. K8s HA is not only about the firmness of Kubernetes itself but also setting up Kubernetes along with supporting components such as etcd, in such a way that there is no single point of failure, as explained by Kubernetes expert Lucas Käldström.
High availability basically means the elimination of single point of failure in the cluster. When highly available clusters are present, it should have the ability to lose few masters. Suppose there is a normal cluster which has multiple masters, multiple etcd replicas but is still running just one kube-dns. Now if the master running kube-dns fails, the cluster will be experiencing some kind of outage because since all the service discovery queries may not be solved. So, it should be taken into a deeper level and analyze where are the key components that we have in a cluster and then try to eliminate their single points of failure.
Horizontal Pod Autoscaler
Enterprise applications have to be designed upfront for scalability and changes. This has a significant indication for both application architecture and infrastructure. Application architecture is developed gradually from unmanageable monolithic or three-tier patterns to interconnected microservices. Microservices introduce new form factors not only for functionality and team-size but also for the unit of infrastructure. A portable container or a pod often works out as the most appropriate unit of infrastructure for microservices-based architecture. Fortunately, container-based scalable infrastructure is a reality, thanks to the Kubernetes project incepted in and open-sourced by in 2014.
The Horizontal Pod Autoscaler scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization.
The Horizontal Pod Autoscaler is implemented as a Kubernetes API resource and by a controller. The resource identifies the behavior of the controller. The replicas are adjusted periodically by the controller in a replication controller or deployment to match the average CPU utilization to the target specified by the user.