The growing demand for distributed applications poses the need to re-strategise application deployments and management in data centers. Resource optimization, network management, resource scheduling and fault tolerance are some of the challenges to be addressed in deploying distributed applications. A number of technologies have evolved to address these challenges. In this post, we will briefly touch upon Mesos’ approach to these problems.
What is Mesos
Mesos is an operating system for distributed applications, which provisions resources to applications just like an operating system. Mesos uses master-slave architecture in which master distributes the work among slaves.
Mesos implements resource scheduling at two levels, one at master and the other at application. Resource offering involves master acquiring resources from slaves and offering them to the applications. Applications can choose to accept or reject the offer.
Mesos addresses the following challenges in deploying distributed applications :
- Resource scheduling – Master focuses on acquiring and offering the resources to the applications and delegates the responsibility of choosing the optimal resources to the applications. This technique enables the master to support variety of applications and also handle resource scheduling across large number of applications. Mesos also supports dynamic addition of resources.
- Network management – Mesos abstracts network management concerns like DNS, IP address management by integrating with third party modules.
- Fault tolerance – There are two points of failure – one at Master and the other at applications. Master is highly available. Applications have to handle failures on their own. Mesos doesn’t handle application failures, but detects failures between application components and informs active components of application to recover from failures.
When to use Mesos?
- To achieve optimal resource utilization.
- Handle large scale deployments as high as 10K nodes.
- Support variety of workloads including real-time processing (Spark, Storm), batch processing (Hadoop), high-performance computing (MPI), data storage (Elastic Search, Cassandra, HDFS), continuous integration (Jenkins) etc.
Why Mesos?
- Performs multi-resource (CPUs, memory, disk) scheduling.
- Provides REST API to ease the orchestration of cluster.
- Extensible, future proof architecture that can work with future workloads.
- Supports variety of deployment platforms including docker.
Mesos ecosystem is growing very fast. Architectural choices made by Mesos and its big community support makes it a compelling candidate for large scale deployments. Other tools comparable with Mesos are Kubernetes, Yarn, Swarm etc each having their own pros and cons. We have already covered Kubernetes in our earlier post which might interest you.