Jun 22

Kronos – a cron replacement to schedule complex data workflows

Increasing need for insights from vast data sources has given rise to data-driven business intelligence products which build and execute complex data workflows.

A data workflow is a set of inter-dependent data-driven tasks. Simple solutions use a cron based approach which works well for simple workflows with few or no task dependencies. However, cron fails if there are complex dependencies between tasks.

At Cognitree, we build and execute complex data workflows for our customers to gather data insights. We built an effective scheduling tool Kronos for our data pipelines which adds more features on top of cron.

What is Kronos

Kronos is a Java based replacement for cron to build, run and monitor complex data pipelines with flexible deployment options. It handles dependency resolution, workflow management, failures. Kronos is built on top of Quartz and uses DAG (Directed Acyclic Graph) to manage the tasks.

Examples of data pipelines include batch jobs, chaining multiple tasks, machine learning job etc.

Kronos can be compared with Oozie and Azkaban, which are targetted specifically for Hadoop workflows while Kronos is flexible and can run any workflow including big data pipelines.

The architecture is flexible and extensible with each component of the Kronos designed to be pluggable.

Why Kronos

  • Dependency Management: Define/manage dependency among tasks.
  • Dynamic: Define/modify workflow and task dependencies at runtime.
  • Extensible: Define custom source of tasks, task handlers and the persistence store.
  • Policy Driven: Define custom policies to handle timeouts.
  • Fault Tolerant: Handle system/process faults.
  • Flexible deployment model: Embed as a library or deploy in standalone or distributed mode.

Today, we are proud to open source and share Kronos, our workflow management framework.

httpss://github.com/cognitree/kronos

Do give it a try by heading on to the getting started section and help us improve Kronos by giving us feedback.

In upcoming posts, we will talk about the use cases solved by using Kronos, our workflow management framework.

Leave a reply

Your email address will not be published.