This blog shows how we leveraged the Kubernetes cluster autoscaler with Amazon EKS service in order to build a cost effective solution for an on-demand deployment of microservices in a dynamically scaling environment. This blog along with a detailed explanation of the use case also provides a step-by-step guide to... read more →
Apr
13
Jul
25
There are many challenges involved in visualizing application traffic patterns. We first need to visualize the sequence of components along various flows of the traffic. Then we need to filter the traffic by different dimensions like protocol, client id, etc. and finally we need to view different metrics like volume,... read more →
Jul
06
At Cognitree, most of our customer projects involve analysing a large amount of data to gather insights. These projects involve design and development of ETL pipelines, real-time analysis, report generation and training models using statistical learning algorithms. The solutions are often built using open source tools and although the components... read more →
Jul
02
In this blog post, we’d like to outline how we defined policies for time series data management in Elasticsearch. Background As part of an IoT security solution built for a startup client, we have a typical real-time data processing pipeline: data from sensors is received into Kafka topics and consumed... read more →
Jun
27
Apart from the pre-built functions available for data analysis, Spark enables developers to write custom user defined functions that can be applied on a single row, a group of rows or a window of rows to analyse data. In this blog, we will explore in detail how we implemented a... read more →
Jun
22
Increasing need for insights from vast data sources has given rise to data-driven business intelligence products which build and execute complex data workflows. A data workflow is a set of inter-dependent data-driven tasks. Simple solutions use a cron based approach which works well for simple workflows with few or no... read more →
Jun
13
Version 2.0.0 of Flume sink plugin is now available. The release includes support for Flume 1.8.0 and Elasticsearch 6.2.4. Many thanks to Alexey Mikka for his contributions. Please use Cognitree's fork of Apache Flume 1.8.0 to make use of class loaders to load the plugin.
Jun
11
Last year, Cognitree enhanced Apache Flume 1.7.0 with a support for using classloaders to load plugins. We have now ported the support for users of Flume 1.8.0. The binary distribution can be downloaded here.
Jul
14
Cognitree has open sourced a Flume sink plugin for Elasticsearch 5.4. The sink plugin is compatible with Flume version 1.7. To avoid dealing with versioning hell for dependencies we highly recommend to use this plugin with Cognitree's fork of Flume. Motivation We were looking to analyze large amounts of streaming... read more →
Jul
11
Apache Flume is a tool for moving large amounts of data from various sources to a centralized data store. It provides an extensible framework to expand its applicability to various sources and stores. The plugin framework currently lacks the ability to provide an isolated class loading to the plugins. Today,... read more →