There are many challenges involved in visualizing application traffic patterns. We first need to visualize the sequence of components along various flows of the traffic. Then we need to filter the traffic by different dimensions like protocol, client id, etc. and finally we need to view different metrics like volume, latency, etc. along each segment in the flow.
In this blog, we will show how Sankey charts address these challenges. Popular tools enable support for Sankey charts and recently Elastic added support for the same using custom vega visualizations. We can now build Sankey charts in Kibana using Vega, a declarative language. We were really impressed with the power of these new age tools and how easily we could bring out powerful visualizations out of data indexed in Elasticsearch. So let’s dig into how Vega visualizations on Kibana help in addressing these visualization challenges.
Cognitree was helping a startup to build a traffic management solution for application servers deployed in the cloud. The solution could proxy application traffic and apply a set of policies on the request and response flows. Policies like authentication, transformation, caching, etc. were provided out-of-the-box.
Today, as we recall the challenges involved in building the solution, one particular challenge stands out: to present a breakdown of traffic across various dimensions like time, protocol, volume, latencies in a simple but comprehensive and interactive way. The aim was to analyze cases like pinpointing root causes of congestion hotspots and to visualize attacks entering the network.
Let’s take a simplified version of the above use case to showcase the visualization solution. Assume, that we are proxying two endpoints of an application and applying policies on the traffic:
- /users/self: returns information about the authenticated user after applying authentication policy and auditing the requests.
- /users/self/media/recent: returns the most recent media published by the authenticated user. Authentication is enforced and if the media item is cached it is served from the cache to avoid a round trip to backend application server.
The flow for both the endpoints is as shown in the diagram.
The basic insights to provide is to show traffic volume or latency broken down by endpoints and policies. In Sankey charts, the components along the path of the traffic are shown as nodes and the link between the nodes represent the flow of traffic between the components. In the simplified version of the use case, the endpoints and policies form the nodes. The flow of traffic for each endpoint is very clearly depicted.
Metrics can be viewed on any component or link on mouse-over events. Hovering the mouse over the node shows the total request count received at the particular endpoint or policy while hovering over the link shows the latencies introduced by the components along the path of the traffic.
The actual power of the visualizations comes next. We can drill down our focus on the traffic from a particular endpoint by clicking on it. This is very useful in troubleshooting specific flows of the traffic. Clicking on the “Show All” button on the graph will zoom out to the graph above.
These rich features of the graphs only get amplified when they get integrated with the filtering abilities of Kibana. In Kibana, we can design a dashboard of such visualizations and simply apply filters on various dimensions like time range, endpoints, policies, etc. Overall Kibana and Vega seem to be a very good combination to build aesthetic dashboards.
To build these visualizations, we used Kafka for ingesting the metrics and Kafka Connect to index the data into Elasticsearch. We installed the latest version of Kibana (6.3) which brings in the Vega plugin out-of-the-box to render the Sankey charts from the data in Elasticsearch. A pictorial representation of the stack is shown below:
Vega visualizations, with its wide variety of visualization designs including the above described Sankey charts, add a boost to the already powerful abilities of Kibana in visualizing data in real time. We will continue to explore this combination of tools for our use cases and share our experiences in the upcoming blogs.