KubeAcademy by VMware
Metric Collection with Prometheus
Next Lesson

In this lesson we will learn about Prometheus and how it collects metrics about your Kubernetes cluster and applications.

Hart Hoover

Senior Field Engineer at Kong

Hart Hoover is a Senior Field Engineer at Kong. His expertise lies in technical training, consulting, community building, Linux-based operating systems, computing automation, and cloud application architecture.

View Profile

Hello. My name is Hart Hoover, Manager of Kubernetes Education at VMware. In this course, I'll be going through how Prometheus collects and stores metrics in a Kubernetes cluster. Prometheus is a CNCF graduated open-source project written in Go. It's used to record real-time metrics that are stored in a time series database, built using an HTTP pull model, that also includes a query language and alerting capabilities.

Typically, when someone talks about Prometheus, they are including several smaller services working together. Let's start with metrics that are available for Prometheus to collect. These four metric types are counters, gauges, histograms and summaries. Counters, well, count. It's a metric that counts up from zero. Typically, this is used for requests, tasks completed or errors. Gauges are similar to counters except it's for metrics that go up and down instead of always increasing. A gauge could be used for temperature or memory use.

Histogram samples observations and counts them in buckets. Histograms can measure things like an application performance index score and also provide a sum of all observed values. A histogram consists of three elements, a count of the number of samples, a sum summing up the value, and, finally, labeled buckets that allow users to query the data collected. Finally, summaries are similar to histograms, but also calculate quantiles tiles over a sliding window. This allows you to calculate averages or percentiles of observed values.

Now that we know metrics, we need to store them for queries, and Prometheus uses a time series database to store its metrics. The time series for Prometheus is stored as a metric and a set of key value labels. This data sample shows a Unix timestamp and a value for the metric. This state, when Prometheus is run in a Kubernetes cluster, is typically backed by persistent storage volumes. So if the Prometheus server is restarted, data is not lost. We have a set of metrics.

We now know how those metrics are stored in Prometheus, but how do we get data out of Prometheus for querying and context? Prometheus uses its own query language called PromQL to allow users to select and aggregate time series data. This can be displayed as tabular data or shown as a graph on a dashboard. Here, I've noted a few Kubernetes specific queries, where in the first example, we query over a two-minute period the rate at which pods are restarting. In the second example, we would be able to see if the count of etcd members was enough for a quorum.

Here, we can see components of a typical Prometheus architecture stack as deployed in Kubernetes. Prometheus queries the Kubernetes API to auto discover its targets, where it should be getting metrics. Kubernetes nodes also export metrics using special exporters in a way that Prometheus understands Prometheus processes these metrics and stores them in a time series database backed by persistent volume storage, where dashboarding tools can build graphs from the data for visualization. Finally, an alerting service called Alert Manager can process monitoring alerts, based on the data, and send alerts to multiple locations, including web hooks, emails, or chat systems like Slack. Thank you. In the next video, we'll look at a Prometheus stack deployed in a Kubernetes cluster.

Give Feedback

Help us improve by sharing your thoughts.

Links

Share