When I talk to others about starting in the cloud, there is a frequent misconception that all you need is endlessly and magically available and that you can never run out of any resources. You indeed can expand, nearly endlessly, however, cloud infrastructure is often not set up this way – for good reason. Today we will be tackling monitoring to better understand some of the tooling available to us too make sure we have eyes on our resources!
What is monitoring?
Monitoring involves tooling that tracks the metrics of the hardware and software of an IT system. By collecting and displaying these metrics, often coupled with alerting, those running the environments can keep track of which resources are being consumed by which processes.
For instance, in the cloud, you might be running an Azure Redhat Openshift Kubernetes cluster built on several underlying nodes and storage with varying CPU and Memory. After initiating your environment and running for a while, you might find that some of your containers constantly crash or don’t perform as expected. One common reason for this is resource constraints. Monitoring allows for timely reactions to increase for example storage, CPU, Networking, and Memory resources available so that timely actions can be taken to adjust these when necessary.
Doesn’t the cloud allow for endless use?
One major advantage of the cloud is the ability to quickly scale your resources. There are two forms of scaling: horizontal and vertical. Vertical scaling involves increasing the power and size of the machine being used, horizontal scaling involves the addition of a similar machine to share the workload.
Scaling in this manner needs to be explicitly configured, but it is often not the solution! If you have a process running that happens to be saving log information 12x over instead of 2x, this is causing a strain on your storage. The solution here is not to add more storage! By implementing monitoring correctly, we can see which process specifically is demanding these resources and investigate accordingly.
Then there is also a budget constraint. The business, of course, wants to keep costs low. Increasing storage costs more money, so you want to first understand whether this is absolutely necessary. Monitoring can help point in the right direction to make this analysis.
What tools are available?
The tooling for monitoring is of course completely dependent on the environment you are working on. In Azure, we have Azure Monitor for monitoring Cloud resources. But we also have tooling such as Prometheus and Grafana that are Open-source applications designed specifically for Kubernetes.
It’s worth it!
What is important is that you incorporate monitoring from the get-go; it should not be an afterthought for your system. By investing in monitoring upfront to bring transparency into the system, in the long run you will save money by only adjusting your system when necessary and prevent avoidable system outages caused by inconspicuous processes.