Home Blogs Cloud Computing Understand the trade-offs with reactive and proactive cloudops

by David Linthicum

Contributor

Understand the trade-offs with reactive and proactive cloudops

analysis

May 20, 20224 mins

Cloud ComputingCloud ManagementManaged Cloud Services

Before you get excited about proactive cloudops tools, know their limitations, especially if you're using a cloud service provider.

Credit: Thinkstock

It’s a no-brainer. Proactive ops systems can figure out issues before they become disruptive and can make corrections without human intervention.

For instance, an ops observability tool, such as an AIops tool, sees that a storage system is producing intermittent I/O errors, which means that the storage system is likely to suffer a major failure sometime soon. Data is automatically transferred to another storage system using predefined self-healing processes, and the system is shut down and marked for maintenance. No downtime occurs.

These types of proactive processes and automations occur thousands of times an hour, and the only way you’ll know that they are working is a lack of outages caused by failures in cloud services, applications, networks, or databases. We know all. We see all. We track data over time. We fix issues before they become outages that harm the business.

It’s great to have this technology to get our downtime to near zero. However, like anything, there are good and bad aspects that you need to consider.

Traditional reactive ops technology is just that: It reacts to failure and sets off a chain of events, including messaging humans, to correct the issues. In a failure event, when something stops working, we quickly understand the root cause and we fix it, either with an automated process or by dispatching a human.

The downside of reactive ops is the downtime. We typically don’t know there’s an issue until we have a complete failure—that’s just part of the reactive process. Typically, we are not monitoring the details around the resource or service, such as I/O for storage. We focus on just the binary: Is it working or not?

I’m not a fan of cloud-based system downtime, so reactive ops seems like something to avoid in favor of proactive ops. However, in many of the cases that I see, even if you’ve purchased a proactive ops tool, the observability systems of that tool may not be able to see the details needed for proactive automation.

Major hyperscaler cloud services (storage, compute, database, artificial intelligence, etc.) can monitor these systems in a fine-grained way, such as I/O utilization ongoing, CPU saturation ongoing, etc. Much of the other technology that you use on cloud-based platforms may only have primitive APIs into their internal operations and can only tell you when they are working and when they are not. As you may have guessed, proactive ops tools, no matter how good, won’t do much for these cloud resources and services.

I’m finding that more of these types of systems run on public clouds than you might think. We’re spending big bucks on proactive ops with no ability to monitor the internal systems that will provide us with indications that the resources are likely to fail.

Moreover, a public cloud resource, such as major storage or compute systems, is already monitored and operated by the provider. You’re not in control over the resources that are provided to you in a multitenant architecture, and the cloud providers do a very good job of providing proactive operations on your behalf. They see issues with hardware and software resources long before you will and are in a much better position to fix things before you even know there is a problem. Even with a shared responsibility model for cloud-based resources, the providers take it upon themselves to make sure that the services are working ongoing.

Proactive ops are the way to go—don’t get me wrong. The trouble is that in many instances, enterprises are making huge investments in proactive cloudops with little ability to leverage it. Just saying.

by David Linthicum

Contributor

David S. Linthicum is an internationally recognized industry expert and thought leader. Dave has authored 13 books on computing, the latest of which is An Insider’s Guide to Cloud Computing. Dave’s industry experience includes tenures as CTO and CEO of several successful software companies, and upper-level management positions in Fortune 100 companies. He keynotes leading technology conferences on cloud computing, SOA, enterprise application integration, and enterprise architecture. Dave writes the Cloud Computing blog for InfoWorld. His views are his own.

Topics

About

Policies

Our Network

More

Understand the trade-offs with reactive and proactive cloudops

Before you get excited about proactive cloudops tools, know their limitations, especially if you're using a cloud service provider.

More from this author

The perils of overengineering generative AI systems

A balanced approach to AI platform selection

How to avoid cloud whiplash

Does AI make us dependent on Big Tech?

A CISO game plan for cloud security

Generative AI agents will revolutionize AI architecture

The power of genAI plus multicloud architecture

What’s your plan for the cloud skills gap?

Most popular authors

Show me more

Beyond the usual suspects: 5 fresh data science tools to try today

HR professionals trust AI recommendations

Safety off: Programming in Rust with `unsafe`

How to use dbm to stash data quickly in Python

How to auto-generate Python type hints with Monkeytype

How to make HTML GUIs in Python with NiceGUI

Understand the trade-offs with reactive and proactive cloudops

Before you get excited about proactive cloudops tools, know their limitations, especially if you're using a cloud service provider.

Related content

Generative AI won’t fix cloud migration

All the brilliance of AI on minimalist platforms

The next 10 years for cloud computing

Serverless cloud technology fades away

More from this author

The perils of overengineering generative AI systems

A balanced approach to AI platform selection

How to avoid cloud whiplash

Does AI make us dependent on Big Tech?

A CISO game plan for cloud security

Generative AI agents will revolutionize AI architecture

The power of genAI plus multicloud architecture

What’s your plan for the cloud skills gap?

Most popular authors

Show me more

Beyond the usual suspects: 5 fresh data science tools to try today

HR professionals trust AI recommendations

Safety off: Programming in Rust with `unsafe`

How to use dbm to stash data quickly in Python

How to auto-generate Python type hints with Monkeytype

How to make HTML GUIs in Python with NiceGUI