Let’s face it, we need to rethink security. Of 143 enterprises who recently offered me views on the topic, 118 said they spent too much on security, 94 said that they didn’t believe their critical assets were secured satisfactorily, and 127 said that their security solutions left holes in their protective shields. I blogged earlier about the need to think about security in a different way, and for this blog I want to look at the thing enterprises say would make a fundamental difference. What is that? The answer is activity.
Some security issues aim to steal information. Some aim to compromise information, destroy data. Some are fairly narrow in scope, some very broad. Some attempt to exploit holes in protection and some to create them. There are a lot of variables, which makes it difficult to identify bad actors and risky situations. But enterprises say that in the end, one thing that seems a universal property of a security problem is the generation of an unusual activity pattern.
One enterprise told me about a recent ransomware attack they suffered. It started with a small group of servers, and they caught it because they saw an unusual pattern of file access at an unusual time of day. The operations team got concerned, cut off the impacted servers, and took all their critical databases offline until they figured out what was happening. Another saw that a particular server was initiating access to applications it didn’t really have any relationship with, and also cut it off. In both cases, the discovery of a security problem was accidental; nobody was explicitly monitoring for the conditions. One company was just checking on a reported application slowdown and the other was tracing a network connection.
I don’t believe that every potential security problem can be uncovered by watching for unusual activity, not because all wouldn’t be likely to create it but because narrow-scope attacks wouldn’t generate the kind of major shifts that could be detected in a reasonably effective and practical level of activity monitoring. But any sort of activity monitoring would be a challenge.
One obvious strategy might be to look at the database accesses. Obviously, a sudden increase in the rate of database access could be a signal that malware of some sort is operating, and a difference in access patterns might signal a hacking attempt. Of the 143 enterprises who offered comments on security enhancements, only 29 said they reviewed database activity “regularly”, but none said they tracked it dynamically in real time. Even the 29 who reviewed database activity regularly did so to monitor usage for performance management, not to enhance security.
One problem enterprises cited in this area was the difficulty in associating database access with a specific application or user. It’s a problem of tracing back from database toward the user, through what might be a series of applications and components, to the point where it’s possible to identify what’s actually going on and why. Of my 29 enterprises who reviewed database activity, 22 said they had tried to trace access to applications for performance management and cost allocation reasons, and 9 said they’d been successful.
Componentization in general, and the cloud in particular, have been overall complicating factors in activity monitoring because they tend to increase the attack surface. You can break into an application, in theory, at any component boundary, but when you do you may actually be breaking into only a part of multiple applications, which means that it’s difficult to know what’s actually going on. Yes, a large volume of accesses linked to a breach could be detected, but could they be tied back to the source?
I’ve noted in the past that application performance management, a goal almost as universal as security, is best accomplished through the tracing of workflows. That’s true of activity monitoring for security reasons too. Depending on how work is steered through a maze of components (service mesh is an example) it may be possible to determine workflows overall by looking at steering policies, but only if steering is done by a separate mechanism rather than defined by each step along the path. Good workflow-centric APM relies on software probes within components to time-stamp things, and a combination of probes and credentialing of requests could make it difficult to introduce false messages into even a complex workflow.
Of the 143 enterprises who had comments on this issue, 83 said they used probe technology in software, but only in software they’d developed themselves. None of these enterprises said they insisted in probes in third-party software, and in fact 34 of them said they would be concerned about the security of the probes. And none of the probe users applied probes to security monitoring.
Activity monitoring, in the end, is about workflows because both legitimate transactions and threats look like work and the movement of both valid work and threats make up workflows. Getting a handle on workflows would seem to be the baseline requirement for effective security, but as those who have followed my blogs know, I doubt that enterprises overall are as workflow-sensitive as they should be. They don’t consider workflow dynamics in performance management, especially in latency management. They don’t consider them in network loading either. Why should they think about them for security?
So, is there another way? The only other option I can see, one that might at least approach the ability of probes to monitor and analyze activity, is to harness the knowledge of the network. In order for work to flow it has to thread through addressable resources. In all but a few cases, the network associates logical names (URLs/URIs) with these addresses, and those logical names can be associated with the elements of applications, steps in processes.
Right now, the knowledge needed to make this work is divided between pieces of middleware and components of things like DNS. Could all that be collected, and perhaps more important, could it be correlated to be made actionable? That’s where AI might really make itself useful. As I’ve noted in other blogs, we have multiple management systems and multiple operations process sets, and combining them to do anything can be both a technical and training/organizational challenge. AI could cross the boundaries to correlate data, and through that gain insights into user and application workflow patterns that were suspicious. Even without the use of probes, AI could move security practices forward significantly, and if probes were available it could do even better.
Security is perhaps the broadest and most valuable of all AI missions that operations teams could promote, and some vendors are starting to develop AI tools already. Generic AI, such as the ChatGPT model, could be pressed into service here too, but I think there are still “hallucination” issues to be dealt with there. But users who state an interest in AI for security (of the 118 enterprises who offered comments here, 101 thought AI could be helpful) well over three-quarters tended to look for a generative AI/ChatGPT-type solution. At the minimum, this suggests that realistic purpose-built AI tools for security aren’t publicized enough to influence buyers as they should.
Enterprises spend a lot on security, and expect to continue to do so, and while they also tend to believe they’re getting some results for their money, they acknowledge that their overall sense of risk isn’t diminishing much. It’s pretty clear that vendors know that, and are trying to present a better overall model, but we’re still groping for just what the best approach is. Perhaps 2024 will be the year we make real progress.