We’re hearing a lot about how analytics is going to change networking, how they’re essential in SDN, NFV, the cloud, and maybe also critical in improving your social standing, good looks, financial health, even maybe make you run faster. How big can “big data” get? Apparently the sky’s the limit. As usual, most of this is just crap, and so we need to look at the question of applying “analytics” to network services to sort out any good ideas from the inevitable chaff of marketing claims.
First, “Knowledge is Power” not “data”. In order for “data” to become knowledge, you need the critical notion of context. When I know that traffic is heavy, it’s nice. If I know what road it’s on and when, it’s nicer. If I know it’s on the road I plan to use, it’s power. The point here is that collecting information on network behavior does little good if you can’t apply that data contextually in some way, and there are two approaches you can take to get to context. Then you have to turn that “knowledge” into power separately.
The first approach to data context in network analytics is one of baselining. If traffic is “heavy” it has to be heavy relative to something, and in baselining you attempt to define a normal state, as a value of variables or more likely a range of value. When data falls outside the range at any point, you take that as an indication of abnormal behavior, which means that you undertake some action for remediation (the “power” part). However, getting baselines for variables won’t create context because you can’t relate the conditions across measurement points with anything in particular. Baselining, or simply range-testing in analytics, isn’t particularly helpful, and most people who do anything with it that’s useful really mean past-state analysis when they say “baselining”.
What some analytics approaches advocate is to look at the state of the network holistically, with all variables considered individually relative to their accepted range of values. You then essentially pattern-match to decide what past state this present one corresponds to, and you accept the way that past state was interpreted as being the context of the current conditions. The NOC said this was Friday-and-I-have-no-date traffic last time it happened, so that’s what I’ll call it this time. Presumably, if we can determine the remedies taken last time and apply them (or at least suggest them) automatically, we can respond correctly. However, we have to assume that our baseline map has 1) accurately established context based on past-state analysis, and 2) that somebody has created rules for response that can be applied to the current situation. Most analytics processes don’t really address the latter of the two issues; it’s up to an operations specialist to somehow create general scripts or policies and make them runnable on demand.
The second approach to gaining context is to take a service-driven approach. A network asserts a service set that’s consumed by its users. Each service is dependent on resource behaviors that fulfill it, and if you understand what behaviors are associated with a given service you can correlate the state of these behaviors with the services. Now if “Behavior 4” has a variable out of range, you can presume that means that the services depending on Behavior 4 will be impacted.
The critical requirement in a service-based analytics application is that there be a correlation between the data you collect and “services”. That means either that you have to measure only service-specific variables and ignore resource state, or that you understand the way resources relate to the services.
Resource relationships to services depend on whether the service network is a provisioned resource or a connection network. In a provisioned resource, you make a specific connectivity change to accommodate the service and so you presumably have some knowledge of how you did it. In cloud networking, for example, if you use Neutron to set up connections among hosted elements, you know what connections you set up. In connection networks, the members of the network are mutually addressable, and so you don’t have to do anything special to let them talk, Instead you have to know how a given connection would be carried, which means an analysis of the state of the forwarding rules for the nodes.
One thing all of this demonstrates, if you think it through, is that there are really two networks here—a service network and a resource network. There are also then two ways of looking at network conditions—based on how they impact services directly (the service-network view) and based on the health of the resources, working on the theory that healthy resources would support services as designed.
You might think this means that the service context is useless, but the opposite is true. That’s because there are two levels of “service” in a network. One level defines the “functional behavior” of the service and is created and sustained by maintaining functional relationships among elements, and the other defines the “structural behavior” of the service, which is created by that connection network (or networks). Resources, or infrastructure, asserts its own services. When we talk about a service view of something in relation to analytics we’re not talking about the retail functional relationships but rather the structural relationships—which is good because it’s the resources we have data from.
For new technologies like SDN and NFV I think this dualism is critical, both to allow analytics to be used effectively and to make operations of a network practical. Where a “service” is coerced from “multi-tenant” resources by network/service policies, you can’t spend a lot of time fiddling with individual connected users and applications because you set up the multi-tenant connection network to avoid that. In that case, you have to consider the whole connection network as a service.
The final point here, the “power” part of knowledge, is making something happen with what you now know. The service-based framing of network analytics means that you have something ecosystemic you can use as context—the connection-network service you defined. Logically, if that’s your framework then you have to be able to take service experience and pull resource conditions out of it to create your analysis, which means that analytics has to be related to some sort of service, and in a way that allows you to collect resource data for that service on demand. This is the thing you need to look for when somebody talks “network analytics” to you.