Friday, March 28, 2014

Network Functions Virtualization Meets Network Monitoring and Forensics

By Adwait Gupte, Product Manager

Enterprises and service providers are increasingly flirting with Network Functions Virtualization (NFV) as a means to achieve greater efficiency, scalability and agility in the core and datacenter.

NFV promises a host of benefits in the way networks are created, managed and how they evolve. Compute virtualization has, of course, redefined data centers, transforming servers from computers to virtual processing nodes that can run on one or many physical servers. This separation of processing hardware from the abstract “ability to process” definition of servers allows a lot of flexibility in the way datacenters are managed and how workloads are managed, especially in multi-tenant environments.

Network Functions Virtualization (NFV) is a similar concept, applied to networking. But haven’t switches and appliances always been distributed network “processing” nodes? NFV proposes replacing the integrated, purpose built software/hardware boxes, such as routers and switches, with commodity processing platforms and software that performs the actual network function. Thus, rather than having a box with its own network OS, processing power, memory and network ports which together function as a router, NFV proposes having a general purpose hardware with processing power, memory and ports that run software that transforms it into a router. In some cases, it’s more costly and less efficient to hand a networking job to a general purpose processor. The advantage of this virtualized router is that the software layer can be changed on the fly to turn this router into a switch or a gateway or a load balancer. This flexibility enables polymorphism within network infrastructure and promises to deliver a more nimble design that can be dynamically repurposed according to the changing needs of the network, thus future proofing the investment made in acquiring the infrastructure.

Today switching and routing functions can be virtualized, with some tradeoffs.  More sophisticated functions for security and network/application monitoring still require hardware acceleration. Tools such as NPM and APM and security systems such as IPS, which operate on real time data, have arrived in a virtual form factor for some use cases. Technologically speaking, this seems to be the logical evolution that follows the virtualization of much of data center infrastructure. While there remains debate as to whether the tool vendors embrace or attempt to stymie this evolution, the more critical question is: What elements require optimized processing and hardware acceleration?

From the customer’s viewpoint, virtualization reduces the CAPEX allocated to such tools and systems. As virtualized tools become available, it might become easier for customers to scale their tool deployments to match their growing networks. The hope of scaling out, without needing to buy additional costly hardware based appliances, is an obvious attraction. They can instead just increase the compute power of their existing infrastructure and possibly buy more instances of the virtualized probes, as necessary. In a multi-tenant situation, these probes may even be dynamically shared as the traffic load of individual tenants varies. But what if those tools and probes cannot function without hardware acceleration? What if running them on general purpose compute proves more expensive than running them on optimized systems?

There’s no reason to adopt virtual tools and systems that can’t get the job done or that increase costs.

Further, while routing/switching are very well understood functions that even nascent players can virtualize, there is a significant operational cost to any such changeover. Advanced monitoring features are much more complicated and sophisticated. In contrast to infrastructure elements, tools and security systems require a greater development investment and more often require highly integrated hardware to function efficiently. 

I think the driving force behind this transformation will have to come from the customers, especially large ones, who have the economic wherewithal to force the vendors to toe the line towards virtualization. An example of such a shift is AT&T’s Domain 2.0 project. As John Donovan put it, “No army can hold back an economic principle whose time has come.”

As the large customers build pressure on the vendors to move towards virtualization, I think we will start seeing some movement towards NFV within advanced products of the networking space. One element of this change is already occurring in forensics or “historical” (as opposed to real time) network analysis. Historical analysis functions, such as IDS or Network Forensics, can be virtualized to a great degree, but these systems, today, tend to be monolithic devices. These devices combine capture, storage and analysis. As has been shown repeatedly in the past, there’s certainly value to specialization; especially when line-rate performance is required. Capturing network data, storing it efficiently for retrieval, and building smart analytics are diverse functions that have been coupled in the past.

Today, just as we consider decoupling network functions from underlying hardware, we should also look at the benefits of decoupling network data from analysis software and hardware appliances. After all, these systems are hardware, software, and data. Ultimately, NFV provides an opportunity for the analytics tools and security systems to offload the data capture and storage duties to other elements, enabling hardware optimization (if required) and freeing the data to be used by a variety of systems. A move towards NFV by the analytics vendors would bring with it all the advantages of scalability and cost-effectiveness that NFV promises in other networking domains—but  analytics vendors need to decouple data processing as much they need to virtualize functionality.

Tuesday, March 18, 2014

High Density Tapping in a 100Gbps World

By Joseph Collins, Product Manager

Network bandwidth utilization has been rapidly increasing. In one example, Dave Jameson, Principal Architect at Fujitsu Network Communications, writes that in “in 1995 there were approximately 5 million cell phone subscribers in the US, less than 2 percent of the population. By 2012, according to CTIA, there were more than 326 million subscribers. Of those, more than 123 million were smartphones.” Smartphones have made unlimited amounts of information available to create the “human centric network.” (Jameson, 2014)

Increase in data is not limited to the telecommunications market, however. In every industry we see increased bandwidth utilization and big data. This continued capacity has resulted in data centers increasing the capacity of their fiber links. Goldman Sachs has upgraded the stocks for a number of optical equipment manufacturers because of huge deals being made by organizations that need increased bandwidth. (Jackson, 2014) This increased need for speed has finally propelled 10GbE data center switching shipments to overtake 1GbE according to Crehan Research. (Crehan Research Inc., 2014)
Data Center Switch Port Shipments 2009-2013, Crehan Research Inc.
While bandwidth utilization growth is a major factor in the growth from 1GbE to 10GbE, so are form factor and lower prices. Growth also isn’t expected to stop at 10GbE. Goldman Sach’s Simona Jankowski writes that 100GbE will soon be an important revenue maker for the market, and is one of the reasons for their stock rating upgrades. (Jackson, 2014).

Data center specialists must have flexible and high port density solutions for network access. Products, such as network TAPs, need to address tapping needs for today while giving networking professionals the ability to prepare and upgrade for tomorrow. This translates into chassis that support tapping of Multimode and Singlemode links from 1Gbps up to 100Gbps at a high-port density with modularity to allow for scalable transitions and upgrades.

VSS Monitoring’s HD Fiber TAP provides such a solution to the network access problem. In a 1RU form factor, up to 24 links of 1-100Gbps Singlemode or 1-10Gbps Multimode, 16 links of 40Gbps or 100Gbps Singlemode, or a mix of the two can be passively tapped. This thoughtful design allows for expansion of 1Gbps to 10Gbps, and to 40Gbps and 100Gbps tapping access for network links for years to come.

Availability

The HD Fiber TAP chassis, 1Gbps-100Gbps MM and 1Gbps-40Gbps SM modules are shipping now; the 100G MM modules are to follow in April.

Friday, March 14, 2014

SDN Applications Alone Do Not Meet Customer Needs for Visibility and Security on Large-Scale Networks.

By: Andrew R. Harding, Vice President of Products

Last week I wrote about how the term “network TAP” is being misused in the SDN world. I explained how engineers might combine TAPs, NPBs, and SDN in a solution, using the joint IBM and VSS Monitoring “ConvergedMonitoring Fabric” as an example. And, in the last week, the leading SDN proponent announced a "TAP"--that is, an automated SPAN configuration tool that works with OpenFlow switches. It's an interesting announcement, which you can read about here: http://www.sdncentral.com/news/onf-debuts-network-tapping-hands-on-openflow-education/2014/03/. The announcement was made at the Open Networking Summit, the annual meeting of the Open Networking Foundation (ONF). (http://www.opennetsummit.org/)

ONF, which has been led by Dan Pitt since 2011, and which some say has been driven by Nick McKeown from behind the curtains at Stanford, is moving from shepherding the OpenFlow specification to delivering an open-source project. (https://www.opennetworking.org/) This event is worth noting because their fist SDN application is an “aggregation tap” that works with an OpenFlow controller and OpenFlow switches. This is quite a development for the ONF, which had spurned open source in the past and left white space in the SDN arena for single vendor driven projects (like the languishing Project Floodlight) and multi-vendor projects like Open Daylight (ODL). (http://www.opendaylight.org/) But it's not a TAP. This application requires tapping and TAPs to access traffic. An OpenFlow switch, alone, can't get traffic from the production network, and spanning ports directly from a production OpenFlow switch encounters precisely the same issues as traditional attempts at using SPAN.

Dan Pitt, speaking for the ONF, asserts that the project is merely an educational tool and that the open-source project, called “SampleTap,” is a “non-invasive, experimental project.” That sounds very much like the initial positioning for some SDN applications from embattled SDN startups, which touted their own "tapping" SDN application as “your first production SDN application.” The passive nature of tapping traffic and then aggregating that tapped traffic does make the use case a safe starting point for SDN. TAPs don't perturb the network. Combining TAPs with NPBs delivers visibility into network data. For simple use cases, such as educational and lab deployments, this open-source SDN application might provide a starting point for software engineers needing to learning about the network or networking engineers who are investigating SDN. SDN code alone, however, fails to provide visibility and fails to improve security on large scale networks. SDN applications alone, open-source or commercialized, do not meet those customer needs because:
  • Tools must be optimized. Switches can’t do this. They are limited to link aggregation, and very few production OpenFlow switches even support LAG.
  • Traffic must be groomed. Current switches cannot re-write packets. They cannot support port and time stamping. They can only support basic aggregation and filtering.
  • Monitoring fabric = hardware-accelerated meshed forwarding system. OpenFlow cannot do this today. NPBs and the vMesh architecture do this today.
  • Initial tapping is required. No SDN offering supports a complete solution from TAPs to passive NPB to active use cases.
  • Latency of white-box & commodity silicon switches is unacceptable for many applications.


SDN Apps alone are incomplete and must rely on NPBs and TAPs. This sample application remains an intriguing development. This application runs atop the Open Daylight SDN Controller, the same platform as the converged monitoring fabric from VSS Monitoring and IBM. In a demo of the application, which is based on Java and HTML5, a multi-switch system that supports aggregation and OpenFlow filters was shown. Very basic unidirectional service insertion , a basic approach to augmenting switch functionality with functions available only on remote systems was also shown. This service insertion approaches show the pre-cursor to tool chaining and service chaining, which have been something of a holy grail in networking. The idea of “insertion” and “chaining” goes all the way back to Cisco’s venerable Service Insertion Architecture and Juniper’s “service chaining vision,” announced in 2013 with meager results thereafter. Using complex routing configurations or overloading ancient protocols such as WCCP in pursuit of chaining has been a bugaboo that led to many a network outage over the year, so chaining is an important concept in networks.

Robust service chaining can actually be delivered today—and is deployed in many large-scale networks today. In monitoring networks, the chaining of performance tools and passive IDS systems utilizes VSS Monitoring’s vBrokers. In active security deployments, service chaining for production traffic uses VSS vProtector, which was designed to provide simple, fail-safe service chaining. Today, to deliver functionality that can be demonstrated in an educational application, such as SampleTap from ONF, network engineers need commercial systems. 

As such applications evolve from OpenFlow 1.0 to the more recent and far more robust OpenFlow 1.3 standard, projects such as this sample application represent a new tool for investigating SDN in a well-known use case. This application can also help us clarify the differences between TAPs, NPBs, and SDN aggregation applications, and it might just foreshadow a method for combining SDN systems with NPBs. In discussions about the sample application, Dan Pitt has assured his listeners that there are no plans to turn SampleTap into a product. His goal is advancing OpenFlow, not delivering products, he said.

This announcement might just be a milestone: maybe OpenFlow 1.3 or a follow up version of the specification will mark the point at which users really need to consider integrating OpenFlow support more broadly and expecting that it will be used widely. The announcement stated that the application was tested with available OpenFlow switches and that the source will be available on ONF’s Github repository soon and licensed under the Apache 2.0 open-source license. The ONF has sponsored the job of building an application atop the OpenDayLight controller, which might be the death knell of earlier projects, such as the Floodlight controller, which seems to be languishing as its sponsor pivots to a new business focus. I look forward to further use of OpenFlow and integration between OpenFlow monitoring points and network packets brokers, such as that available from IBM and VSS Monitoring today. As for the open source sample application, we all just need to wait a few days, to get past the demo and get access to the code…

Tuesday, March 11, 2014

Definitions of SDN, TAPs, and What's Required to Monitor Large-Scale Networks

By: Andrew R. Harding, Vice President of Products 

Even if you don't know what a network TAP is, you should read this post, because a recent announcement from the Open Networking Foundation may have caused some confusion about the definitions of SDN, TAPs, and what’s required to monitor large-scale networks. (You can read the announcement here: http://www.sdncentral.com/news/onf-debuts-network-tapping-hands-on-openflow-education/2014/03/ .)

A network TAP is a tool that enables network engineers to access the data on networks to complete performance analysis, troubleshooting, security, and compliance tasks. Engineers tap the network with a TAP, and as networks grow in scale and complexity, tapping systems have evolved into monitoring switches and packet brokering fabrics. Such fabrics require TAPs, and other more sophisticated elements, to aggregate, filter, and optimize the tapped traffic.  These other elements are called Network Packet Brokers (NPBs) or Network Visibility Controllers. I will use the NPB moniker.

Using a TAP is an alternative to configuring mirror or SPAN ports on network switches. SPAN "mirrors" are switch ports that carry copies of network traffic. (SPAN stands for "Switched Port Analyzer " or "Switch Port for Analysis" depending on who you talk to.) They have performance constraints, have physical limits, and perturb the system under analysis (as they are a subordinate function within a network switch), so most folks prefer to use TAPs in large-scale networks. Lately, some software engineers—or their collaborators in marketing—have been calling their software-defined networking applications "TAPs." This naming scheme is clever marketing, but it's not accurate.

If you step back and think about SPAN for a moment, while it's useful for ad-hoc data access, it is a fundamentally limited approach. Yes, it's integrated with the switch, but configuring a switch to copy every packet from several ports to another port on a switch is a silly idea. The switch will encounter performance limits and will start to drop packets because that is what switches are supposed to do. Each switch also has only a limited number of SPAN ports. A passive TAP doesn't perturb the system, won't utilize a switch port, and won’t require switch configuration changes. A TAP simply splits off a copy of the traffic an engineer needs to access. Cisco themselves recognizes the limits of SPAN ports and recommends: "the best strategy is to make decisions based on the traffic levels of the configuration and, when in doubt, to use the SPAN port only for relatively low-throughput situations." (http://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/san-consolidation-solution/net_implementation_white_paper0900aecd802cbe92.pdf)

It’s obvious that a clever SDN marketeer would avoid calling their latest application "auto-SPAN" or "programmable SPAN" because that would limit them to use cases where SPAN can meet the needs: low-utilization use cases. Networks need TAPs: no argument there. TAPs do not modify the system or the data under test. With TAPs, Heisenberg uncertainty does not apply. You get what's on the wire from a TAP, including physical-layer errors, which are sometimes required to sort out network issues. And TAPs don't drop packets. If you operate a network, you’re likely to be evaluating the benefits of a system to aggregate and filter network monitoring data to simplify delivering that data to performance tools and security systems. That's what network packet brokers do, at the most basic level. Optimizing that traffic, maximizing the use of performance tools, and simplifying large scale security deployment are more advanced features. VSS Monitoring offers both TAPs as well as basic and advanced NPBs. The SDN gang, for some reason, didn't choose to call their systems software-defined NPBs--maybe because they can't do what NPBs do? Or maybe because that’s a mouthful: SDN-NPBs. TLA2! And, so these systems that use OpenFlow (or other means) to program a switch to aggregate monitored traffic have been called “taps”. (They could more accurately have called them “SDN Aggregators”.)

They might be described as an SDN “forwarding system” because they don’t actually support tapping at all. In fact, IBM and VSS Monitoring have qualified a solution that combines SDN technology with network packet brokers. This solution supports TAPs, NPBs, and integrates with SDN systems, too. You can learn more about this "converged monitoring fabric" here: http://public.dhe.ibm.com/common/ssi/ecm/en/qcs03022usen/QCS03022USEN.PDF and here: http://www.vssmonitoring.com/resources/SolutionBriefs/VSS-IBM%20SDN_Solution%20Brief.pdf


Furthermore, VSS offers the option to have TAP port pairs integrated into the NPBs themselves. SDN switches do not support integrated TAPs and require additional products to actually TAP the network. The vMesh architecture is a network fabric (though it's not a general purpose fabric as it's optimized for monitoring network and security deployments.) To deploy such a fabric, SDN or otherwise, you cannot make progress without TAPs. You can't forget Layer 1.