The other day, I overheard a discussion between a colleague and a market analyst over the value of packet-level information. The analyst didn’t think full packet capture made sense for NPM/APM tools, because they could perform their functions effectively using only metadata and/or flow statistics.
So, are network recorders and their ilk really on the way out? Is complete packet capture useless?
I argue “no.” And here’s why: APM tools can generally identify issues for a given application (e.g. Lync calls are dropping). These issues might arise from the compute infrastructure (slow processors, insufficient RAM), but they could also lie within the network infrastructure (link overload, badly tuned TCP parameters, etc.). In the latter case, the root cause would be extremely difficult to identify and debug without having a complete, packet-level record.
When investigating a breach or “exfiltration” (such as Target’s), you absolutely need the full packet data, not just flow level metrics (which show only some activity, not exactly “what” activity took place) or metadata (which shows “some data” was sent out, not “which data” was sent out). Summarized flow statistics (or metadata) are an inherently a glossy approach to “compressing” monitoring data. True, they take up less space and can be processed faster than a full packet, but they omit information that could be critical to a discovery process.
While full packet capture is not required to show that application infrastructure is faultless when performance issues arise, it is certainly required when the problem is caused by the network or when the exact data that was transmitted is required for troubleshooting or security purposes. Full packet capture makes sense for both APM and security use cases. However, full packet capture for everything, all the time is ridiculously cost prohibitive. Networks engineers and security analysts need to capture just the data they need and no more.
Aside from the obvious compliance mandates, continuous packet capture prevents data gaps. Implemented efficiently, full packet capture is also feasible in terms of cost and management. One of the key elements of such efficiency is decoupling the data from vertically integrated tools. I covered probe virtualization in a previous post, but some of these points are worth repeating in the context of making full packet capture scalable:
- Tools that integrate capture, storage, and analysis of packet data are expensive. They also have limited storage and compute capacity. If you run out of either, the only way to expand is to buy a new appliance. An open capture and storage infrastructure makes the scaling of at least those parts of the equation more cost effective.
- NPM/APM tools already make use of complete packets in the sense that they hook into a network tap/span port and accept these packets. Whether they store them internally or process them on the fly and discard them depends on the tool. The point is, if we are able to separate the collection of the data (packet capture) from the consumption of the data (the NPM/APM analytics, forensics etc.), it makes the data a lot more versatile. We can collect the data once and use it for multiple purposes, anytime, anywhere.
- The exact tool that will be used to collect this data need not be known at the time of collection since the data can being collected in an open format (e.g. PCAP). Such format makes the data future proof.
- Virtualized analytics tools are on the horizon (customers are demanding them). Then, these virtualized appliances will need to be fed data from separate capture/storage infrastructure, although some of these functions can be taken care of by the Network Packet Brokers (NPBs) that collect the data across the network.
A full, historical record of packets (based on continuous capture as a separate network function) is not only useful, but will remain relevant for the foreseeable future. A system that utilizes programmability to trigger packer capture based on external events, then forwards packets in real-time while simultaneously recording the flow of interest, enabling asynchronous analysis, will increase the value of such capture. Now, that’s something only VSS Monitoring can do today (a post for another day).