Using Network Analytics to Spot and Fix Network Issues Faster

By pg.menon posted Apr 30, 2018 01:00 PM


This two-part series dives into Aruba’s Network Analytics Engine (NAE), a unique framework for network assurance and remediation that is built into the ArubaOS-CX network operating system. This blog dives into the NAE architecture and how it speeds network troubleshooting. The second blog explores implementation examples and use cases for NAE.  


Today’s fast-changing markets expect businesses to be more agile. Increasing business agility is crucial to growth and profitability. Since IT is an essential part of business operation, the same applies to IT operations. Innovations in IT infrastructure that reduce downtime, identify anomalies, improve performance and resolve problems quickly contribute directly to business agility and business continuity. The Network Analytics Engine (NAE), which is  built into the ArubaOS-CX network operating system for the Aruba 8400 and 8320 core and aggregation switches, is one such innovation designed for networking and network operators.


IT departments have sophisticated systems in place to monitor and manage their infrastructure. A vast array of tools help operators to collect event logs, watch key performance metrics and capture traffic flow data to improve performance, spot anomalies and analyze failure. In conjunction with these existing tools, NAE enables network operators to have a rules-based framework that will quickly find the root cause, trigger follow-up actions and lower their mean time to repair.


What is Network Analytics Engine?

NAE is a first-of-its-kind built-in framework for network assurance and remediation. Combining the full automation and deep visibility capabilities of ArubaOS-CX, this unique framework allows monitoring, troubleshooting and easy network data collection by using simple scripting agents.


Quite simply, NAE lets you analyze a problem in real time. It gives you the insight you need to resolve the issue, or even better, it takes corrective action based on established policies. When it detects an anomaly, it can proactively collect additional statistics and data to proactively troubleshoot the problem.


NAE is made up of agents, rules, databases, APIs and a web UI.

  • NAE Agents - NAE makes use of agents to collect context. Agents are user-defined scripts that get triggered in the device when a specific event occurs and they collect additional interesting and relevant network information.
  • NAE Rules - Agents are triggered by rules that are also defined by the user. An example of a rule is short-term-high-CPU, where additional context is collected when the CPU utilization exceeds a certain threshold for a specified period of time.
  • NAE Databases - NAE is tied to the configuration and state database as well as a time series database.
    • Configuration and state database - The tie-in with the configuration and network state database is what makes NAE possible, because NAE has direct access to the entire current state of ArubaOS-CX, all statistics included. This also helps the agent to correlate a network event to a configuration change, which is useful in determining root cause by checking if the event was related to a configuration change.
    • Time series database - The tie-in with a time series database give users the ability to rewind and playback the network context surrounding a network event. Under normal use, storage is estimated at 400 days.
  • REST APIs - NAE has REST APIs for integration with external systems such as security information and event management (SIEM) tools and log analytics engines. In addition, operators can use the APIs to request information from other devices in the network to create a complete picture of the network state when a specific event occurs and automatically take corrective action based on policy.
  • Web UI -  The Aruba OS-CX web user interface gives operators quick and easy visibility. Besides providing the ability to monitor the status of a switch, it gives you access to view and configure NAE agents, scripts and alerts. Automatically generated graphs provide additional context that is required for troubleshooting networks.Network Analytics Engine componentsNetwork Analytics Engine components


Three Benefits of Built-in Network Analytics

NAE delivers clear benefits to network operators:


  1. Addressing administrative boundaries saves time. In many networks, administrative boundaries limit operator access to network visibility. Event logs are processed by log analytics systems that are usually under a different administrative domain. Network visibility, performance insights and subsequent corrective actions often require people to work across administrative boundaries. Sometimes working across administrative boundaries causes unnecessary delays and may not be agile or flexible enough for operators to address business needs. The built-in NAE gives some freedom and flexibility for network operators to directly deal with appropriate network related issues. 
  2. Real-time context helps faster troubleshooting. Event logs from applications and infrastructure are generally gathered into standalone log analytics tools for root cause analysis. The task of requesting additional context around the event is carried out by scripts associated with log analytics tools. These tools service a large number of devices in the network, and as a result, delays may be encountered before the tools react to an event and request additional context from any particular device. This delay may result in loss of the exact context at the time when the event occurred. The built-in NAE automatically triggers the gathering of additional information at the time the event occurred giving meaningful context and enabling faster trouble shooting. In addition, the rules-based framework can trigger follow-up actions to remedy the problem. Besides troubleshooting, real-time context also is useful in system optimization, such as when CPU utilization suddenly peaks, remedial action can be taken immediately. Additionally, when the centralized servers and tools are unreachable or unavailable, this information cannot be collected at all. 
  3. NAE is a turnkey solution. Most analytics toolsets have to be pulled together and integrated before they can be used. This includes database integration, time series data stores, streaming data feeds and scripts. NAE is a prepackaged solution that can be used out-of-the-box. It comes with integrated databases and prepackaged scripts for common use cases. In addition, supplementary assistance in the form of scripts and use cases can be found on GitHub or at the Aruba Solutions Exchange.

NAE is an innovative new way for network operators to identify and resolve problems faster. In my next blog, I will explore implementation examples. 


Go Deeper

Read my blog “The Three Biggest Network Automation Benefits of REST APIs.”


Read the blog “ArubaOS-CX: A Modern, Programmable Network for the Mobile and IoT Age,” by Tom Black, VP and GM of the campus switching business unit at Aruba.


PG Menon is senior director of product and solutions marketing at Aruba.



Did you like this blog? Give it a thumbs-up or share it on social media using the buttons below.