Small data observability can deliver the real-time insights that matter, enabling enterprises to maximize the availability of complex and growing infrastructures.
Capturing data for system observability is an increasing priority for organizations. Not only is it important for maximizing IT efficiency, but it’s also essential for identifying and resolving security and performance issues. When it comes to SRE, IT operations and other critical teams, capturing and analyzing the right data to draw conclusions and make informed decisions makes a huge difference in speed and efficiency. . As more and more data is generated at the edge, it causes teams to re-evaluate their observability strategy.
The most common approach companies take is to move raw peak data to a central repository and analyze it in batches – a “big data” observability strategy. However, this has become impractical – or at the very least, very expensive – on a large scale. The world produced an estimated 79 zettabytes of data in 2021, leaving many businesses overwhelmed with storage and analytics expenses in the form of software-as-a-service (SaaS) contracts and cloud-related expenses.
The nature of the edge and data on the edge also present unique concerns. Companies face the challenge of configuring, deploying and maintaining agents in thousands of locations at once to extract all the data produced. These locations may simply be completely incompatible with edge visibility solutions that rely on on-premises hardware or virtual machines for observability. There is simply no place for these solutions on an IoT device or a lean branch location server.
There is an alternative approach to enabling observability at the edge – pushing analytics to where events are happening and pairing it with the ability to dynamically control what is being analyzed in real time. We call this “dynamic edge observability,” or what you might call a “small data” strategy. This approach enables teams to conduct and retrieve analyzed data in real time, evolving based on the raw data generated by the edge infrastructure. This provides unparalleled specificity and flexibility, accelerating incident response while keeping costs stable even as businesses analyze more data.
Yet big data and small data are not mutually exclusive: big data can provide the context needed to better leverage the hyperspecific approach of small data. Let’s take a closer look at how big data and small data approaches to observability compare, how small data can dynamically generate insights, and how the two approaches can work in tandem.
See also: Legacy Systems Impede Deployment of Unified Observability
Explain big and small data observability strategies
The Big Data approach to observability involves bringing raw telemetry from the edge to a central repository where a SaaS or cloud provider can perform analysis on it to generate insights. When faced with the question of how much data to collect, teams often think “as much as possible” in hopes of having what they need when asking questions.
Unfortunately, this strategy has significant drawbacks. Transiting and storing so much data is expensive – as is paying for the analysis of that data – and often only a fraction of the data ends up offering meaningful insights. The process is often time-consuming, and waiting for an analysis introduces a lag between identifying issues and getting the information needed to fix them. Many companies generate so much data that they run out of quota or storage space, which forces them to settle for short retention times or pay more to ingest and store more data, leading to unreasonable costs.
In contrast, a dynamic edge observability strategy analyzes raw data streams where the data is generated and produces only “little data” – intelligently aggregated metrics and logs that provide direct insight. These results are available locally for immediate action and are also collected centrally for global visibility.
Small data also addresses some of the cost and logistical concerns of big data. Since each edge agent pushes a very small amount of data relative to a central repository of raw data, companies are less likely to run out of quota. Processing small amounts of data is faster and cheaper, especially on a large scale. The focus on collecting more signals and less noise means less money is spent storing unusable data.
Advocacy for small data
Edge agents can slice data into any dimension you need to investigate, allowing teams to instantly analyze a subset of relevant data. Therefore, small data can enable rapid response in a range of use cases where big data would be too slow and expensive.
For example, imagine a chain of stores with hundreds of stores has a single faulty cash register in one location. Analyzing this malfunction with a Big Data approach can work, but since you won’t know which of the thousands of machines needs to be analyzed and repaired, you will need to collect all data from all machines at all times. Instead, with a dynamic on-board observability policy, the operations group responsible for the repair could instruct the on-board observability agent to collect in-depth scans on the specific cash register IP address only when the problem occurs. From there, the group could quickly diagnose misconfigurations or other issues that occur, and then remove the request for further analysis, returning to basic observability.
Small data is also useful for responding to potential malicious activity. If a DNS provider notices a huge spike in traffic from a particular country, they can configure edge agents to provide more detail about top queries from that country. From there, examining the top requests might indicate bot traffic, suggest a DDoS attack, or merit further investigation.
Stay above water in a deluge of data
More data at the edge means more potential insights, but it can also lead to more resources being spent analyzing the data. And ultimately, the information produced may not be timely or useful. Even if an organization has found a way to manage its current volume of data, it will be a huge challenge to scale a big data approach without also incorporating a small data observability strategy. Small data observability can deliver the real-time insights that matter, enabling enterprises to maximize the availability of complex and growing infrastructures.