RAD Runtime - Performance Impact Guide
Introduction
This document explains the safety of the RAD Security Runtime monitoring solution. It describes the performance impact of using the components in the production environments. It also lists all measures, guardrails and configuration options that enable the customer to mitigate any impact on the production workloads.eBPF
The major component of our runtime analysis is the Runtimeagent, which is deployed as a part of a Kubernetes DaemonSet to monitor each node. It uses eBPF, a modern and highly secure technology built directly into the Linux kernel.
eBPF is not like traditional kernel modules. It is a sandboxed, in-kernel virtual machine that provides robust guarantees against system impact.
eBPF safety guarantees
Before an eBPF program is executed, it’s verified by an eBPF verifier, a static analysis tool within the kernel that checks the code of all eBPF programs before they are loaded. If a program fails any of the following checks, the kernel will reject it:- Crash-Proof Guarantee: The verifier checks all possible permutations of the program’s execution. It ensures that the program cannot access memory outside its designated sandbox or call unapproved kernel functions, making it impossible for the
agentto cause a kernel panic. - Bounded Execution (No Infinite Loops): The verifier confirms that all programs will run to completion in a finite amount of time. This check ensures that our agent can never cause a kernel-level hang or CPU starvation.
- Secure Sandboxing: The agent’s eBPF programs are isolated. They can only access a small, predefined set of kernel “helper” functions and specific memory regions. This strict isolation prevents any potential for data corruption or unintended side effects.
Minimal Performance Overhead
Theagent is designed for high performance and minimal overhead:
- Event-Driven: eBPF programs are event-driven. They lie dormant until a specific event occurs (e.g., a process starts, a network connection is opened). This design choice ensures there’s no unnecessary CPU usage.
- No Context Switching: The
agentruns in kernel space, so it can observe data directly. This avoids the expensive operation of constantly copying data between kernel and user spaces. - Read-Only Data Access: Our eBPF programs are “read-only.” They do not modify any kernel data structures or application logic, therefore they don’t require any synchronization, which could use additional CPU cycles.
Runtime DaemonSet Kubernetes configuration
Theagent container loads the eBPF programs and maintains user-space in memory queues to move and analyze Runtime events.
The container is part of the Kubernetes pod. The standard controls provided by Kubernetes are used to limit resource usage. The Helm chart provides various configuration options.
By default, the chart sets the following config:
limits and requests for CPU and ephemeral storage to the same value to prevent the agent from using more resources than the node has at any time. This ensures that the runtime Pods will be terminated before any impact on the production workload can occur.
Please note that in production environments in most of our deployments, the usage rarely exceeds the defined requests. We often see P50 of the Pod’s CPU usage around 20 milli cores. P95 rarely exceeds 50 milli cores. Pod’s memory, we often observe, is about 200MiB for both P50 andP95.
All internal queues are bounded, meaning they never grow infinitely. If the queues are full, the standard approach for Runtime monitoring tools is to start dropping events. The queue size is also configured via the Helm chart:
20 000 events in any of the processing queues.
Finally, the Runtime agent also maintains self-telemetry - a set of metrics and diagnostic logs sent to the RAD API and continuously monitored to improve the operational performance.
To send the data to the RAD API the Runtime Pod uses an additional container exporter. It also uses bounded internal queues to optimize the data export process.
This pod uses the same resource configuration as the agent and can also be configured using the Helm chart values.
The Helm chart also allows limiting tracing to specific Kubernetes namespaces. By default, the Runtime agent uses the following configuration: