Skip to main content

RAD Runtime - Performance Impact Guide

Introduction

This document explains the safety of the RAD Security Runtime monitoring solution. It describes the performance impact of using the components in the production environments. It also lists all measures, guardrails and configuration options that enable the customer to mitigate any impact on the production workloads.

eBPF

The major component of our runtime analysis is the Runtime agent, which is deployed as a part of a Kubernetes DaemonSet to monitor each node. It uses eBPF, a modern and highly secure technology built directly into the Linux kernel. eBPF is not like traditional kernel modules. It is a sandboxed, in-kernel virtual machine that provides robust guarantees against system impact.

eBPF safety guarantees

Before an eBPF program is executed, it’s verified by an eBPF verifier, a static analysis tool within the kernel that checks the code of all eBPF programs before they are loaded. If a program fails any of the following checks, the kernel will reject it:
  • Crash-Proof Guarantee: The verifier checks all possible permutations of the program’s execution. It ensures that the program cannot access memory outside its designated sandbox or call unapproved kernel functions, making it impossible for the agent to cause a kernel panic.
  • Bounded Execution (No Infinite Loops): The verifier confirms that all programs will run to completion in a finite amount of time. This check ensures that our agent can never cause a kernel-level hang or CPU starvation.
  • Secure Sandboxing: The agent’s eBPF programs are isolated. They can only access a small, predefined set of kernel “helper” functions and specific memory regions. This strict isolation prevents any potential for data corruption or unintended side effects.

Minimal Performance Overhead

The agent is designed for high performance and minimal overhead:
  • Event-Driven: eBPF programs are event-driven. They lie dormant until a specific event occurs (e.g., a process starts, a network connection is opened). This design choice ensures there’s no unnecessary CPU usage.
  • No Context Switching: The agent runs in kernel space, so it can observe data directly. This avoids the expensive operation of constantly copying data between kernel and user spaces.
  • Read-Only Data Access: Our eBPF programs are “read-only.” They do not modify any kernel data structures or application logic, therefore they don’t require any synchronization, which could use additional CPU cycles.

Runtime DaemonSet Kubernetes configuration

The agent container loads the eBPF programs and maintains user-space in memory queues to move and analyze Runtime events. The container is part of the Kubernetes pod. The standard controls provided by Kubernetes are used to limit resource usage. The Helm chart provides various configuration options. By default, the chart sets the following config:
runtime:
  agent:
		resources:
		  limits:
		    cpu: 200m
		    memory: 1Gi
		    ephemeral-storage: 1Gi
		  requests:
		    cpu: 100m
		    memory: 128Mi
		    ephemeral-storage: 100Mi
To control the worst-case usage of the resources in the environment where extra resources are extremely constrained on the nodes (tightly packed pods), we also suggest setting the limits and requests for CPU and ephemeral storage to the same value to prevent the agent from using more resources than the node has at any time. This ensures that the runtime Pods will be terminated before any impact on the production workload can occur. Please note that in production environments in most of our deployments, the usage rarely exceeds the defined requests. We often see P50 of the Pod’s CPU usage around 20 milli cores. P95 rarely exceeds 50 milli cores. Pod’s memory, we often observe, is about 200MiB for both P50 andP95. All internal queues are bounded, meaning they never grow infinitely. If the queues are full, the standard approach for Runtime monitoring tools is to start dropping events. The queue size is also configured via the Helm chart:
runtime:
	agent:
		eventQueueSize: 20000
This allows to keep up to 20 000 events in any of the processing queues. Finally, the Runtime agent also maintains self-telemetry - a set of metrics and diagnostic logs sent to the RAD API and continuously monitored to improve the operational performance. To send the data to the RAD API the Runtime Pod uses an additional container exporter. It also uses bounded internal queues to optimize the data export process. This pod uses the same resource configuration as the agent and can also be configured using the Helm chart values. The Helm chart also allows limiting tracing to specific Kubernetes namespaces. By default, the Runtime agent uses the following configuration:
runtime:
  agent:
    env:
      TRACER_IGNORE_NAMESPACES: |
        cert-manager,
        rad,
        kube-node-lease,
        kube-public,
        kube-system

High-frequency events tracing

RAD agent traces some open file events, i.e. when the running program accesses a file on a container. Optionally, RAD Runtime can trace HTTP requests being sent from Kubernetes clusters. These high-frequency events are traced by default, minimising resource usage. The sampling can be configured in the Helm chart. By default, it uses the following values:
runtime:
	agent:
	 sampling:
      # Enable sampling of high frequency Runtime events (HTTP and Open File).
      enabled: true
      # Minimum number of samples to be collected per event deduplication key (container id and file path or HTTP url).
      minSamples: 10
      # Maxiumum number of samples to be collected per event deduplication key (container id and file path or HTTP url).
      maxSamples: 100
      # Percentage of all events to be sampled after the minimum and before the maximum number of events is collected.
      ratio: 5
      # TTL to expire the minimum and maximum counters.
      ttl: "5m"

PII detection

Optionally, PII detection can be enabled. This uses a natural language processing analyzer. The analyzer is deployed on the customer cluster to avoid sending data to external services (including RAD API) for analysis. By default, we use 3 replicas of this analyzer and use Helm Chart config to configure the resources. Furthermore, we rate-limit the number of requests being sent to the analyser (by default, 10 requests per second per node). Finally, to reduce egress costs when a large amount of data is sent to the RAD API. We also offer the option to configure AWS Private Link. Please head to https://docs.rad.security/docs/configure-aws-privatelink for the details.