> ## Documentation Index
> Fetch the complete documentation index at: https://docs.rad.security/llms.txt
> Use this file to discover all available pages before exploring further.

# Runtime Performance Overview

> Performance impact guide for RAD Security Runtime monitoring solution

# RAD Runtime - Performance Impact Guide

## Introduction

This document explains the safety of the RAD Security Runtime monitoring solution. It describes the performance impact of using the components in the production environments. It also lists all measures, guardrails and configuration options that enable the customer to mitigate any impact on the production workloads.

## eBPF

The major component of our runtime analysis is the Runtime `agent`, which is deployed as a part of a Kubernetes `DaemonSet` to monitor each node. It uses eBPF, a modern and highly secure technology built directly into the Linux kernel.

eBPF is not like traditional kernel modules. It is a sandboxed, in-kernel virtual machine that provides robust guarantees against system impact.

### eBPF safety guarantees

Before an eBPF program is executed, it's verified by an eBPF verifier, a static analysis tool within the kernel that checks the code of all eBPF programs before they are loaded. If a program fails *any* of the following checks, the kernel will reject it:

* **Crash-Proof Guarantee:** The verifier checks all possible permutations of the program's execution. It ensures that the program cannot access memory outside its designated sandbox or call unapproved kernel functions, making it impossible for the `agent` to cause a kernel panic.
* **Bounded Execution (No Infinite Loops):** The verifier confirms that all programs will run to completion in a finite amount of time. This check ensures that our agent can never cause a kernel-level hang or CPU starvation.
* **Secure Sandboxing:** The agent's eBPF programs are isolated. They can only access a small, predefined set of kernel "helper" functions and specific memory regions. This strict isolation prevents any potential for data corruption or unintended side effects.

### Minimal Performance Overhead

The `agent` is designed for high performance and minimal overhead:

* **Event-Driven:** eBPF programs are event-driven. They lie dormant until a specific event occurs (e.g., a process starts, a network connection is opened). This design choice ensures there's no unnecessary CPU usage.
* **No Context Switching:** The `agent` runs in kernel space, so it can observe data directly. This avoids the expensive operation of constantly copying data between kernel and user spaces.
* **Read-Only Data Access:** Our eBPF programs are "read-only." They do not modify any kernel data structures or application logic, therefore they don't require any synchronization, which could use additional CPU cycles.

## Runtime DaemonSet Kubernetes configuration

The `agent` container loads the eBPF programs and maintains user-space in memory queues to move and analyze Runtime events.

The container is part of the Kubernetes pod. The standard controls provided by Kubernetes are used to limit resource usage. The Helm chart provides various configuration options.

By default, the chart sets the following config:

```yaml theme={null}
runtime:
  agent:
		resources:
		  limits:
		    cpu: 200m
		    memory: 1Gi
		    ephemeral-storage: 1Gi
		  requests:
		    cpu: 100m
		    memory: 128Mi
		    ephemeral-storage: 100Mi
```

To control the worst-case usage of the resources in the environment where extra resources are extremely constrained on the nodes (tightly packed pods), we also suggest setting the `limits` and `requests` for CPU and ephemeral storage to the same value to prevent the agent from using more resources than the node has at any time. This ensures that the `runtime` Pods will be terminated before any impact on the production workload can occur.

Please note that in production environments in most of our deployments, the usage rarely exceeds the defined `requests`. We often see `P50` of the Pod's CPU usage around `20 milli cores`. `P95` rarely exceeds `50 milli cores`. Pod's memory, we often observe, is about `200MiB` for both `P50` and`P95`.

All internal queues are bounded, meaning they never grow infinitely. If the queues are full, the standard approach for Runtime monitoring tools is to start dropping events. The queue size is also configured via the Helm chart:

```yaml theme={null}
runtime:
	agent:
		eventQueueSize: 20000
```

This allows to keep up to `20 000` events in any of the processing queues.

Finally, the Runtime `agent` also maintains self-telemetry - a set of metrics and diagnostic logs sent to the RAD API and continuously monitored to improve the operational performance.

To send the data to the RAD API the Runtime Pod uses an additional container `exporter`. It also uses bounded internal queues to optimize the data export process.

This pod uses the same resource configuration as the `agent` and can also be configured using the Helm chart `values`.

The Helm chart also allows limiting tracing to specific Kubernetes namespaces. By default, the Runtime `agent` uses the following configuration:

```yaml theme={null}
runtime:
  agent:
    env:
      TRACER_IGNORE_NAMESPACES: |
        cert-manager,
        rad,
        kube-node-lease,
        kube-public,
        kube-system
```

## High-frequency events tracing

RAD agent traces some open file events, i.e. when the running program accesses a file on a container.

Optionally, RAD Runtime can trace HTTP requests being sent from Kubernetes clusters.

These high-frequency events are traced by default, minimising resource usage. The sampling can be configured in the [Helm chart](https://artifacthub.io/packages/helm/rad/rad-plugins). By default, it uses the following values:

```yaml theme={null}
runtime:
	agent:
	 sampling:
      # Enable sampling of high frequency Runtime events (HTTP and Open File).
      enabled: true
      # Minimum number of samples to be collected per event deduplication key (container id and file path or HTTP url).
      minSamples: 10
      # Maxiumum number of samples to be collected per event deduplication key (container id and file path or HTTP url).
      maxSamples: 100
      # Percentage of all events to be sampled after the minimum and before the maximum number of events is collected.
      ratio: 5
      # TTL to expire the minimum and maximum counters.
      ttl: "5m"
```

### PII detection

Optionally, PII detection can be enabled. This uses a natural language processing analyzer. The analyzer is deployed on the customer cluster to avoid sending data to external services (including RAD API) for analysis. By default, we use 3 replicas of this analyzer and use Helm Chart config to configure the resources.

Furthermore, we rate-limit the number of requests being sent to the analyser (by default, 10 requests per second per node).

## AWS Private Link

Finally, to reduce egress costs when a large amount of data is sent to the RAD API. We also offer the option to configure AWS Private Link.

For AWS PrivateLink configuration details, see the [AWS Setup Guide](/rad-security/integrations/aws-setup).
