Skip to content
Snippets Groups Projects
Unverified Commit 1c124c0d authored by Mario Macias's avatar Mario Macias Committed by GitHub
Browse files

NETOBSERV-307 & NETOBSERV-308: changed example deployments + documentation (#23)

* Move memlock removal to initialization + extra documentation to work with eBPF

* Documented individual capabilities instead of privileged
parent 178ef040
No related branches found
No related tags found
No related merge requests found
......@@ -3,6 +3,14 @@
The Network Observability eBPF Agent allows collecting and aggregating all the ingress and
egress flows on a Linux host (required a Kernel 4.18+ with eBPF enabled).
* [How to compile](#how-to-compile)
* [Hot to configure](#how-to-configure)
* [How to run](#how-to-run)
* [Development receipts](#development-receipts)
* [Known issues](#known-issues)
* [Frequently-asked questions](#frequently-asked-questions)
* [Troubleshooting](#troubleshooting)
## How to compile
```
......@@ -19,24 +27,56 @@ The eBPF Agent is configured by means of environment variables. Check the
The NetObserv eBPF Agent is designed to run as a DaemonSet in OpenShift/K8s. It is triggered and
configured by our [Network Observability Operator](https://github.com/netobserv/network-observability-operator).
Anyway you can run it directly as an executable with administrative privileges:
Anyway you can run it directly as an executable from your command line:
```
export FLOWS_TARGET_HOST=...
export FLOWS_TARGET_PORT=...
sudo -E bin/netobserv-ebpf-agent
```
To deploy locally, use instructions from [flowlogs-dump (like tcpdump)](./examples/flowlogs-dump/README.md).
To deploy it as a Pod, you can check the [deployment example](./examples/performance/deployment.yml).
To deploy it as a Pod, you can check the [deployment examples](./deployments).
The Agent needs to be executed either with:
1. The following [Linux capabilities](https://man7.org/linux/man-pages/man7/capabilities.7.html)
(recommended way): `BPF`, `PERFMON`, `NET_ADMIN`, `SYS_RESOURCE`. If you
[deploy it in Kubernetes or OpenShift](./deployments/flp-daemonset-cap.yml),
the container running the Agent needs to define the following `securityContext`:
```yaml
securityContext:
runAsUser: 0
capabilities:
add:
- BPF
- PERFMON
- NET_ADMIN
- SYS_RESOURCE
```
(Please notice that the `runAsUser: 0` is still needed).
2. Administrative privileges. If you
[deploy it in Kubernetes or OpenShift](./deployments/flp-daemonset.yml),
the container running the Agent needs to define the following `securityContext`:
```yaml
securityContext:
privileged: true
runAsUser: 0
```
This option is only recommended if your Kernel does not recognize some of the above capabilities.
We found some Kubernetes distributions (e.g. K3s) that do not recognize the `BPF` and
`PERFMON` capabilities.
Here is a list of distributions where we tested both full privileges and capability approaches,
and whether they worked (✅) or did not (❌):
| Distribution | K8s Server version | Capabilities | Privileged |
|-------------------------------|--------------------|--------------|------------|
| Amazon EKS (Bottlerocket AMI) | 1.22.6 | ✅ | ✅ |
| K3s (Rancher Desktop) | 1.23.5 | ❌ | ✅ |
| Kind | 1.23.5 | ❌ | ✅ |
| OpenShift | 1.23.3 | ✅ | ✅ |
## Where is the collector?
As part of our Network Observability solution, the eBPF Agent is designed to send the traced
flows to our [Flowlogs Pipeline](https://github.com/netobserv/flowlogs-pipeline) component.
In addition, we provide a simple GRPC+Protobuf library to allow implementing your own collector.
Check the [packet counter code](./examples/performance/server/packet-counter-collector.go)
for an example of a simple collector using our library.
## Development receipts
......@@ -62,7 +102,38 @@ Tested in Fedora 35 and Red Hat Enterprise Linux 8.
## Known issues
## Extrenal Traffic in Openshift (OVN-Kubernetes CNI)
### Extrenal Traffic in Openshift (OVN-Kubernetes CNI)
For egress traffic, you can see the source Pod metadata. For ingress traffic (e.g. an HTTP response),
you see the destination **Host** metadata.
\ No newline at end of file
you see the destination **Host** metadata.
## Frequently-asked questions
### Where is the collector?
As part of our Network Observability solution, the eBPF Agent is designed to send the traced
flows to our [Flowlogs Pipeline](https://github.com/netobserv/flowlogs-pipeline) component.
In addition, we provide a simple GRPC+Protobuf library to allow implementing your own collector.
Check the [packet counter code](./examples/performance/server/packet-counter-collector.go)
for an example of a simple collector using our library.
## Troubleshooting
### Deployed as a Kubernetes Pod, the agent shows permission errors in the logs and can't start
In your [deployment file](./deployments/flp-daemonset-cap.yml), make sure that the container runs as
the root user (`runAsUser: 0`) and with the granted capabilities or privileges (see [how to run](#how-to-run) section).
### The Agent doesn't work in my Amazon EKS puzzle
Despite Amazon Linux 2 enables eBPF by default in EC2, the
[EKS images are shipped with disabled eBPF](https://github.com/awslabs/amazon-eks-ami/issues/728).
You'd need either:
1. Provide your own AMI configured to work with eBPF
2. Use other Linux distributions that are shipped with eBPF enabled by default. We have successfully
tested the eBPF Agent in EKS with the [Bottlerocket](https://aws.amazon.com/es/bottlerocket/)
Linux distribution, without requiring any extra configuration.
......@@ -6,5 +6,7 @@ but the files contained here are useful for documentation and manual testing.
* `flp-daemonset.yml`, shows how to deploy/configure the Agent when Flowlogs Pipeline is deployed
as daemonset, taking the target host configuration from the Host IP.
* `flp-daemonset-cap.yml`, same as `flp-daemonset.yml`, but assigning individual capabilities instead
of deploying a fully-privileged container.
* `flp-service.yml`, shows how to deploy/configure the Agent when Flowlogs Pipeline is deployed
as a service, explicitly setting the host configuration as the service name.
\ No newline at end of file
# Example deployment for manual testing with flp
# It requires loki to be installed
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: netobserv-ebpf-agent
labels:
k8s-app: netobserv-ebpf-agent
spec:
selector:
matchLabels:
k8s-app: netobserv-ebpf-agent
template:
metadata:
labels:
k8s-app: netobserv-ebpf-agent
spec:
serviceAccountName: netobserv-account
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
containers:
- name: netobserv-ebpf-agent
image: quay.io/mmaciasl/netobserv-ebpf-agent:main
# imagePullPolicy: Always
securityContext:
capabilities:
add:
- BPF
- PERFMON
- NET_ADMIN
- SYS_RESOURCE
runAsUser: 0
env:
- name: FLOWS_TARGET_HOST
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: FLOWS_TARGET_PORT
value: "9999"
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: flp
labels:
k8s-app: flp
spec:
selector:
matchLabels:
k8s-app: flp
template:
metadata:
labels:
k8s-app: flp
spec:
containers:
- name: flowlogs-pipeline
image: quay.io/netobserv/flowlogs-pipeline:latest
ports:
- containerPort: 9999
args:
- --config=/etc/flp/config.yaml
volumeMounts:
- mountPath: /etc/flp
name: config-volume
volumes:
- name: config-volume
configMap:
name: flp-config
---
apiVersion: v1
kind: ConfigMap
metadata:
name: flp-config
data:
config.yaml: |
log-level: debug
pipeline:
- name: ingest
- name: decode
follows: ingest
- name: enrich
follows: decode
- name: encode
follows: enrich
- name: loki
follows: encode
parameters:
- name: ingest
ingest:
type: grpc
grpc:
port: 9999
- name: decode
decode:
type: protobuf
- name: enrich
transform:
type: network
network:
rules:
- input: SrcAddr
output: SrcK8S
type: "add_kubernetes"
- input: DstAddr
output: DstK8S
type: "add_kubernetes"
- name: encode
encode:
type: none
- name: loki
write:
type: loki
loki:
type: loki
staticLabels:
app: netobserv-flowcollector
labels:
- "SrcK8S_Namespace"
- "SrcK8S_OwnerName"
- "DstK8S_Namespace"
- "DstK8S_OwnerName"
- "FlowDirection"
url: http://loki:3100
timestampLabel: TimeFlowEnd
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: netobserv-account
......@@ -24,6 +24,7 @@ spec:
# imagePullPolicy: Always
securityContext:
privileged: true
runAsUser: 0
env:
- name: FLOWS_TARGET_HOST
valueFrom:
......@@ -124,18 +125,4 @@ apiVersion: v1
kind: ServiceAccount
metadata:
name: netobserv-account
---
apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
name: example
allowPrivilegedContainer: true
allowHostDirVolumePlugin: true
allowHostNetwork: true
allowHostPorts: true
runAsUser:
type: RunAsAny
seLinuxContext:
type: RunAsAny
users:
- system:serviceaccount:network-observability:netobserv-account
......@@ -24,6 +24,7 @@ spec:
# imagePullPolicy: Always
securityContext:
privileged: true
runAsUser: 0
env:
- name: FLOWS_TARGET_HOST
value: "flp"
......@@ -138,18 +139,3 @@ apiVersion: v1
kind: ServiceAccount
metadata:
name: netobserv-account
---
apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
name: example
allowPrivilegedContainer: true
allowHostDirVolumePlugin: true
allowHostNetwork: true
allowHostPorts: true
runAsUser:
type: RunAsAny
seLinuxContext:
type: RunAsAny
users:
- system:serviceaccount:network-observability:netobserv-account
......@@ -98,6 +98,8 @@ func FlowsAgent(cfg *Config) (*Flows, error) {
func (f *Flows) Run(ctx context.Context) error {
alog.Info("starting Flows agent")
systemSetup()
tracedRecords, err := f.interfacesManager(ctx)
if err != nil {
return err
......
package agent
func systemSetup() {
}
package agent
import (
"github.com/cilium/ebpf/rlimit"
"github.com/sirupsen/logrus"
)
var slog = logrus.WithField("component", "systemSetup")
// systemSetup holds some system-dependant initialization processes
func systemSetup() {
if err := rlimit.RemoveMemlock(); err != nil {
slog.WithError(err).
Warn("can't remove mem lock. The agent could not be able to start eBPF programs")
}
}
......@@ -11,7 +11,6 @@ import (
"time"
"github.com/cilium/ebpf/ringbuf"
"github.com/cilium/ebpf/rlimit"
"github.com/netobserv/netobserv-ebpf-agent/pkg/flow"
"github.com/sirupsen/logrus"
"github.com/vishvananda/netlink"
......@@ -53,11 +52,6 @@ func NewFlowTracer(iface string, sampling uint32) *FlowTracer {
// before exiting.
func (m *FlowTracer) Register() error {
ilog := log.WithField("iface", m.interfaceName)
// Allow the current process to lock memory for eBPF resources.
// TODO: manually invoke unix.Prlimit with lower/reasonable rlimit
if err := rlimit.RemoveMemlock(); err != nil {
return fmt.Errorf("removing mem lock: %w", err)
}
// Load pre-compiled programs and maps into the kernel, and rewrites the configuration
spec, err := loadBpf()
if err != nil {
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment