-
Alban Crequy authored
In previous patches, I added the option --cgroupmap to filter events belonging to a set of cgroup-v2. Although this approach works fine with systemd services and containers when cgroup-v2 is enabled, it does not work with containers when only cgroup-v1 is enabled because bpf_get_current_cgroup_id() only works with cgroup-v2. It also requires Linux 4.18 to get this bpf helper function. This patch adds an additional way to filter by containers, using mount namespaces. Note that this does not help with systemd services since they normally don't create a new mount namespace (unless you set some options like 'ReadOnlyPaths=', see "man 5 systemd.exec"). My goal with this patch is to filter Kubernetes pods, even on distributions with an older kernel (<4.18) or without cgroup-v2 enabled. - This is only implemented for tools that already support filtering by cgroup id (bindsnoop, capable, execsnoop, profile, tcpaccept, tcpconnect, tcptop and tcptracer). - I picked the mount namespace because the other namespaces could be disabled in Kubernetes (e.g. HostNetwork, HostPID, HostIPC). It can be tested by following the example in docs/special_filtering added in this commit, to avoid compiling locally the following command can be used ``` sudo bpftool map create /sys/fs/bpf/mnt_ns_set type hash key 8 value 4 \ entries 128 name mnt_ns_set flags 0 docker run -ti --rm --privileged \ -v /usr/src:/usr/src -v /lib/modules:/lib/modules \ -v /sys/fs/bpf:/sys/fs/bpf --pid=host kinvolk/bcc:alban-containers-filters \ /usr/share/bcc/tools/execsnoop --mntnsmap /sys/fs/bpf/mnt_ns_set ``` Co-authored-by:
Alban Crequy <alban@kinvolk.io> Co-authored-by:
Mauricio Vásquez <mauricio@kinvolk.io>
32ab8583