Skip to main content

2 posts tagged with "security"

View All Tags

· 5 min read
Barun Acharya

LSM hooks in Linux Kernel mediates access to internal kernel objects such as inodes, tasks, files, devices, and IPC. LSMs, in general, refer to these generic hooks added in the core kernel code. Further, security modules could make use of these generic hooks to implement enhanced access control as independent kernel modules. AppArmor, SELinux, Smack, TOMOYO are examples of such independent kernel security modules.

LSM seeks to allow security modules to answer the question "May a subject S perform a kernel operation OP on an internal kernel object OBJ?"

LSMs can drastically reduce the attack surface of a system if appropriate policies using security modules are implemented.

DACs vs. MACs

DAC (Discretionary Access Control) based access control is a means of restricting access to objects based on the identity of subjects or groups. For decades, Linux only had DAC-based access controls in the form of user and group permissions. One of the problems with DACs is that the primitives are transitive in nature. A user who is a privileged user could create other privileged users, and that user could have access to restricted objects.

With MACs (Mandatory Access Control), the subjects (e.g., users, processes, threads) and objects (e.g., files, sockets, memory segments) each have a set of security attributes. These security attributes are centrally managed through MAC policies. In the case of MAC, the user/group does not make any access decision, but the access decision is managed by security attribute.

LSMs are a form of MAC-based controls.

LSM Hooks

LSM mediates access to kernel objects by placing hooks in the kernel code just before the access.

It can be seen here that the LSM hooks are applied after the DAC and other sanity checks are performed.

Here it is shown that the LSM hooks are applied in core objects, and these hooks are dereferenced using a global hooks table. These global hooks are added ( e.g., check apparmor hooks when the security module is initialized.

TOCTOU problem handling

LSMs are typically used for a system's policy enforcement. One school of thought is that the enforcement can be handled in an asynchronous fashion, i.e., the kernel audit events could pass the alert to userspace, and then the userspace could enforce the decision asynchronously.

Such an approach has several issues, i.e., the asynchronous nature might result in the malicious actor causing the actual damage before the actor could be identified. For example, if the unlink() of a file object is to be blocked, the asynchronous nature might result in the unlink getting successful before the attack could be blocked.

LSM hooks are applied inline to the kernel code processing; the kernel has the security context and other details of the object while making the decision inline. Thus the enforcement is inline to the access attempt, and any blocking/denial action can be performed without TOCTOU problems.

Security Modules currently defined in Linux kernel


./security/smack/smack_lsm.c:4926:DEFINE_LSM(smack) = {
./security/tomoyo/tomoyo.c:588:DEFINE_LSM(tomoyo) = {
./security/loadpin/loadpin.c:246:DEFINE_LSM(loadpin) = {
./security/commoncap.c:1468:DEFINE_LSM(capability) = {
./security/selinux/hooks.c:7387:DEFINE_LSM(selinux) = {
./security/bpf/hooks.c:30:DEFINE_LSM(bpf) = {
./security/safesetid/lsm.c:264:DEFINE_LSM(safesetid_security_init) = {
./security/lockdown/lockdown.c:163:DEFINE_LSM(lockdown) = {
./security/integrity/iint.c:174:DEFINE_LSM(integrity) = {
./security/yama/yama_lsm.c:485:DEFINE_LSM(yama) = {
./security/apparmor/lsm.c:1905:DEFINE_LSM(apparmor) = {

In the above list, AppArmor and SELinux are undoubtedly the most widely used. AppArmor is relatively easier to use, but SELinux provides the greater intensive and fine-grained policy specification. Linux POSIX.1e capabilities logic is also implemented as a security module.

There can be multiple security modules used at the same time. This is true in most cases; the capabilities module is always loaded alongside SELinux or any other LSM. The capabilities security module is always ordered first in execution (controlled using .order = LSM_ORDER_FIRST flag).

Stackable vs Non-Stackable LSMs

Note that AppArmor, SELinux, and Smack security modules initialize themselves as exclusive (LSM_FLAG_EXCLUSIVE) security modules. There cannot be two security modules in the system with LSM_FLAG_EXCLUSIVE flag set. Thus, this means that one cannot have any two of the following (SELinux, AppArmor, Smack) security modules registered simultaneously.

BPF-LSM is a stackable LSM and thus can be used alongside AppArmor or SELinux.

Permissive hooks in LSMs

Certain POSIX-compliant filesystems depend on the ability to grant accesses that would ordinarily be denied at a coarse level (DAC level) of granularity (check capabilities man page for CAP_DAC_OVERRIDE). LSM supports DAC override (a.k.a., permissive hooks) for particular objects such as POSIX-compliant filesystems, where the security module can grant access the kernel was about to deny.

Security Modules: A general critique

LSMs, as generic MAC-based security primitives, are very powerful. The security modules allow the administrator to impose additional restrictions on the system to reduce the attack surface. However, if the security module policy specification language is hard to understand/debug, the administrator usually takes a stance of disabling it altogether, thus imposing friction in adoption.


  1. Linux Security Modules: General Security Support for the Linux Kernel, Wright & Cowan et al., 2002

· 5 min read
Barun Acharya

A few months back I presented at Cloud Native eBPF Day Europe 2022 about Armoring Cloud Native Workloads with BPF LSM and planted a thought about building a holistic tool for runtime security enforcement leveraging BPF LSM. I have spent the past few weeks collaborating with the rest of the team at KubeArmor to realize that thought. This blog post will explore the why’s and how’s of implementing security enforcement as part of KubeArmor leveraging BPF LSM superpowers at its core.


Linux Security Modules provides with security hooks necessary to set up the least permissive perimeter for various workloads. A nice introduction to LSMs here.

KubeArmor is a cloud-native runtime security enforcement system that leverages these LSMs to secure the workloads.

LSMs are really powerful but they weren’t built with modern workloads including Containers and Orchestrators in mind. Also, the learning curve of their policy language seems to be steep thus imposing friction in adoption.

eBPF has provided us with the ability to safely and efficiently extend the kernel’s capabilities without requiring changes to kernel source code or loading kernel modules.

BPF LSM leverages the powerful LSM framework while providing us with the ability to load our custom programs with decision-making into the kernel seamlessly helping us protect modern workloads with enough context while we can choose to keep the interface easy to understand and user-friendly.

KubeArmor already integrates with AppArmor and SELinux and has a set of tools and utilities providing a seamless experience for enforcing security but these integrations come with their own set of complexities and limitations. Thus the need to integrate with BPF LSM would provide us with fine-grained control over the LSM hooks.


The Implementation can be conveyed by the following tales:

  1. Map to cross boundaries - Establishing the interface between KubeArmor daemon (Userspace) and BPF Programs (KernelSpace)
  2. Putting Security on the Map - Handling Policies in UserSpace and Feeding them into the Map
  3. Marshal Law - Enforcing Policies in the KernelSpace

Map to cross boundaries🗺️

There’s a clear boundary between the territories of KernelSpace and the Userspace, so how do we establish the routes between these two.

We leverage eBPF Maps for establishing the interface between the KubeArmor daemon (Userspace) and BPF Programs (KernelSpace). As described in the kernel doc,

‘maps’ is a generic storage of different types for sharing data between kernel and userspace.

It seems apt to help us navigate here 😁

For each container KubeArmor needs to protect or how I like to term it ‘Armor Up’ the workload, we create an entry in the global BPF Hash of Maps pinned to the BPF filesystem under /sys/fs/bpf/kubearmor_containers, this entry has a value to another BPF Hashmap which has all the details of policies that are needed to enforce.

Map Interface

Putting Security on the Map📍

KubeArmor Security Policies have a lot of metadata, We cannot put all that to Maps and let the BPF Program navigate those complexities.

For instance, we usually map the security policies through labels associated with Pods and Containers. But we can’t send that to the BPF Program, let alone labels eBPF programs wouldn’t handle container names/IDs as well. So KubeArmor extracts information from Kubernetes and CRI (Docker/Containerd/CRI-O) APIs and simplifies it to something we can extract in the eBPF Program as well.

struct key {
u32 pid_ns;
u32 mnt_ns;

struct containers {
__uint(type, BPF_MAP_TYPE_HASH_OF_MAPS);
__uint(max_entries, X);
__uint(key_size, sizeof(struct key)); // 2*u32
__uint(value_size, sizeof(u32)); // Rule Map File Descriptor
__uint(pinning, LIBBPF_PIN_BY_NAME); // Created in Userspace, Identified in Kernel Space using pinned name

Similarly, KubeArmor can receive conflicting policies, we need to handle and resolve them as part of KubeArmor Userspace program before putting them on the Map.

After all the rule simplification, conflict resolution, and handling of policy updates, We send the data to the eBPF Map prepping the BPF Programs to get ready for enforcement.

Marshal Law

We finally marshal all the data in the kernel space and impose MAC (Military Access Control? 🔫 Pun intended :P ).

In the kernel space where our BPF LSM Programs reside, In each program, we extract the main entity i.e. File Path, Process Path, or Network Socket/Protocol, and pair them up with their parent process paths and look up in the respective maps. We do the decision making based on these lookup values.

There are a fair bit of complexities involved here which I have skipped, if you are interested in them, check out Design Doc and Github Pull Request.

I would also like to credit How systemd extended security features with BPF LSM which acted as an inspiration for the implementation design.

Armoring Up

The environment requires a kernel >= 5.8 configured with CONFIG_BPF_LSM, CONFIG_DEBUG_INFO_BTF and the BPF LSM enabled (via CONFIG_LSM="...,bpf" or the "lsm=...,bpf" kernel boot parameter).

You can use the daemon1024/kubearmor:bpflsm image and follow the Deployment Guide to try it out.

A sample alert here showing enforcement through BPF LSM: BPF

Next Steps

We have unraveled just a drop of what BPF LSM is capable of, and we plan to extend our security features to lots of other use cases and BPF LSM would play an important role in it.

Near future plans include supporting wild cards in Policy Rules and doing in-depth performance analysis and optimizing the implementation.

That sums up my journey to implement security enforcement leveraging BPF LSM at its core. It was a lot of fun and I learned a lot. Hope I was able to share my learnings 😄

If you have any feedback about the design and implementation feel free to comment on the Github PR. If you have any suggestions/thoughts/questions in general or just wanna say hi, my contact details are here ✌️