Hardening Linux Containers: LXC/LXD Security Profiles for Production Workloads

Hardening Linux Containers: LXC/LXD Security Profiles for Production Workloads

Linux containers — whether managed by LXC, LXD, or Proxmox PCT — offer significantly lower overhead than full virtual machines. That efficiency comes with a different security model. Containers share the host kernel; a kernel vulnerability or privileged container escape is a full host compromise, not just a guest compromise. Hardening containers is not optional for production workloads: it is the difference between containers as a liability and containers as a safe deployment primitive.

This article covers unprivileged container configuration, AppArmor and seccomp profiles, resource limits, network isolation, and monitoring integration with Wazuh. All examples use Proxmox PCT (which wraps LXC) but the underlying concepts apply to any LXC/LXD environment.

Privileged vs. Unprivileged Containers

This is the most important security decision when creating a container. A privileged container runs with UID 0 inside the container mapped to UID 0 on the host. If a process escapes the container namespace, it is root on the host. A unprivileged container uses UID mapping: UID 0 inside the container maps to an unprivileged UID (e.g., UID 100000) on the host. An escaped process has no special privileges on the host.

Create unprivileged containers by default. In Proxmox, uncheck the “Privileged container” checkbox at creation time. After the fact, in the container’s configuration file (/etc/pve/lxc/<CTID>.conf):

# /etc/pve/lxc/102.conf
unprivileged: 1
lxc.idmap: u 0 100000 65536
lxc.idmap: g 0 100000 65536

The UID map means container UIDs 0-65535 map to host UIDs 100000-165535. A process running as root (UID 0) in the container appears as UID 100000 on the host — an unprivileged user with no special access.

Unprivileged containers have minor operational constraints: bind-mounting host directories owned by root requires mapping adjustments, and some kernel interfaces may behave differently. These are solvable. The security gain is substantial and non-negotiable for multi-tenant or production deployments.

AppArmor Profiles

AppArmor provides mandatory access control at the kernel level. LXC ships with a default AppArmor profile (lxc-container-default) that restricts the most dangerous kernel interfaces. For production workloads, start with the default profile and add restrictions specific to your workload.

Check the active AppArmor profile for a running container:

aa-status | grep lxc

The default LXC profile blocks:

  • Loading kernel modules (CAP_SYS_MODULE)
  • Mounting filesystems (CAP_SYS_ADMIN mount operations)
  • Access to /proc/sysrq-trigger and /proc/sys/kernel
  • Raw socket creation in most contexts

For a web server container that needs no special kernel access, you can layer additional restrictions:

# /etc/apparmor.d/lxc/lxc-webserver
profile lxc-webserver flags=(attach_disconnected,mediate_deleted) {
  #include <abstractions/base>
  #include <abstractions/lxc/container-base>

  # Deny all network operations except TCP/UDP
  deny network raw,
  deny network packet,

  # Deny ptrace on other processes
  deny ptrace (read, trace),

  # Deny writes to /proc and /sys except approved paths
  deny /proc/sys/** w,
  deny /sys/** w,
}

Load the profile and assign it to the container in the LXC config:

apparmor_parser -r /etc/apparmor.d/lxc/lxc-webserver

# In /etc/pve/lxc/102.conf:
lxc.apparmor.profile: lxc-webserver

Seccomp Filtering

Seccomp (Secure Computing Mode) restricts which system calls a process can make. The attack surface of the Linux kernel is largely its syscall interface — limiting the syscalls available to a container dramatically reduces the kernel attack surface.

LXC applies a default seccomp profile that blocks a curated list of dangerous syscalls. The blocked syscalls include:

  • kexec_load — replace the running kernel
  • open_by_handle_at — bypass path-based access controls (used in Docker breakouts)
  • init_module, finit_module — load kernel modules
  • ptrace — limited to same UID by default
  • mount — prevented for non-CAP_SYS_ADMIN processes

View the default seccomp profile applied by LXC:

cat /usr/share/lxc/config/common.seccomp

For containers that need further restriction (e.g., a static file server that makes no network system calls itself), create a custom seccomp profile using the allowlist approach:

2
allowlist
read
write
open
openat
close
stat
fstat
lstat
poll
lseek
mmap
mprotect
munmap
exit_group

The number at the top is the seccomp profile version (2 = BPF-based). Allowlist profiles are more restrictive than blocklist profiles but require careful enumeration of every syscall the application uses. Use strace -f -e trace=all to record syscalls during normal operation before building an allowlist.

Capability Dropping

Linux capabilities decompose root privileges into discrete units. Even a process running as UID 0 can be stripped of dangerous capabilities. For a typical web service container:

# /etc/pve/lxc/102.conf
# Drop all capabilities, add back only what's needed
lxc.cap.drop: sys_module
lxc.cap.drop: sys_rawio
lxc.cap.drop: sys_pacct
lxc.cap.drop: sys_admin
lxc.cap.drop: sys_nice
lxc.cap.drop: sys_resource
lxc.cap.drop: sys_time
lxc.cap.drop: sys_tty_config
lxc.cap.drop: mknod
lxc.cap.drop: audit_write
lxc.cap.drop: audit_control
lxc.cap.drop: mac_override
lxc.cap.drop: mac_admin

For the strictest posture, drop all capabilities and add back only specific ones:

lxc.cap.drop: all
# If the container needs to bind to port 80/443:
# lxc.cap.keep: net_bind_service
# If it needs to manage its own networking:
# lxc.cap.keep: net_admin

Resource Limits

Resource limits serve two purposes: preventing runaway workloads from affecting host stability (availability), and limiting the blast radius of a compromised container (security — a crypto miner is limited in its impact).

Proxmox PCT resource configuration:

# /etc/pve/lxc/102.conf
cores: 2
cpulimit: 1.5          # Max 1.5 CPU cores
cpuunits: 1024         # CPU scheduler weight
memory: 1024           # 1 GB RAM limit
swap: 512              # 512 MB swap
rootfs: local-lvm:8    # 8 GB root disk

For finer-grained control, cgroup v2 limits via LXC config:

# Limit total processes (prevents fork bombs)
lxc.cgroup2.pids.max: 200

# IO weight and bandwidth limits
lxc.cgroup2.blkio.weight: 100

Also configure ulimits inside the container for per-process limits:

# /etc/security/limits.d/container-limits.conf (inside the container)
*    hard    nproc     100
*    hard    nofile    65535
*    hard    fsize     1048576
www-data  hard  nproc  50

Network Isolation

Each container should have a dedicated network interface connected to an appropriate bridge or VLAN. Avoid placing containers on the same bridge as the host’s primary interface.

# /etc/pve/lxc/102.conf
net0: name=eth0,bridge=vmbr20,firewall=1,hwaddr=BC:24:11:xx:xx:xx,ip=10.0.20.15/24,gw=10.0.20.1,tag=20

Enable the Proxmox container firewall (firewall=1) and configure rules in the Proxmox firewall GUI or via /etc/pve/firewall/102.fw:

[RULES]
IN ACCEPT -p tcp --dport 22 -source 10.0.10.0/24  # SSH from management only
IN ACCEPT -p tcp --dport 443                         # HTTPS from anywhere
IN DROP                                              # Default deny inbound
OUT ACCEPT -p tcp --dport 443                        # HTTPS outbound allowed
OUT ACCEPT -p udp --dport 53                         # DNS
OUT DROP                                             # Default deny outbound

SSH Hardening Inside Containers

If the container runs an SSH server, harden it beyond defaults:

# /etc/ssh/sshd_config (inside the container)
PermitRootLogin no
PasswordAuthentication no
ChallengeResponseAuthentication no
UsePAM yes
AllowUsers deploy-user
MaxAuthTries 3
ClientAliveInterval 300
ClientAliveCountMax 2
LoginGraceTime 30
Banner /etc/ssh/banner
Protocol 2

Use key-based authentication exclusively. Distribute authorized keys via configuration management rather than manually. For bastion-accessed containers, consider not running SSH at all and accessing the container console via pct enter <CTID> from the Proxmox host or through the web console.

Wazuh Monitoring Integration

Deploy the Wazuh agent inside each container to provide file integrity monitoring, log analysis, and vulnerability assessment. The agent reports to the Wazuh manager on the SIEM host.

# Inside the container: install Wazuh agent
curl -s https://packages.wazuh.com/key/GPG-KEY-WAZUH | apt-key add -
echo "deb https://packages.wazuh.com/4.x/apt/ stable main" \
  > /etc/apt/sources.list.d/wazuh.list
apt-get update && apt-get install wazuh-agent

# Configure the manager address
sed -i 's/MANAGER_IP/siem01.management.example-corp.com/' \
  /var/ossec/etc/ossec.conf

systemctl enable --start wazuh-agent

Configure Wazuh FIM (File Integrity Monitoring) to watch critical paths inside the container:

<syscheck>
  <directories check_all="yes" realtime="yes">/etc</directories>
  <directories check_all="yes" realtime="yes">/usr/bin</directories>
  <directories check_all="yes" realtime="yes">/usr/sbin</directories>
  <directories check_all="yes">/var/www</directories>
  <ignore>/etc/mtab</ignore>
  <ignore>/etc/mnttab</ignore>
</syscheck>

Wazuh will alert on unauthorized file modifications, new SUID binaries, and configuration changes — critical signals for detecting container compromise.

Audit Checklist Before Production Deployment

  • Container is unprivileged (unprivileged: 1 in config)
  • AppArmor profile assigned and active
  • Default seccomp profile confirmed, custom profile if needed
  • Dangerous capabilities dropped
  • CPU, memory, and PID limits configured
  • Dedicated network interface with firewall rules applied
  • SSH: key-only auth, root login disabled, banner configured
  • Wazuh agent installed and reporting to SIEM
  • No sensitive secrets in environment variables visible via /proc/<pid>/environ
  • Automated backup snapshot configured (Proxmox scheduled backup)

Conclusion

Container security is not a single setting — it is a layered composition of kernel-level controls. Unprivileged containers eliminate the most dangerous escape path. AppArmor and seccomp restrict kernel interface exposure. Capability dropping narrows what a process can do even as root within the container. Resource limits bound the blast radius of compromise or runaway workloads. Network isolation prevents lateral movement if a container is compromised. And continuous monitoring via Wazuh closes the detection gap that prevention controls leave open.

Apply these controls as defaults for every new container, not as a retrofit when something goes wrong. Configuration management (Puppet, Ansible) can enforce the baseline across a fleet, catching drift before it becomes a vulnerability.

Scroll to Top