A reverse proxy is a single point of failure unless you engineer redundancy into the design from day one. Apache HTTP Server, when combined with keepalived’s VRRP (Virtual Router Redundancy Protocol) implementation, delivers active-passive failover with sub-second detection and a shared virtual IP that clients never need to update. This guide walks through a production-ready dual-Apache setup with keepalived, covering everything from initial configuration to split-brain prevention.
Architecture Overview
The setup consists of two Apache servers — proxy01 (10.0.1.10) and proxy02 (10.0.1.11) — sharing a virtual IP (VIP) of 10.0.1.100. All external traffic targets the VIP. Under normal conditions proxy01 holds the VIP (MASTER state). If proxy01 fails its health check or goes offline, keepalived on proxy02 detects the failure within two to three seconds and promotes itself to MASTER, claiming the VIP via a gratuitous ARP broadcast.
DNS for your public-facing domains points to the upstream load balancer or NAT rule, which forwards to the VIP. Clients are never aware of the underlying failover.
Prerequisites
- Two servers on the same L2 segment (VIP failover relies on ARP)
- Apache 2.4 with mod_proxy, mod_proxy_http, mod_proxy_wstunnel, mod_ssl enabled
- keepalived 2.2+ installed on both nodes
- Identical SSL certificates deployed to both nodes (or a shared NFS mount)
- Firewall rules allowing VRRP (protocol 112) between the two servers
Apache Configuration
The Apache configuration should be identical on both nodes. Use configuration management (Puppet, Ansible) to enforce this — configuration drift is the most common source of post-failover surprises.
<VirtualHost *:443>
ServerName app.example-corp.com
SSLEngine on
SSLCertificateFile /etc/pki/tls/certs/example-corp.crt
SSLCertificateKeyFile /etc/pki/tls/private/example-corp.key
SSLCertificateChainFile /etc/pki/tls/certs/example-corp-chain.crt
SSLProtocol -all +TLSv1.2 +TLSv1.3
SSLCipherSuite ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305
SSLHonorCipherOrder off
Header always set Strict-Transport-Security "max-age=63072000; includeSubDomains; preload"
Header always set X-Content-Type-Options "nosniff"
Header always set X-Frame-Options "DENY"
Header always set Referrer-Policy "strict-origin-when-cross-origin"
ProxyPreserveHost On
ProxyTimeout 60
ProxyPass /api/ http://backend01.internal:3001/api/
ProxyPassReverse /api/ http://backend01.internal:3001/api/
ProxyPass / http://frontend01.internal:3000/
ProxyPassReverse / http://frontend01.internal:3000/
ErrorLog /var/log/httpd/app-error.log
CustomLog /var/log/httpd/app-access.log combined
</VirtualHost>
<VirtualHost *:80>
ServerName app.example-corp.com
RewriteEngine On
RewriteRule ^(.*)$ https://%{HTTP_HOST}$1 [R=301,L]
</VirtualHost>
Also harden the global Apache configuration:
ServerTokens Prod
ServerSignature Off
TraceEnable Off
FileETag None
Timeout 60
KeepAliveTimeout 15
keepalived Configuration
On proxy01 (MASTER):
global_defs {
router_id PROXY01
enable_script_security
}
vrrp_script chk_apache {
script "/usr/bin/systemctl is-active --quiet httpd"
interval 2
weight -20
fall 2
rise 2
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 110
advert_int 1
authentication {
auth_type PASS
auth_pass ChangeThisSecret42
}
virtual_ipaddress {
10.0.1.100/24 dev eth0
}
track_script {
chk_apache
}
notify_master "/etc/keepalived/notify.sh MASTER"
notify_backup "/etc/keepalived/notify.sh BACKUP"
notify_fault "/etc/keepalived/notify.sh FAULT"
}
On proxy02 (BACKUP) — identical except state BACKUP, priority 100, and router_id PROXY02.
The weight -20 on the Apache health script means that if httpd is not active, the effective priority drops from 110 to 90 — lower than proxy02’s 100 — triggering a failover even while the server itself is online.
Notification Script
Create /etc/keepalived/notify.sh to send alerts on state transitions:
#!/bin/bash
STATE=$1
HOSTNAME=$(hostname -s)
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
logger -t keepalived "VRRP state change: $HOSTNAME entered $STATE at $TIMESTAMP"
# Optionally POST to a webhook or send email
curl -s -X POST https://alerts.example-corp.com/webhook \
-H 'Content-Type: application/json' \
-d "{\"host\":\"$HOSTNAME\",\"state\":\"$STATE\",\"time\":\"$TIMESTAMP\"}"
chmod 755 /etc/keepalived/notify.sh
SSL Certificate Synchronization
Both nodes must present identical certificates. Two approaches:
- Shared NFS mount: Store certs on an NFS share mounted read-only on both proxies. Simple, but the NFS server becomes a dependency.
- rsync on renewal: Use a post-renewal hook in your ACME client to rsync new certificates to the standby node and reload Apache there. More complex but eliminates the NFS dependency.
With Let’s Encrypt or step-ca, configure the renewal hook on both nodes to reload Apache after a successful renewal. The master should also push the renewed cert to the standby immediately.
Session Persistence
If your backend application is not fully stateless, you need session persistence to ensure that after a failover, existing sessions remain valid. Options:
- Shared session store: Store sessions in Redis or PostgreSQL. Both proxy nodes forward to the same backends, so failover is transparent to the application layer.
- mod_proxy_balancer with LBMETHOD=bybusyness: For backends that handle their own session stickiness via a shared cache, this distributes load effectively.
- Client-side sessions (JWT): Stateless JWT tokens are carried by the client and validated on each request — no server-side session state to synchronize.
The shared session store approach is the most robust for HA deployments. Redis Sentinel or Redis Cluster can itself be made highly available.
Health Checks and Monitoring
Beyond keepalived’s Apache process check, implement application-level health checks:
<Location /health>
ProxyPass http://backend01.internal:3001/health
ProxyPassReverse http://backend01.internal:3001/health
# Restrict to internal monitoring systems
Require ip 10.0.0.0/8
</Location>
Monitor the VIP itself from an external vantage point. A simple HTTP check against https://app.example-corp.com/health every 30 seconds gives you end-to-end validation that the full proxy-to-backend chain is functioning.
Split-Brain Prevention
Split-brain occurs when both nodes simultaneously believe they are MASTER and both claim the VIP — resulting in an ARP conflict and unpredictable routing. Prevention strategies:
- VRRP authentication: The
auth_passdirective ensures only legitimate peers participate in the election. Use a strong random password. - Unicast VRRP: In environments where multicast is blocked or unreliable, configure keepalived to use unicast advertisements with explicit peer IP addresses.
- Network design: Ensure both nodes are on the same L2 broadcast domain with low-latency connectivity. VRRP over a routed path introduces risk.
- Fence the failed node: In critical environments, use IPMI/BMC-based fencing to power-cycle the failed node rather than just withdrawing the VIP.
Testing Failover
Test failover regularly — ideally in a staging environment that mirrors production, and at least quarterly in production with a maintenance window:
# On proxy01, stop Apache and watch VIP move to proxy02
systemctl stop httpd
ip addr show eth0 # VIP should disappear from proxy01
# On proxy02:
ip addr show eth0 # VIP should appear here within ~3 seconds
# Restart Apache on proxy01 and verify it reclaims MASTER
systemctl start httpd
# After preemption delay, proxy01 should reclaim VIP
Automate this test in your CI/CD pipeline or as a scheduled chaos engineering job. Failover that has never been tested is failover that will fail when you need it most.
Conclusion
A dual-Apache setup with keepalived VRRP provides a straightforward, cost-effective path to reverse proxy high availability. The key disciplines are configuration parity between nodes, application-level health checks that reflect real service health, SSL synchronization, and regular failover testing. With these in place, a single proxy failure becomes a seconds-long hiccup rather than an outage.
