A Public Key Infrastructure (PKI) is the foundation of trust in a networked environment. Every TLS certificate, code signing signature, and mutual authentication exchange depends on a chain of certificates rooted in a trust anchor. Most organizations start with a single self-signed CA or a cloud-managed certificate service, then discover years later that their ad hoc approach cannot scale — they have no offline root, no clear intermediate CA hierarchy, no automated renewal, and no revocation infrastructure. This guide covers designing a multi-tier PKI that is auditable, scalable, and suitable for enterprise certificate management from day one.
Why a Multi-Tier Hierarchy
A single root CA that issues leaf certificates directly is simple but fragile. If the root’s private key is compromised or the root CA must be rebuilt, every issued certificate becomes untrusted simultaneously. The multi-tier model separates concerns:
- Root CA: The ultimate trust anchor. Its key is generated in an air-gapped ceremony, stored in hardware (HSM or YubiKey), and comes online only to sign intermediate CA certificates. This happens perhaps once per year.
- Intermediate CAs: Online subordinates of the root, partitioned by purpose or organizational unit. The intermediate issues leaf certificates. If an intermediate is compromised, revoke it from the root — leaf certificates issued by other intermediates are unaffected.
- Leaf Certificates: End-entity certificates for servers, users, devices, or code signing. Short-lived (90 days or less), auto-renewed, issued by an intermediate.
Root CA Ceremony
The root CA key generation must be treated as a high-assurance ceremony. The objective is to generate the key in a controlled environment, with witnesses, with the private key never touching a networked system.
Minimum ceremony requirements:
- Air-gapped laptop (no WiFi, no Bluetooth, freshly installed OS booted from live USB)
- At least two witnesses from separate departments
- Key generation documented with command-line transcript and timestamps
- Private key stored in at least two HSMs (primary + backup) and one encrypted paper backup in a physical safe
- Root CA certificate exported and distributed to all managed systems via Puppet/Group Policy
# Generate root CA on air-gapped system using step-ca
step certificate create "Example Corp Root CA" \
root-ca.crt root-ca.key \
--profile root-ca \
--kty EC \
--curve P-384 \
--not-after 87600h # 10 years
# Verify
step certificate inspect root-ca.crt
Intermediate CA Design
Partition intermediates by purpose. Common patterns:
- TLS Intermediate CA: Issues server certificates for web services and APIs. 5-year validity, path length constraint of 0 (cannot issue further intermediates).
- Client Auth Intermediate CA: Issues mutual TLS certificates for service-to-service authentication and VPN client certificates.
- Code Signing Intermediate CA: Issues certificates for signing software artifacts and container images.
- Device Intermediate CA: Issues certificates for enrolled workstations and servers (used in 802.1X network authentication).
# Generate TLS intermediate (run on online CA server after root ceremony)
step certificate create "Example Corp TLS Intermediate CA" \
tls-intermediate.csr tls-intermediate.key \
--profile intermediate-ca \
--kty EC \
--curve P-256 \
--not-after 43800h # 5 years
# Sign with root CA (transfer CSR to air-gapped root, sign, transfer cert back)
step certificate sign tls-intermediate.csr root-ca.crt root-ca.key \
--profile intermediate-ca \
--not-after 43800h \
> tls-intermediate.crt
# Create full chain
cat tls-intermediate.crt root-ca.crt > tls-chain.crt
Online CA with step-ca
step-ca (from Smallstep) provides a production-ready online CA server with ACME protocol support, JWK provisioners, and native Kubernetes integration. Deploy it as the online intermediate CA for automated leaf certificate issuance:
# Initialize step-ca with existing intermediate
step ca init \
--name "Example Corp TLS CA" \
--dns "ca.internal.example-corp.com" \
--address ":443" \
--root root-ca.crt \
--key tls-intermediate.key \
--certificate tls-intermediate.crt
# Add ACME provisioner for automated renewal
step ca provisioner add acme --type ACME
# Start the CA
step-ca $(step path)/config/ca.json
With the ACME provisioner, any service that supports ACME (certbot, Caddy, Traefik, cert-manager) can automatically obtain and renew certificates from your internal CA — the same workflow as Let’s Encrypt, but under your control.
OCSP Responders
Certificate Revocation Lists (CRLs) are downloaded by clients periodically and can be megabytes in size for large CAs. Online Certificate Status Protocol (OCSP) is preferable for leaf certificates — clients query an OCSP responder in real time for a specific certificate’s status.
Embed the OCSP responder URL in every issued certificate:
# In step-ca's ca.json, configure the Authority Information Access extension
{
"authority": {
"claims": {
"defaultTLSCertDuration": "2160h"
},
"template": {
"subject": {{ toJson .Subject }},
"extensions": [
{
"id": "1.3.6.1.5.5.7.1.1",
"value": {
"ocsp": ["http://ocsp.internal.example-corp.com"],
"issuers": ["http://crt.internal.example-corp.com/tls-intermediate.crt"]
}
}
]
}
}
}
The OCSP responder must be highly available and fast — a slow OCSP responder degrades TLS handshake performance for all services using your PKI. Deploy behind a load balancer with health checks, and implement OCSP stapling on your web servers to cache responses client-side.
CRL Distribution
Maintain CRLs for all CA levels. Publish them at stable URLs that do not change when you rebuild or rotate CAs:
http://crl.internal.example-corp.com/root-ca.crl— updated annually or on revocationhttp://crl.internal.example-corp.com/tls-intermediate.crl— updated every 24 hours
Use HTTP (not HTTPS) for CRL and OCSP distribution — a client validating a certificate cannot use that certificate to fetch the CRL that would validate it. This chicken-and-egg problem is why revocation infrastructure uses plain HTTP.
Auto-Renewal with cert-manager
In Kubernetes environments, cert-manager automates certificate lifecycle management using your step-ca ACME endpoint:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: internal-ca
spec:
acme:
server: https://ca.internal.example-corp.com/acme/acme/directory
privateKeySecretRef:
name: internal-ca-account-key
solvers:
- http01:
ingress:
class: traefik
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: api-tls
namespace: production
spec:
secretName: api-tls-secret
duration: 2160h # 90 days
renewBefore: 360h # Renew 15 days before expiry
dnsNames:
- api.internal.example-corp.com
issuerRef:
name: internal-ca
kind: ClusterIssuer
Trust Distribution
The root CA certificate must be present in the trust store of every system that will verify certificates issued by your PKI. Distribute via configuration management:
# Puppet: install root CA into system trust store
file { '/etc/pki/ca-trust/source/anchors/example-corp-root.crt':
ensure => file,
source => 'puppet:///modules/pki/example-corp-root.crt',
notify => Exec['update-ca-trust'],
}
exec { 'update-ca-trust':
command => '/usr/bin/update-ca-trust extract',
refreshonly => true,
}
Operational Runbook
Document and practice these procedures before you need them:
- Revoke a leaf certificate: Call the CA’s revocation API or run
step ca revoke <serial>. OCSP status updates immediately; CRL on next publication interval. - Revoke an intermediate CA: Sign a new CRL from the root including the intermediate’s serial. Update root CRL distribution point. Decommission the online CA instance.
- Rotate an intermediate CA: Generate new intermediate, sign with root, update all clients to trust the new intermediate, migrate issuance, revoke old intermediate after all its leaf certs expire.
- Root CA compromise: The worst case. Requires removing the root from all trust stores (Puppet/Group Policy), generating a new root, signing new intermediates, re-issuing all leaf certificates. Plan for 72-hour incident response window minimum.
Conclusion
A multi-tier PKI is an investment that pays dividends for the lifetime of your infrastructure. The air-gapped root ceremony, intermediate CA partitioning, OCSP/CRL infrastructure, and automated leaf renewal are not gold-plating — they are the minimum viable architecture for a PKI that can survive a partial compromise and maintain operations through it. Build it right the first time; rebuilding a trusted CA hierarchy is one of the most disruptive operations an enterprise can undertake.
