SOAR Playbook Engineering: Designing Automated Response Logic for Security Operations

Excerpt: Security Orchestration, Automation, and Response (SOAR) playbooks are the backbone of a scalable security operations center. This guide covers TheHive 5 and Cortex 3 integration with MISP, playbook design patterns for automated alert triage, enrichment workflows, containment action design, and measuring playbook quality with operational metrics.

Introduction

A security operations center that handles alerts manually is fundamentally limited by analyst throughput. As environments grow and threat volume increases, the gap between alert volume and analyst capacity widens until critical alerts are missed or response times drift into hours. SOAR platforms close this gap by automating repeatable, well-understood response logic — freeing analysts to focus on investigation and judgment.

This article takes a practical, engineering-focused approach to SOAR playbook development using TheHive 5 and Cortex 3, with MISP as the threat intelligence backbone. We will cover the principles that distinguish effective automation from brittle scripts, design patterns for common security scenarios, and the metrics that tell you whether your playbooks are working.

The SOAR Stack: TheHive, Cortex, and MISP

Before designing playbooks, it is important to understand the role each component plays in the stack:

  • TheHive 5 is the case management and orchestration platform. Alerts flow in, analysts triage them, and cases are managed through to resolution. TheHive 5 introduces a new data model with organizations, custom fields, and significantly improved performance over TheHive 4
  • Cortex 3 is the analysis and response engine. It provides a library of “analyzers” (enrichment modules) and “responders” (action modules) that can be triggered from TheHive. Each analyzer/responder runs as an isolated Python or JavaScript microservice
  • MISP (Malware Information Sharing Platform) is the threat intelligence hub. It stores indicators of compromise (IOCs), threat actor TTPs, and vulnerability data. Cortex analyzers can query MISP to contextualize observables, and TheHive can export cases to MISP as events

The integration flow for a typical alert:

Alert Source (SIEM/EDR/NDR)
        |
        v
TheHive Alert Ingestion (API or webhook)
        |
        v
Automated Pre-processing (TheHive Functions)
   - Observable extraction
   - Severity assignment
   - Tag application
        |
        v
Cortex Analyzer Execution (automatic or on-demand)
   - IP/domain reputation lookup
   - File hash check (VirusTotal, MISP)
   - ASN/GeoIP enrichment
        |
        v
Case Creation (if alert meets threshold)
        |
        v
Analyst Review + Decision
        |
        v
Cortex Responder Execution (containment actions)
   - Firewall block rule
   - EDR host isolation
   - AD account disable
        |
        v
Case Closure + MISP Export

Playbook Design Principles

Well-designed playbooks share several characteristics that distinguish them from ad-hoc automation scripts:

Idempotency

A playbook triggered multiple times for the same alert should produce the same result as being triggered once. Actions like “block IP X” should be idempotent — if the block already exists, the playbook should succeed without error rather than failing or creating a duplicate entry. This is critical because alert deduplication is imperfect and you will encounter repeat triggers.

Graceful Degradation

When an enrichment source is unavailable (VirusTotal API rate limit, MISP maintenance window), the playbook should continue with reduced information rather than fail completely. Use try/catch patterns in Cortex responders and ensure TheHive cases are created even when enrichment is partial. Log enrichment failures as case notes so analysts know the context is incomplete.

Explicit Decision Points

Not every response action should be automated. Identify which decisions require human judgment and design explicit pause points. A good rule of thumb: automate enrichment and low-risk containment actions (e.g., passive network block on an external IP), but require analyst approval for high-impact actions (host isolation, account lockout).

Observable-Centric Design

Design enrichment workflows around observable types rather than alert types. A phishing alert and a malware alert may both produce the same observable types (URLs, file hashes, email addresses) — your enrichment playbook should handle each observable type consistently regardless of how it was generated.

Building an Alert Ingestion Pipeline

Alerts arrive in TheHive from multiple sources: SIEM export scripts, Wazuh integration, Cloudflare log analysis, EDR webhooks. The ingestion pipeline must normalize these into a consistent format before playbook logic runs.

# Example: Python script to ingest Wazuh alerts into TheHive
# Runs as a webhook target for Wazuh integration

import requests
import json
from datetime import datetime

THEHIVE_URL = "https://thehive.example-corp.com"
THEHIVE_API_KEY = "your-api-key-here"

def create_thehive_alert(wazuh_alert):
    """Convert a Wazuh alert to TheHive alert format and create it."""

    # Extract observables from Wazuh alert fields
    observables = []

    if wazuh_alert.get("data", {}).get("srcip"):
        observables.append({
            "dataType": "ip",
            "data": wazuh_alert["data"]["srcip"],
            "tags": ["source", "wazuh"],
            "message": "Source IP from Wazuh alert"
        })

    if wazuh_alert.get("data", {}).get("url"):
        observables.append({
            "dataType": "url",
            "data": wazuh_alert["data"]["url"],
            "tags": ["wazuh"]
        })

    # Map Wazuh severity (0-15) to TheHive severity (1-4)
    wazuh_level = wazuh_alert.get("rule", {}).get("level", 0)
    if wazuh_level >= 12:
        severity = 3  # High
    elif wazuh_level >= 8:
        severity = 2  # Medium
    else:
        severity = 1  # Low

    alert_data = {
        "type": "wazuh",
        "source": "wazuh-manager",
        "sourceRef": wazuh_alert.get("id", str(datetime.now().timestamp())),
        "title": wazuh_alert.get("rule", {}).get("description", "Wazuh Alert"),
        "description": json.dumps(wazuh_alert, indent=2),
        "severity": severity,
        "tags": [
            "wazuh",
            f"rule-{wazuh_alert.get('rule', {}).get('id', 'unknown')}",
            wazuh_alert.get("agent", {}).get("name", "unknown-agent")
        ],
        "observables": observables,
        "customFields": {
            "wazuh-rule-id": {"string": str(wazuh_alert.get("rule", {}).get("id"))},
            "wazuh-level": {"integer": wazuh_level}
        }
    }

    response = requests.post(
        f"{THEHIVE_URL}/api/v1/alert",
        headers={"Authorization": f"Bearer {THEHIVE_API_KEY}"},
        json=alert_data
    )
    response.raise_for_status()
    return response.json()

Automated Enrichment Workflows

Enrichment is the process of adding context to observables — turning a raw IP address into “this IP belongs to a known botnet C2 infrastructure, last seen 2 hours ago, associated with Emotet campaign.” This context is what separates an alert from a case and enables triage decisions.

Configure automatic analyzer execution in TheHive using Functions (formerly “webhooks” or “triggers”):

// TheHive Function: Auto-run analyzers on new observables
// Trigger: AnyEvent where objectType = Observable

function handle(input) {
  const observable = input.object;
  const analyzerMap = {
    "ip": [
      "MaxMind_GeoIP_4_0",
      "AbuseIPDB_2_0",
      "MISP_2_1",
      "Shodan_Host_2_0"
    ],
    "domain": [
      "DomainTools_Iris_Investigate_1_0",
      "MISP_2_1",
      "VirusTotal_GetReport_3_1"
    ],
    "hash": [
      "VirusTotal_GetReport_3_1",
      "MISP_2_1",
      "MalwareBazaar_1_0"
    ],
    "url": [
      "URLhaus_2_0",
      "VirusTotal_GetReport_3_1"
    ]
  };

  const analyzers = analyzerMap[observable.dataType] || [];

  return analyzers.map(analyzerId => ({
    "type": "RunAnalyzer",
    "analyzerId": analyzerId,
    "observableId": observable._id
  }));
}

Containment Action Design

Containment responders are the highest-risk automation in a SOAR playbook — a misconfigured firewall block can take down a service, and an incorrect host isolation can interrupt business operations. Design containment actions with explicit safeguards:

# Example Cortex responder: Block IP at perimeter firewall
# Uses pfSense API to add IP to a blocklist alias

import requests
from cortexutils.responder import Responder

class PfSenseBlockIP(Responder):
    def __init__(self):
        Responder.__init__(self)
        self.pfsense_url = self.get_param("config.pfsense_url")
        self.api_key = self.get_param("config.api_key")
        self.blocklist_name = self.get_param("config.blocklist_name", "SOAR_BLOCKLIST")

    def run(self):
        # Get the IP to block from the observable
        ip = self.get_param("data.data")
        if not ip:
            self.error("No IP address found in observable")

        # Safety check: never block RFC1918 addresses
        import ipaddress
        try:
            addr = ipaddress.ip_address(ip)
            if addr.is_private:
                self.error(f"Refusing to block private IP address: {ip}")
        except ValueError:
            self.error(f"Invalid IP address: {ip}")

        # Add to pfSense alias via API
        response = requests.post(
            f"{self.pfsense_url}/api/v1/firewall/alias/entry",
            headers={"Authorization": f"Bearer {self.api_key}"},
            json={
                "name": self.blocklist_name,
                "address": ip,
                "detail": f"SOAR block - TheHive case {self.get_param('data.case._id', 'unknown')}"
            },
            verify=True,
            timeout=10
        )

        if response.status_code in (200, 201):
            self.report({
                "success": True,
                "message": f"IP {ip} added to blocklist {self.blocklist_name}",
                "pfsense_response": response.json()
            })
        else:
            self.error(f"pfSense API error: {response.status_code} {response.text}")

Playbook Quality Metrics

Automation without measurement is maintenance debt without feedback. Instrument your SOAR deployment with these key metrics to understand whether playbooks are working and where to invest optimization effort:

  • Mean Time to Enrich (MTTE) — from alert creation to enrichment complete. Target under 5 minutes for high-severity alerts
  • Automation Rate — percentage of alerts closed without analyst interaction. A well-tuned SOC should achieve 60-80% automation rate for known-good/known-bad classification
  • False Positive Rate per Playbook — track separately for each playbook to identify which enrichment sources or detection rules are generating noise
  • Escalation Rate — percentage of automated alerts that are escalated to cases. Sudden spikes may indicate a new attack campaign or a misconfigured detection rule
  • Responder Success Rate — percentage of containment actions that complete without error. Low success rates indicate integration maintenance needs
  • Playbook Coverage — percentage of incoming alert types covered by an automated playbook vs. requiring manual triage

Conclusion

Effective SOAR playbook engineering requires treating automation the same way you treat production code: with version control, testing, explicit error handling, and continuous measurement. The TheHive/Cortex/MISP stack provides excellent building blocks, but the value comes from the engineering discipline applied to the playbooks running on top of it.

Start with enrichment automation — it is low-risk, immediately valuable, and teaches you the patterns you will need for containment actions. Layer in automated triage logic as your enrichment data quality improves. Add containment actions carefully, with explicit safety checks and analyst approval gates. Measure everything, and use the metrics to drive iterative improvement. A security operations center with mature SOAR automation is a fundamentally different beast from one running on manual processes — more consistent, more scalable, and able to maintain analyst effectiveness as the threat landscape evolves.

Scroll to Top