In this section
The Detection-and-Response Pipeline
Section 0.4 defined the maturity spectrum that measures SOC capability. This section maps the full pipeline, the operational path an attack takes from generating telemetry through detection, triage, investigation, containment, and back into the feedback loop. You'll see where each module in this course builds a specific stage.
Seven stages, seven failure points
Scenario
An attacker compromises a user account through AiTM phishing. The sign-in event is recorded in SigninLogs (Stage 1: telemetry exists). No analytics rule queries for AiTM token characteristics (Stage 2: detection gap). No alert fires (Stage 3 doesn't happen). The SOC never sees the event (Stages 4-6 don't happen). No post-incident review identifies the gap (Stage 7 doesn't happen). The attacker operates for 21 days. The telemetry existed from minute one. The pipeline broke at Stage 2 and every subsequent stage failed silently.
The detection-and-response pipeline is the operational path from attack to resolution. Seven stages, each dependent on the previous one. When any stage breaks, every subsequent stage fails, and the failure is silent because no alert fires to announce the gap.
Understanding the pipeline is the prerequisite for everything this course builds. Module 1 builds the operational infrastructure for Stages 4-6 (triage, investigation, containment). Modules 2-6 build Stage 2 (detection rules). Modules 7-8 build investigation and response methodology. Modules 9-12 build the feedback loop (Stage 7) and operational maturity.
Estimated time: 30 minutes.
Figure 0.5. The seven-stage detection-and-response pipeline. NE's AiTM incident broke at Stage 2 (no detection rule for the technique). When any stage breaks, every subsequent stage fails silently. no alert fires to announce the gap.
Stage 1: Telemetry: what the platform collects
Telemetry is the data the platform records. On the Microsoft stack, this includes: SigninLogs (every authentication event), AuditLogs (every directory change), OfficeActivity (every email, file, and collaboration action), DeviceProcessEvents (every process execution on managed endpoints), EmailEvents (every email delivery), CloudAppEvents (every cloud application action), and dozens of additional tables from connected data sources.
The telemetry exists whether or not anyone examines it. This is both the opportunity and the frustration, the data to detect most attacks is already being collected and stored, the ingestion cost is already being paid, but without a detection rule that queries it, the data sits unused. At NE, the AiTM sign-in was recorded with full detail, the source IP, the MFA method, the token acquisition characteristics, from minute one. The data was there. No rule looked at it.
What makes telemetry deceptive is volume. NE ingests approximately 48,000 sign-in events per week across 810 users. The AiTM sign-in was one record among 48,000. A human scanning the raw log would never find it, the signal-to-noise ratio is roughly 1:48,000 for that technique. This is why Stage 2 exists: automated rules that query the telemetry at machine speed, filtering millions of events down to the handful that match attack patterns.
Telemetry quality matters as much as telemetry presence. A sign-in event that records the IP address and timestamp is useful. A sign-in event that also records the MFA method, the authentication flow, and the conditional access evaluation result is investigable. The difference is data connector configuration, which fields are collected, which enrichment is applied, and whether the retention period is long enough to support investigation. Module 1 Section 1.8 maps the telemetry sources to SOC workflow functions so you know what data you have, where it lives, and where the gaps are.
Stage 2: Detection: rules that query the telemetry
Detection rules are the automated queries that examine telemetry and produce alerts. A scheduled analytics rule runs a KQL query every N minutes, looking for patterns that match known attack behaviors. When the query returns results, the rule creates an incident in the Sentinel queue.
The detection stage is where most SOCs have the largest gap. The typical mid-size organization has 20-40 analytics rules covering 10-15% of the attack techniques relevant to their threat landscape. The other 85-90% of techniques produce telemetry that no rule examines. The gap is invisible because the SOC only sees what the rules catch, the techniques with no rules produce no alerts, so no one knows they're missing.
NE had 23 analytics rules before the AiTM incident. Twelve were Microsoft templates enabled as-is. Eleven were custom rules written in 2024 covering basic scenarios, failed login thresholds, new inbox rules, impossible travel. None queried the MFA method field in SigninLogs. None correlated sign-in anomalies with subsequent email manipulation. None detected the pattern that defines AiTM: a sign-in where MFA was satisfied by a claim in the token rather than by interactive user authentication. The rule that would have caught the attack is 8 lines of KQL. It didn't exist. Modules 2-6 build 28 detection rules across four domains (identity, email, endpoint, cloud application) that close the most critical gaps.
Stage 3: Alert: the incident appears in the queue
When a detection rule returns results, Sentinel creates an incident with severity, entity mappings, and alert details. The incident appears in the queue for the next available analyst. This stage is automatic, the detection rule's configuration determines the severity and the entity mapping. The quality of this stage depends entirely on Stage 2: if the rule is well-configured with accurate entity mapping and appropriate severity, the incident provides good triage context. If the rule is a template enabled without customization, the incident may have incorrect severity or missing entity information.
Entity mapping is the architectural detail that separates a useful alert from a noisy one. When a rule maps the UserPrincipalName field to an Account entity and the IPAddress field to an IP entity, the resulting incident links directly to the investigation graph, the analyst clicks the account and sees every related alert, every recent sign-in, every associated device. Without entity mapping, the analyst reads raw query results and manually copies values into search queries. The difference is minutes per triage. Across hundreds of alerts per week, those minutes compound into analyst hours lost to friction that proper rule configuration eliminates.
Stages 4-6: Triage, investigation, containment
These three stages are the operational core that Module 1 builds. Stage 4 (triage): the L1 analyst evaluates the alert using the triage decision framework, enrichment queries, classification criteria, disposition recording. Stage 5 (investigation): the L2 analyst scopes the incident, determines what the attacker accessed, what the blast radius is, and what containment is needed. Stage 6 (containment): the SOC lead or L2 analyst with appropriate permissions executes containment, disable accounts, revoke sessions, isolate devices, block IPs.
The quality of Stages 4-6 depends on the operational infrastructure: documented triage methodology (Section 1.5), escalation framework for ambiguous alerts (Section 1.4), shift handover for investigation continuity (Section 1.3), tier boundaries for appropriate skill matching (Section 1.2), and metrics for quality measurement (Section 1.6). Without this infrastructure, these stages operate on habit and individual skill, inconsistent, unmeasured, and fragile.
The NE incident exposed the operational gaps at every stage. When the managed SOC partner triaged the alert, they followed the standard credential compromise playbook, check MFA, check impossible travel, close if both pass. The playbook had no path for AiTM. When the internal team discovered the compromise on Monday, the investigation was ad hoc, no documented evidence collection sequence, no standardized containment checklist, no pre-authorized containment actions. The CISO asked for a timeline and the team spent 4 hours assembling one from memory and scattered notes. Module 1 builds the infrastructure that makes all three stages systematic.
Stage 7: Feedback: the loop that makes everything better
Stage 7 is where investigation findings flow back into detection. Every TP investigation reveals something about how the attack worked, which techniques were used, which telemetry was visible, which detection rules fired (or didn't). That intelligence feeds the detection backlog: new rules for techniques that weren't covered, tuning for rules that produced FPs during the investigation, and coverage assessments that identify the next priority gaps.
Without Stage 7, the pipeline is a one-way conveyor. Alerts come in, incidents go out, and the detection library never improves. The same gaps persist. The same techniques succeed. The SOC is permanently reactive, responding to the attacks the existing rules catch, blind to everything else.
The feedback loop is the hardest stage to sustain. It requires dedicated time (the L3 protected block from Section 1.2), a structured process (the monthly tuning review from Module 10), and organizational commitment to prioritize improvement over throughput. Most SOCs acknowledge that feedback is important and then never do it because the alert queue always has more immediate urgency. The queue is infinite. The feedback loop has no deadline. Without structural protection, scheduled time that is not negotiable. Stage 7 atrophies. Modules 10-12 build the operational cadences that make Stage 7 systematic: monthly tuning reviews, quarterly coverage assessments, threat intelligence integration.
Stages 1, 3, 4, and sometimes 5 exist. The telemetry is collected (Stage 1). Template rules produce alerts (Stage 3, but Stage 2 was never customized). L1 triages (Stage 4). L2 sometimes investigates (Stage 5, but only when the alert is obviously serious). Containment is ad hoc (Stage 6, no pre-authorized playbook, no documented sequence). Feedback doesn't exist (Stage 7). The pipeline works for the attacks the template rules happen to catch. Everything else passes through undetected.
Estimate your own coverage
Stage 2 (detection) is the most common pipeline failure point, telemetry exists, but no rule queries it. The estimator below uses your actual data sources and rule count to estimate what percentage of the ATT&CK attack surface your SOC can detect. The output includes a tactic-by-tactic heatmap showing exactly which areas you cover well and which are gaps:
The estimate is based on your data sources (telemetry visibility) and rule count (detection logic). Most Microsoft-stack SOCs score 10-25% on first run.
Your turn: establish the environment baseline
Coverage estimates are abstract until they're anchored in environment size. Before you can say "20% of attack techniques are covered," you need to know how big the environment is. Write the query that counts distinct users in the NE tenant.
SOC Operations Principle
The detection-and-response pipeline has seven stages. Each stage depends on the previous one. When Stage 2 (detection) breaks, no rule for the technique. Stages 3-7 don't happen. The failure is silent. The only way to find pipeline breaks before an attacker exploits them is to measure coverage at Stage 2 and quality at Stages 4-6. This course builds every stage.
Get weekly detection and investigation techniques
KQL queries, detection rules, and investigation methods — the same depth as this course, delivered every Tuesday.
No spam. Unsubscribe anytime. ~2,000 security practitioners.