In this section

Why Detection Engineering Cannot Close the Gap

3-4 hours · Module 0 · Free

What you already know

Section 0.1 measured the detection gap from two angles: coverage ratio (where the gap is) and dwell time (what the gap costs). Both treated the gap as a measurement problem. This section explains why the gap exists structurally. Three categories of threat require three different operational responses, and five architectural limitations prevent detection rules from addressing them all. Understanding these limitations is what separates the argument "we need more rules" from "we need a different approach."

Scenario

Marcus Webb proposes hiring two additional detection engineers and tripling Northgate's analytics rule output over the next year. His investment would bring the rule count from 147 to over 400. Rachel asks: "If we triple the number of rules, does that triple the number of ATT&CK techniques we cover?" Marcus pauses. At his previous employer, tripling the rule count improved coverage from 29% to 34% because most new rules covered variants of techniques that already had rules. Uncovered techniques stayed uncovered. He needs to articulate why before he can propose the complementary investment.

The detection pyramid: three layers of visibility

David Bianco's Pyramid of Pain established that not all detection indicators are equally valuable because not all are equally painful for attackers to change. Bianco's detection pyramid extends that insight into a visibility model that maps three categories of threat to three operational disciplines.

The detection pyramid maps three threat categories to three operational disciplines. The known-unknown layer — hunting's domain — contains the most damaging threats and receives the least investment.

Known-known threats are techniques where the attack pattern is documented, the telemetry is available, and a detection rule exists in your SIEM. When the attack occurs, an alert fires. This is detection engineering's domain. The CardinalOps 2025 report, analyzing over 13,000 rules across production SIEMs including Sentinel, found the average enterprise covers 21% of ATT&CK techniques. Of rules that do exist, 13% are non-functional due to misconfigured data sources. In practice, this base layer is narrower than most organizations believe.

Known-unknown threats are techniques documented in ATT&CK, discussed in threat intelligence, presented at security conferences. Your environment has no detection rule for them. The telemetry to surface them is probably flowing into Sentinel already. What creates this gap is operational: nobody is querying that data for that pattern. Given that 79% of ATT&CK techniques have no corresponding SIEM rule in the average enterprise, this layer represents the single largest unaddressed threat surface. This is hunting's domain.

Unknown-unknown threats are novel attack paths that researchers have not yet classified: zero-day exploitation of platform behaviors, creative abuse of legitimate features. You cannot hunt for something you cannot hypothesize. This is the domain of anomaly detection: statistical baselining that identifies deviation from normal. Sentinel UEBA, Defender for Identity's behavioral models, and custom time-series decomposition queries operate here.

Most SOC budgets fund the base layer exclusively. Detection engineers write rules, tune thresholds, and manage false positives. The known-unknown layer, which contains the most actionable gap, receives no operational investment until a hunting program begins. A vendor offering a managed UEBA service covers the unknown-unknown layer partially, but UEBA detects statistical deviations, not specific hypothesized techniques. The middle layer remains empty until a human analyst formulates a hypothesis and tests it.

Damage distribution inverts relative to investment. The base layer receives the most funding and handles the most frequent threats, but those threats are the least damaging precisely because they trigger alerts and get contained. No operational funding reaches the known-unknown layer, and handles the threats that cause the most organizational damage: long-dwell-time intrusions, data exfiltration campaigns, and undetected persistence that survives remediation. Threats designed to evade detection rules are, by definition, the threats that operate in the layers detection engineering cannot reach. They succeed because nobody is looking in the layer where they live.

Five structural limitations

The detection gap is not a staffing problem. Five limitations are properties of the detection method itself.

Rules encode anticipation. Every detection rule implements what its author predicted the attack would look like. A Sentinel rule detecting inbox rule creation monitors for New-InboxRule operations with ForwardTo or RedirectTo actions. A BEC operator who reads the published rule logic uses MoveToFolder instead, redirecting financial emails to the RSS Feeds folder. Same objective, same ATT&CK technique (T1564.008), different action type. The rule does not fire. Researchers at the Fraunhofer Institute analyzed 292 widely deployed SIEM rules and found that at least 44% could be evaded using straightforward modification techniques: character case changes, field value substitution, process path manipulation, argument reordering, and equivalent command alternatives. Not exotic zero-day evasions. Variations a moderately skilled attacker applies by reading the rule and changing one parameter. Offensive tool developers specifically study published detection logic: EvilGinx developers modified their AiTM proxy to avoid triggering Microsoft's anomalousToken risk detection, and PowerShell obfuscation frameworks evolve specifically to bypass AMSI detection patterns.

Rules require ingested telemetry. A rule monitoring service principal authentication queries AADServicePrincipalSignInLogs. If your organization does not ingest this table, the rule is syntactically correct and operationally blind. The same applies to AADNonInteractiveUserSignInLogs (where AiTM token replay appears), MicrosoftGraphActivityLogs (where Graph API abuse appears), and CloudAppEvents (where OAuth consent events appear). The telemetry exists. Microsoft records it. The organization chose not to ingest it. Every detection rule targeting those data sources is blind by architecture.

Rules trade sensitivity for specificity. An analytics rule that fires on more than five failed logins in ten minutes will not catch an attacker who paces at four attempts per ten minutes. Every threshold is a permission slip for a patient adversary. In practice, detection engineers tune toward specificity because false positives carry an immediate operational cost: a rule that fires 50 times per day with a 2% true positive rate generates 49 wasted investigations for every real detection. The threshold gets raised, exclusions get added, and the detection floor climbs. Hunting does not operate on this tradeoff because hunting queries do not generate automated alerts. A hunting query that returns 500 results does not generate 500 incidents. It generates a dataset that a human analyst reviews, enriches with context, and makes judgments about. The analyst can tolerate a noisy result set because they are investigating, not triaging an alert queue. They examine the 500 results, identify the 3 that are suspicious based on contextual factors, and investigate those in depth. Hunting can afford to operate at sensitivity levels that would be operationally destructive as automated detections.

Rules are static. A detection rule deployed today matches today's technique variant. The technique evolves, the environment changes, new applications introduce behavioral patterns the rule was not designed to handle. Without continuous validation, rules accumulate silent failures. No mechanism in Sentinel or Defender XDR allows a detection rule to evaluate its own effectiveness. A rule that has not fired in 12 months cannot tell you whether it is a well-targeted detection for a rare event or a broken rule that would not fire even if the attack occurred. The AiTM detection rules that worked against 2023 toolkits have been modified to evade the specific patterns they match. EvilGinx developers study published detection logic and engineer evasion against it. Your rule stays still. The technique moves on. Nobody notices until someone runs the underlying query manually with broader parameters. Running that query is a hunt. It tests the rule by looking for the technique the rule was supposed to catch.

Rules generate alerts without context. When a rule fires, it produces an alert with the matched entity, the severity, and the technique tag. It does not explain whether the matched behavior is a weekly admin process or an actual attack. It does not answer "was this the same user who registered a new MFA method an hour ago?" or "did this sign-in originate from an AiTM proxy?" Those questions require querying different tables and applying reasoning the rule was never designed to perform. Hunting operates in the opposite direction. A hunt for BEC persistence does not look for inbox rule creation in isolation. It looks for the chain: anomalous sign-in, MFA registration within 24 hours, inbox rule creation within 48 hours, and mailbox access from the same session. The hunt produces findings with built-in context because the hypothesis defined the context from the start. An analyst who runs that hunt learns the environment's behavioral patterns through direct engagement with the data, which is why experienced hunters make better alert triagers than analysts who have only worked the queue.

Analyst Decision

Assessment: Northgate's plan to reach 400 rules and improve ATT&CK coverage will meaningfully strengthen the known-known layer. It does not address the five structural limitations identified above.

Limitation exposure: Of 147 existing rules, an estimated 13% may be non-functional (telemetry gaps). At least 44% of the remainder are vulnerable to straightforward evasion if the attacker reads the published rule logic. Rules tuned for specificity create a detection floor below which attacker activity passes unnoticed.

Recommendation: Approve the detection engineering plan. Supplement with a parallel hunting capability that investigates the known-unknown layer the rules do not cover, validates existing rules by hunting for the techniques they claim to detect, and converts hunt findings into new detection rules through the hunt-to-detection pipeline.

The complementary relationship

These five limitations are architectural. Tripling Marcus's team produces three times as many rules operating under the same five constraints. More rules improve depth and breadth within the known-known layer. They do not reach into the known-unknown layer where 79% of the threat surface sits.

Hunting addresses each limitation through a structurally different approach. Where rules encode anticipation, the hunter formulates a hypothesis based on current intelligence and tests it against live data. The hypothesis can be refined in real time as the data reveals unexpected patterns. Where rules require specific tables, the hunter queries whatever data exists and adjusts the methodology to the available telemetry. Where rules use fixed thresholds, the hunter examines all activity below and above the threshold, identifying patterns that automated detection cannot afford to alert on. Where rules are static, the hunter adapts the query to current technique variants during the investigation itself. Where rules lack context, the hunter brings months of environmental knowledge to every result, distinguishing the weekly admin process from the attacker's lateral movement by recognizing patterns that no rule could encode.

Both disciplines reinforce each other through the hunt-to-detection pipeline. Every hunt that discovers a new pattern produces a detection rule that automates the finding permanently, moving it from the known-unknown layer into the known-known layer. Every detection rule that fires generates data that informs the next hunting hypothesis. Every rule that fails to fire on a technique the hunter found through manual investigation identifies a detection gap that needs remediation.

This cycle is continuous: hunt, discover, automate, hunt the next gap. Each campaign shrinks the known-unknown layer by one technique with every successful campaign. Detection engineering maintains the growing known-known layer by tuning and validating the rules that hunts produce. Breaking the cycle by funding only one discipline leaves the other layer permanently unmanned.

Responding to a coverage gap analysis by doubling the analytics rule backlog

Two years later the rule count has tripled but the coverage ratio improved by five percentage points, because most new rules added depth on already-covered techniques rather than breadth on uncovered ones. The known-unknown layer is exactly as large as it was before the investment. The five limitations did not change because no amount of rule-writing changes architectural properties of the detection method.

Threat Hunting Principle

The five limitations of detection rules are architectural, not operational. No staffing increase, budget allocation, or tooling upgrade resolves them. Hunting exists because rules are structurally incapable of covering the full threat surface. The two disciplines are complementary by design, connected through the hunt-to-detection pipeline that converts hunting discoveries into permanent automated coverage.

Section 0.3 maps the specific M365 threat landscape that hunting must address. AiTM session hijacking, living-off-the-cloud techniques, OAuth persistence, and hybrid identity exploitation all operate inside the detection gap using legitimate credentials and standard operations.

Unlock the Full Course See Full Course Agenda

Get weekly detection and investigation techniques

KQL queries, detection rules, and investigation methods — the same depth as this course, delivered every Tuesday.

No spam. Unsubscribe anytime. ~2,000 security practitioners.

← Previous Next →