In this section

The Detection Gap

3-4 hours · Module 0 · Free

What you already know

You operate a SIEM. Sentinel fires scheduled alerts. Defender XDR generates incidents. Your analytics rule count is a number you can quote. But you've never calculated what percentage of ATT&CK techniques those rules actually cover, or how long attackers operate in your environment before anyone notices. Those two measurements define the hunting opportunity: one tells you where the gap is, the other tells you what the gap costs. This section teaches you to calculate both.

Scenario

Rachel Okafor asks Tom Ashworth for the SOC's detection coverage status. Tom reports 147 analytics rules deployed across Sentinel and Defender XDR. Rachel asks the follow-up question that changes everything: "Of the ATT&CK techniques attackers actually use against M365 environments, what percentage do those 147 rules cover?" Tom doesn't know. Three weeks later, an incident investigation reveals an attacker was in the environment for 23 days before a high-severity alert fired. During those 23 days, the attacker registered a secondary MFA method, created mailbox forwarding rules, consented to two OAuth applications, and downloaded 4,200 files from three SharePoint document libraries. Every action was logged. None generated an alert.

Rule count versus coverage ratio

Ask your SOC lead how many analytics rules are deployed. You will get a number: 80, 120, 200. The number sounds like coverage. It is not coverage. It is rule count.

Coverage is a ratio: the number of ATT&CK techniques your rules detect, divided by the number of ATT&CK techniques relevant to your environment. A 2025 industry analysis of enterprise SIEMs found average detection coverage at 21% of ATT&CK techniques, meaning 79% of adversary techniques defined in the framework produced no alert, no incident, no notification of any kind. When narrowed to the ten most frequently used techniques in observed attacks, organizations covered only four.

Twenty rules that all detect variants of email phishing give you deep coverage of one technique and zero coverage of everything else. A SOC with 50 rules distributed across 40 distinct ATT&CK techniques has a stronger detection posture than a SOC with 200 rules clustered on 15 techniques. The metric that matters is not how many rules you have. It is how many distinct techniques those rules cover, measured against the techniques attackers actually use in your threat landscape.

NE's 147 analytics rules map to 23 distinct ATT&CK techniques. Four tactics have zero custom rules. Post-access operations produce no alerts.

The query below extracts the numerator from your Sentinel workspace: the count of distinct ATT&CK techniques that have at least one firing analytics rule.

KQL

// Coverage ratio numerator — distinct ATT&CK techniques with firing rules
SecurityAlert
| where TimeGenerated > ago(90d)
| where ProviderName == "ASI Scheduled Alerts"
| extend Techniques = tostring(
    parse_json(ExtendedProperties).["Techniques"])
| where isnotempty(Techniques) and Techniques != "[]"
| extend TechniqueList = parse_json(Techniques)
| mv-expand TechniqueList
| summarize by Technique = tostring(TechniqueList)
| summarize CoveredTechniques = count()

The denominator requires judgment. The full ATT&CK Enterprise matrix contains over 200 techniques and 400 sub-techniques. Not all are relevant to your environment. An organization running exclusively in the cloud can exclude techniques requiring physical access or on-premises server exploitation. Open the ATT&CK Navigator, select the techniques relevant to your M365 environment (cloud identity, SaaS, email, endpoint if Defender for Endpoint is deployed, hybrid identity if Azure AD Connect is in use), and count. A typical M365 E5 environment produces a denominator between 70 and 100 techniques.

If your ratio is below 25%, you're in the majority. Between 25% and 40% indicates a mature detection engineering program. Above 40% is advanced. Regardless of where you land, the remaining gap is the hunting surface.

Analyst Decision

Coverage ratio: 23 distinct ATT&CK technique IDs mapped across 147 rules. Denominator (M365-relevant techniques from ATT&CK Navigator): 78. Coverage ratio: 29.5%.

Tactic distribution: Initial Access (14 rules), Execution (9 rules), Persistence (3 rules), Defense Evasion (2 rules). Credential Access (0), Lateral Movement (0), Collection (0), Exfiltration (0).

Assessment: NE's detection posture covers the entry point but loses visibility after initial access. Post-compromise activity produces no alerts from the custom analytics layer. Priority: hunt the four zero-rule tactics before writing additional Initial Access rules.

What lives in the gap

These are not theoretical categories. They are techniques from production M365 compromise investigations where no detection rule existed at the time of discovery.

OAuth application abuse (T1098.003) uses the standard Entra ID consent flow. The attacker consents to an application with Mail.ReadWrite and Files.ReadWrite.All permissions. The application reads every email and downloads OneDrive files indefinitely, surviving password resets because the OAuth grant persists independently. The consent itself is a legitimate operation. The distinction between a user consenting to a productivity tool and a user tricked into consenting to a data theft application is behavioral: data access volume, access patterns, permission scope relative to stated purpose. That analysis is a hunt, not a rule.

Inbox rule manipulation via Graph API (T1564.008) targets BEC operations. The attacker creates inbox rules that redirect emails containing financial keywords to hidden folders and suppress security notifications. Most detection rules monitor Outlook or OWA activity. Rules created via the Microsoft Graph API, the preferred method for automated post-compromise toolkits, produce telemetry in CloudAppEvents, not Exchange admin audit logs. If your detection only watches the admin audit path, Graph API rule creation is invisible.

Service principal credential rotation (T1098.001) exploits existing automation. The attacker adds a new credential to a service principal that already has high-privilege permissions, perhaps one IT created for a legitimate workflow. Authentication as that service principal appears in AADServicePrincipalSignInLogs, a table many organizations do not ingest into Sentinel.

Conditional Access policy modification (T1562.001) weakens the perimeter silently. The attacker with Global Admin adds an IP exclusion to a CA policy, exempting their infrastructure from MFA requirements. The modification is recorded in AuditLogs but buried among hundreds of daily directory change events. Without a specific detection rule for CA policy changes, the attacker weakens identity security without generating any alert.

Each technique shares three characteristics: the attacker uses a legitimate M365 operation, the operation is recorded in available telemetry, and no detection rule covers it. The data exists. The rule does not. That is the detection gap.

What the gap costs: dwell time

The coverage ratio tells you where the gap is. Dwell time tells you what happens inside it.

Mandiant's M-Trends 2026 report, drawn from over 500,000 hours of incident response investigations conducted in 2025, found that global median dwell time rose to 14 days, up from 11 the previous year. After a decade of improvement, the trend reversed. Internal detection improved to a median of 9 days, but external notification cases jumped to 25 days, pulled up by espionage campaigns achieving median dwell times of 122 days.

The Microsoft Digital Defense Report 2025 adds M365-specific context. Average dwell time across Microsoft IR engagements was 12 days, with data collection or staging activity observed in 80% of reactive engagements. These numbers represent investigated incidents. The intrusions that were never detected never appear in any dwell time statistic. The actual median, including undetected compromise, is unknowable and certainly higher.

What does the attacker accomplish during those days? On Day 1, they establish persistence: register a new MFA method on the compromised account, create inbox rules that suppress security notifications, consent to an OAuth application with Mail.ReadWrite and Files.ReadWrite.All permissions. The identity is now theirs even if the password is reset.

Days 2 through 5 are reconnaissance. The attacker maps organizational structure through GAL harvesting, identifies high-value targets by reading executive calendars and shared mailboxes, tests privilege boundaries by attempting access to SharePoint sites and admin portals. Every action uses legitimate M365 operations. Every action is logged in tables your detection rules ignore.

Days 5 through 14 are objective execution. For BEC operators, this is the social engineering window: they study communication patterns, then send the fraudulent invoice from the compromised mailbox. For data theft operators, this is the exfiltration window: they download SharePoint libraries using sync or bulk download, exfiltrating through Microsoft's own infrastructure on standard ports. For ransomware affiliates, this is staging: mapping Active Directory, disabling backup processes, positioning for the encryption event that is the last step, not the first.

Measuring your own dwell time

Your Sentinel incident data contains the measurement. The query below calculates the gap between earliest attacker activity and first detection across your closed incidents.

KQL

// Dwell time baseline — first evidence to first detection
SecurityIncident
| where TimeGenerated > ago(180d)
| where Status == "Closed"
| extend FirstActivity = todatetime(
    parse_json(AdditionalData).firstActivityTimeUtc)
| extend DwellDays = datetime_diff('day', CreatedTime, FirstActivity)
| where DwellDays >= 0 and DwellDays < 365
| summarize
    Median = percentile(DwellDays, 50),
    P75 = percentile(DwellDays, 75),
    P90 = percentile(DwellDays, 90),
    IncidentCount = count()

The P90 is the number that should concern you most. It represents the long tail: intrusions where the attacker had extended undetected access, reached the entrenchment phase, and caused the most organizational damage. If your P90 exceeds 30 days, your detection layer has a significant responsiveness gap that hunting directly addresses. Compare your median to the M-Trends 2026 benchmark of 14 days. If yours is lower, your detection engineering is effective for the threats it covers. But the undetected intrusions may have dwell times far longer.

The distinction between MTTD and MTTR matters here. Mean time to respond (MTTR) measures how fast the SOC acts after an alert fires. Mean time to detect (MTTD) measures how long the attacker was present before the alert existed. A SOC with a 2-hour MTTR and a 30-day median dwell time responds quickly to incidents it eventually detects. But the attacker had 30 days of unmonitored access before that response began. Hunting compresses MTTD for threats that live in the detection gap.

There is a category of compromise that never appears in any dwell time statistic: the ones where the attacker achieved their objective and left without detection. The BEC operator who intercepted one wire transfer and disappeared. The data theft operator who exfiltrated a customer database that surfaced on a dark web marketplace months later. M-Trends 2026 found some espionage-linked intrusions achieved dwell times approaching 400 days, well beyond standard 90-day log retention policies. Hunting does not guarantee you will find these intrusions. But it is the only operational activity that proactively looks for them. Detection rules wait for a pattern. Hunting goes looking.

Coverage decays

Even strong coverage degrades. Three forces work against you continuously.

Technique evolution is the first. When Microsoft publishes detection guidance, attackers adapt. M-Trends 2026 found that the mean time from vulnerability disclosure to exploitation has dropped to negative seven days, meaning attackers are routinely exploiting before patches exist. Your rule that worked six months ago may catch 80% of current variants instead of 95%. Nobody notices unless someone tests.

Environmental drift is the second. Your M365 environment changes constantly: new applications deployed, new conditional access policies, new user populations onboarded, new integrations connected. Each change introduces attack surface your existing rules were not designed to cover. A rule written before you adopted Defender for Cloud Apps does not examine OAuth consent phishing through that workload.

Rule atrophy is the third. A rule that has not fired in 12 months is either a well-targeted detection for a rare event or a broken rule that would not fire even if the attack occurred. Without validation, there is no way to tell the difference. A 2025 analysis found 13% of SIEM detection rules are non-functional due to misconfigured data sources and missing log fields. Organizations accumulate silent failures.

The coverage ratio is not a number you calculate once. It changes as your environment evolves, attackers adapt, and rules age without maintenance. Hunting reveals whether your detections still work. When a hunt finds evidence of a technique that should have triggered an existing rule, it surfaces both a potential compromise and a detection failure simultaneously.

The SOC reports 147 analytics rules deployed

The security dashboard shows all rules healthy. The mean time to respond is under four hours. Leadership concludes detection is adequate. Nobody has calculated the coverage ratio or the dwell time baseline. When the coverage query runs, 147 rules map to 23 distinct techniques clustered on Initial Access and Execution. Four tactics have zero custom rules. The attacker's entire post-access operation is invisible. The dashboard measured effort and response speed while ignoring detection breadth and detection lag.

Threat Hunting Principle

Two numbers define your hunting surface. The coverage ratio tells you where the gap is. The dwell time baseline tells you what the gap costs. Together they answer the question every hunting program starts with: where do we look, and why does it matter?

Section 0.2 examines why the detection gap cannot be closed by writing more rules. Five structural limitations of detection engineering are architectural, not staffing problems. Understanding these limitations clarifies why hunting is a complementary discipline, not a workaround for insufficient rule-writing capacity.