In this section

Key Metrics: MTTD, MTTR, and What Actually Matters

3-4 hours · Module 0 · Free

What you already know

Section 0.3 defined the five-layer endpoint security stack — hardening, prevention, detection, response, and forensic readiness. This section introduces the metrics that measure whether each layer is actually working. You'll learn to distinguish between the metrics that impress executives and the metrics that drive engineering decisions — because they are not the same metrics.

Scenario

Your CISO presents last quarter's metrics to the board. MTTD is down to 12 minutes from 45. MTTR is down to 25 minutes from 90. The board sees green arrows and concludes the security posture is improving. What the dashboard doesn't show: the improvement came entirely from faster auto-triage of commodity malware alerts. Zero new custom detection rules were deployed. Zero ASR rules moved from audit to block. The detection coverage for credential theft, lateral movement, and persistence is unchanged at 0%. The numbers improved. The security posture did not.

Why MTTD and MTTR are necessary but insufficient

Mean Time to Detect and Mean Time to Respond are the most commonly reported SOC metrics. They appear in every vendor pitch, every analyst report, and every board-level security presentation. They matter because they measure speed — and speed matters during an active incident. A SOC that detects an AiTM attack in 8 minutes and contains it in 15 minutes produces a vastly different outcome than one that takes 4 hours for each.

The problem is that MTTD and MTTR can be optimized without improving security. Auto-closing low-severity alerts reduces MTTR to near zero for those alerts — but the alerts are closed, not investigated. Tuning alert thresholds to suppress noisy detections reduces alert volume and makes MTTD look faster — but the noisy detection might have caught a real attack at a lower confidence level. Detecting only commodity threats for which MDE has built-in detections produces a fast MTTD for those threats — but says nothing about detection capability for targeted attacks using custom tooling. A SOC that handles 95% of alerts in under 10 minutes but misses the 5% that represent actual compromise has excellent MTTD and MTTR numbers and a catastrophic security posture.

MTTD and MTTR also have a survivorship bias problem: they only include attacks that were detected. If an attacker operates undetected for 45 days — which is the median dwell time for sophisticated intrusions according to the Mandiant M-Trends 2025 report — those 45 days do not appear in your MTTD calculation because the attack was never detected during the measurement period. Your MTTD dashboard shows 12 minutes because it only measures the commodity threats your built-in rules caught. The targeted attack that has been running for six weeks is invisible to the metric entirely.

There is also the attribution problem. An improved MTTR might reflect faster analyst response — which is genuinely good. Or it might reflect that the types of alerts changed. If your most complex alert type was eliminated (by auto-closing or threshold change), the remaining alerts are simpler and faster to handle. The average improves, but the hard problems are no longer being measured. Without segmenting MTTD and MTTR by alert type, severity, and detection source, the aggregate number conceals more than it reveals.

The metrics that drive endpoint security engineering decisions are different. They measure what the stack actually prevents, what it actually detects, and what coverage gaps remain open. They are harder to present on a dashboard but impossible to game without actually improving the security posture.

Figure ES0.4 — Vanity metrics measure activity and look good on dashboards. Engineering metrics measure whether the endpoint security stack actually prevents, detects, and contains attacks. Both have a place — but only engineering metrics drive configuration decisions.

The four engineering metrics that matter

Prevention metrics measure whether your prevention controls are configured and effective. The percentage of ASR rules in block mode across the fleet tells you how much of the prevention layer is actually preventing. If 3 of 18 ASR rules are in block mode on 60% of devices, your prevention coverage is approximately 10% of its potential. The target is not 100% — some rules have legitimate business conflicts that keep them in audit or warn mode permanently. But the metric tracks progress toward the maximum enforceable configuration.

The ASR block rate per rule per month measures the actual prevention impact. An ASR rule that blocks 200 execution attempts per month is preventing 200 potential compromises — each one a technique that would have required analyst triage if it had been allowed to execute. An ASR rule that blocks zero has two interpretations: either the technique is not being attempted against your environment, or the rule is not configured correctly. High block rates validate deployment. Zero block rates prompt investigation.

AV cloud protection level across the fleet reveals how many devices operate with enhanced protection versus defaults. Default level is functional. High+ catches significantly more unknown threats through deeper cloud-side analysis and longer detonation timeouts. The metric shows the gap between what your license provides and what your configuration delivers.

Detection metrics measure whether your detection layer finds real threats. The total custom detection rule count matters because zero custom rules means you depend entirely on Microsoft's built-in detections — which catch commodity threats but do not cover your organization's specific patterns or the targeted techniques from your threat model. The custom detection true positive rate — the percentage of alerts confirmed as genuine threats — drives detection engineering. Rules below 30% TP rate generate noise that degrades SOC effectiveness and should be tuned. Rules above 80% TP rate are high-quality detections that can safely trigger automated response. The hunting cadence — how frequently analysts execute proactive hunting queries — measures whether detection extends beyond automated rules. A SOC that hunts weekly finds threats that scheduled rules miss because the hunt adapts to emerging threat intelligence in real time rather than waiting for a rule to be written and deployed.

ATT&CK technique coverage percentage measures detection breadth — the percentage of MITRE ATT&CK techniques relevant to your threat model that are covered by at least one detection rule, whether built-in or custom. A SOC with a 12-minute MTTD but only 20% technique coverage detects fast but detects little. The Defender XDR Threat Analytics reports provide a starting point for mapping your coverage against active threat campaigns targeting your industry. The coverage number improves only when you build new detection rules — faster triage of existing alerts does not change it.

MDE's exposure score from Defender Vulnerability Management assigns a numeric score based on the security posture of your fleet — unpatched vulnerabilities, misconfigured security controls, risky device configurations, and missing hardening measures. The absolute number matters less than the trend. A decreasing exposure score means your hardening and patching efforts are reducing the attack surface. An increasing score means new vulnerabilities or misconfigurations are being introduced faster than you remediate them. A flat score means your hardening effort is keeping pace with entropy but not closing the gap. The exposure score is refreshed daily, making it one of the few endpoint security metrics that provides genuinely real-time feedback on your engineering efforts.

Forensic readiness metrics measure whether evidence will exist when needed. The percentage of endpoints with advanced audit policies, PowerShell ScriptBlock logging, and Sysmon configured tells you whether the IR team can reconstruct attacker actions after containment. This metric is binary for each endpoint: the logging is configured or it is not. Partial logging creates evidence gaps that can undermine the entire investigation. An endpoint with advanced audit policies but no PowerShell ScriptBlock logging can tell you that a process was created but not what the PowerShell script inside that process actually did. An endpoint with Sysmon but default audit policies can show you process-level detail but misses the authentication events that reveal how the attacker obtained their credentials. The target for forensic readiness is 100% deployment across the endpoint fleet — every device generates the evidence the IR team needs, before the incident occurs.

KQL

// ASR enforcement status across the fleet — block vs audit
DeviceEvents
| where Timestamp > ago(30d)
| where ActionType startswith "Asr"
| summarize
    BlockEvents = countif(ActionType endswith "Blocked"),
    AuditEvents = countif(ActionType endswith "Audited")
    by DeviceName
| summarize
    DevicesWithBlocks = dcountif(DeviceName, BlockEvents > 0),
    DevicesAuditOnly = dcountif(DeviceName, BlockEvents == 0 and AuditEvents > 0),
    TotalDevices = dcount(DeviceName)

If DevicesAuditOnly is higher than DevicesWithBlocks, your prevention layer is generating telemetry but preventing nothing. Every device in the audit-only column is running ASR rules that log attack techniques as they execute successfully — the rules are watching the attack, not stopping it.

KQL

// Custom detection effectiveness — TP rate by rule (last 30 days)
AlertInfo
| where Timestamp > ago(30d)
| where DetectionSource == "CustomDetection"
| summarize
    TotalAlerts = count(),
    TruePositives = countif(Classification == "TruePositive"),
    FalsePositives = countif(Classification == "FalsePositive")
    by Title
| extend TPRate = round(TruePositives * 100.0 / TotalAlerts, 1)
| sort by TPRate asc

If this query returns zero rows, you have zero custom detection rules — your detection layer depends entirely on Microsoft's built-in models. If it returns rules with TPRate below 30%, those rules are generating more noise than signal and need tuning. Sort ascending so the worst performers appear first — those are your tuning priorities.

What we see in 90% of environments

A monthly security report showing MTTD, MTTR, and total alerts handled — presented as evidence of a functioning security program. No metric tracks how many ASR rules are in block mode. No metric tracks custom detection coverage. No metric tracks forensic readiness. The SOC knows how fast it handles alerts. Nobody knows how many attack techniques the stack fails to detect at all. The dashboard is green. The coverage is 15%.

Building the measurement cadence

Metrics without cadence are snapshots that never become trends. The measurement cadence for an endpoint security program:

Daily (automated, zero analyst time). Device health summary — total devices, active sensors, inactive sensors, stale AV signatures. These run as saved queries or Sentinel automation rules and alert on threshold violations. If the number of inactive sensors increases by more than 5% in a single day, something is wrong with your MDE deployment and needs immediate investigation — a misconfigured Intune policy, a network segment outage, or an attacker deliberately disabling the sensor on compromised machines. The daily health check is the only metric that catches sensor degradation before it creates a blind spot.

Weekly (15 minutes during standup). ASR enforcement summary — how many rules are in block mode across the fleet, how many block events occurred this week by rule, and whether any rules that were in block mode have reverted to audit or been disabled. Rule reversion is a common issue: an IT administrator disables an ASR rule to troubleshoot an application issue and forgets to re-enable it. Without the weekly check, the rule can remain disabled for months. Hunting cadence check — were the weekly hunting queries executed and were there any findings? This is a quick status check, not a deep analysis. The week-over-week trend in block events matters more than any single number — a sudden drop in block events for a rule that normally blocks 50 times per week may indicate the rule was disabled rather than that attacks stopped.

Monthly (2 hours for the engineering review). Custom detection performance review — for each active custom detection rule, calculate the TP rate over the last 30 days. Rules below 30% are flagged for tuning or retirement. Rules above 80% are candidates for automated response. Exposure score trend — is the score improving as hardening efforts reduce the attack surface, or degrading because new vulnerabilities are being introduced faster than remediation? Generate the executive dashboard for the CISO's monthly report, combining the engineering metrics with the MTTD/MTTR numbers the board expects to see.

The CISO presentation is where framing matters. Lead with the metric they care about — MTTD improved 15% this quarter — then explain the engineering cause: "We moved 6 ASR rules from audit to block mode, preventing 340 incidents that would have required analyst triage. That freed analyst capacity for the complex alerts, improving MTTD on high-severity incidents by 22%." The CISO gets the number the board wants. You get the engineering investment you need. The ASR enforcement metric explains the MTTD improvement, creating a cause-and-effect narrative that justifies continued configuration work.

Quarterly (half day for the strategic assessment). ATT&CK technique coverage mapping — which techniques in your threat model are covered by detections? Which gaps remain? This drives the detection engineering backlog for the next quarter. Maturity model re-assessment from Section 0.9 — score each layer of the stack against the maturity model and compare to the previous quarter. The delta is the progress metric that justifies continued investment to leadership. If the delta is flat — the same score two quarters in a row — something is blocking progress and needs escalation.

The cadence ensures metrics are not a one-time assessment but a continuous measurement system that tracks improvement and catches degradation before it becomes a security incident. A device compliance score that drops 5% in a week could indicate a failed Intune policy update, a new software deployment that broke compliance checks, or an attacker disabling security controls. Without the weekly cadence, you discover the degradation during the monthly review — or during the incident investigation.

Posture Assessment

Prevention layer: ASR rules in block mode: 0/18. AV cloud protection: Default. Prevention score: 0/10.

Detection layer: Custom detection rules: 0. Built-in alerts: active (commodity). Hunting cadence: none. Detection score: 2/10.

Forensic readiness: ScriptBlock logging: not configured. Sysmon: not deployed. Advanced audit: default. Forensic score: 0/10.

Composite posture: 2/30. This is the NE baseline. The course target: 24/30 by the end of the capstone module.

Endpoint Security Principle

The most dangerous metric is the one that improves while security degrades. MTTD and MTTR can both decrease while attack surface, detection coverage gaps, and forensic readiness remain unchanged. Engineering metrics — ASR enforcement percentage, custom detection TP rate, technique coverage, forensic readiness completeness — measure whether the stack works. Report MTTD/MTTR for the executives. Use engineering metrics for the decisions.

Section 0.5 maps the Microsoft ecosystem — how Defender for Endpoint, Intune, Sentinel, Entra ID, and Defender XDR integrate into a unified endpoint security architecture. You'll understand which signals flow where, which console handles which operations, and why endpoint security cannot be designed in isolation from identity and cloud security.

Unlock the Full Course See Full Course Agenda

Get weekly detection and investigation techniques

KQL queries, detection rules, and investigation methods — the same depth as this course, delivered every Tuesday.

No spam. Unsubscribe anytime. ~2,000 security practitioners.

← Previous Next →