In this section

Mitigate Incidents Using Microsoft Defender XDR

10-14 hours · Module 1 · Free

What you already know

You understand the four Defender products and how the correlation engine groups their alerts into incidents. Now you need to work those incidents. This section covers the complete lifecycle from the moment an incident appears in the queue through triage, investigation, containment, and closure.

Scenario

You start your shift at Northgate Engineering and open the incident queue. Three new incidents are waiting: a High-severity multi-stage attack targeting a finance team member, a Medium-severity password spray against 12 accounts, and a Low-severity impossible travel alert on a known frequent traveler. You have five minutes to triage each one before the next analyst's shift ends and they need a handoff. The triage framework in this section is how you make those decisions consistently.

How incidents are created

An incident is a collection of correlated alerts that together describe an attack. The correlation engine evaluates shared entities across alerts: the same user, device, IP address, or mailbox appearing in multiple alerts within a time window. When the engine determines these alerts represent stages of a single attack, it groups them into one incident.

A single alert can also become an incident on its own. A high-severity ransomware pre-encryption alert creates an incident immediately because the response urgency does not allow waiting for additional correlation. The engine continues monitoring: if a second alert fires against the same entity within the time window, it merges into the existing incident rather than creating a new one.

The incident object contains several components you'll use in every investigation. The severity score is calculated from the highest-severity alert plus the breadth of impacted entities. A High-severity alert affecting one user creates a High-severity incident; the same alert affecting twelve users creates a High-severity incident with a higher priority score. The attack story narrative is the correlation engine's interpretation of what happened, generated by mapping the correlated alerts to an ATT&CK chain. The Activities tab shows every automated and manual action taken against the incident, including attack disruption actions from Section 1.1. The Evidence and Response tab aggregates all supporting artifacts: files, processes, URLs, mailbox items, registry entries, and network connections collected across every alert in the incident.

Figure 1.2: The incident lifecycle. Each stage has specific actions and decision criteria covered in this section.

The incident queue: where every shift starts

Defender Portal

security.microsoft.com → Incidents & alerts → Incidents
This is the first page you open at every shift start. Filter by Status = "New" to see unworked incidents. Sort by Severity descending, then by Last activity descending within the same severity. The "Attack disruptions" summary card at the top shows incidents where automated containment already acted.

The queue columns give you triage information before you open any incident. Severity (High, Medium, Low, Informational) sets the priority order. The auto-generated incident name describes the attack type and target, such as "Multi-stage attack: phishing followed by credential theft on j.morrison." Status distinguishes New incidents from those already In Progress or Resolved. Categories map the incident to ATT&CK tactics (Initial Access, Credential Access, Persistence). Impacted assets show the users, devices, and mailboxes involved, giving you a quick scope estimate. Last activity tells you whether the attack may still be in progress: an incident with alerts from five minutes ago needs faster attention than one with its last alert six hours ago.

The queue also shows whether AIR or automatic attack disruption have acted. If disruption already disabled the compromised account and isolated the device, the incident still needs investigation, but the urgency shifts from stopping an active attack to verifying containment and assessing damage. Look for the yellow "Attack Disruption" tag in the incident row.

The five-minute triage framework

Triage is not investigation. Triage is the decision about what to do next: investigate immediately, assign to a specific analyst, schedule for later review, or classify as a known false positive. Five minutes per incident is the target for initial triage.

Minute 1: Read the attack story. Open the incident and read the auto-generated narrative on the Attack Story tab. The engine tells you what it thinks happened: which alerts it correlated, which ATT&CK tactics it mapped, and what entities are involved. You are not validating the story yet. You are building your initial mental model.

Minute 2: Check the entities. How many users, devices, and mailboxes are affected? A single-user incident has a contained blast radius. An incident affecting twelve users across eight devices is a potential campaign. Check whether any impacted user holds a privileged role, such as Global Admin, Exchange Admin, or a service account. Compromised privileged accounts escalate every incident by at least one severity level in your assessment, regardless of what the alert severity says.

Minute 3: Check for active attack signals. Look at the timestamp on the most recent alert. If the last alert is within the past thirty minutes, the attack may still be in progress and containment is urgent. Check whether automatic attack disruption has already acted. If not, and the attack appears active, containment actions (disable account, isolate device) take priority over further investigation.

Minute 4: Review AIR status. Open the Investigations tab to see if Automated Investigation and Response has run. AIR examines the evidence, produces a verdict for each entity and artifact (Malicious, Suspicious, Clean, or No Threats Found), and recommends remediation actions. Some actions execute automatically. Others require your approval. If AIR ran and the verdict is Malicious with pending remediation actions, reviewing and approving those actions is your next step.

Minute 5: Set status and assign. Set the incident to In Progress. Add a tag describing the attack type (Phishing, BEC, Ransomware, PasswordSpray). Assign the incident to yourself or to the appropriate analyst. If the incident is a known false positive pattern (a recurring alert from a specific detection rule that has been validated as benign in previous incidents), classify as False Positive and close with a comment explaining the pattern.

Incident Comment

Triage assessment: High-severity, multi-stage incident. AiTM phishing with credential theft and post-compromise activity. 1 user affected (j.morrison, Finance). 1 device (DESKTOP-NGE042). Attack disruption has NOT acted. Most recent alert: 12 minutes ago. Active attack.

Immediate action: Disabling user account and revoking sessions now. Investigation to follow after containment.

Assigned to: T. Ashworth, Shift 1. Status: In Progress.

This is what the triage comment looks like in the incident timeline. The comment documents what you found, what you decided, and what happens next. Every analyst who opens this incident after you sees this entry. Section 1.7 covers the documentation discipline in full.

The four-level investigation workflow

After triage, investigation follows a consistent four-level drill-down from the broadest context to the most granular data. Each level answers different questions.

Level 1: The incident. The Attack Story tab gives you the full picture as the correlation engine understands it. The incident graph shows entities (users, devices, mailboxes, IP addresses, files) as nodes with connections between them representing the attack flow. Read the graph to understand the scope: how many entities are involved, which entities connect to multiple alerts, and where the attack chain starts and ends. The Attack Story page also allows you to take response actions directly: click any entity node to see available actions for that entity.

Level 2: The alerts. Each alert is one chapter of the attack story. Click an alert to see its specific evidence, the detection logic that triggered, and recommended response actions. A "Phishing email with credential harvesting link" alert shows the sender, recipient, subject line, URLs in the email body, and the delivery action (delivered, blocked, or zero-hour auto purge). A "Sign-in from anonymous IP address" alert shows the authentication details: IP address, geolocation, client application, Conditional Access evaluation result, and risk level. The alert detail page also shows related entities that connect this alert to others in the incident.

Level 3: The entities. Clicking a user entity opens their complete activity profile across all Defender products: recent sign-ins, devices used, email received, cloud app activity, and every alert that references this user across all incidents. Clicking a device opens its timeline showing every process creation, network connection, and file operation, plus its security posture (vulnerabilities, missing patches, configuration gaps) and alert history. Entities are investigation pivot points. When you see that a compromised user's device has a separate malware alert, the entity view reveals the connection. When you see that the source IP from a suspicious sign-in also appears in alerts for two other users, the entity view exposes a broader campaign.

Level 4: Advanced Hunting. When the portal UI does not answer your question, you write a KQL query. Advanced Hunting provides raw access to every event recorded by every Defender product. If you need every application the compromised user accessed in the four hours after the stolen token was used, the portal user profile shows recent activity but not a comprehensive time-bounded view. Advanced Hunting does. If you need to know whether any other user received the same phishing email, a query against EmailEvents with the sender domain and URL pattern returns the answer in seconds.

KQL

// Advanced Hunting — find all recipients of the same phishing email
EmailEvents
| where Timestamp > ago(24h)
| where SenderFromAddress == "invoice-update@contoso-billing.com"
| project Timestamp, RecipientEmailAddress, Subject, DeliveryAction, ThreatTypes
| sort by Timestamp asc

This query identifies every recipient who received email from the same phishing sender. If the incident started with one user, this query tells you whether the campaign hit five others who haven't reported anything yet. That is the difference between containing one compromised account and catching a campaign before the other five accounts are exploited.

The containment sequence

When investigation confirms a compromise, the order of your response actions determines whether you eliminate the attacker's access or leave gaps they exploit while you work.

Figure 1.3: Each step closes a gap the previous step does not. Skipping or reordering steps leaves windows of continued attacker access.

Step 1: Disable the account. This kills all active sessions immediately. The attacker loses access the moment the account is disabled. From the incident page, click the user entity, then select "Disable user in Active Directory" or "Suspend user in Entra ID" depending on your directory. If automatic attack disruption already disabled the account, verify the action in the Activities tab and move to Step 2.

Step 2: Revoke sessions. Disabling the account prevents new sign-ins but does not invalidate existing OAuth tokens. A token stolen via AiTM can remain valid for up to one hour after the account is disabled. Revoking refresh tokens forces all clients to re-authenticate, which fails because the account is disabled. Run Revoke-MgUserSignInSession in Graph PowerShell or use the "Revoke sessions" action in the portal entity flyout.

Step 3: Reset the password. This prevents re-authentication with the compromised credential. If you reset without disabling first, the attacker's existing token keeps working until it expires. If you reset without revoking, the attacker's refresh token may still obtain new access tokens. The three steps together close every access path.

Step 4: Investigate scope. With the attacker locked out, you have time to assess damage. What emails did they read or forward? What files did they download from SharePoint or OneDrive? Did they create persistence mechanisms: inbox forwarding rules, OAuth app consent grants, additional user accounts, mail transport rules? Advanced Hunting queries across CloudAppEvents, EmailEvents, and the AuditLog table answer these questions. Section 1.8 builds the specific queries for this investigation.

Step 5: Eradicate. Remove everything the attacker created. Delete the inbox forwarding rule. Revoke any OAuth app consents they granted. Remove any accounts or guest users they provisioned. Undo any Conditional Access policy changes. Revert any mail transport rule modifications. Each persistence mechanism you miss is a path back in. The Evidence and Response tab on the incident aggregates the artifacts AIR found, which is your starting checklist.

Step 6: Re-enable. After eradication is complete, re-enable the account with a new password and verified MFA enrollment. If the user's MFA method was also compromised (the attacker registered their own authenticator app during the session), remove all MFA methods and have the user re-register from a verified clean device. Walk the user through what happened and what to watch for.

Incident classification and closure

After remediation, classify and close the incident. Classification drives the SOC metrics that leadership uses to evaluate detection effectiveness, and it feeds back into the correlation engine's tuning.

True Positive means the alert correctly identified a real threat. The attack happened. Track these to measure the value your detection rules produce.

False Positive means the alert was incorrect and no threat existed. Track these to identify detection rules that need tuning. A rule that generates 40% false positives consumes analyst time without producing security outcomes.

Benign True Positive means the detection was technically correct (the activity did occur) but the activity was authorized. A penetration test triggering behavioral detection. An IT admin running a tool that looks like credential dumping. A user who set up an inbox rule for a legitimate business reason. The detection worked; the operational context made the activity benign. Track these separately because they validate detection coverage (the rule fires on the right behavior) while indicating opportunities for exception tuning.

Add a closing comment explaining what happened, what actions were taken, and what follow-up is needed. This comment becomes the incident record. When the CISO asks about the high-severity incident from last Tuesday, the comment is the answer. When the next analyst encounters a similar incident, the comment tells them how the previous one was resolved. Section 1.7 covers documentation standards in full.

Security Operations Principle

Disable before you reset. Revoke before you re-enable. The containment sequence exists because each step closes a specific access path that the other steps leave open. Disabling kills sessions. Revoking kills tokens. Resetting kills the credential. Skip any step and the attacker retains access through the mechanism that step would have eliminated.

Section 1.3 covers Defender for Office 365 in investigation depth: how the email protection stack detects threats, Threat Explorer for email investigation, remediation actions for phishing and BEC, and how Automated Investigation and Response handles email campaigns at scale.

Unlock the Full Course See Full Course Agenda

Get weekly detection and investigation techniques

KQL queries, detection rules, and investigation methods — the same depth as this course, delivered every Tuesday.

No spam. Unsubscribe anytime. ~2,000 security practitioners.

← Previous Next →