In this section

TH4.2 Hunting AiTM Session Hijacking

6-8 hours · Module 4
What you already know

The previous section covered building the authentication baseline. This section covers hunting aitm session hijacking.

How AiTM works, and why MFA does not stop it

The attacker deploys a reverse proxy (EvilProxy, Evilginx, Caffeine, Tycoon 2FA) that sits between the user and the real Microsoft login page. The user sees a legitimate-looking login form. They enter their credentials. The proxy forwards them to Microsoft. Microsoft challenges with MFA. The user completes MFA on their phone. Microsoft issues a session token. The proxy captures the token and forwards it to the attacker while also completing the login for the user, who notices nothing unusual.

Anti-Pattern

Hunting hunting aitm session hijacking without a hypothesis

The hunter opens Advanced Hunting and starts writing queries without a clear hypothesis. They find interesting data but cannot determine whether it represents a threat, a misconfiguration, or normal activity. Every hunt starts with a hypothesis: a specific, testable statement about attacker behavior. Without a hypothesis, you are exploring, not hunting. Exploration has value, but it produces findings you cannot action without additional scoping.

The attacker now has a valid session token with a completed MFA claim. They import this token into a browser or a tool like ROADtools and authenticate as the user. From Microsoft's perspective, this is a valid session: the credentials were correct, MFA was completed, and the token is properly signed. The only anomaly: the token is being used from a different IP, device, and location than the one where MFA was completed.

This is the detection surface. The interactive sign-in (where MFA was completed) happened from the user's real device, through the proxy's IP. The non-interactive token refresh (where the attacker uses the stolen token) happens from the attacker's infrastructure. The gap between these two authentication events is the signal.

The core AiTM detection query

// AiTM Session Hijacking — Core Detection
// Finds non-interactive token refreshes from IPs that do not
// appear in the user's interactive sign-in history
let baselineWindow = 30d;
let detectionWindow = 7d;
// Step 1: Build per-user interactive IP baseline
let InteractiveBaseline = SigninLogs
| where TimeGenerated > ago(baselineWindow)
| where ResultType == 0
| where IsInteractive == true
| summarize
    BaselineIPs = make_set(IPAddress, 100),
    BaselineSubnets = make_set(
        strcat(split(IPAddress, ".")[0], ".",
               split(IPAddress, ".")[1], ".",
               split(IPAddress, ".")[2], ".0/24"), 50)
    by UserPrincipalName;
// Step 2: Find non-interactive sign-ins from non-baseline IPs
AADNonInteractiveUserSignInLogs
| where TimeGenerated > ago(detectionWindow)
| where ResultType == 0
| join kind=inner InteractiveBaseline
    on UserPrincipalName
// Check: is this non-interactive IP in the user's baseline?
| extend IPInBaseline = BaselineIPs has IPAddress
| extend SubnetKey = strcat(
    split(IPAddress, ".")[0], ".",
    split(IPAddress, ".")[1], ".",
    split(IPAddress, ".")[2], ".0/24")
| extend SubnetInBaseline = BaselineSubnets has SubnetKey
| where IPInBaseline == false and SubnetInBaseline == false
// This non-interactive sign-in came from an IP (and subnet)
//   that the user has NEVER signed in from interactively
// This is the core AiTM signal — a token being used from
//   infrastructure the user does not operate from
| extend Country = tostring(LocationDetails.countryOrRegion)
| extend City = tostring(LocationDetails.city)
| extend DeviceOS = tostring(DeviceDetail.operatingSystem)
| extend Browser = tostring(DeviceDetail.browser)
| extend App = AppDisplayName
| project
    TimeGenerated,
    UserPrincipalName,
    IPAddress,
    Country,
    City,
    App,
    DeviceOS,
    Browser,
    ResourceDisplayName,
    UserAgent,
    // Include the baseline for comparison
    BaselineIPCount = array_length(BaselineIPs)
| sort by TimeGenerated desc
// INTERPRETATION:
// Each row is a non-interactive token refresh from a non-baseline IP
// Not all are malicious — some are legitimate:
//   - User on a new ISP (home IP changed)
//   - User traveling (hotel WiFi)
//   - Corporate proxy that rotates IPs
// Filter further with the enrichment queries below

Reducing false positives. IP reputation and context

The core query produces candidates, not findings. Enrichment separates legitimate new IPs from attacker infrastructure.

// Enrichment 1: How many users share this non-baseline IP?
// Attacker IPs are typically used by 1 user (targeted replay)
// Corporate infrastructure IPs serve many users
let suspicious = AADNonInteractiveUserSignInLogs
| where TimeGenerated > ago(7d)
| where ResultType == 0
| join kind=inner (
    SigninLogs
    | where TimeGenerated > ago(30d)
    | where ResultType == 0 and IsInteractive == true
    | summarize BaselineIPs = make_set(IPAddress, 100)
        by UserPrincipalName
) on UserPrincipalName
| where BaselineIPs !has IPAddress
| summarize
    AffectedUsers = dcount(UserPrincipalName),
    Users = make_set(UserPrincipalName, 10),
    FirstSeen = min(TimeGenerated),
    LastSeen = max(TimeGenerated),
    Apps = make_set(AppDisplayName, 10)
    by IPAddress;
suspicious
| extend IPClassification = case(
    AffectedUsers == 1, "SINGLE-USER (investigate)",
    AffectedUsers <= 3, "LOW-VOLUME (suspicious)",
    AffectedUsers > 3, "MULTI-USER (likely infrastructure)")
| sort by AffectedUsers asc
// SINGLE-USER IPs are the highest signal — one user's token
//   is being replayed from infrastructure nobody else uses
// MULTI-USER IPs may be a new corporate proxy, VPN, or ISP
//   — investigate before classifying as malicious
// Enrichment 2: Did the user have an interactive sign-in from
// this IP within 24 hours? If yes, the user may have traveled
// or changed networks — lower confidence of compromise
let nonBaselineEvents = AADNonInteractiveUserSignInLogs
| where TimeGenerated > ago(7d)
| where ResultType == 0
| join kind=inner (
    SigninLogs
    | where TimeGenerated > ago(30d)
    | where ResultType == 0 and IsInteractive == true
    | summarize BaselineIPs = make_set(IPAddress, 100)
        by UserPrincipalName
) on UserPrincipalName
| where BaselineIPs !has IPAddress
| project TimeGenerated, UserPrincipalName, IPAddress,
    NonInteractiveTime = TimeGenerated;
// Check for nearby interactive sign-in from same IP
nonBaselineEvents
| join kind=leftouter (
    SigninLogs
    | where TimeGenerated > ago(7d)
    | where ResultType == 0 and IsInteractive == true
    | project InteractiveTime = TimeGenerated,
        UserPrincipalName, InteractiveIP = IPAddress
) on UserPrincipalName
| where InteractiveIP == IPAddress
| extend TimeDiff = abs(datetime_diff(
    'hour', NonInteractiveTime, InteractiveTime))
| where TimeDiff < 24
// If this join returns results: the user DID sign in
//   interactively from this IP within 24 hours
// This REDUCES the AiTM confidence — the user may have
//   legitimately moved to this IP
// If the join returns NO results: the IP has ONLY appeared
//   in non-interactive logs — highest AiTM confidence
| summarize NearbyInteractive = count()
    by UserPrincipalName, IPAddress
// Enrichment 3: Is this IP associated with a known residential
// proxy, VPN, or Tor exit node? These are common attacker
// infrastructure for AiTM replay
// Use the IPAddress against Microsoft's risk detection context
SigninLogs
| where TimeGenerated > ago(7d)
| where IPAddress in ("", "")
| extend RiskDetail = tostring(
    parse_json(RiskEventTypes_V2))
| where isnotempty(RiskDetail) and RiskDetail != "[]"
| project TimeGenerated, UserPrincipalName, IPAddress,
    RiskDetail, RiskLevelDuringSignIn
// If Microsoft flags this IP with "anonymizedIPAddress",
//   "maliciousIPAddress", or "suspiciousIPAddress" risk types,
//   the IP is known-bad infrastructure
// Combine with the single-user classification from Enrichment 1
//   for highest-confidence findings

The AiTM investigation decision tree

After running the core query and enrichment:

High confidence (escalate to IR): Non-baseline IP + single-user association + no nearby interactive sign-in + Microsoft risk detection on the IP + application access to mail or files. This pattern matches a complete AiTM replay, stolen token used from attacker infrastructure to access high-value data.

Medium confidence (investigate further): Non-baseline IP + single-user + no nearby interactive sign-in, but no Microsoft risk detection. The IP may be a new legitimate location. Check: did the user report travel? Is the IP geolocation consistent with known user locations? Is the accessed application consistent with the user's normal app set?

Low confidence (monitor): Non-baseline IP but multi-user association, or nearby interactive sign-in exists. Likely a new corporate IP or user network change. Add to the baseline if confirmed legitimate.

AiTM SESSION HIJACKING — DETECTION LOGIC USER AUTHENTICATES (interactive) Real device, real IP, MFA completed → SigninLogs: IP = user's real location ATTACKER REPLAYS TOKEN (non-interactive) Different IP, different device, valid MFA claim → AADNonInteractiveUserSignInLogs: IP = attacker infrastructure PROXY DETECTION: Non-interactive IP ∉ Interactive baseline The attacker's IP never appeared in the user's interactive sign-in history Single-user IP? (Enrichment 1) Attacker IPs serve 1 user only No nearby interactive? (Enrichment 2) User never authenticated from this IP Risk detection? (Enrichment 3) Microsoft flags the IP as suspicious ALL THREE = HIGH CONFIDENCE AiTM → Escalate to IR

Figure TH4.2. AiTM detection logic. The core signal is non-interactive token use from IPs outside the interactive baseline. Enrichment queries raise or lower confidence.

Run the core detection query against your workspace (7-day detection window, 30-day baseline). How many non-baseline non-interactive events appear?

If the count is > 100, you likely need VPN/proxy exclusions. Identify the most common non-baseline IPs with Enrichment 1. If they serve 10+ users, they are infrastructure, add to known-good.

If the count is 0-10, examine each event individually. Check: is the IP single-user? Is there a nearby interactive sign-in? What application was accessed? A non-baseline IP accessing Exchange Online or SharePoint with no nearby interactive sign-in is the textbook AiTM pattern.

If you find a high-confidence result: do not remediate immediately from the hunting query. Open an incident, document the finding per TH1.7, and follow your organization's IR process. The hunt identified the compromise. The IR process contains and remediates it.

Compliance Context

MFA prevents the attacker from authenticating with stolen credentials alone. AiTM proxies do not steal credentials alone, they intercept the entire authentication flow, including the MFA step. The attacker receives a session token with a valid MFA claim. From the identity provider's perspective, MFA was completed successfully. This is not a bypass of MFA, it is a capture of the session that MFA created. The only MFA methods that resist AiTM are phishing-resistant methods: FIDO2 security keys and passkeys, which bind the authentication to the legitimate domain and fail when the proxy's domain does not match. Organizations relying on push-notification MFA or SMS MFA are vulnerable to AiTM. The hunt in this subsection detects the result of successful AiTM attacks: the token replay, which is the only detection available after MFA has been completed through the proxy.

Extend this hunt

For environments with Defender for Cloud Apps connected, extend the AiTM detection by correlating the non-baseline non-interactive sign-in with immediate mailbox access. An AiTM attacker who replays a token typically accesses the mailbox within minutes, reading email, creating inbox rules, or downloading attachments. Query CloudAppEvents for MailItemsAccessed or New-InboxRule operations from the suspicious IP within 1 hour of the non-baseline sign-in. This temporal correlation, token replay followed by immediate mailbox access, is the highest-confidence AiTM indicator. TH5 (Hunting Cloud Persistence) covers the inbox rule analysis in depth.

Checkpoint

Hunt window: ___ to ___

Baseline window: 30 days prior

Non-baseline non-interactive events (raw): ___

After VPN/infrastructure exclusion: ___

Single-user IPs identified: ___

High-confidence findings: ___

Medium-confidence (further investigation): ___

Low-confidence (monitoring): ___

Known-good IPs added to baseline: ___

Incidents opened: ___

References Used in This Subsection