In this section

DE3.8 Valid Account Compromise Detection

4-5 hours · Module 3

What you already know

Section 3.7 built DE3-006 for USB-based initial access: a physical vector where the detection signal is the post-mount process chain. This section addresses the hardest initial access detection problem in the course: an attacker who signs in with legitimate credentials, produces no failed logons, triggers no malware signature, and leaves no exploit artifact. The only signal is behavioral deviation from the compromised user's established patterns.

Scenario

Rachel reviews a compromised account discovered during unrelated threat hunting. The attacker obtained p.morrison's credentials from a credential dump on a dark web marketplace. NE was never directly attacked. The attacker signed in at 02:14 AM from a residential IP in Romania, accessed SharePoint, downloaded three documents from the engineering specs library, and signed out at 02:38 AM. No failed logons. No malware. No impossible travel, p.morrison hadn't signed in since 17:45 the previous day. Every individual sign-in event was legitimate. The anomaly was the pattern: a user who has never signed in after midnight, from a country they've never visited, accessing a library they touch once a quarter.

Why valid account compromise is the hardest detection problem

Every other rule in this module detects an attack artifact: a URL click, a spray pattern, a token replay, a suspicious process chain. DE3-007 detects the absence of normal behavior. The attacker produces no artifact you can write a signature for. They sign in successfully, use the same applications the legitimate user uses, and access resources the user is authorized to touch. The signal isn't what the attacker does, it's the subtle difference between how they do it and how the real user does it.

This detection pattern is entirely different from indicator-based rules. DE3-001 through DE3-006 fire when a specific observable occurs: a click event correlated with a sign-in anomaly, a spray ratio exceeding a threshold, a process chain following a USB mount. DE3-007 fires when the aggregate behavior of an account deviates from its own historical pattern. You're comparing the account against itself, not against a static rule.

Entra ID Protection handles a portion of this through its risk engine. Microsoft's machine learning model evaluates sign-in risk based on factors including unfamiliar sign-in properties, atypical travel, and anomalous token activity. The risk score appears in SigninLogs as RiskLevelDuringSignIn and RiskLevelAggregated. The limitation is that Entra ID Protection evaluates each sign-in independently. It catches a sign-in from an unfamiliar country. It doesn't catch an attacker who signs in from the same country as the user, at a slightly unusual time, accessing slightly unusual resources, where no single sign-in exceeds the risk threshold, but the combined pattern is clearly abnormal.

DE3-007 fills that gap by building per-user baselines and comparing current behavior across multiple dimensions simultaneously.

Building the behavioral baseline

The foundation of DE3-007 is a 30-day baseline that captures four dimensions of each user's normal behavior: where they sign in from (IP ranges and countries), what they sign in with (device identifiers and operating systems), when they sign in (hour-of-day distribution), and what they access (application IDs and resource names).

// 30-day per-user baseline — location, device, timing, apps
// Run in Advanced Hunting to see what "normal" looks like
let BaselineWindow = 30d;
let BaselineStart = ago(BaselineWindow);
SigninLogs
| where TimeGenerated > BaselineStart
| where ResultType == "0"
| extend HourOfDay = hourofday(TimeGenerated)
| extend Country = tostring(LocationDetails.countryOrRegion)
| extend DeviceOS = tostring(DeviceDetail.operatingSystem)
| summarize
    KnownIPs = make_set(IPAddress, 50),
    KnownCountries = make_set(Country),
    KnownDeviceOS = make_set(DeviceOS),
    KnownApps = make_set(AppDisplayName, 25),
    TypicalHours = make_set(HourOfDay),
    TotalSignins = count(),
    EarliestHour = min(HourOfDay),
    LatestHour = max(HourOfDay)
    by UserPrincipalName
| where TotalSignins > 10

UserPrincipalName       KnownCountries   KnownDeviceOS     TypicalHours       TotalSignins
──────────────────────  ───────────────  ────────────────  ─────────────────  ────────────
p.morrison@ne.co.uk     ["GB"]           ["Windows"]       [7,8,9..17]        287
s.chen@ne.co.uk         ["GB","US"]      ["Windows","iOS"] [6,7,8..19,20]     412
r.okafor@ne.co.uk       ["GB","DE"]      ["Windows","iOS"] [7,8..22,23]       534

The baseline reveals each user's operational fingerprint. p.morrison signs in exclusively from Great Britain, on Windows, between 07:00 and 17:00. Any sign-in from Romania at 02:14 on a Linux device deviates on three dimensions simultaneously. r.okafor, by contrast, regularly signs in from Germany and works until 23:00: the same Romanian sign-in at 02:14 deviates on fewer dimensions for her account.

This per-user sensitivity is what makes behavioral detection powerful. The same sign-in event is suspicious for one user and normal for another. A static rule that alerts on "any sign-in from Romania" would produce false positives for every NE employee who travels to Romania legitimately. A behavioral rule that alerts on "a sign-in from a country this specific user has never visited, at an hour they've never been active, on an OS they've never used" is precise.

Figure DE3.8. The deviation score aggregates anomalies across four behavioral dimensions. A single new IP is normal travel. Three or more simultaneous deviations from a user's established pattern warrant investigation.

Building DE3-007: the production rule

The production rule uses materialize() to compute the 30-day baseline once, then joins current sign-in activity against it. The deviation score increments for each dimension where the current sign-in falls outside the user's established baseline.

// DE3-007: Valid account behavioral deviation
// ATT&CK: T1078 — Valid Accounts
// Chain: All chains | Sprint 2 | Severity: Medium
let LookbackWindow = 1h;
let BaselineWindow = 30d;
let DeviationThreshold = 3;
// Build per-user baseline from past 30 days
let Baseline = materialize(
    SigninLogs
    | where TimeGenerated between (
        ago(BaselineWindow) .. ago(LookbackWindow))
    | where ResultType == "0"
    | extend Country = tostring(
        LocationDetails.countryOrRegion)
    | extend DeviceOS = tostring(
        DeviceDetail.operatingSystem)
    | extend HourOfDay = hourofday(TimeGenerated)
    | summarize
        KnownCountries = make_set(Country),
        KnownIPs = make_set(IPAddress, 100),
        KnownOS = make_set(DeviceOS),
        KnownApps = make_set(AppDisplayName, 50),
        KnownHours = make_set(HourOfDay)
        by UserPrincipalName
);
// Score current sign-ins against baseline
SigninLogs
| where TimeGenerated > ago(LookbackWindow)
| where ResultType == "0"
| extend Country = tostring(
    LocationDetails.countryOrRegion)
| extend DeviceOS = tostring(
    DeviceDetail.operatingSystem)
| extend HourOfDay = hourofday(TimeGenerated)
| join kind=inner (Baseline) on UserPrincipalName
| extend NewCountry = iff(Country !in (KnownCountries), 1, 0)
| extend NewIP = iff(IPAddress !in (KnownIPs), 1, 0)
| extend NewOS = iff(DeviceOS !in (KnownOS), 1, 0)
| extend NewApp = iff(AppDisplayName !in (KnownApps), 1, 0)
| extend OffHours = iff(HourOfDay !in (KnownHours), 1, 0)
| extend DeviationScore =
    NewCountry + NewIP + NewOS + NewApp + OffHours
| where DeviationScore >= DeviationThreshold
| project TimeGenerated, UserPrincipalName, IPAddress,
    Country, DeviceOS, AppDisplayName, HourOfDay,
    DeviationScore, NewCountry, NewIP, NewOS,
    NewApp, OffHours, RiskLevelDuringSignIn

The materialize() function is critical here. Without it, the baseline subquery re-executes for every row in the current sign-in results. With a 30-day window against SigninLogs, which can contain millions of rows in a large environment: the query would time out. materialize() computes the baseline once, caches the result in memory, and reuses it for every join. This is a KQL performance pattern you'll use repeatedly for baseline-comparison rules.

The deviation threshold of 3 is the starting point, not the final value. During the 14-day observation period, you'll evaluate every alert and adjust. If the threshold produces too many alerts from users who travel frequently and use multiple devices, raise it to 4. If it misses compromise scenarios where the attacker stays within two dimensions of normal behavior, lower it to 2 and add Entra ID Protection risk as a weighted dimension.

Anti-Pattern

Many organizations rely entirely on Entra ID Protection's impossible travel detection or Defender for Cloud Apps' anomaly policy for behavioral compromise detection. Impossible travel catches the obvious case: a sign-in from New York followed by Tokyo 30 minutes later. It misses the sophisticated case: an attacker who checks the user's last sign-in location and signs in from a nearby country during off-hours. DE3-007's multi-dimensional scoring catches patterns that single-dimension detection architecturally cannot. Impossible travel is one input, not the complete answer.

Tuning the behavioral baseline

The 30-day window was chosen because it balances stability against relevance. A 7-day window adapts quickly to behavioral changes but produces volatile baselines: a user who travels for one week skews the window. A 90-day window is stable but includes behavior patterns from before role changes, project reassignments, or application migrations.

Three tuning adjustments you'll make during the observation period. First, exclude service accounts. Service accounts sign in from consistent infrastructure with consistent patterns, they'll never deviate and they inflate the baseline computation. Filter them in the baseline query with | where UserPrincipalName !has "#EXT#" | where UserPrincipalName !startswith "svc-". Second, weight the dimensions differently if your environment warrants it. A new country is a stronger signal than a new IP within the same country. You can modify the scoring to assign 2 points for a new country and 1 point for other dimensions. Third, consider adding Entra ID Protection's RiskLevelDuringSignIn as a sixth dimension. A sign-in scored "medium" by Microsoft's model that also deviates on two other dimensions crosses the threshold, combining Microsoft's detection with your own.

Run the rule against 14 days to understand your false positive rate before deploying as a scheduled analytics rule. The output tells you which users have the most volatile baselines (frequent travelers, multi-device users, after-hours workers) and whether those users need either a higher threshold or explicit exclusion.

Investigating a behavioral deviation alert

When DE3-007 fires, the alert gives you a user, a deviation score, and the specific dimensions that triggered. The investigation workflow starts from the deviation dimensions and expands outward.

If the new country dimension fired, pivot to the IP address. Look up the ASN, is it a residential ISP, a hosting provider, a VPN service, or a Tor exit node? Hosting provider IPs and known proxy services are stronger indicators than residential ISPs in the same country. An attacker using a VPS in Romania produces a different risk profile than a user connecting from a hotel in Bucharest.

If the new application dimension fired, examine what the user accessed. An attacker with stolen credentials often targets high-value resources early, executive mailboxes, SharePoint document libraries containing financial data, or administrative portals they've identified through prior reconnaissance. The CloudAppEvents table shows application-level activity after the sign-in. A sign-in that deviates on timing and immediately accesses the CFO's mailbox or downloads files from a restricted SharePoint library is materially different from a deviation that accesses Outlook and reads the user's own email.

If multiple dimensions fired simultaneously, correlate the timing with other rules in this module. Does the flagged user also appear in DE3-004 (token theft) results within the same window? Did a DE3-001 (phishing click) or DE3-003 (spray success) fire for the same account within the past 24 hours? A behavioral deviation that correlates with an initial access detection is not a coincidence, it's a confirmed attack chain.

The response decision depends on the deviation score and the correlation evidence. A score of 3 with no corroborating alerts warrants a conditional access review, force reauthentication and check whether the user confirms the activity. A score of 4 or higher, or any score that correlates with another DE3 rule, warrants immediate session revocation and a password reset. The cost of a false positive (the user reauthenticates) is far lower than the cost of a missed true positive (the attacker maintains access while you investigate).

Rule Specification. DE3-007

Rule ID: DE3-007

Name: Valid account behavioral deviation, multi-dimension scoring

ATT&CK: T1078. Valid Accounts

Hypothesis: Compromised accounts exhibit simultaneous deviations across location, device, timing, and application access that legitimate users rarely produce

Data sources: SigninLogs (30-day baseline + current window)

Frequency: Every 1 hour | Lookback: 1 hour 15 minutes

Severity: Medium (Score 3-4) | High (Score 5+)

Entity mapping: Account → UserPrincipalName | IP → IPAddress | Host → DeviceOS

FP sources: Traveling executives, new employees (no baseline yet), users who change devices frequently, seasonal workers returning after extended absence

Chains: All six chains. T1078 is a secondary entry vector for every attack scenario

← Previous Next →