In this section

KQL Fundamentals for Identity Security

4-5 hours · Module 1 · Free
What you already know

Sections 1.1 through 1.7 taught you to read individual sign-in log fields — authentication method, CA evaluation, risk signals, device state, and location. Every query you ran used the same KQL operators: where to filter, extend to parse JSON, project to select columns. This section formalizes those operators and adds the aggregation patterns that turn thousands of individual records into the population-level analysis that security operations requires. These are the query patterns you'll reuse in every subsequent module of this course.

Scenario

You need to answer a question that the portal can't: "How many users in the tenant signed in successfully without MFA in the last 7 days, broken down by application?" The Entra admin center shows individual sign-in records. It doesn't aggregate across users, calculate distinct counts, or group by application. The answer requires KQL — specifically, the summarize operator with dcount and countif. Every security question that involves "how many," "which users," "over what time period," or "compared to baseline" is a KQL question.

The query pipeline

KQL is a pipe-based language. You start with a table name and pipe data through operators that filter, transform, and aggregate. Each operator receives the output of the previous one and passes its result to the next. If you've used PowerShell pipelines or Unix command-line tools, the model is familiar. If not, think of it as a sequence of processing steps: data flows top to bottom, and each step narrows, reshapes, or summarizes what flows through.

Every identity security query follows the same structure: start with the table, filter the time window and conditions, parse any JSON fields, transform or classify, then aggregate or project the output. The operators below are the ones that appear in 90% of identity security queries.

Azure Portal

Microsoft SentinelLogs (or Defender XDRAdvanced Hunting)
Both interfaces run KQL against the same underlying data. Sentinel Logs gives you access to all Log Analytics tables including SigninLogs, AuditLogs, and AADNonInteractiveUserSignInLogs. Defender Advanced Hunting uses the same KQL syntax against the IdentityLogonEvents and related Defender tables. Every query in this module runs in either interface — use whichever your environment provides.

Filtering with where

The where operator filters rows by condition. Every query starts with a where that scopes the time window, and most add additional where clauses to narrow to the relevant population.

KQL
// Filter to successful sign-ins in the last 24 hours
SigninLogs
| where TimeGenerated > ago(24h)
| where ResultType == 0

Standard comparisons: == (equals), != (not equals), >, <, >=, <=. For strings, == is case-sensitive and =~ is case-insensitive. The has operator checks for word-boundary matches and is indexed — use it over contains for performance. The in operator checks membership in a list: where AppDisplayName in ("Office 365 Exchange Online", "Microsoft Teams"). The !in negation finds everything not in the list.

For identity security, has is almost always preferable to contains. The query where AppDisplayName has "Exchange" matches "Office 365 Exchange Online" but not "DataExchangeService." The query where AppDisplayName contains "Exchange" matches both. Word-boundary precision prevents false matches against application names that contain your search term as a substring.

Selecting with project and transforming with extend

project selects which columns appear in the output. extend adds new calculated columns without dropping existing ones. You've been using both since Section 1.1 — now understand the distinction.

project is exclusive — only the columns you name survive. Use it as the final step to produce clean, readable output. extend is additive — it keeps everything and adds new columns. Use it mid-query to parse JSON fields and create derived classifications. project-away is the inverse of project — it removes specific columns while keeping everything else.

The case() function inside extend is the conditional logic engine. It evaluates conditions in order and returns the value for the first match — the KQL equivalent of if/else chains. You used it in Section 1.7 to create device trust classifications and in Section 1.4 to classify authentication strength. The pattern appears in every detection rule and every posture assessment query.

Aggregating with summarize

The summarize operator is where KQL transitions from record-level to population-level analysis. It groups rows and calculates aggregate values — the operator that turns 50,000 sign-in events into a table showing sign-in counts by user, distinct application counts, and country sets.

KQL
// Per-user sign-in summary with multi-dimensional aggregation
SigninLogs
| where TimeGenerated > ago(24h)
| where ResultType == 0
| summarize
    SignIns = count(),
    Apps = dcount(AppDisplayName),
    IPs = dcount(IPAddress),
    Countries = make_set(tostring(LocationDetails.countryOrRegion), 5)
    by UserPrincipalName
| order by SignIns desc

The aggregate functions you'll use most: count() counts rows in each group. dcount() counts distinct values (approximate — uses HyperLogLog, which is fast but not exact for large populations). make_set() creates a deduplicated array of values — useful for seeing which countries or applications appear without counting each one. min() and max() find the earliest or latest timestamp, the smallest or largest value. countif() counts rows that match a condition within each group — countif(RiskLevelDuringSignIn == "high") counts high-risk sign-ins per user without a separate where clause.

The by clause specifies grouping columns. by UserPrincipalName creates one row per user. by UserPrincipalName, AppDisplayName creates one row per user-application combination. by bin(TimeGenerated, 1h) groups events into one-hour buckets, producing a time series — the foundation for trend analysis and spike detection.

Working with JSON arrays: parse_json and mv-expand

Many sign-in log fields contain JSON arrays. The ConditionalAccessPolicies array from Section 1.5, the AuthenticationDetails array from Section 1.4, and the RiskEventTypes_V2 array from Section 1.6 all require JSON parsing before they can be queried.

Two operators handle this. parse_json() converts a string to a queryable JSON object. Array indexing (AuthDetail[0]) extracts a specific element by position without changing the row count — use this when you know the position of the element you want, like the first authentication step. mv-expand expands an array into individual rows — one row per element — increasing the row count. Use this when you want to analyze each array element individually, like each CA policy evaluation.

The key difference: mv-expand on a sign-in with 15 CA policies produces 15 rows. Array indexing produces 1 row with the value from the specified position. Choose based on whether you need to filter or aggregate across array elements (mv-expand) or extract a known element (indexing).

Variables and subqueries with let

The let operator defines named values or subqueries that you reference later. It's essential for complex detection queries that compare current activity against a baseline, define reusable threshold values, or chain multi-step analysis.

KQL
// Define thresholds and sensitive apps as reusable variables
let timeRange = 24h;
let sensitiveApps = dynamic(["Azure Portal", "Microsoft Graph Explorer",
    "Azure Active Directory PowerShell"]);
SigninLogs
| where TimeGenerated > ago(timeRange)
| where AppDisplayName in (sensitiveApps)
| where ResultType == 0
| project TimeGenerated, UserPrincipalName, AppDisplayName, IPAddress

let also defines subqueries whose results are joined with the main query — the pattern you saw in the post-risky-sign-in correlation query in Section 1.6 and the service principal baseline query in Section 1.3. The subquery runs once, its results are stored, and the main query joins against them. This is how you build detection logic that says "find all activity that happened after a risky sign-in for the same user."

Time functions

Identity security queries constantly work with time. ago() calculates a relative time — ago(24h) is 24 hours before now, ago(7d) is 7 days, ago(30m) is 30 minutes. between() filters a time range — where TimeGenerated between (ago(48h) .. ago(24h)) selects events from yesterday only. datetime_diff() calculates the gap between two timestamps — datetime_diff('minute', Timestamp2, Timestamp1) returns the minutes between events, which is the foundation for impossible travel detection. bin() rounds timestamps to intervals for grouping — bin(TimeGenerated, 1h) enables time-series analysis by bucketing events into hourly windows.

KQL QUERY PIPELINE — IDENTITY SECURITY PATTERN SigninLogs Table source where Filter rows extend Parse + classify summarize Aggregate order by Sort output project Select columns ~500K rows ~50K rows ~50K rows ~200 rows ~200 rows ~200 rows Data narrows at each stage. The biggest reduction is where (time + conditions) and summarize (aggregation). Put where clauses as early as possible — they determine query performance.

Figure 1.8 — The KQL query pipeline for identity security. Data narrows from table source through filtering and aggregation to final output.

Data type pitfalls in sign-in logs

KQL is strongly typed, and the sign-in log fields contain a mix of types that produce silent failures if you don't handle them correctly. Four pitfalls catch new query writers.

Dynamic type fields require explicit casting. LocationDetails, DeviceDetail, ConditionalAccessPolicies, and AuthenticationDetails are all dynamic type (JSON). You must use tostring() to extract values for comparison: tostring(LocationDetails.countryOrRegion) works, but LocationDetails.countryOrRegion == "US" may fail silently — the query returns no results without an error because the comparison between a dynamic type and a string literal doesn't evaluate as you expect.

Empty versus null versus absent. Some fields are empty strings (""), some are null, and some are not present at all. Use isnotempty() defensively when filtering on optional fields. where isnotempty(tostring(LocationDetails.countryOrRegion)) excludes sign-ins where the country couldn't be determined from the IP. Without this filter, your aggregations include blank entries that distort percentages.

ResultType is numeric. Compare with == (numeric equality), not has or contains (string operations). where ResultType == 0 is correct. where ResultType has "0" will not work as expected because has operates on strings and ResultType is an integer.

Case sensitivity in UPN comparisons. String comparisons with == are case-sensitive. UPNs in Entra ID are case-insensitive but may be stored with varying capitalization. Use =~ for case-insensitive comparison when filtering by UPN: where UserPrincipalName =~ "admin@domain.com".

What we see in 90% of environments

"We don't need KQL — we use the portal for log analysis." The Entra admin center provides a GUI for browsing individual sign-in events. It cannot correlate sign-ins across users (detecting spray patterns), calculate baselines (establishing normal behavior), join multiple tables (combining sign-in data with audit data), create time series (identifying trends), or build automated detection rules. Every detection rule in EI13 is a KQL query. Every verification step in the Defense Design Method is a KQL query. Every operational monitoring procedure in EI14 runs KQL. The portal shows you what happened to one user. KQL shows you what's happening across your tenant.

Putting it together: investigating a suspicious sign-in

A complete investigation query demonstrates how the operators chain together. The scenario: you received an alert for a medium-risk sign-in. You need to assess the sign-in, determine the risk context, and decide whether to escalate.

KQL
// Complete investigation: risky sign-in context assessment
let targetUser = "casey.finance@yourdomain.onmicrosoft.com";
SigninLogs
| where TimeGenerated > ago(24h)
| where UserPrincipalName =~ targetUser
| where RiskLevelDuringSignIn in ("medium", "high")
| where ResultType == 0
| extend
    Country = tostring(LocationDetails.countryOrRegion),
    DeviceOS = tostring(DeviceDetail.operatingSystem),
    AuthMethod = tostring(parse_json(AuthenticationDetails)[0].authenticationMethod),
    CAStatus = ConditionalAccessStatus
| extend Assessment = case(
    Country !in ("US", "GB"), strcat("ALERT: Unexpected country — ", Country),
    AuthMethod == "Password", "WARNING: Password-only auth (no MFA)",
    CAStatus == "notApplied", "CRITICAL: No CA policy evaluated",
    "Context appears normal — review post-auth activity")
| project
    TimeGenerated, AppDisplayName, IPAddress, Country,
    DeviceOS, AuthMethod, RiskLevelDuringSignIn,
    RiskEventTypes_V2, CAStatus, Assessment

This query combines every operator from this section: let for the target variable, where for filtering, extend for JSON parsing and conditional classification with case(), and project for clean output. The Assessment column produces an automated triage note — not a replacement for human judgment, but a starting point that highlights the most concerning combination of signals. Run the post-authentication activity queries from Section 1.10 for the follow-up investigation.

Identity Security Principle

KQL is the operational language of identity security on the Microsoft stack. Every verification query, every baseline measurement, every detection rule, and every investigation in this course is a KQL query. The portal shows you a single sign-in record. KQL shows you the pattern across 50,000. The transition from record-level browsing to population-level analysis is the transition from reactive to proactive security operations.

Next

Section 1.9 builds on these fundamentals with advanced query patterns — time-windowed anomaly detection, session reconstruction, multi-step aggregation, and statistical deviation from baselines. These are the patterns that turn raw sign-in data into security intelligence.

Unlock the Full Course See Full Course Agenda