In this section

Your First AI-Assisted Investigation

2-3 hours · Module 0 · Free

Scenario

Defender XDR fires an alert: "Suspicious sign-in followed by inbox rule creation" for user d.chen@northgateeng.com. The sign-in originated from 185.234.72.19 (Netherlands hosting provider) at 09:12 UTC. MFA was completed successfully. Four minutes later, a new inbox rule appeared on the account. This matches the Code of Conduct AiTM campaign pattern from section 0.1. Open your Claude workspace and investigate.

The alert details

Before you prompt Claude, understand what you are working with. The alert contains three data points: the user, the anomalous IP, and the inbox rule creation. Your job is to determine whether this is AiTM token theft, credential compromise, or a legitimate sign-in followed by a coincidental rule change. The investigation requires querying at least four Sentinel tables.

Alert: Suspicious sign-in followed by inbox rule creation
Severity: High
User: d.chen@northgateeng.com
Sign-in time: 2026-04-15T09:12:00Z
Sign-in IP: 185.234.72.19 (NL, hosting provider)
MFA: Completed (push notification approved)
Inbox rule: Created at 2026-04-15T09:16:00Z
Rule action: Mark as read, move to RSS Subscriptions
MITRE: T1566.002 (Phishing: Spearphishing Link) → T1539 (Steal Web Session Cookie) → T1564.008 (Email Hiding Rules)

Step 1: Prompt Claude

Open your course Project (the one you configured in section 0.3). Paste the alert details into a new conversation with the investigation prompt pattern from section 0.1. Adapt it to match the specific alert data above.

Your prompt should specify: the user, the suspicious IP, the timeline, the MFA status, the inbox rule details, and the tables you expect to query (SigninLogs, AADNonInteractiveUserSignInLogs, CloudAppEvents, OfficeActivity). Ask Claude to generate the investigation queries, explain what each query tests, and predict what the results will show if this is an AiTM compromise.

Structure matters here. A vague prompt ("investigate this user for possible compromise") produces generic queries that may not address the AiTM-specific evidence chain. A structured prompt that includes the MITRE techniques, the specific tables, and the expected attack sequence produces targeted queries that match the investigation you actually need to run. Compare the output from both approaches. The structured prompt produces queries with appropriate time windows, correct field filtering, and inline comments explaining what each filter targets. The vague prompt produces valid KQL but misses the AiTM-specific indicators: it queries for failed sign-ins (irrelevant in AiTM where sign-ins succeed) and skips the token replay table entirely.

This is the difference between using AI as an autocomplete tool and using it as an investigation accelerator. The structured prompt encodes your investigative hypothesis. Claude operationalises that hypothesis into queries. The quality of the output is determined by the quality of the input, and the quality of the input comes from your understanding of the attack technique.

If you have access to a Sentinel workspace, run the queries against your own environment substituting one of your users and a known-good sign-in. If you do not have a workspace, evaluate the queries structurally: are the table names real, are the field names correct, does the logic match the investigation objective?

Step 2: Validate the output

Apply the five-check validation from section 0.2 to every query Claude generates.

Prompt Pattern

Pattern: Five-check validation for investigation queries

For each query Claude generated, check:

1. References: Does the table exist in Sentinel? Run: union withsource=T * | distinct T

2. Fields: Do the field names exist? Run: TableName | getschema

3. Evidence: Does the query test what the investigation needs, or something adjacent?

4. Logic: Do the filters include the right events and exclude the wrong ones?

5. Platform: Is everything Sentinel/KQL, or did Claude mix in Splunk or Elastic syntax?

Here is what to look for specifically in this investigation. Claude will almost certainly generate a SigninLogs query that works correctly. SigninLogs is the most common table in Claude's KQL training data, and the field names (UserPrincipalName, IPAddress, ResultType, MfaDetail, ConditionalAccessStatus) are well-represented. The query will filter on the user and the time window, project the relevant fields, and probably sort by TimeGenerated descending.

Where Claude is more likely to make errors is in the token replay query and the inbox rule query. For token replay, the correct table is AADNonInteractiveUserSignInLogs. Claude sometimes references AADSignInEventsBeta or invents a table name like NonInteractiveSignIns. Both look plausible. Neither exists. Run union withsource=T * | distinct T | where T contains "AAD" and verify.

For inbox rule detection, both CloudAppEvents and OfficeActivity contain the data, but the field names and ActionType values differ between them. Claude might generate a query against CloudAppEvents with ActionType == "New-InboxRule" (correct) or ActionType == "Set-InboxRule" (different operation, catches modifications not creations). It might also use OfficeActivity with Operation == "New-InboxRule" (correct for that table) but project fields that belong to CloudAppEvents. The cross-table field confusion is the context leakage failure mode from section 0.2.

When you find an error, correct it. Note what the error was, which failure mode it maps to (hallucinated reference, outdated syntax, incorrect logic, context leakage), and how long the correction took. This is your first entry in your prompt engineering log. Save the prompt you used, the corrected queries, and the validation notes. These are the first artifacts in your prompt library.

Step 3: Build the timeline

After validating and running the queries (or evaluating them structurally if you do not have a Sentinel workspace), ask Claude to assemble the results into a chronological investigation timeline. Provide the query results or describe the expected results based on the alert data:

"Based on these findings, build a chronological investigation timeline for d.chen@northgateeng.com from initial compromise through current scope. Format each entry: timestamp, event description, source table, investigative significance. Distinguish between confirmed findings and items requiring further investigation."

Evaluate the timeline Claude produces against three criteria. First, does it correctly sequence the events in the order they occurred? The attacker chain follows a predictable pattern for AiTM: initial sign-in via proxy, token exfiltration, token replay from attacker infrastructure, persistence (inbox rule), mailbox access, and lateral phishing. Claude should produce this sequence from the evidence.

Second, does it identify the AiTM-specific indicators? The key differentiator between AiTM and credential theft is that MFA was completed successfully. In credential theft, the attacker has the password but not the second factor. In AiTM, the attacker relays the real MFA challenge through the proxy. The MfaDetail field showing completed MFA combined with sign-in from a hosting provider is the signature that distinguishes the two.

Third, does it distinguish between what the evidence confirms and what remains unverified? A good timeline separates facts ("8 outbound emails sent at 09:31") from assessments ("likely lateral phishing based on identical subject line"). Claude sometimes presents assessments as facts. The validation discipline catches this.

The timeline is the primary investigation deliverable. In Module 2, you build timelines for every investigation type. In Module 4, the timeline becomes the foundation of the incident report. The quality of the timeline determines the quality of every downstream deliverable.

Claude Exchange

Claude Response (excerpt): Investigation Timeline

09:12:00Z Sign-in from 185.234.72.19 (NL). MFA completed. [SigninLogs]

09:14:00Z Non-interactive sign-in from 91.215.xx.xx. [AADNonInteractive]

09:16:00Z Inbox rule created: mark read, move to RSS. [CloudAppEvents]

09:18:00Z MailItemsAccessed x23 from 91.215.xx.xx. [OfficeActivity]

09:31:00Z 8 outbound phishing emails to internal contacts. [OfficeActivity]

Analyst Evaluation

✓ Sequence correct. The timeline follows the AiTM attack chain: initial sign-in → token replay → persistence → mailbox access → lateral phishing.

✓ AiTM confirmed. Two different IPs with different infrastructure profiles within 2 minutes confirms token exfiltration, not legitimate concurrent sessions.

✗ Missing scope step. Timeline stops at lateral phishing but does not assess whether any of the 8 recipients clicked. Add: query SigninLogs for each recipient in the 24 hours following the phishing send. This determines whether the compromise is isolated to d.chen or has propagated.

That missing scope step is typical. Claude builds the timeline from the evidence you provided but does not reason about what the evidence implies for the next investigation thread. The 8 outbound phishing emails mean 8 potential additional compromises. Each recipient needs a sign-in review for the 24 hours following the phishing send. If any of them authenticated through the same AiTM proxy, the compromise has propagated and your containment scope just tripled.

An experienced analyst automatically opens that thread because they have worked incidents where a single compromised account cascaded into 15 compromises over a weekend. Claude does not open that thread because scoping requires operational judgment about what matters next, not just what the current evidence contains. This is the judgment boundary in practice: Claude handles the mechanical timeline construction, the analyst handles the investigative reasoning about what the timeline means for scope and containment.

What you built

You completed your first full AI-assisted investigation of an AiTM account compromise. You generated investigation queries using a structured prompt pattern that encoded your investigative hypothesis. You validated each query against the five-check discipline and caught at least one error (most students find errors in the token replay or inbox rule queries). You built a chronological timeline from cross-table evidence. You identified a scope gap that Claude missed: the 8 recipients who may have clicked the lateral phishing.

The containment decision demonstrates the judgment boundary from section 0.1. Based on the timeline, the immediate containment actions are: revoke the user's active sessions, reset the user's password, delete the malicious inbox rule, and block the two attacker IPs at the Conditional Access level. These are clear from the evidence and Claude would recommend all four if asked. The judgment call is whether to also disable the accounts of the 8 phishing recipients as a precautionary measure before you have evidence of their compromise. That decision depends on the operational impact of disabling 8 users, the criticality of those users' roles, and how quickly you can investigate the 8 accounts. AI cannot make that call because it depends on context only you have: your organization's risk tolerance, the business impact of disabling those accounts, and the current capacity of your investigation team.

This is the working pattern for every module in the course. Module 2 expands it to 20+ investigation types with templates for each. Module 3 applies the same pattern to detection engineering. Module 4 uses the investigation timeline as the input for AI-drafted incident reports, executive summaries, and regulatory notification assessments. The discipline does not change. The security domain does.

Deploy what you built. Save the investigation prompt you used, including any modifications you made after seeing the initial output. Save the corrected queries with your validation notes inline. Add the validation notes to your prompt engineering log with the failure mode classification for each error you found. The prompt library you start building now grows throughout the course. By Module 10, it contains 100+ tested templates across investigation, detection, documentation, automation, governance, and deployment. Each template improves every time you use it on a real incident, because your validation notes capture what works in your specific environment, what Claude gets wrong consistently, and what adaptations your workspace requires.

Anti-Pattern

Stopping the investigation when the timeline looks complete

Claude's timeline ends at the 8 outbound phishing emails. The analyst marks the investigation complete. Two days later, a second account compromise is discovered: one of the 8 recipients clicked the phishing link and completed MFA through the same AiTM proxy. The original investigation missed the propagation because the analyst relied on Claude's timeline without asking the follow-up question that operational experience demands. AI builds timelines from the evidence you provide. Scoping the investigation beyond the initial evidence is the analyst's responsibility.

← Previous Next →

Reading width