In this section

What AI Actually Does (and Does Not Do)

3-4 hours · Module 1 · Free

Scenario

Your CISO asks you to evaluate three vendor proposals. Vendor A offers "AI-powered SIEM enrichment." Vendor B offers "AI-driven threat detection." Vendor C offers "generative AI for security automation." All three use the word "AI." All three mean something fundamentally different. Vendor A runs predefined correlation rules and calls the result AI. Vendor B trains an anomaly detection model on your historical sign-in data. Vendor C uses a large language model to generate KQL queries and investigation reports from natural language prompts. If you cannot distinguish between the three, you cannot evaluate the proposals.

Three categories of "AI" in security products

The security industry uses "AI" to describe three fundamentally different technologies. Conflating them leads to governance failures: applying LLM governance to a rule-based system wastes resources, and applying rule-based governance to an LLM misses the hallucination risk entirely.

Rule-based systems with conditional logic. Deterministic if-then logic. YARA rules, Sigma rules, threshold-based SIEM alerts, static analysis signatures. Given the same input, they always produce the same output. They do not learn, adapt, or generalize. Their failure mode is well understood: false positives when the rule is too broad, false negatives when the rule is too narrow. When a vendor calls this AI, they are using the term loosely.

Traditional machine learning. Statistical models trained on historical data to classify new observations. User and Entity Behavior Analytics (UEBA) anomaly detection, malware classification in endpoint protection, spam filtering. These models learn patterns from training data and generalize to new inputs. Their failure mode is distribution shift: when production data diverges from training data, accuracy degrades. A UEBA model trained on pre-pandemic office behavior generates false positives when the workforce shifts to remote access.

Large language models. Neural networks trained on massive text corpora that generate output one token at a time, where each token is selected based on its statistical probability given the preceding context. Claude, ChatGPT, Gemini, Copilot. These are the systems this course teaches you to use. They are fundamentally text generation engines, not factual verification engines. Their failure mode, hallucination, is not a bug to be patched. It is an architectural property of how token prediction works.

A fourth category emerged in 2026. Claude Security, built on Opus 4.7, scans codebases for vulnerabilities by tracing data flows across files and modules, then generates targeted patches for human review. CrowdStrike integrates Opus 4.7 into Falcon for AI-driven exposure analysis. Palo Alto Networks uses it in Unit 42 Frontier AI Defense. These are LLMs applied to code analysis with structured verification pipelines layered on top. The underlying technology is the same token prediction architecture, but the deployment pattern includes automated adversarial validation before findings reach a human reviewer. Understanding that distinction matters: the vendor's verification pipeline is what makes the output trustworthy, not the model's inherent accuracy.

Figure 1.1: Four categories of "AI" in security products. The fourth category wraps LLM generation in automated verification pipelines.

How large language models work: what practitioners need to know

You do not need to understand transformer architecture. You need to understand four properties that directly affect how you use LLMs for security work.

Token prediction, not comprehension. An LLM generates output one token at a time. A token is a word fragment, roughly four characters. Each token is selected based on its statistical probability given all preceding tokens. The model does not "understand" your KQL query. It identifies that after SigninLogs | where UserPrincipalName ==, the statistically probable next tokens form a string value, followed by | where TimeGenerated, followed by a time filter. This is why LLMs produce syntactically correct KQL that references non-existent tables: the syntax patterns are well-represented in training data, but the model has no connection to your workspace to verify whether a table exists.

Context window: what the model can see. The context window is the total text the model can process at once. For Claude Opus 4.7, the input context is 1 million tokens (roughly 750,000 words). For Claude Sonnet 4.6, it is 200,000 tokens. Everything the model needs to know must fit in this window. For long investigations, segment into focused conversations within your Project rather than running a single conversation that exceeds the window, where the model loses your earliest context.

Training data cutoff. Claude's current reliable cutoff is May 2025. The model does not know about vulnerabilities disclosed after that date, Microsoft portal changes shipped after that date, or threat campaigns discovered after that date, unless it uses web search or you provide the information. When Claude generates a PowerShell command using Get-MsolUser, the cmdlet was deprecated in March 2023 in favor of Get-MgUser in Microsoft Graph PowerShell SDK. The model produces the deprecated version because its training data contains thousands of examples of the old cmdlet and fewer of the replacement. When you need current threat data, provide it in the prompt. Claude excels at analyzing intelligence you supply. It cannot supply intelligence it does not have.

Non-determinism. The same prompt produces different output on different runs because token selection includes a randomness factor. Ask for a KQL query twice and you may get two slightly different queries. Both may be valid. One may be better. Your validation discipline applies to every output, every time.

Seeing the mechanics in action

Theory matters only if it explains what you see in practice. The following Claude Exchange demonstrates all four properties at once. An analyst asks Claude to triage a suspicious sign-in alert. Read the response, then read the analyst evaluation to see where the mechanics produced correct output and where they produced operationally unsafe output.

Claude Exchange

Analyst Prompt

Triage this Defender alert: "Suspicious sign-in activity" for user_01@contoso.com. Sign-in from IP 192.0.2.45 at 03:14 UTC, location Lagos Nigeria, UserAgent "Mozilla/5.0 (Windows NT 10.0)" with MFA satisfied via push notification. The user is based in London and last signed in from a London IP at 17:42 UTC the previous day. Sentinel workspace has standard M365 tables. Generate enrichment queries and an initial severity assessment.

Claude Response (excerpt)

Severity: HIGH. The sign-in exhibits AiTM indicators: impossible travel (London 17:42 → Lagos 03:14 = 9.5 hours, flight time ~6.5 hours), MFA satisfied via push (consistent with MFA fatigue or session token replay), and a generic User-Agent string that may indicate proxy infrastructure.

// Enrichment 1: Check for concurrent sessions

SigninLogs

| where UserPrincipalName == "user_01@contoso.com"

| where TimeGenerated between (ago(48h) .. now())

| project TimeGenerated, IPAddress, Location,

ResultType, AppDisplayName, UserAgent

| sort by TimeGenerated asc

// Enrichment 2: Check for inbox rule creation

CloudAppEvents

| where AccountObjectId == "user_01@contoso.com"

| where ActionType == "New-InboxRule"

| where TimeGenerated > ago(48h)

Analyst Evaluation

✓ Correct: Severity assessment is sound. Impossible travel + MFA push + generic UA is a textbook AiTM pattern. The enrichment queries target the right tables and the right fields.

✗ Wrong field: Enrichment 2 filters CloudAppEvents on AccountObjectId with a UPN string. AccountObjectId expects a GUID, not an email address. The correct filter is AccountDisplayName or a join on AccountObjectId after resolving the GUID from IdentityInfo. This is a hallucination: the model chose a plausible field name without verifying the data type.

✗ Missing context: Claude assessed severity as HIGH based on technique alone. But the analyst has not checked whether user_01 is a Global Admin, whether the Lagos IP is a known VPN exit node, or whether the user has a travel history to Nigeria. Environmental context changes severity. AI cannot supply it.

✗ Incomplete scope: No query for AADNonInteractiveUserSignInLogs, which would show token replay activity that occurs after the interactive sign-in. AiTM attacks use stolen session cookies for non-interactive access. Missing this table misses the persistence mechanism.

That exchange demonstrates every LLM property from this section. The severity assessment and first enrichment query are correct because sign-in triage is well-represented in training data (low hallucination risk). The AccountObjectId field name error is a hallucination: the model selected a plausible field name that has the wrong data type. The missing environmental context reflects the training cutoff problem: Claude does not know whether user_01 is an admin or whether the IP belongs to a corporate VPN. The missing AADNonInteractiveUserSignInLogs query reflects the coverage gap: the model's training data contains far more examples of SigninLogs and CloudAppEvents than of non-interactive sign-in tables.

Predicting where hallucinations concentrate

Not all AI outputs carry equal hallucination risk. In security operations, risk correlates with how well the topic is represented in training data and how verifiable the output is against an external reference. Learning to predict the risk level before you read the output tells you where to invest verification time.

Low risk. KQL syntax patterns for common tables (SigninLogs, SecurityEvent, DeviceProcessEvents). Report structure and formatting. MITRE ATT&CK descriptions for well-documented techniques. PowerShell cmdlets from widely used modules (Az, Microsoft.Graph). These are well-represented in training data, and errors are immediately visible when you run the query or command. The triage severity assessment in the exchange above is a low-risk output: the AiTM pattern recognition is a well-documented, widely-discussed attack type.

Moderate risk. KQL for less common tables (AADNonInteractiveUserSignInLogs, IdentityInfo, BehaviorAnalytics). Field names that changed in recent schema updates. API endpoints for services that update frequently. Regulatory citations for frameworks that have been revised (NIST CSF 2.0 replacing 1.1). The CloudAppEvents field error in the exchange above is a moderate-risk hallucination: the table is common, but AccountObjectId's data type (GUID, not string) is a detail the model conflated with similar fields on other tables. Schema verification catches these in seconds.

High risk. Threat actor attribution. Campaign timeline details. Specific CVE descriptions, affected versions, and patch status. Incident statistics and metrics. Tool-specific configuration that varies by version. If the analyst had asked Claude "which threat group operates the AiTM infrastructure behind this IP," the model would generate a plausible attribution (complete with campaign name, motivation, and geographic origin) that might be entirely fabricated from patterns in its training data. Never include AI-generated attribution, statistics, or vulnerability specifics in a deliverable without verifying against primary sources.

The Claude Exchange above illustrates the gradient. The triage assessment (low risk, well-represented pattern) was correct. The CloudAppEvents field name (moderate risk, less common table) was wrong. If the analyst had asked Claude for the threat actor behind a specific AiTM campaign (high risk), the model would generate a plausible attribution that might be entirely fabricated.

Anti-Pattern

Treating AI output as a draft that needs editing rather than output that needs verification

An analyst receives the CloudAppEvents query from the exchange above, reads it quickly, decides it looks right, and runs it. It returns zero results. The analyst concludes no inbox rules were created. In reality, the query returned zero results because the filter field was wrong, not because no rules exist. The analyst has a false negative from a field-type mismatch. Editing would have missed this because the field name looks correct. Verification (Check 2: does this field exist with this data type?) catches it in seconds. Every AI output is unverified until you verify it. The five-check discipline from C0.2 is the verification method.

The false negative from the CloudAppEvents error is the cost of skipping verification. In a real investigation, that gap means the analyst closes the alert as benign while the attacker's inbox rule continues intercepting financial emails. The difference between editing and verification is the difference between reading the query and checking the schema.

The four AI categories, the four LLM properties, and the hallucination risk gradient give you the vocabulary to evaluate every AI output before you trust it. The next section applies that vocabulary across six SOC functions, showing you where AI acceleration is highest and where the verification cost justifies the risk.

← Previous Next →

Reading width

Text size

What AI Actually Does (and Does Not Do)

Three categories of "AI" in security products

How large language models work: what practitioners need to know

Seeing the mechanics in action

Predicting where hallucinations concentrate