In this section

7.1 Microsoft Sentinel: SIEM + SOAR Architecture

16-20 hours · Module 7
What you already know

Module 6 covered KQL fundamentals: the query language that powers detection and investigation. This module covers the platform where those queries run: Microsoft Sentinel. Workspace architecture, data ingestion, cost management, and the operational decisions that determine whether Sentinel is a security tool or an expensive log store.

Introduction

Required role: Microsoft Sentinel Contributor for workspace configuration. Sentinel Reader for queries.

Azure Portal

Microsoft SentinelOverview
Check the workspace overview: ingestion volume trend, active incidents, and connected data sources.

Every module in this course generates data. Defender XDR generates alerts and incidents. Defender for Endpoint generates device telemetry. Entra ID generates sign-in and audit logs. Purview generates DLP and audit events. Defender for Cloud generates security alerts and recommendations. Until now, you have accessed this data through product-specific portals: the Defender portal for XDR data, the Purview portal for audit data, the Azure portal for cloud security data.

Sentinel is where all of this data converges. It is the central data platform that ingests security data from every source, stores it in a queryable format, runs detection rules against it, generates incidents when threats are found, and enables automated response through playbooks. Sentinel is not just another portal, it is the data infrastructure that makes cross-product investigation (the technique you learned in Modules 3.9, 4.10, and 5.10) operationally scalable.

Anti-Pattern

Deploying Sentinel without a workspace design

A single Sentinel workspace is created with all data sources connected. Six months later, ingestion costs are uncontrollable because nobody designed which data belongs in analytics tier, which in basic logs, and which should not be ingested at all.

This subsection teaches you what Sentinel is architecturally, how it processes data from ingestion to response, and how it differs from traditional SIEM platforms. This architectural understanding is essential for the configuration decisions in subsections 7.2 through 7.12.

What a SIEM does and why it matters

A Security Information and Event Management (SIEM) system serves five functions in security operations.

SIEM — FIVE CORE FUNCTIONS ① Collect Ingest logs from all sources ② Store Retain for query and compliance ③ Detect Run rules against ingested data ④ Investigate Query data with KQL ⑤ Respond Automate actions via playbooks
Figure 7.1: The five core SIEM functions. Sentinel implements all five, plus SOAR (Security Orchestration, Automation, and Response) capabilities that extend the Respond function with full workflow automation through Logic Apps playbooks.

Collect, ingest log data from every security-relevant source: endpoints, identities, email, cloud infrastructure, network devices, firewalls, applications, and third-party security tools. Without comprehensive collection, threats that generate signals in uncollected sources are invisible.

Store, retain the collected data for investigation and compliance. Retention periods vary by regulation and operational need: some data needs 90 days for active investigation, other data needs 7 years for regulatory compliance. Storage must be cost-effective at scale: a mid-size organization generates 10-50 GB of security log data per day.

Detect, run detection rules (analytics rules in Sentinel) against the stored data. Rules evaluate patterns that indicate threats: a sign-in from a new country followed by inbox rule creation (BEC pattern), a process creating a scheduled task after executing a suspicious download (malware persistence), or a storage account accessed from a Tor exit node (data exfiltration). When a rule matches, an alert is generated and grouped into an incident.

Investigate, query the stored data to understand what happened. KQL queries (Module 6) are the primary investigation tool. The investigation combines data from multiple tables to build the cross-product timelines you practiced in Modules 3.9, 4.10, and 5.10.

Respond, take action when a threat is confirmed. Automated responses (playbooks) can disable accounts, isolate devices, block IPs, create tickets, send notifications, and orchestrate multi-step remediation workflows. Manual responses use the investigation findings to guide containment and eradication actions.

How Sentinel differs from traditional SIEM

Traditional SIEM platforms (Splunk, IBM QRadar, ArcSight, LogRhythm) were designed for on-premises deployment. Sentinel was designed for cloud-native operation. The differences are not cosmetic, they fundamentally change the operational model.

Traditional SIEM vs Microsoft Sentinel
AspectTraditional SIEMMicrosoft Sentinel
DeploymentOn-premises servers, sized at purchaseCloud-native, scales automatically
Capacity planningBuy hardware for peak + growthPay per GB ingested: no hardware
ScalingBuy more servers, re-architectScales transparently, ingest more, pay more
MaintenancePatch OS, update SIEM, manage storageMicrosoft manages infrastructure
Query languageSPL (Splunk), AQL (QRadar), customKQL (shared with Defender XDR, Azure)
Microsoft integrationRequires connectors, often incompleteNative first-party, one-click connectors
AutomationSeparate SOAR product (Demisto, Phantom)Built-in SOAR (Logic Apps playbooks)
Pricing modelLicense-based (per EPS, per source)Consumption-based (per GB/day)
Time to deployWeeks to monthsHours (workspace creation to data ingestion)
Threat intelligenceSeparate TI platformBuilt-in TI integration + Microsoft TI
The practical impact: With traditional SIEM, increasing your log coverage from 10 sources to 50 sources requires hardware upgrades, storage expansion, and capacity planning. With Sentinel, you enable additional data connectors and the platform scales automatically. The barrier to comprehensive visibility is budget (cost per GB), not infrastructure. This changes the security conversation from "we can't collect that data because we don't have capacity" to "we can collect any data: the question is whether the detection value justifies the ingestion cost."

The Log Analytics workspace: Sentinel's data foundation

Sentinel does not have its own data store. It runs on top of a Log Analytics workspace: the Azure data platform that stores log data in tables and provides KQL query capability. When you "create a Sentinel workspace," you are actually enabling the Sentinel solution on an existing (or new) Log Analytics workspace. The workspace is the foundation. Sentinel adds the security layer: analytics rules, incidents, automation, hunting, workbooks, and the security-specific features.

Understanding this architecture matters because: Log Analytics workspace settings (retention, access controls, pricing tier) directly affect Sentinel's capabilities and cost. Some settings are configured at the workspace level (retention policies, data collection rules) and others at the Sentinel level (analytics rules, automation rules). The workspace can contain non-security data (performance counters, application logs) alongside security data. Sentinel queries can access all data in the workspace, not just security tables.

SENTINEL ARCHITECTURE — LAYERED MODEL M365 / Defender Entra ID Azure Activity Firewalls / Syslog Custom Apps AWS / GCP Log Analytics Workspace (Data Platform) Microsoft Sentinel (SIEM + SOAR Layer) Analytics Rules Incidents Automation Hunting Workbooks TI
Figure 7.2: Sentinel's layered architecture. Data sources feed into the Log Analytics workspace (the data platform). Sentinel runs on top of the workspace, providing analytics rules, incidents, automation, hunting, workbooks, and threat intelligence. The workspace stores the data; Sentinel makes it actionable for security operations.

The SOAR component: automation rules and playbooks

SIEM collects data and detects threats. SOAR (Security Orchestration, Automation, and Response) automates the response. Sentinel includes SOAR natively through two mechanisms.

Automation rules are lightweight, no-code logic that runs when an incident is created or updated. They can: change the incident severity (escalate a medium-severity incident to high if it involves a VIP user), assign the incident to a specific analyst or team (route BEC incidents to the email security team), add tags for categorisation (tag incidents involving external IPs with "external-threat"), suppress known false positive patterns (auto-close incidents from a known-benign source), and trigger playbooks for more complex automation.

Automation rules are the first line of automated response, they handle the routine incident management tasks that otherwise consume analyst time. In a mature SOC, automation rules handle 60-80% of incident management actions automatically: assigning, tagging, severity adjustment, and false positive suppression. The analyst's time is reserved for investigation and response decisions that require human judgment.

Playbooks are full automation workflows built on Azure Logic Apps. They can call any API, integrate with any service, and orchestrate multi-step response workflows. Examples: when a high-severity BEC incident is created, a playbook automatically resets the user's password, revokes all sessions, checks for inbox forwarding rules (and deletes them if found), sends a notification to the user's manager, creates a ticket in ServiceNow, and posts an alert to the SOC Slack channel. All within 30 seconds of the incident being created.

Playbooks bridge the gap between detection and response. Without automation, the sequence is: detection → analyst reads alert → analyst decides action → analyst takes action (multiple portal clicks). With a playbook, the sequence is: detection → playbook takes pre-approved actions immediately → analyst reviews and adjusts. The time from detection to initial containment drops from minutes (manual) to seconds (automated).

Module 10 covers analytics rules (detection) and automation rules in detail. Module 8 covers playbook creation. This subsection establishes the architectural context: Sentinel is not just a data collection platform, it is a detection and response platform where automation is a core capability, not a bolt-on.

Data flow: from source to investigation

Understanding how data flows through Sentinel clarifies the configuration decisions in subsequent subsections.

Step 1: Ingestion. Data sources send log data to the Log Analytics workspace through data connectors (Module 8). Microsoft sources (Defender XDR, Entra ID, Azure Activity) use built-in connectors with minimal configuration. Third-party sources (firewalls, SaaS applications, endpoint tools) use Syslog/CEF, API connectors, or custom data collection rules. Each data source writes to a specific table: Defender XDR writes to SecurityAlert and SecurityIncident, Entra ID writes to SigninLogs and AuditLogs, endpoints write to DeviceProcessEvents and DeviceNetworkEvents.

Step 2: Storage. Ingested data is stored in the workspace according to the table's log tier assignment: Analytics (full query capability, 90-day default retention), Basic (limited query, 30-day retention, lower cost), or Archive (no live query, long-term retention, lowest cost). The tier determines what you can do with the data. Analytics tier data supports full KQL queries and analytics rules. Basic tier data supports limited queries. Archive tier data must be restored before querying. Subsection 7.4 covers tier selection in detail.

Step 3: Detection. Sentinel analytics rules run on a schedule (every 5 minutes, every hour, etc.) and evaluate KQL queries against the ingested data. When a rule's query returns results, an alert is created. Alerts are grouped into incidents by the correlation engine based on shared entities (the same user, IP, or device appears in multiple alerts). Incidents appear in the Sentinel incident queue for analyst investigation. Module 10 covers analytics rule creation.

Step 4: Investigation. The analyst opens the incident, reviews the alerts, and uses KQL to query the workspace data for additional context. This is where the skills from Modules 1-6 converge: KQL queries (Module 6) against sign-in data (Module 1), endpoint data (Module 2), audit data (Module 3), cloud security data (Module 4), with Copilot assistance (Module 5).

Step 5: Response. Based on the investigation findings, the analyst takes containment and remediation actions, either manually (through the Defender portal or Azure portal) or through playbooks that automate pre-approved response actions. Automation rules can also trigger automatic response at the moment of incident creation, before the analyst even sees the incident.

Why Sentinel for Microsoft environments

If your security stack is primarily Microsoft (M365, Azure, Entra ID, Defender XDR), Sentinel provides native integration that no third-party SIEM can match. The Defender XDR data connector ingests all Defender product data with one click. The Entra ID connector ingests sign-in and audit data with one click. The Azure Activity connector ingests management plane data with one click. The data arrives in well-structured tables with consistent schema. The KQL queries you write for Defender XDR Advanced Hunting work in Sentinel with minimal modification (some table names differ, but the query patterns are identical).

For organizations running mixed environments (Microsoft + third-party firewalls, non-Microsoft endpoint tools, SaaS applications), Sentinel provides Syslog/CEF connectors for network devices, API connectors for SaaS platforms, and custom data collection rules for bespoke data sources. The unified workspace means you query Microsoft data and third-party data with the same KQL, in the same workspace, with the same investigation workflow.

The alternative, running a third-party SIEM alongside Defender XDR, creates a split-brain problem: some data is in the SIEM, some is in Defender XDR, and cross-product investigation requires querying both platforms. Sentinel eliminates the split brain by serving as both the SIEM and the investigation platform.

SC-200 exam assumption

The SC-200 exam assumes Sentinel is your SIEM. Questions reference Sentinel tables (SecurityAlert, SigninLogs, DeviceProcessEvents), Sentinel features (analytics rules, automation rules, hunting, workbooks), and the unified security operations platform (Sentinel + Defender XDR). The exam does not test third-party SIEM products. Your Sentinel knowledge is the foundation for 40-45% of the exam questions.

Try it yourself

If you have a Sentinel workspace from Module 0 setup, navigate to Microsoft Sentinel in the Azure portal. Review the Overview page: the data ingestion volume chart (how much data is entering the workspace daily), the active analytics rules count, the open incidents count, and the enabled data connectors. If no Sentinel workspace exists, create one now: the setup takes 5 minutes and is covered step-by-step in subsection 7.3. Having a live workspace to work with through the rest of this module transforms the content from theoretical to hands-on.

The Overview page shows a dashboard with ingestion trends, incident metrics, and connector status. In a fresh lab workspace, you may see minimal data (if no connectors are enabled) or moderate data (if you connected Defender XDR and Entra ID in Module 0). The key metric to note: daily ingestion volume in GB. This number drives cost (subsection 7.5) and determines which log tiers are appropriate (subsection 7.4).

Compliance Context

Licensing a product is not deploying a product. Defender for Endpoint requires onboarding devices, configuring policies, and validating telemetry. Defender for Office 365 requires configuring anti-phishing policies, safe links, and safe attachments for your specific domains and VIPs. An enabled-but-unconfigured Defender product provides less protection than a properly configured open-source alternative. Configuration is the capability: the license is just the starting point.

Section Reference

Review the techniques covered in this subsection and add the operational patterns to your team's runbook. Each pattern is adaptable to your specific environment and security requirements.