In this section

Analyst Tiers and Role Architecture

8-10 hours · Module 1 · Free

What you already know

Section 1.1 documented the operating model — who does the work and when. This section defines how the work is divided within the team. L1, L2, and L3 are not seniority grades — they're capability specializations with different scope, different tools, and different time horizons. You'll define the boundaries that determine when an alert moves between tiers.

Tiers are specializations, not ranks

Scenario

Your L1 analyst spends 14 minutes on an alert — checking sign-in logs, correlating with inbox activity, querying for related device events. The investigation is thorough. It's also an L2 investigation being done at L1 speed, on L1 time, while 12 other alerts queue up. Meanwhile, the L3 analyst planned to spend this morning reviewing false positive patterns — but they've been pulled into queue coverage because three high-severity alerts arrived during a staffing gap. Nobody is doing L3 work. Nobody notices because L3 work has no SLA.

The three-tier SOC model is universal in design and dysfunctional in practice. Nearly every SOC defines L1, L2, and L3 roles. Nearly every SOC operates them as seniority bands rather than capability specializations — L1 is junior, L2 is senior, L3 is whatever the most experienced person does when they're not helping with queue overflow. The problem isn't the model. The problem is that the boundaries between tiers are undefined, so alerts flow wherever capacity exists rather than where capability matches.

A tier is a capability specialization with a defined scope, a defined time boundary, and a defined escalation trigger. L1 triage has a 15-minute scope — classify or escalate within 15 minutes. L2 investigation has a 2-hour scope — determine full scope and recommend action within 2 hours. L3 deep analysis has no queue pressure — they work detection feedback, complex investigations, and program improvement on protected time. The boundaries make the tiers functional. Without them, everyone does everything and nobody does the feedback loop.

Estimated time: 40 minutes.

Figure 1.2 — The three tiers as capability specializations. Each tier has a defined scope, time boundary, and escalation trigger. Without enforced boundaries, L1 analysts do L2 work slowly while L3 analysts cover queue gaps instead of maintaining the feedback loop.

L1 — Triage

L1 triage is the first human evaluation of every alert. The L1 analyst's job is classification: determine whether the alert represents a true positive, false positive, benign true positive, or requires escalation — and do it within 15 minutes. This is not superficial work. It's structured evaluation using a documented decision framework (built in Section 1.5) that checks specific criteria in a specific order.

The 15-minute boundary

Fifteen minutes is the L1 scope boundary. If the analyst can't classify the alert in 15 minutes, the alert exceeds L1 capability and must be escalated to L2. This boundary exists for two reasons. First, it prevents the L1 analyst from spending L2 time on a single alert while 12 others queue. Second, it recognizes that if the analyst can't classify in 15 minutes, the alert likely requires investigation depth that the L1 decision framework isn't designed to provide.

The 15-minute boundary is not a speed target — it's a scope boundary. An analyst who closes every alert in 3 minutes is probably not investigating thoroughly enough. An analyst who spends 14 minutes on every alert is probably doing L2 work at L1 speed. The target is for 70-80% of alerts to resolve at L1 within 15 minutes using the triage framework. The remaining 20-30% escalate to L2 for investigation.

What L1 does in 15 minutes

The L1 triage follows the decision framework from Section 1.5. For each alert, the analyst: reads the alert details (rule name, severity, entities, timestamps). Checks the entity against known-good lists (service accounts, VPN exit IPs, scheduled processes). Runs the standard enrichment queries for the alert type (sign-in history for credential alerts, inbox rule audit for email alerts, process history for endpoint alerts). Evaluates the enrichment results against the classification criteria. Records the disposition with a brief rationale.

The L1 fifteen minutes — how the time breaks down

0:00-1:00 — Read alert: rule name, severity, entities, timestamps, context

1:00-3:00 — Check known sources: VPN IPs, service accounts, scheduled processes, VIP list

3:00-8:00 — Run enrichment queries: sign-in history, MFA method, related alerts, entity context

8:00-12:00 — Evaluate results against classification criteria: TP / FP / BTP / Undetermined

12:00-15:00 — Record disposition with rationale — or escalate if classification isn't possible

At 15:00 — if not classified, escalate via capability trigger. Don't extend. The queue is waiting.

The key distinction is between alerts the framework resolves and alerts the framework can't resolve. A brute-force alert where the source IP is a known pen-test scanner resolves immediately (FP, known scanner). A credential compromise alert where MFA succeeded normally and the IP matches the user's history resolves in 5 minutes (BTP or FP). A credential alert where the user is a finance executive, the IP is unfamiliar, and MFA succeeded via a "satisfied by claim" assertion rather than interactive prompt does not resolve within 15 minutes — because it requires L2 investigation to determine whether the authentication is legitimate or the result of an AiTM token capture. That alert escalates.

What L1 doesn't do

L1 does not investigate. Investigation means tracing the full scope of an incident — what systems did the attacker access, what actions did they take, what data was exposed, what containment is needed. Investigation requires KQL queries that join multiple tables, entity timeline reconstruction, and judgment about scope that the L1 framework doesn't provide. When an L1 analyst spends 30 minutes joining SigninLogs with OfficeActivity to trace an authentication anomaly through subsequent email access, they're doing L2 work on L1 time — and the queue is growing.

L2 — Investigation

L2 investigation picks up where L1 triage ends. The L2 analyst receives an escalated alert and determines the full scope of the incident — every action the attacker took, every system they accessed, the complete timeline from initial access to current state.

The 2-hour investigation boundary

L2 investigations have a 2-hour soft boundary. Most incidents that L2 handles — credential compromise, inbox rule manipulation, suspicious application consent — can be scoped within 2 hours using KQL queries in Advanced Hunting, entity pages in Defender XDR, and UEBA insights in Sentinel. If an investigation exceeds 2 hours, it typically indicates either a multi-system compromise that requires L3/IR involvement or an investigation that needs a different approach.

What L2 produces

Every L2 investigation produces three outputs. First, a scope determination — what happened, what the attacker accessed, what the blast radius is. Second, a containment recommendation — what actions should be taken (disable account, revoke sessions, block IP, isolate device) and whether those actions require approval from the SOC lead or security management. Third, an investigation record in the ticket — the evidence collected, the queries run, the timeline reconstructed, and the rationale for the determination.

NE L2 Investigation Output — INC-NE-2026-0342 (Excerpt)

Scope Determination

Account c.richardson@ne.com compromised via AiTM token capture at 06:47. Inbox rule "Archive-2026" created at 06:49 moving invoice/payment emails to hidden folder. MailItemsAccessed shows 847 email reads between 06:50-08:15. No external forwarding. No lateral movement to other accounts.

Containment Recommendation

1. Disable account immediately. 2. Revoke all sessions. 3. Delete inbox rule "Archive-2026". 4. Reset password + MFA. 5. Review MailItemsAccessed for sensitive data exposure.

Classification

True Positive → Severity 2 (VIP Finance user, confirmed active compromise). SOC lead notified.

The investigation record is critical for two purposes beyond the immediate incident. It feeds the quality metrics (Section 1.6) — was the L1 escalation appropriate? Was the L2 determination correct? It also feeds the detection feedback loop — did the investigation reveal that the detection rule should have fired earlier, or that a detection gap exists for a technique the attacker used?

L2 tools and skills

L2 analysts need deeper tool proficiency than L1. KQL at an intermediate level — joins, summarize with makeset/makelist, time-series analysis, and cross-table correlation. Defender XDR Advanced Hunting for cross-product queries. Entity pages for quick entity context. UEBA for behavioral anomaly context. And the judgment to distinguish correlation from causation — just because two events happened near each other in time doesn't mean they're related.

The L1-to-L2 escalation quality problem

The quality of the L1-to-L2 handoff determines whether L2 time is spent investigating or re-triaging. A well-structured escalation includes: what the alert is, what enrichment the L1 analyst ran, what the results showed, and specifically what the L1 analyst couldn't determine. "I ran the sign-in history query and the IP is unfamiliar. The user is in Finance. MFA succeeded via claim assertion rather than interactive. I can't determine if this is legitimate or AiTM. Requesting L2 investigation."

A poorly structured escalation is: "Looks suspicious. Escalating." That escalation forces L2 to restart triage from zero — consuming 15 minutes of L2 time on work L1 already did (or should have done). The escalation format is defined in Section 1.4. Getting it right is the difference between L2 spending 80% of their time investigating and 80% of their time re-triaging.

The metric that catches this is escalation accuracy — what percentage of L1 escalations result in L2 determining the alert actually warranted investigation (not re-classifiable at L1). A healthy escalation accuracy rate is 60-80%. Below 50% means L1 is over-escalating — either the triage framework is too conservative or L1 training needs attention. Above 90% might mean L1 is under-escalating — keeping alerts they should hand off, leading to missed investigations or delayed detection.

L3 — Deep analysis and the feedback loop

L3 is where the feedback loop lives. The L3 analyst handles complex investigations that exceed L2 scope (multi-system compromises, insider threat indicators, APT-pattern activity) and, more importantly, performs the proactive work that improves the SOC over time — false positive analysis, detection rule development, ATT&CK coverage assessment, and analyst training.

The time protection problem

L3 is the most important tier and the most frequently stolen. When queue pressure builds — three high-severity alerts arrive during a staffing gap, or an analyst is on leave and the remaining team can't keep up — the L3 analyst is the obvious person to pull into queue coverage. They're the most experienced. They resolve alerts fastest. And every hour they spend in the queue is an hour not spent on the work that makes the SOC better.

In a survey of SOC leads, the most common pattern is L3 time allocated at 60% proactive work and 40% queue support in theory, and 20% proactive work and 80% queue support in practice. The queue always wins because it has an SLA. The feedback loop never has an SLA. Detection tuning can always wait until next week. Next week the queue is heavy again. The tuning never happens.

What L3 time protection looks like

NE's post-incident restructuring addressed this directly. The SOC lead's calendar now blocks 10 hours per week as "L3 protected time" — visible to the team, not available for queue coverage except during declared incidents. During L3 time, the lead reviews false positive data from the past week, evaluates escalation patterns for training opportunities, works the detection backlog (converting investigation findings into detection rules), and runs the monthly tuning review.

The protection has to be structural, not aspirational. A policy that says "the L3 analyst should spend 50% of time on proactive work" fails when the queue demands attention. A calendar block that the team treats as equivalent to an analyst being on leave succeeds because the L1/L2 analysts plan for reduced queue capacity during those hours. NE runs with two analysts covering L1/L2 during the SOC lead's L3 blocks — the same staffing they'd have if the lead were on leave.

Run this query to see how your SOC's investigation time is actually distributed — it reveals whether L2/L3 time is being consumed by triage work:

KQL — Analyst Time Distribution (Who's Doing What?)

// How is analyst time distributed across triage vs investigation?
SecurityIncident
| where TimeGenerated > ago(30d)
| where Status == "Closed"
| extend Analyst = Owner.assignedTo
| extend TimeToClose = datetime_diff("minute",
    ClosedTime, CreatedTime)
| summarize
    IncidentsClosed = count(),
    MedianMinutes = percentile(TimeToClose, 50),
    AvgMinutes = round(avg(TimeToClose), 1),
    Quick = countif(TimeToClose < 15),
    Extended = countif(TimeToClose > 60)
    by Analyst
| extend QuickRate = round(100.0 * Quick / IncidentsClosed, 1)

If one analyst has a significantly higher Extended count, they're doing L2 work on L1 time. If the SOC lead has a high QuickRate, they're being consumed by queue work instead of L3 activities. The numbers don't lie — they show how tiers actually operate vs how they're supposed to operate.

What we see in 90% of SOC tier structures

L1, L2, and L3 are defined in the job descriptions but not in the operating procedures. The boundaries are "L1 handles simple stuff, L2 handles complex stuff, L3 handles the really complex stuff." Nobody defined "simple" or "complex." In practice, L1 analysts who are thorough spend 25 minutes on alerts that should escalate at 15. L2 analysts get pulled into L1 queue coverage during peaks. L3 analysts rarely do L3 work because the queue always needs help. The result: investigation quality depends on who's on shift, the feedback loop doesn't run, and the SOC processes the same quality of alerts forever.

NE's tier structure — three people, full coverage

NE's SOC has three people: Tom Ashworth (L1/L2 rotation), Priya Sharma (L1/L2 rotation), and the SOC lead (L2/L3 with protected L3 time). BlueVoyant covers after-hours L1 triage. This is a minimal team. The tier structure works because the boundaries are explicit.

During business hours, one analyst runs L1 triage (queue primary) while the other is available for L2 investigation. They rotate daily. When no L2 investigations are active, the second analyst also triages L1 — but the L2 role takes priority over L1 throughput. If an L1 alert needs escalation, the L2 analyst drops L1 work immediately.

The SOC lead handles complex L2 escalations and dedicates 10 protected hours per week to L3 work. The protection is enforced by the team: during L3 blocks, Tom and Priya handle the full queue. If a Severity 1 incident arrives during L3 time, the lead engages — that's the exception that overrides protection. Everything else waits.

What the structure produces

With boundaries enforced, NE's three-person team produces: consistent L1 triage quality (because the decision framework runs within 15 minutes, not ad-hoc investigation), L2 investigations that scope incidents fully (because the L2 analyst isn't also trying to keep the L1 queue moving), and genuine L3 improvement work (because 10 hours per week is protected, not aspirational).

Before the boundaries, the same three people produced: inconsistent triage (because some alerts got 3 minutes and others got 25), incomplete investigations (because the L2 analyst was simultaneously monitoring L1), and zero L3 work (because the SOC lead was always in the queue). Same people. Same tools. Different structure.

Career progression through tiers

The tier structure also defines career progression. L1 competencies: triage framework execution, basic enrichment queries, alert classification accuracy above 85%. L2 competencies: KQL at intermediate level, cross-table investigation, scope determination, containment recommendation. L3 competencies: false positive root cause analysis, detection rule development, ATT&CK coverage assessment, program metrics and reporting.

Each tier has defined competencies that the analyst can work toward. Progression from L1 to L2 happens when the analyst demonstrates L2 competencies consistently — not after a fixed time period. Some analysts reach L2 proficiency in 6 months. Others take 18 months. The competency framework makes the criteria transparent.

This progression path matters for retention. One of the primary drivers of SOC burnout is the feeling of being stuck — triaging the same alerts with the same tools and no visible path to growth. A defined competency framework gives the L1 analyst concrete skills to develop and a measurable threshold to reach. At NE, Tom and Priya both started at L1. Tom reached L2 proficiency in 8 months by focusing on KQL skill development. Priya reached L2 in 10 months by developing stronger investigation methodology. Both paths are valid because the competency framework measures output, not approach. The framework also makes the conversation with management concrete: "I've demonstrated L2 competencies in these areas — here's the evidence — and I'm requesting the role change" is a defensible conversation that doesn't depend on subjective assessment.

SOC Operations Principle

Tiers are capability specializations with defined scope boundaries, not seniority grades. L1 has 15 minutes to classify or escalate. L2 has 2 hours to scope and recommend. L3 has protected time for the work that makes the SOC better. Without enforced boundaries, everyone does everything, the feedback loop never runs, and the SOC stays exactly where it is.

Section 1.3 — Shift Handover Design. Tiers define how work divides within a shift. Handover defines how work transfers between shifts — what context passes, what gets lost, and how to ensure investigation continuity when the person who started the investigation isn't the person who finishes it.

Unlock the Full Course See Full Course Agenda

Get weekly detection and investigation techniques

KQL queries, detection rules, and investigation methods — the same depth as this course, delivered every Tuesday.

No spam. Unsubscribe anytime. ~2,000 security practitioners.

← Previous Next →