In this section

TH0.12 Hunting Maturity Models

3-4 hours · Module 0 · Free

What you already know

Section 0.11 defined the five cognitive skills that separate effective hunters from analysts running queries. You know what makes hunting work at the individual level. But individual skill alone doesn't produce a hunting program. Programs require organizational infrastructure: methodology, tooling, time allocation, data collection, and management commitment. The Hunting Maturity Model provides a five-level framework for assessing where your organization sits and what specific investments advance you to the next level. This section teaches the model, shows how to score your own organization honestly, and maps each level to the M365 capabilities that define it.

Scenario

Rachel Okafor presents the hunting program proposal to NE's risk committee. The committee chair asks a direct question: "Where are we today, and where will we be in twelve months?" Rachel needs more than "we'll start hunting." She needs a framework that places Northgate Engineering on a defined progression, shows the current level with evidence, defines the target level with specific criteria, and maps the investment required to close the gap. The committee manages risk by measuring it. An answer without a measurement framework won't survive the follow-up questions.

Why maturity models matter for hunting programs

Detection rules, response playbooks, and vulnerability management all have established measurement frameworks. Hunting programs frequently lack one. The result is that leadership can't distinguish between a team that runs ad hoc KQL queries on Friday afternoons and a team that executes structured hypothesis-driven campaigns with documented findings, detection rule outputs, and measurable coverage improvement. Without a framework, the hunting budget competes with every other security investment on subjective claims rather than observable criteria.

A maturity model provides three things. First, it gives you an honest assessment of current capability, defined by artifacts you can demonstrate rather than aspirations you describe. Second, it defines the next level in terms of specific, observable changes. Third, it gives leadership a progression they can track quarterly, which converts the hunting program from a discretionary investment into a measurable capability building toward a defined target.

The Hunting Maturity Model: origin and structure

David Bianco created the Hunting Maturity Model while working as a security architect and threat hunter at Sqrrl. The model describes five levels of organizational hunting capability, from HMM0 (no proactive activity) through HMM4 (continuous automation of proven hunting procedures with analyst effort focused on novel threats). It has become the most widely adopted framework for hunting capability assessment, referenced by SANS, the Sqrrl Threat Hunting Reference Model, and adopted by SOC maturity programs globally.

The model's key insight is that hunting maturity depends on two independent axes: the quality and quantity of data the organization routinely collects, and the skill level of the analysts performing the hunts. Your actual maturity level is the lower of the two. An organization that collects every M365 log into Sentinel but assigns analysts who can only run pre-built queries from a blog post is constrained by the analyst axis. An organization with skilled hunters but incomplete data ingestion, missing AADNonInteractiveUserSignInLogs or CloudAppEvents for example, is constrained by the data axis.

This two-axis structure explains a common frustration: teams that deploy Sentinel, connect all data sources, and still produce no hunting results. The data axis is at HMM3 or HMM4, but the analyst skill axis sits at HMM0. The overall maturity is HMM0. The investment required isn't more data. It's analyst development, methodology, and protected time.

The five levels

HMM0 — Initial. The organization relies entirely on automated alerting. Sentinel analytics rules fire, someone investigates. When no rule fires, no one looks. No analyst time is dedicated to examining data proactively. The data may be flowing into Sentinel, but only automated queries consume it. Most organizations that have deployed Defender XDR and connected Sentinel data connectors but haven't formalized proactive search operations sit here. The detection gap from Section 0.1 is fully open, with 73 uncovered techniques and no one looking for them.

In an M365 context, HMM0 looks like this: Sentinel ingests SigninLogs, SecurityAlert, and maybe DeviceEvents. The SOC processes the alerts those tables generate through analytics rules. Advanced Hunting exists but nobody opens it unless they're investigating a specific alert. There's no hypothesis, no documentation, and no hunt-derived detection rules. The only coverage expansion comes from enabling additional vendor rule templates.

HMM1 — Minimal. The organization collects data into Sentinel and analysts can search it reactively. Hunting happens, but only in response to specific triggers: a threat intelligence report mentions a new technique, an incident investigation reveals a wider pattern, or a vendor advisory prompts a check. There is no scheduled cadence, no hypothesis backlog, no documentation standard, and no output pipeline that converts hunt findings into permanent detection rules.

An analyst at HMM1 might read a blog post about AiTM phishing, open Advanced Hunting, run a query checking AADNonInteractiveUserSignInLogs for unusual session token patterns, find nothing conclusive, and move on without recording the effort. The searching capability exists. The discipline to document, track, and repeat does not. This is where many organizations land after deploying a SIEM. It's the most common maturity level in the industry, and it's the level where Bianco's model draws the first meaningful line: HMM1 is the minimum level at which any hunting occurs, because the analyst is at least searching proactively beyond what the automated rules cover.

HMM2 — Procedural. This is the defining transition. Hunting follows a documented methodology. Hypotheses come from defined sources: ATT&CK coverage gaps, threat intelligence reports, incident post-mortems, or environmental changes. Hunts are scoped before execution, run against specified data sources, analyzed with documented criteria, and recorded in hunt records that peers can review. The critical output pipeline exists: hunt findings produce detection rules that close the gap the hunt identified. There's a maintained backlog of hypotheses. There's a cadence, even if it's only one campaign per month.

At HMM2, the organization follows data analysis procedures created by others. The hypothesis templates, query patterns, and documentation frameworks come from external sources: conference presentations, community hunting playbooks, courses like this one, or shared threat intelligence. The analysts adapt those procedures to their environment, interpret the results, and produce original findings. They're not yet creating entirely new analytical techniques, but they're competent at applying established ones.

This course targets HMM2. An organization that completes the methodology module (TH1), builds the ATT&CK coverage analysis (TH3), and executes campaign modules (TH4 through TH13) with documented outputs operates at HMM2. That's not a conservative target. It's a realistic one that most organizations can reach within six months of sustained effort, and it represents the operational level where hunting consistently produces detection rules and measurable coverage improvement.

HMM3 — Innovative. Threat intelligence drives hypothesis generation systematically, through structured TI consumption that produces backlog items within 48 hours of relevant reports rather than ad hoc blog reading. Frequently-executed hunts are automated as scheduled queries that run without manual initiation but still require analyst review. Behavioral baselines generate hypotheses automatically: UEBA identifies statistical anomalies that feed the hunting pipeline. Program metrics are tracked and reported to leadership with quantified outcomes.

The key distinction from HMM2 is that analysts at HMM3 create new data analysis procedures. They develop original analytical techniques tailored to their environment's specific threat profile, rather than adapting community playbooks. They build novel KQL patterns that detect behaviors no published query addresses. They identify data correlations across tables that produce findings unique to their organization's telemetry. Few organizations reach HMM3 without dedicated hunting resources and at least twelve months of sustained HMM2 operations.

HMM4 — Leading. Routine hunting activities are automated. Human effort focuses on novel hypothesis generation and investigation of the most complex, creative threats. New hypotheses are generated from TI feeds with minimal manual intervention. The hunt-to-detection pipeline operates automatically for well-understood technique categories. Analysts spend their time on adversary-simulation-informed hypotheses that no automated system would generate. This level is rare and develops over years of sustained investment. Most organizations will never need HMM4. An HMM3 program that consistently produces original detection content and measurable coverage expansion already exceeds industry norms.

Figure TH0.12a — Hunting Maturity Model levels. Each level has observable criteria on two axes: data collection and analyst skill. Your actual maturity is the lower of the two.

Honest assessment: the two-axis test

The most common misuse of maturity models is aspirational self-assessment. A team that runs scheduled KQL queries from a shared library claims HMM3 because the hunts are "automated" and "TI-driven." But there's no hypothesis documentation, no original analytical procedures, no analyst review of automated results, and no detection rules produced from findings. Running someone else's queries on a timer is scheduled alerting, not hunting.

Bianco's two-axis design makes honest assessment straightforward. Score each axis independently, then take the lower value.

Data axis assessment. In M365, this maps directly to Sentinel data connector status. HMM0 and HMM1 organizations have basic connectors enabled: SecurityAlert, SigninLogs, maybe DeviceEvents. HMM2 organizations ingest broadly across all three hunting clusters from Section 0.10, covering identity tables (SigninLogs, AADNonInteractiveUserSignInLogs, AuditLogs), collaboration tables (CloudAppEvents, EmailEvents, OfficeActivity), and endpoint tables (DeviceProcessEvents, DeviceNetworkEvents, DeviceFileEvents). HMM3 and HMM4 organizations add non-default sources such as MicrosoftGraphActivityLogs, custom connectors for SaaS applications, or third-party feeds enriching Sentinel's native tables.

Analyst skill axis assessment. HMM0 analysts process alerts. HMM1 analysts search for IOCs when prompted. HMM2 analysts follow documented procedures, adapt community-sourced queries to their environment, and produce interpretable findings. HMM3 analysts create original analytical procedures that no external source provided. HMM4 analysts automate their proven procedures and focus exclusively on novel, creative hypothesis generation.

Northgate Engineering's current state illustrates the two-axis gap. Sentinel ingests all three data clusters: identity, collaboration, endpoint. On the data axis, NE sits at HMM2. But Tom Ashworth and Priya Sharma have run exactly zero documented hunts. On the skill axis, NE sits between HMM0 and HMM1. The overall maturity is HMM0. Rachel's proposal to the board should present this honestly: the data foundation is already in place. The investment required isn't infrastructure. It's methodology, protected time, and the structured practice that builds analyst capability from HMM0 to HMM2 on the skill axis.

Posture Assessment

Organization: Northgate Engineering

Assessment date: Current quarter

Data collection axis: HMM2. Sentinel ingests identity, collaboration, and endpoint clusters. AADNonInteractiveUserSignInLogs, CloudAppEvents, and Device* tables all active. 90-day retention on all hunting tables. MicrosoftGraphActivityLogs not yet enabled (limits Graph API abuse detection).

Analyst skill axis: HMM0. SOC analysts triage alerts from analytics rules. No proactive search activity documented. No hypothesis generation. No hunt records exist. Analysts have KQL proficiency from alert investigation but have not applied it to hypothesis-driven search.

Overall HMM level: HMM0 (constrained by analyst skill axis)

Gap to HMM2: Data axis already meets HMM2 requirements. Skill axis requires: documented methodology, hypothesis backlog from ATT&CK coverage analysis, minimum 4 hours/week protected time, hunt record template, and hunt-to-detection rule pipeline. Estimated timeline: 3 to 6 months with structured campaign execution.

Recommendation: Invest in methodology and protected time, not infrastructure. Begin with TH1 (Hunt Cycle methodology), execute TH3 (ATT&CK coverage analysis), run first three campaigns (TH4 through TH6) to establish cadence and documentation habit.

This posture assessment format is the artifact Rachel takes to the board. It converts subjective "we need hunting" into measurable state: here's where we are on each axis, here's the specific gap, here's what closes it. The board can track the skill axis advancing from HMM0 to HMM1 after the first documented hunt, and from HMM1 to HMM2 after the first quarter of structured campaign execution.

The HMM1-to-HMM2 transition: where programs succeed or stall

The transition from HMM1 to HMM2 is where most hunting programs either establish themselves or quietly fail. HMM1 is comfortable. The analyst searches when motivated, finds something occasionally, and reports the finding informally. There's no accountability, no cadence, and no output pipeline. The program exists in name but produces no permanent improvement to detection coverage.

HMM2 requires four changes that organizations resist. First, a documented methodology. The Hunt Cycle in TH1 provides this: hypothesis, scope, data identification, query execution, analysis, documentation, detection rule output. The methodology is a procedure the analyst follows, not a suggestion. Second, a hypothesis backlog. Random topic selection produces random value. The ATT&CK coverage analysis in TH3 produces a prioritized backlog with specific techniques your detection rules don't cover, ranked by threat relevance to your environment. Third, protected time. A hunting program that exists only when the alert queue is empty doesn't exist. Four hours per week, shielded from alert triage, is the minimum sustainable cadence. Fourth, an output pipeline. Every completed hunt produces either a new detection rule or a documented negative finding. The rule closes a coverage gap permanently. The negative finding provides audit evidence and eliminates the hypothesis from future prioritization.

Organizations stall at HMM1 when any of these four elements is missing. The most common failure: the analyst has the methodology and the backlog, but the alert queue consumes every available hour. Without protected time, the program decays back to ad hoc searching. Section 0.8 covered this as an organizational readiness prerequisite, and Section 0.15 addresses it in the 90-day implementation plan.

Other maturity frameworks: where HMM fits

The HMM is not the only framework available. Two others are worth understanding because you'll encounter them in SOC assessments and vendor conversations.

MITRE INFORM (INformed FORMation assessment) is MITRE's framework for assessing threat-informed defense maturity. Updated in 2026, INFORM evaluates three dimensions: cyber threat intelligence, defensive measures, and testing and evaluation. Each dimension has technical components scored from least to most threat-informed. INFORM is broader than HMM because it assesses your entire threat-informed defense program, not hunting specifically. If your organization already uses MITRE ATT&CK for technique mapping and detection coverage analysis, INFORM provides the assessment framework for measuring how well you apply that ATT&CK knowledge across detection, hunting, and validation. Hunting maturity is one component within INFORM's broader scope.

SOC-CMM (SOC Capability Maturity Model) assesses security operations across five domains (Business, People, Process, Technology, and Services) with twenty-six aspects including threat hunting. The 2025 Global SOC Maturity Report found that organizations with fewer than seven full-time SOC staff rarely progress beyond maturity level 2 across all domains. SOC-CMM uses a six-level scale (0 through 5) for maturity and a four-level scale (0 through 3) for capability. It's more comprehensive than HMM but also more resource-intensive to administer. If your SOC is undergoing a full capability assessment, threat hunting will be evaluated as one aspect within SOC-CMM. If you're specifically measuring hunting program maturity, HMM is more focused and actionable.

For practical purposes, the choice of model matters less than the honest application of it. The HMM's simplicity, with five levels, two axes, and observable criteria, makes it effective for board-level communication and quarterly progress tracking. If your organization already uses SOC-CMM, map the HMM levels to the corresponding hunting capability scores. If you're piloting MITRE INFORM, the hunting maturity component aligns with HMM's structure. The mapping is usually straightforward because the underlying progression from ad hoc to structured to automated applies across all frameworks.

Using the model with leadership

The maturity model serves as a translation layer between technical capability and business risk language. When Rachel presents to NE's board, she doesn't describe HMM levels. She says: "Our ability to detect threats depends entirely on automated rules. When those rules miss something, and our coverage analysis shows they miss 73 of the 95 techniques relevant to our environment, nobody is looking. The hunting program adds a structured human layer that investigates the gaps our automation doesn't cover."

The two-axis structure helps Rachel explain why the investment is primarily people and process, not technology. The board might expect a capital request for new tooling. Instead, Rachel shows the posture assessment: the data axis is already at HMM2, so no new infrastructure is needed. The investment is 4 to 8 hours of analyst time per week, a methodology the analyst follows, and a documentation standard that produces measurable output. That's an operational expense the board can track quarterly.

The model also sets realistic expectations. A team at HMM0 on the skill axis won't produce original threat intelligence findings or automated hunting pipelines in the first quarter. Promising HMM3 capabilities from an HMM0 team creates expectations that can't be met, eroding confidence in the program before it delivers. Present the current level honestly. Define one level of advancement per quarter. Four quarters of measurable progress builds more leadership confidence than one quarter of overpromised results.

In month three, Rachel reports the first completed hunt record, documented evidence that an analyst investigated a hypothesis, searched specific data sources, and recorded findings. That's the HMM0-to-HMM1 transition, observable in a single artifact. In month six, she reports the first detection rule produced from a hunt, a permanent improvement to automated coverage that didn't exist before the program. That's progress toward HMM2, and the coverage trend metric from Section 0.14 quantifies it. Each milestone is observable, measurable, and directly tied to risk reduction.

Anti-pattern

Aspirational self-assessment that skips levels. A team deploys a vendor's pre-built hunting query pack, schedules it to run weekly, and claims HMM3 because the hunts are "automated" and "TI-driven." No analyst reviews the results. No hypothesis was documented before execution. No hunt record exists. No detection rule was produced from findings. Running someone else's queries on a timer is scheduled alerting with extra steps. Honest assessment starts with observable artifacts: hunt records, analyst-written hypotheses, and detection rules that didn't exist before the hunt. If those artifacts don't exist, the maturity level is lower than the team believes. The SOC-CMM 2025 report found consistent overestimation in self-assessed SOCs. The pattern is universal, not unique to your organization.

Threat Hunting Principle

Measure capability by artifacts, not aspiration. Your HMM level is defined by what you can demonstrate (hunt records, documented hypotheses, detection rules produced from findings) rather than what you believe your team could do with enough time. Honest assessment determines what you do next. Present HMM0 honestly and advance to HMM1 in three months rather than claiming HMM2 and producing nothing. Observable progress builds leadership confidence. Undelivered promises destroy it.

Section 0.13 takes the maturity assessment, the ROI model from Section 0.7, and the readiness assessment from Section 0.8 and combines them into the leadership case, the structured business justification that secures budget, headcount, and protected time for the hunting program.