In this section

Passive Reconnaissance — What's Visible Before the Attack

6-8 hours · Module 1 · Free

What you already know

You've heard of OSINT. You may have run Shodan queries or checked your organization's exposure through Have I Been Pwned. This section walks the full passive reconnaissance process from the attacker's perspective, with specific examples of what each source reveals, how the attacker uses it to shape their operational plan, and what you can audit before the attacker does.

Scenario

Before sending a single packet to Northgate Engineering's network, the attacker knows the CEO's name (Rachel Okafor), the mail server's IP, that the organization runs M365 E5 with Defender for Endpoint, that three employees have LinkedIn profiles mentioning "migrating to Azure," and that a developer posted a GitHub commit referencing an internal API endpoint. All from public sources. None of it generated a single alert in the SIEM. The attacker's operational plan is already half-written before the first phishing email is drafted.

Your attack surface is public

Passive reconnaissance is invisible to your SIEM. There is no alert for someone reading your LinkedIn page. No log entry for someone querying certificate transparency. No firewall event for someone browsing your job postings. The attacker builds a detailed operational picture in complete silence, and the first time you see them is when the phishing email arrives. By that point, they already know your email format, your security stack, the name of your CFO's executive assistant, and potentially a valid password from a breached third-party service.

You cannot prevent the attacker from reading public data. What you can do is audit what is exposed, reduce what does not need to be public, and assume the attacker has already read everything that remains.

Four passive reconnaissance categories. Each shapes a different part of the attacker's operational plan. None generates a single alert in the target's SIEM.

Infrastructure reconnaissance

DNS records are public by design and reveal more about your infrastructure than most organizations realize.

MX records reveal your email provider immediately. northgateeng.com MX → northgateeng-com.mail.protection.outlook.com tells the attacker the organization uses Microsoft 365. That single record determines the phishing strategy: the attacker builds an AiTM proxy targeting the Microsoft login page rather than a Google or Okta login page.

SPF records reveal every third-party service authorized to send email on your behalf. A typical SPF includes spf.protection.outlook.com, _spf.google.com, mail.zendesk.com, and servers.mcsv.net. The attacker now knows the organization's M365, Google Workspace, Zendesk, and Mailchimp usage. Each is a potential phishing vector: a spoofed Zendesk ticket, a fake Mailchimp campaign update, a fraudulent Google Workspace notification.

DMARC records reveal the organization's email authentication enforcement level. A DMARC policy of p=none means the organization monitors but does not enforce email authentication, allowing spoofed emails to reach inboxes. A policy of p=reject means spoofed emails are blocked, forcing the attacker to use a different domain rather than impersonating the target's own domain. The attacker checks DMARC before deciding whether to spoof the target's domain directly or register a lookalike domain.

Certificate transparency logs are the highest-yield infrastructure source. Querying crt.sh for %.northgateeng.com returns every certificate ever issued for the domain, including certificates for internal hostnames: vpn.northgateeng.com, jenkins.internal.northgateeng.com, staging-api.northgateeng.com. Each reveals an internal service. jenkins means CI/CD pipeline. staging-api means a staging environment that almost certainly has weaker access controls than production.

CLI Output

Attacker's passive infrastructure reconnaissance — Northgate Engineering
$ dig MX northgateeng.com +short
10 northgateeng-com.mail.protection.outlook.com
→ Email provider: Microsoft 365
$ dig TXT northgateeng.com +short | grep spf
"v=spf1 include:spf.protection.outlook.com include:mail.zendesk.com ~all"
→ SaaS: M365 + Zendesk. No Google, no Mailchimp.
$ curl -s "https://crt.sh/?q=%25northgateeng.com&output=json" | jq '.[].name_value' | sort -u
vpn.northgateeng.com
mail.northgateeng.com
autodiscover.northgateeng.com
jenkins.internal.northgateeng.com
staging-api.northgateeng.com
sso.northgateeng.com
→ Internal services exposed: VPN, Jenkins CI/CD, staging API, SSO portal
$ curl -sI https://vpn.northgateeng.com | grep -i server
Server: Cisco ASA
→ VPN appliance identified. Check CVE database for unpatched vulns.

Every piece of information narrows the attacker's options and increases the precision of their initial access attempt. The MX record eliminates Google-targeted phishing. The SPF reveals Zendesk as a trusted sender they can impersonate. The CT logs expose Jenkins, which is a high-value target for supply chain access. The VPN header identifies the exact appliance model. None of these queries touched the target's infrastructure.

Technology fingerprinting from public data

Beyond DNS, attackers extract technology information from sources the organization cannot easily remove. HTTP response headers from public-facing web applications often reveal the web server software, framework version, and sometimes the programming language. An X-Powered-By: ASP.NET header confirms a Windows server environment. A Server: nginx/1.24.0 header tells the attacker the exact version and whether it has known vulnerabilities.

Error pages are another source. A 404 page that displays the IIS default error format confirms Windows and IIS. A stack trace accidentally left in a production error response might reveal internal file paths, framework versions, and database driver information. Attackers systematically probe for these responses because each one eliminates uncertainty about the target environment.

Public code repositories are increasingly valuable. GitHub searches for the organization's domain name or employee email addresses may reveal committed API keys, internal URLs, configuration files, or infrastructure-as-code templates that describe the entire cloud environment. A single committed Terraform file can expose the organization's full Azure resource topology, including resource group names, virtual network configurations, and storage account names that the attacker can use to craft targeted attacks against specific services.

People reconnaissance

LinkedIn is the attacker's organizational chart. Job titles map to access privileges. The reporting structure reveals trust relationships that can be exploited through social engineering.

"Senior Finance Manager" has access to payment systems. "Executive Assistant to the CEO" has delegated access to the CEO's mailbox and calendar. The attacker does not need to compromise the CEO directly. They compromise the assistant who has the same data access with less security scrutiny. "IT Support Technician" has help desk privileges that can reset passwords and bypass MFA for locked-out users. The attacker targets this role for voice phishing because one successful call can produce a password reset for any employee in the organization.

Job postings are a technical specification of the environment. A posting for "SOC Analyst: experience with Microsoft Sentinel, CrowdStrike Falcon, and Splunk required" tells the attacker the exact detection stack. They load a CrowdStrike Falcon trial, test their payload against it, and iterate until it evades detection, all before they touch the target environment. Postings for "Cloud Security Engineer: Azure experience required" confirm the cloud platform. Postings mentioning specific compliance frameworks (PCI DSS, HIPAA, FedRAMP) reveal the regulatory environment, which tells the attacker what data the organization holds and how they respond to breaches.

Email format discovery is trivially easy. One employee's email address appearing anywhere public (a conference bio, a GitHub commit, a press release, a breached third-party database) reveals the format for the entire organization. first.last@northgateeng.com means the attacker can generate a valid email address for any person whose name they know from LinkedIn. Combined with the organizational chart from LinkedIn, the attacker can build a complete email directory with job titles, reporting relationships, and access-level estimates without querying a single internal system.

Social media provides operational timing intelligence. Employees posting about conferences, vacations, or travel reveal when key staff will be unavailable. An attacker monitoring the CISO's LinkedIn sees a post about speaking at a conference next week. They schedule the initial access attempt for that week, knowing the most senior security decision-maker is away from the SOC.

Credential reconnaissance

If passive reconnaissance produces one finding that changes the entire operation, it is valid credentials. Breached passwords provide direct access without exploitation, without phishing, without triggering any technique your detection rules were built to catch. The attacker authenticates as a legitimate user because they are using a legitimate credential.

Breach databases are the primary source. When a third-party service is breached, credentials of every user who registered with their work email become available. The attacker queries for *@northgateeng.com, receives a list of employees with breached credentials, and attempts credential stuffing against M365. If any employee reused their password, the attacker gets a valid session without triggering a single failed-login alert.

Infostealer malware has created a secondary credential market that is arguably more dangerous than breach databases. Infostealers like Lumma, StealC, and Vidar run on employees' personal devices and capture everything stored in the browser: passwords, cookies, session tokens, autofill data, cryptocurrency wallets. Infostealer malware compromised 3.9 billion credentials across 4.3 million devices in 2024, with Lumma alone responsible for 23.3 million detections in 2025. An infostealer log from a single personal laptop might contain the employee's M365 password, a valid session cookie that bypasses MFA entirely, VPN credentials, and SSH keys.

The infection happens on a device the organization does not manage, does not monitor, and cannot detect. The first sign is when those credentials are used against the environment. If the attacker replays a valid session cookie, even MFA provides no protection because the authentication already happened. CYFIRMA reported that 54% of ransomware victims in 2025 had corporate credentials previously exposed in infostealer logs.

The pipeline from infostealer infection to enterprise breach is now measured in days. An employee's personal laptop is infected by Lumma through a malicious advertisement or pirated software download. The stealer harvests every saved credential in the browser within seconds and transmits the data to attacker infrastructure. The stolen credentials are packaged into "stealer logs" and sold on dark web marketplaces or Telegram channels. An initial access broker purchases the log, tests the M365 credentials, verifies access, and lists the corporate entry point for sale. A ransomware affiliate buys the access and begins post-exploitation. The entire chain from personal device infection to ransomware staging can complete within a week.

The defensive implication is significant: your perimeter defense and endpoint detection cannot protect against an infection on a device you don't control. The defense must happen at the authentication layer. Phishing-resistant MFA (FIDO2 security keys or passkeys) prevents stolen passwords from being useful. Continuous access evaluation policies (CAE) in Entra ID reduce the lifetime of stolen session tokens. Conditional Access policies that require compliant devices prevent authentication from unmanaged endpoints. These controls are covered in depth in the Identity and Access Management course.

Analyst Decision

Self-assessment priority order: Run the same passive reconnaissance against your own organization before the attacker does. Start with credential exposure (Have I Been Pwned domain search, infostealer log monitoring services) because valid credentials are the single highest-impact finding. Then audit DNS and certificate transparency for exposed internal hostnames. Then review job postings for security stack disclosure. Then audit LinkedIn exposure for IT and security staff.

Remediation priorities: For breached credentials, force password resets and deploy phishing-resistant MFA (FIDO2/passkeys) so that valid passwords alone are insufficient. For exposed internal hostnames in CT logs, evaluate whether internal services need public certificates or whether private CA certificates would work. For job postings, test whether "experience with enterprise SIEM and EDR platforms" communicates the same requirement as naming specific products.

Continuous monitoring: Passive reconnaissance self-assessment is not a one-time exercise. New employees join and update LinkedIn profiles. New certificates are issued. New breach databases become available. Set a quarterly cadence for credential exposure monitoring and an annual cadence for full passive reconnaissance audit.

Assuming your organization is not interesting enough to be targeted

Small and mid-size organizations assume threat actors target only large enterprises. The IAB market does not filter by organization size. An access broker finds a vulnerable Cisco ASA VPN, compromises it, and lists the access for $500 without caring whether the organization has 50 employees or 50,000. The ransomware affiliate who buys that access cares about one thing: whether the organization has enough revenue to pay the ransom. Passive reconnaissance is automated at scale. Every organization with internet-facing infrastructure is being scanned and cataloged.

Offensive Operations Principle

Passive reconnaissance is invisible to your detection stack because it never touches your infrastructure. The attacker builds their target model from public information and uses it to craft campaigns that bypass your specific controls. You cannot detect passive reconnaissance, but you can audit what it would find and reduce what does not need to be public.

Section 1.6: Active Reconnaissance. Where passive reads public data, active touches your infrastructure. The attacker's challenge: gather useful data without triggering your alerts. Section 1.6 covers port scanning, directory brute-forcing, credential spraying, and why most active reconnaissance succeeds because your detection thresholds were designed for a different problem.

Unlock the Full Course See Full Course Agenda

Get weekly detection and investigation techniques

KQL queries, detection rules, and investigation methods — the same depth as this course, delivered every Tuesday.

No spam. Unsubscribe anytime. ~2,000 security practitioners.

← Previous Next →