In this section

Why Detection Is Cross-Source: Following an Attack Trail in a SIEM

Module 0

Picture the attacker who finally gets a working password for a Northgate account. It is a real win for them, and it is also the least useful thing they can do. A stolen credential sitting idle achieves nothing. To get value from it they have to sign in, and then do something: reach a machine, find data worth taking, move toward a system that matters, get information back out. Every one of those actions happens somewhere different, and somewhere different means a different log.

This is the single most important habit to build before you write a query in anger, so it is worth stating plainly. An attack is not an event. It is a sequence of actions strung across the layers of the estate, and the evidence for it is scattered across the matching sources in exactly the same way the islands were scattered in the previous sub.

The job of detection and investigation is to gather that sequence back into one story. The SIEM gives you the assembled record; this sub is about the shape of the thing you are looking for inside it.

No attack lives in one log

A useful intrusion almost never confines itself to a single system, because a single system is rarely where the attacker's goal lives. The credential is stolen in one place, used in another, and acted upon in a third. Each step is a footprint, and each footprint lands in whichever source records that kind of activity. The authentication lands in the identity logs. The program that runs lands in the endpoint logs.

The connection out to a command server lands in the network and DNS logs. The file that leaves lands in the proxy or the cloud logs.

If you only ever look at one of those sources, you see one footprint and call it either nothing or a mystery. A lone failed sign-in is noise. A single odd process is noise. A connection to an unfamiliar address is noise.

None of them, in isolation, looks like an attack, because an attack is not in any one of them. It is in the relationship between them: the failed sign-ins, then a success, then the odd process under that same account, then the connection out, all within a short window. The relationship is the signal. The individual events are just the places it touches down.

This is why the whole-estate view from the previous sub matters so much in practice. The reason you want every source in one place is not tidiness. It is that the evidence you actually need is the line drawn between events in different sources, and you cannot draw that line if the events live in systems you have to visit one at a time. Detection is cross-source by nature because attacks are.

There is an uncomfortable corollary worth facing early. Attackers understand this fragmentation and lean on it. Keeping each step quiet within its own source, and spacing the steps out in time, is a deliberate way to stay under the threshold of any single team's attention.

The sign-in is unremarkable to whoever watches identity, the process is unremarkable to whoever watches endpoints, and the gap between them gives each event a chance to be forgotten before the next one lands. The defensive answer to that is the ability to ask every source the same question at once and line up the answers on one clock, which is the capability the whole-estate SIEM exists to provide.

A sharper eye on any single source cannot substitute for it, because the thing being hidden is the relationship, and the relationship is invisible from inside one log.

Following a trail: a spray that worked

Walk one real shape end to end, staying conceptual, so the cross-source idea stops being abstract. A password spray is among the simplest intrusions to picture, and Northgate sees one in this course.

It begins in the identity layer. An external address, 193.32.162.89, tries one common password against a long list of accounts. Most attempts fail, which is the defining signature of a spray: many accounts, few attempts each, almost all unsuccessful. On its own this is a wall of failed authentications, and failed authentications happen all day for innocent reasons.

But buried in that wall is one success, because eventually the guessed password matches a real one. That success is the moment the incident becomes real, and it is still only an identity event.

Now the trail leaves identity. The account that succeeded signs in properly and, a few minutes later, a process starts on the workstation tied to that account. That is an endpoint event, in a different source entirely, and nothing in the endpoint log knows or cares that the account behind it was sprayed minutes earlier.

Shortly after, that same host opens a connection to an address it has never contacted before, reaching out for instructions. That is a network event, in a third source. Follow it a little further and you may see internal connections as the attacker tests what else the account can reach, which is the network layer again, now describing movement rather than egress.

Step back from the walk and notice that no individual step was loud. The spray was a pile of failed logins of the kind any busy directory produces all day. The process was one of many thousands that start on Northgate's endpoints every hour. The outbound connection was one among millions.

What made the trail findable was never the volume of any one footprint. It was that a single account sat behind a failed-then-succeeded sign-in, an unfamiliar process, and a connection to an address with no prior history, all inside a few minutes.

The spray itself also carries a statistical shape, many accounts touched while almost every attempt fails, that ordinary login activity does not reproduce, and that shape is what a detection keys on first. The rest of the trail then confirms what the shape suggested.

The same account links four events that live in three sources. Each event alone is unremarkable; the sequence is the incident. Reading the dashed line is the work, and it is only possible because every source sits in one record.

Notice what carried you from one source to the next. At each step you held onto an identifier, the account name, and asked the next source what that identifier did. The sprayed-and-succeeded account became the thing you looked for on the endpoint. The host running the process became the thing you looked for in the network logs. You did not search blindly in each source. You followed a thread.

The pivot is the skill

That thread has a name. The act of taking an identifier from an event in one source and using it to find related events in another is a pivot, and it is the core investigative move you will perform thousands of times in this course. Pivots work because the sources share fields. The same account name appears in identity and endpoint records. The same host appears in endpoint and network records.

The same address appears in network and web records. Above all, every event carries a time, and the SIEM puts them on one clock, so you can ask what an account did and, more pointedly, what it did in the ten minutes after a suspicious sign-in.

The identifier you carry changes as the investigation moves, and learning to choose it well is most of the craft. Early on the anchor is usually the account, because identity is where a great many incidents first surface. Once you have a host, the host often becomes the better anchor, since it gathers every account and process that touched that machine, well beyond the one you started from.

Once you have an external address, that address becomes the anchor, and you can ask which other hosts also spoke to it and whether any of that contact predates the incident you are working. Each anchor reveals a different slice of the same event, and the richest anchor shifts as the picture fills in.

Example search illustration only, not run here

index=endpoint user=p.sharma
| sort _time
| table _time host process_name parent_process_name

The pivot as a single search: take the account from the sign-in and ask the endpoint what it ran. Carrying an identifier from one source into the next is the move you repeat all course.

Example output illustration only, 2 results

_time      host           process_name     parent_process_name
08:17:41   WS-SHARMA-01   powershell.exe   winword.exe
08:18:09   WS-SHARMA-01   cmd.exe          powershell.exe

An investigation is a chain of pivots. You start from one anchor, often the thing the alert handed you, and you carry an identifier from it into the next source to widen the picture. Each pivot either extends the story or closes a door, and both outcomes are progress. The skill that separates a confident analyst from a stuck one is rarely knowing more commands. It is knowing which identifier to carry next and which source will answer the question you have right now.

The shared field is the bridge. Knowing which identifier links two sources, and trusting the single clock that orders them, is what lets you walk an incident from its first footprint to its last.

Anti-Pattern

Re-searching each source from scratch.

A reliable way to lose a trail is to treat every source as a fresh hunt: query identity for anything odd, then separately query the endpoints, then separately query the network, and try to eyeball the connections between three piles of unrelated results. Carry an identifier instead. Take the account, host, or address from the event you already have and ask the next source about that specifically, in the window around it. The pivot is what keeps an investigation one thread rather than three disconnected searches.

The seven shapes you will learn to catch

Real intrusions reuse a small number of shapes, and this course is built around seven of them, drawn through the Northgate environment so you meet each one as a trail to follow rather than a definition to memorize. They are worth seeing as a set now, because the modules ahead return to them.

The password spray you just walked is the first: many accounts, few attempts, one success that opens the door. Second is an adversary-in-the-middle token replay, where the attacker captures a valid session token through a phishing proxy and replays it to walk straight past multi-factor authentication, so the sign-in looks legitimate and the giveaway is in the context around it.

Third is endpoint compromise, where a malicious process establishes itself on a host and you trace what it spawned and what it touched.

Fourth is the pre-encryption phase of a ransomware operation, the reconnaissance, staging, and shadow-copy deletion that happens before any files are locked, which is the window where you can still stop it. Fifth is a hybrid pivot, where an attacker moves between the on-premises estate and the cloud, using a foothold in one to reach the other.

Sixth is an edge-to-identity chain, starting at an internet-facing device and ending in a valid account. The seventh is the full-span capstone that asks you to carry an investigation across all of these layers at once.

You are not expected to recognize these on sight yet. The point of listing them is to show that the cross-source trail you followed for the spray is the same kind of object in every case. Different entry, different goal, same fundamental task: find the footprints, establish that they belong to one actor, and put them in order.

Seen as a set, the seven are chosen to cover the estate on purpose. Some begin in identity, some on an endpoint, some at the network edge, and some in the cloud, so that by the end you have followed trails that start from every major source rather than rehearsing the same entry point seven times over. An attacker does not get to choose where your attention is strong, so the course refuses to let your attention be strong in only one place.

Why a single-source view misses the attack

It is worth being blunt about the failure this sub is guarding against, because it is the most common way a real incident goes unseen for too long. A team watching one source in isolation, however carefully, is structurally unable to see an attack that spans several. The identity team sees a strange sign-in and, finding nothing else strange in identity, closes it.

The endpoint team sees an odd process and, lacking the sign-in context, treats it as a low-priority curiosity. Neither is wrong about their own source. Both miss the incident, because the incident lives in the join between their sources that neither of them is positioned to make.

The whole-estate SIEM exists precisely to dissolve that boundary, and the analyst who thinks in trails rather than in single events is the one who uses it well. Hold that frame as you go into the rest of the course. Every detection you build and every investigation you run is, underneath, an attempt to reassemble a sequence that an attacker scattered across sources, hoping no one was looking at all of them at once.

There is one problem this leaves untouched, and it is the hardest one. The estate produces a vast amount of activity that looks, footprint for footprint, exactly like the early steps of an attack: failed sign-ins, new processes, fresh connections. Telling the rare real trail apart from the constant benign noise is its own discipline, and SPL0.5 takes it on directly.

← Previous Next →