Chapter 13: Cloud Forensics — Microsoft Azure and OneDrive on Windows
Introduction
For the first twelve chapters of this course, the disk has been the crime scene. You learned to image it, parse its file system, walk its registry, carve its unallocated clusters, and reconstruct user behavior from artifacts written by Windows itself. That model breaks the moment a user signs into a Microsoft account and lets OneDrive begin syncing the Documents folder. Files that appear in Explorer may have zero bytes resident on the local volume. Timestamps in the Master File Table may reflect a placeholder created seconds ago, not the original document authored two years prior on a different machine. The "deletion" the suspect performed last Tuesday may still exist in a second-stage recycle bin held by Microsoft in a data center the analyst will never physically touch.
This chapter extends the file systems analysis discipline into the cloud. The focus is deliberately Windows-centric and artifact-driven. We will spend most of our time on OneDrive client artifacts because that is where on-disk forensics meets cloud-resident evidence, and because the OneDrive sync engine is one of the most consequential and least understood data structures on a modern Windows endpoint. We will also cover the Microsoft Azure platform conceptually so you understand the broader environment your endpoint artifacts live within, the legal mechanisms that govern provider-held evidence, and the tenant-side log sources that corroborate what you find on disk.
Learning Objectives
By the end of this chapter, you will be able to:
- Differentiate the IaaS, PaaS, and SaaS service models within Microsoft Azure and Microsoft 365, and identify which categories of evidence are reachable through endpoint acquisition versus provider-mediated legal process.
- Describe the legal framework governing cloud evidence acquisition, including the Stored Communications Act, the CLOUD Act of 2018, and the Microsoft Law Enforcement Request process.
- Locate, identify, and interpret OneDrive client artifacts on a Windows endpoint, including the ODL log files, the SQLite-based SyncEngine databases, and the NTFS reparse points used by Files On-Demand.
- Reconstruct file activity timelines across the OneDrive sync boundary, accounting for placeholder behavior, dehydration and rehydration events, and version history.
- Distinguish OneDrive Personal from OneDrive for Business at the artifact level and explain how the differences affect investigative scope.
13.1 The Cloud Forensics Problem Space
Before tools and artifacts, you need a clean mental model of what "the cloud" actually is from an investigator's perspective. The National Institute of Standards and Technology, in Special Publication 800-145, defines three service models that remain the most useful starting point.
Infrastructure as a Service (IaaS) delivers virtualized compute, storage, and networking. In Azure, this is the world of Virtual Machines, managed disks, and virtual networks. The customer controls the operating system and everything above it. From a forensic standpoint, an IaaS virtual machine can often be acquired in a manner that closely resembles traditional disk imaging because the customer has administrative access to the guest OS and can snapshot the underlying managed disk.
Platform as a Service (PaaS) delivers a managed runtime. Azure App Service, Azure SQL Database, and Azure Functions are common examples. The customer controls application code and data but not the operating system. Forensic acquisition shifts from disk imaging to log review and database export. There is no guest OS to image.
Software as a Service (SaaS) delivers a finished application. Microsoft 365, which includes OneDrive for Business, SharePoint Online, Exchange Online, and Teams, is the dominant SaaS platform you will encounter as an investigator. The customer controls only configuration and content. Acquisition is almost entirely provider-mediated through tools such as Microsoft Purview eDiscovery or Microsoft Graph API queries.
The shared responsibility model formalizes who controls what. Microsoft is responsible for the physical infrastructure, the hypervisor, and the underlying services. The customer is responsible for data, identity, and the configurations they choose. Forensic responsibility tracks the same line. The artifacts you can compel through normal investigative process are those the customer controls. Anything below that line requires Microsoft's cooperation, generally through legal process.
Several characteristics of cloud environments complicate every step of the investigation:
- Volatility. Resources can be created, modified, and destroyed in seconds. A compromised virtual machine deleted by an attacker may leave no recoverable disk image at all unless soft-delete or backup features were enabled in advance.
- Multi-tenancy. Provider infrastructure is shared across customers. Physical seizure of a server is almost never an option because doing so would affect unrelated tenants.
- Jurisdiction. Data may be stored in any region the customer or provider selects. A U.S. investigation may need to acquire evidence physically located in Ireland or Singapore.
- Log retention defaults. Many of the most useful Azure and Microsoft 365 logs are retained for only 90 days under default licensing. Evidence that existed at the time of an incident may be unrecoverable by the time the investigator is engaged.
- Chain of custody. When a provider exports data on the customer's behalf, the analyst did not perform the acquisition. Documentation of the export process, hash values returned by the provider, and the transport chain become the chain of custody.
| Service Model | Customer Controls | Direct Forensic Access | Acquisition Path |
|---|---|---|---|
| IaaS (Azure VM) | Guest OS, applications, data | Yes (via guest OS or disk snapshot) | Snapshot managed disk; image from within guest |
| PaaS (App Service, Azure SQL) | Application code, data, configuration | Partial (data and logs only) | Database export; diagnostic log download |
| SaaS (Microsoft 365, OneDrive) | Content, identity configuration | Indirect | Purview eDiscovery; Graph API; legal process |
Analyst Perspective
On your first cloud case, the most valuable thirty minutes you can spend are on the licensing page. Whether your organization has Microsoft 365 E3 versus E5, whether Microsoft Defender for Cloud is enabled, and whether Audit (Premium) has been turned on will determine what evidence even exists. Investigators routinely discover that the log they need was never being generated.
13.2 Legal and Procedural Framework
Cloud evidence acquisition is a legal exercise as much as a technical one. The relevant U.S. statute is the Stored Communications Act, codified at 18 U.S.C. § 2701 and following, which governs how providers may disclose customer content and metadata. Different categories of data require different legal instruments, ranging from a subpoena for basic subscriber information to a search warrant for content.
The CLOUD Act of 2018 (Clarifying Lawful Overseas Use of Data Act) clarified that U.S. providers must produce data in their possession, custody, or control regardless of where it is physically stored. It also created a framework for executive agreements with foreign governments to streamline cross-border requests. Before the CLOUD Act, the location of the data center was a frequent point of legal contention, most famously in the Microsoft Ireland case.
For non-U.S. investigations, Mutual Legal Assistance Treaties (MLATs) remain the formal channel, though they are notoriously slow. A preservation request under 18 U.S.C. § 2703(f) is a fast and important first step. It directs the provider to preserve specific records for 90 days (renewable once) while the investigator pursues the legal process needed to compel production. Sending a preservation letter is often the single most time-sensitive action in a cloud case.
Microsoft publishes a Law Enforcement Request process and accepts properly served legal documents through a dedicated portal. Internal corporate investigators working within their own tenant do not need legal process at all because the organization owns the data; they use Microsoft Purview eDiscovery directly. The distinction between an external law enforcement investigator and an internal corporate investigator working the same tenant is fundamental and changes every step of the workflow.
Putting It Together: Choosing the Acquisition Path
A mid-sized accounting firm suspects a departing employee exfiltrated client tax records to a personal account during their two-week notice period. The firm's IT director engages you. You have two evidence sources within reach: the employee's company-issued Windows laptop, still on premises, and the firm's Microsoft 365 tenant.
The endpoint will tell you what files were touched locally and may contain ODL log entries showing sync activity. The tenant Unified Audit Log will tell you what was downloaded, shared externally, or accessed through the web interface. Neither source alone is sufficient. The endpoint cannot prove what happened in a browser session. The tenant log cannot prove which physical machine performed an action without correlating IP addresses and device identifiers.
Your sequence is preservation first, acquisition second. You direct the IT director to immediately place the user's mailbox and OneDrive on legal hold within Microsoft Purview, which preserves content even if the user deletes it. You then perform a triage acquisition of the laptop while it remains in your custody. Only after both preservation actions are complete do you begin parsing artifacts.
13.3 Microsoft Azure Architecture for Investigators
Students consistently conflate "Azure" with "Microsoft 365." They are related but distinct. Azure is the IaaS and PaaS cloud platform. Microsoft 365 is the SaaS productivity suite that runs on top of Azure infrastructure. Both share a common identity plane: Microsoft Entra ID, formerly known as Azure Active Directory.
The Azure resource hierarchy proceeds from broadest to narrowest:
- Tenant. The top-level Entra ID directory representing an organization. One tenant per organization is typical.
- Management groups. Optional containers for organizing multiple subscriptions under common policy.
- Subscriptions. Billing and administrative boundaries. A tenant may contain many subscriptions.
- Resource groups. Logical containers for related resources within a subscription.
- Resources. The actual virtual machines, storage accounts, databases, key vaults, and so on.
This hierarchy matters to investigators because access control, logging configuration, and legal scope all attach at specific levels. A search warrant scoped to a subscription does not authorize collection from a sibling subscription in the same tenant. An audit log enabled at the resource group level does not capture events at the subscription level.
Microsoft Entra ID is the identity plane for both Azure and Microsoft 365. Every user, every service principal, every device, and every authentication event in either environment passes through Entra ID. Sign-in logs and audit logs from Entra ID are the most important single source for investigating account compromise, lateral movement, and credential abuse in Microsoft cloud environments.
13.4 Azure and Microsoft 365 Evidence Sources
The following log sources are the ones you will reach for most often. Retention periods and licensing requirements should be verified against current Microsoft documentation before you commit to a case strategy, because Microsoft has changed both repeatedly.
- Azure Activity Log. Records control-plane operations such as creating, modifying, or deleting Azure resources. Default retention is 90 days. Useful for reconstructing what an attacker did to the environment itself.
- Entra ID Sign-in Logs. Record every authentication attempt against Entra ID, successful or failed, including source IP, application, and conditional access result. Default retention varies by license tier.
- Entra ID Audit Logs. Record directory changes such as user creation, group membership changes, and role assignments.
- Microsoft 365 Unified Audit Log. The single most important log source for SaaS investigations. Captures user and admin actions across Exchange, SharePoint, OneDrive, Teams, and Entra ID. Searched through Microsoft Purview. The events most relevant to OneDrive cases include
FileAccessed,FileDownloaded,FileUploaded,FileSyncDownloadedFull,FileDeleted, andSharingSet. - Microsoft Graph API. A programmatic interface that allows scripted collection of mailbox content, OneDrive files, audit events, and identity data. Useful for repeatable, scriptable acquisition workflows.
- Microsoft Sentinel. Microsoft's cloud SIEM. If the organization has it deployed, it may contain longer retention of the above sources plus correlation rules and incident records.
| Source | What It Captures | Default Retention | Primary Use |
|---|---|---|---|
| Azure Activity Log | Resource control-plane changes | 90 days | Attacker actions on infrastructure |
| Entra ID Sign-in Log | Authentication attempts | 7 to 30 days (license dependent) | Account compromise investigation |
| Entra ID Audit Log | Directory and identity changes | 30 days (license dependent) | Privilege escalation, persistence |
| Unified Audit Log | M365 user and admin actions | 90 to 180 days (license dependent) | OneDrive, Exchange, SharePoint activity |
| Microsoft Graph API | Programmatic export of content and metadata | N/A (live query) | Scripted acquisition |
Warning
Default retention values change. The Unified Audit Log was 90 days for most tenants for years, then Microsoft began extending it to 180 days for many license tiers. Always confirm current retention against Microsoft documentation and the tenant's actual configuration before relying on a log source. Never assume an event "must be there" based on what was true on a previous case.
13.5 OneDrive as a Hybrid Artifact
OneDrive is where Azure-side cloud forensics meets endpoint file system analysis, which is why it deserves the bulk of this chapter. Two distinct products share the OneDrive name and the same client binary on Windows, but they store data in different services and generate slightly different artifacts.
OneDrive Personal is the consumer service tied to a Microsoft account. Storage lives in a consumer Azure backend. It is the OneDrive your students use for their family photos and the OneDrive that frequently appears in criminal cases involving individuals.
OneDrive for Business is part of Microsoft 365 and is technically a per-user SharePoint Online site collection. Each user has a personal site, and OneDrive for Business presents a sync client view of that site. It is the OneDrive that appears in nearly all corporate investigations.
A single Windows installation can sync both at once. The client maintains separate sync roots, separate settings databases, and separate ODL log streams for each account. Investigators must inventory which accounts are configured before assuming a single artifact set.
The defining technical feature of modern OneDrive is Files On-Demand, which uses NTFS reparse points to present cloud files in Explorer without downloading their content. A placeholder file occupies a directory entry and a small amount of metadata in the Master File Table, but its data stream contains zero allocated clusters. When the user opens the file, the OneDrive client intercepts the I/O request, downloads the content, and rehydrates the file. If the file is later marked "free up space," the content is removed and the placeholder restored.
This has direct consequences for traditional disk imaging. A dead-box image of a system using Files On-Demand will contain placeholder entries for files whose content does not exist locally. File carving against unallocated space will not recover them because they were never resident. Recognizing placeholders is critical, and the indicator is straightforward: a placeholder file carries the O (offline) attribute, visible from attrib or programmatically through the FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS flag.
Warning
A standard file count or hash inventory of a OneDrive folder will produce results that look complete but are not. Files whose content is dehydrated will fail to hash because there is no data to hash. Always check for the offline attribute and account for placeholder files separately in your acquisition documentation. Treating placeholders as missing files or as full files are both wrong.
13.6 OneDrive Client Artifacts on Windows
The OneDrive client writes most of its forensically interesting state under the user's local profile. The exact paths have remained stable across recent versions, but tool authors note that Microsoft changes file formats periodically and parsers should be validated against known data.
Settings and databases. The directory %LocalAppData%\Microsoft\OneDrive\settings\ contains per-account subdirectories. Each subdirectory holds the SQLite databases that track sync state. The most important file is the SyncEngine database, historically named in patterns such as SyncEngineDatabase.db and accompanied by .dat files containing per-file metadata. These databases record file paths, parent folder identifiers, sync status, hashes, and timestamps for every item the client knows about. They are the primary source for answering "what did this user have in their OneDrive on this date" because they describe the cloud-side state as the client most recently observed it, not just what is currently resident on disk.
ODL logs. The directory %LocalAppData%\Microsoft\OneDrive\logs\ contains the OneDrive client's structured diagnostic logs in two formats: .odl files for current activity and .odlsent files for log entries already transmitted to Microsoft. The format is binary and proprietary, but it has been reverse engineered by the community. Tools from Yogesh Khatri and CCL Solutions can parse ODL files into a readable timeline of sync events, including file uploads, downloads, deletes, renames, and conflict resolutions. ODL parsing is one of the highest-value techniques in modern Windows forensics because it reveals activity that occurred entirely within the cloud sync engine and is invisible to the standard Windows event log.
Registry. The hive HKCU\Software\Microsoft\OneDrive records account configuration, the sync root path for each account, the email address associated with each account, and the last sign-in time. The Accounts subkey contains a numbered subkey for each configured account (typically Personal for OneDrive Personal and Business1, Business2, etc. for tenants). The Tenants subkey under Business1 records the friendly tenant name. These values are essential for confirming which accounts to focus on before parsing the larger artifact set.
Known Folder Move (KFM). When KFM is enabled, the user's Desktop, Documents, and Pictures folders are redirected into the OneDrive sync root. This is invisible to most users and easy to miss as an investigator. Registry values under HKCU\Software\Microsoft\OneDrive\Accounts\Business1 record whether KFM is active and which folders are redirected. If KFM is on, "the user's Documents folder" and "the OneDrive sync root" are the same physical directory, and every file the user saves to Documents is uploaded to the cloud automatically.
Recycle bin behavior. OneDrive maintains its own two-stage recycle bin in the cloud. When a user deletes a synced file, the local copy goes to the standard Windows Recycle Bin and the cloud copy goes to a first-stage OneDrive Recycle Bin. If the user empties the OneDrive recycle bin, the file moves to a second-stage recycle bin retained for an additional period (currently 30 days for personal accounts and configurable for business). For business accounts, an administrator can recover items from the second-stage bin even after the user has tried to remove them. This is an evidence preservation gift on the right kind of case.
Shell extensions and overlay icons. The OneDrive shell extension registers under HKLM\Software\Microsoft\Windows\CurrentVersion\Explorer\ShellIconOverlayIdentifiers. The overlay icons (cloud, checkmark, sync arrows) are not forensic evidence themselves but their presence confirms the client was installed and active.
Putting It Together: Reconstructing an Exfiltration
A corporate investigator is asked to determine whether a sales engineer copied a proprietary pricing spreadsheet to a personal account before resigning. The endpoint is a company-issued Windows laptop. The corporate tenant's Unified Audit Log is available.
The investigator begins at the registry. HKCU\Software\Microsoft\OneDrive\Accounts shows two configured accounts: Business1 for the corporate tenant and Personal for an outlook.com address. The presence of a personal OneDrive account on a company device is itself a finding worth documenting.
Next, the investigator parses the SyncEngine databases for both accounts. The Business1 database contains an entry for Pricing_Q4_Confidential.xlsx with a known hash. The Personal database contains an entry for a file with the same name and the same hash, in a folder named Backup. The hash match indicates the same content existed in both clouds.
The ODL logs for the Business1 account show a FileDownloaded event for Pricing_Q4_Confidential.xlsx two days before the resignation date, followed by ODL events for the Personal account showing FileUploaded for the same name within minutes. The Unified Audit Log is then queried for the same file path and confirms the corporate-side download event with the source IP address of the laptop.
The investigator now has three independent corroborating sources: the local SyncEngine state, the local ODL timeline, and the tenant audit log. Each on its own would be challenged. Together they form a defensible reconstruction.
13.7 Timeline Reconstruction Across the Sync Boundary
Traditional Windows timeline analysis treats the MFT, the USN journal, and the registry as authoritative for "when did this file appear on this system." Sync clients break that assumption in subtle ways.
When OneDrive creates a placeholder for a file that already exists in the cloud, the MFT records the placeholder creation time, not the original document creation time. Two files in the same OneDrive folder may have nearly identical local creation times despite having been authored years apart on different machines. Standard MAC time analysis will mislead the investigator unless the cloud-side metadata is also consulted.
Dehydration and rehydration events generate additional surprises. When a user opens a dehydrated file, the rehydration may update the last access time. When the system later marks the file offline again to free space, the resulting change to the file's reparse data may update the modified time even though the user did not edit the content. The ODL log is the corrective source: it records the actual sync events with their original timestamps as reported by the service.
OneDrive also maintains version history on the cloud side, retaining previous versions of edited files for a period that varies by service tier. Version history is acquired through the web interface, the Microsoft Purview eDiscovery search, or the Graph API. It cannot be reconstructed from the local artifacts alone, but the local SyncEngine database will indicate which files have multiple versions stored cloud-side.
The disciplined approach is to build two timelines and reconcile them. The first is the local timeline from MFT, USN, registry, and ODL. The second is the cloud timeline from Unified Audit Log entries and version history exports. The points where they agree are strong. The points where they disagree are usually the points the case turns on.
13.8 Acquisition Strategies and Tooling
For endpoint-side collection, KAPE (the Kroll Artifact Parser and Extractor) ships with targets for the OneDrive logs, settings, and registry hives. A KAPE collection is the fastest way to capture the artifact set without imaging the entire disk. For tenant-side collection, Microsoft Purview eDiscovery (Standard or Premium) is the supported workflow for legal-quality export of mailbox and OneDrive content along with audit data. Microsoft Graph API enables scripted acquisition for repeatable workflows and is the right choice when collecting from many users programmatically.
Tool validation deserves explicit attention. The ODL format is proprietary, parsers are community-maintained, and Microsoft has changed the format without notice in past releases. Before relying on parser output in a report, validate it against a known-good dataset that you generated yourself by performing controlled actions on a test system. The same applies to SyncEngine database parsers. Treat any tool output for these formats as a working hypothesis until you have confirmed the underlying records by hand.
Chapter Summary
- Cloud forensics requires reconciling endpoint artifacts with provider-held evidence; neither source is sufficient alone.
- The NIST service models (IaaS, PaaS, SaaS) determine what an investigator can acquire directly versus what requires legal process.
- Preservation under 18 U.S.C. § 2703(f) and a Microsoft Purview legal hold are time-critical first steps in any cloud case.
- Microsoft Entra ID is the identity plane for both Azure and Microsoft 365; sign-in and audit logs from Entra ID are foundational to most investigations.
- The Unified Audit Log is the most important single source for SaaS user activity, but its retention and licensing dependencies must be verified per tenant.
- OneDrive Personal and OneDrive for Business share a client binary but generate separate, parallel artifact sets that must be inventoried independently.
- Files On-Demand uses NTFS reparse points to present cloud files as zero-byte placeholders; standard imaging and hashing workflows must account for this.
- The OneDrive SyncEngine SQLite databases and ODL logs are the highest-value endpoint artifacts and are the focus of the companion lab.
- Timeline analysis across the sync boundary requires reconciling local MFT and USN records with cloud-side audit log and version history data.
The next chapter shifts from cloud-resident artifacts to anti-forensics and evidence destruction techniques, where you will see how attackers and informed users attempt to defeat the artifact sources you have learned to read. Many of those techniques interact directly with the OneDrive sync engine in ways that the careful investigator can detect.