A Step-by-Step Microsoft Cloud Cybersecurity Health Check for Internal IT Teams

A structured, domain-by-domain assessment process your IT team can run today to identify the security gaps in your Microsoft 365, Azure, and Entra ID environment.

Published April 1, 2026 ~25 min read Security Assessment · Microsoft 365 · Azure · Entra ID

Introduction

Most organizations running Microsoft 365 and Azure know their security posture has gaps. The problem is rarely awareness — it's knowing where to start. The Microsoft admin portals are sprawling. Security settings are scattered across Entra ID, Intune, Defender, Purview, Exchange Online, and Azure itself. When everything is "kind of configured" but nothing has been systematically reviewed, it's difficult to know what's actually protecting you and what's just the default.

Running a formal third-party assessment is the gold standard, but it's not always immediately available or budgeted. What you can do right now — today, with the team you have — is run a structured health check across the twelve governance domains that matter most in a Microsoft cloud environment. This article gives you that structure. It's the same framework used in professional security assessments, adapted so an internal IT team can work through it methodically and produce a real, prioritized list of findings.

This is not a compliance checklist you can rush through in an afternoon. Each domain requires you to actually open the relevant admin portals, check real configurations, and make honest assessments about whether your current state is intentional or inherited. Some of what you find will be straightforward to fix. Some will surface questions your team hasn't had to answer yet. Both of those outcomes are the point.

How to Use This Health Check

The approach is straightforward: work through each of the twelve domains in order, evaluate your environment against the criteria described, document what you find, and then prioritize remediation based on risk. This is not a pass/fail exercise. There is no score that tells you whether you're "good" or "bad." The output is a prioritized action list — a clear record of what needs attention, how urgently, and how much effort is involved.

Before you begin, set up a simple tracking spreadsheet. Five columns are all you need: Domain (which of the twelve areas), Finding (what you observed), Severity (Critical, High, Medium, or Low), Remediation (what needs to happen), and Effort (estimated time and complexity). That spreadsheet becomes your working document. By the time you finish all twelve domains, it will contain every actionable finding from the assessment and the basis for a remediation roadmap.

Some domains will take fifteen minutes. Others will take an hour or more, especially if configurations are unclear or undocumented. Budget two to three full working days for a thorough pass across everything. If that seems like a lot, consider that most organizations have never done this at all — spending two days to build a complete picture of your security posture is time well invested.

Who should run this?

Ideally, the person or team with Global Reader access (at minimum) in your Microsoft 365 tenant, plus access to the Azure portal for any IaaS resources. You don't need Global Admin to assess — you need it to fix things later. Read access is sufficient for the review itself.

The 12 Assessment Domains
01
Identity & Authentication
02
Email Security
03
Data Protection
04
Endpoint Management
05
Endpoint Protection
06
Application Security
07
Network & Infrastructure
08
Logging & Monitoring
09
Privileged Access
10
Incident Response
11
Backup & Recovery
12
Governance & Documentation

Domain 1: Identity and Authentication

Identity is the perimeter in a cloud environment. If an attacker can authenticate as one of your users, they bypass nearly everything else you've built. This domain is where you start because findings here tend to be the most consequential — and the most common.

Multi-Factor Authentication

The first question is whether MFA is enforced for all users, and enforced is the key word. Many organizations have MFA "enabled" through Security Defaults or per-user MFA settings, but that's not the same as enforcing it through Conditional Access. Security Defaults are Microsoft's baseline — they require MFA registration and prompt for MFA on certain conditions — but they give you no granularity. You can't exclude a service account, require a specific MFA method, or combine MFA with device compliance checks. If your tenant is still running on Security Defaults, note that as a finding. The move to Conditional Access policies is necessary for any environment beyond a handful of users.

Check the Entra admin center under Protection > Conditional Access. You should see at least one policy that requires MFA for all users, targeting all cloud apps, with appropriate exclusions (break-glass accounts, service accounts with compensating controls). If the only MFA in your environment is the legacy per-user toggle in the Microsoft 365 admin center, that's a critical finding. Per-user MFA doesn't integrate with Conditional Access and creates blind spots as your policies evolve.

Legacy Authentication

Legacy authentication protocols — POP3, IMAP, SMTP AUTH, older Exchange ActiveSync — do not support MFA. An attacker with a stolen password can authenticate through a legacy protocol and completely sidestep your MFA policies. Check whether you have a Conditional Access policy that explicitly blocks legacy authentication. This should be a separate policy from your MFA policy, targeting all users and all cloud apps, with the condition set to block legacy authentication clients. If this policy does not exist, that's a critical finding. This is one of the single most impactful security controls you can deploy and one of the most commonly missing.

Global Administrator Count

Navigate to Entra ID > Roles and administrators > Global Administrator and count the assigned users. The target is two to four, with one of those being a documented break-glass account. In practice, we routinely find environments with eight, twelve, or more Global Admins — often because someone needed elevated access for a one-time task and the role was never removed. Every Global Admin account is a high-value target. If an attacker compromises any one of them, they have full control of the tenant. Document the count and note any accounts that shouldn't be there.

While you're looking at admin accounts, check whether administrators use separate accounts for their daily work versus administrative tasks. An admin who reads email, clicks links, and browses the web from the same account that has Global Admin privileges is carrying far more risk than necessary. Dedicated admin accounts — even without Privileged Identity Management — dramatically reduce the blast radius of a compromised credential.

Self-Service Password Reset

Check whether SSPR is enabled under Entra ID > Password reset. If it's enabled, verify that it requires MFA verification (not just a security question or alternate email). SSPR that relies on weak verification methods is an account takeover vector. If SSPR is not enabled at all, that's a different kind of finding — it means your help desk is handling all password resets, which is operationally expensive and creates its own social engineering risks.

Risky Sign-In Policies

If your organization has Entra ID P2 licensing, check whether risk-based Conditional Access policies are configured. These use Microsoft's machine learning to detect anomalous sign-in patterns — impossible travel, unfamiliar locations, known-malicious IPs — and respond by requiring MFA or blocking access entirely. Navigate to Protection > Conditional Access and look for policies with sign-in risk or user risk conditions. If you have P2 licenses and these policies aren't configured, you're paying for a capability you're not using.

What good looks like.

MFA enforced via Conditional Access for all users. Legacy authentication blocked by policy. Two to four Global Admins with a documented break-glass account. Admin accounts separated from daily-use accounts. SSPR enabled with MFA verification. Risk-based policies active if licensed for P2.

Domain 2: Email Security

Email remains the primary attack vector for most organizations, and Microsoft 365 email security involves more layers than most teams realize. The default settings are not sufficient for any organization handling sensitive data, and even organizations that have made changes often have gaps in their configuration.

DKIM and DMARC

Start with the fundamentals. DKIM (DomainKeys Identified Mail) cryptographically signs your outgoing email so receiving servers can verify the message actually originated from your domain. Check whether DKIM is enabled for every sending domain in your tenant by navigating to Microsoft 365 Defender > Email & collaboration > Policies & rules > Threat policies > DKIM. Each domain should show DKIM signing as enabled with valid CNAME records published in DNS.

Then check DMARC. This is a DNS record that tells receiving servers what to do when a message fails SPF and DKIM checks. The record lives in your public DNS as a TXT record on _dmarc.yourdomain.com. What matters is the policy directive: p=none means you're only monitoring — nothing is being blocked. p=quarantine sends failed messages to spam. p=reject tells receiving servers to drop them outright. If your DMARC policy is still set to p=none, you're collecting data but not actually protecting your domain from spoofing. That's a high finding if it's been more than 90 days since you first published the record.

Safe Links and Safe Attachments

These are Defender for Office 365 features (Plan 1 or Plan 2). Safe Links rewrites URLs in email to route through Microsoft's scanning service at click time. Safe Attachments detonates file attachments in a sandbox before delivery. Check whether policies exist under Threat policies > Safe Links and Safe Attachments. Look specifically at whether the policies apply to all users or just a subset. We frequently find organizations where Safe Links exists but only covers a pilot group that was never expanded, or where Safe Attachments is configured but the action is set to "Monitor" rather than "Block."

Auto-Forwarding Rules

External email auto-forwarding is a classic data exfiltration technique. A compromised account sets up a forwarding rule to an external address, and every email the user receives is silently copied out of the organization. Check whether outbound auto-forwarding is blocked under Exchange admin center > Mail flow > Remote domains. The default remote domain ("*") should have auto-forwarding set to disabled. Also check for existing forwarding rules by running a mail flow report or searching transport rules for any that redirect mail externally.

Anti-Phishing Policies

The default anti-phishing policy in Microsoft 365 provides basic protection, but it's worth checking whether your organization has configured custom policies that go further. Look for impersonation protection — this guards against attackers spoofing the display names of your executives or your trusted domains. Check whether mailbox intelligence is enabled, which uses machine learning based on each user's communication patterns to detect phishing. If your anti-phishing configuration is just the out-of-box default with no customization, note that as a medium finding. The defaults are better than nothing, but they miss scenarios that targeted phishing exploits.

Mail Flow Bypass Rules

This one catches organizations more often than they expect. Over the years, transport rules get created to bypass spam filtering for a specific vendor, allow a scanning appliance to relay, or whitelist a partner domain. Those rules accumulate. Check Exchange admin center > Mail flow > Rules and look for any rule that modifies the spam confidence level, bypasses spam filtering, or sets the authentication-results header to pass. Each of those rules is a hole in your email security posture, and many exist long after the original need has passed.

Domain 3: Data Protection

Data protection in Microsoft 365 centers on controlling what information leaves your organization and how sensitive data is classified, labeled, and handled. This domain is where compliance requirements meet practical configuration — and where the gap between policy on paper and policy in practice tends to be widest.

Sensitivity Labels

Check the Microsoft Purview compliance portal under Information protection > Labels. Are sensitivity labels defined? Are they published to users? And most importantly, are users actually applying them? Many organizations define labels during a project, publish them, and then never follow up on adoption. If labels exist but usage reports show negligible adoption, the control is effectively not in place. If no labels are defined at all, that's a finding worth noting — it means there's no mechanism for users to classify sensitive documents, and no basis for automated DLP policies tied to classification.

Data Loss Prevention Policies

DLP policies scan content across Exchange, SharePoint, OneDrive, and Teams for patterns that match sensitive information types — Social Security numbers, credit card numbers, health records, financial data. Check Purview > Data loss prevention > Policies for active policies. Look at what sensitive information types are covered, which locations the policies apply to, and what the enforcement actions are. A common gap is having DLP policies only on Exchange but not on SharePoint or Teams, meaning a user can share a document containing SSNs through Teams without triggering any policy.

External Sharing

Navigate to the SharePoint admin center and check the sharing settings. SharePoint and OneDrive external sharing has four levels: anyone (anonymous links), new and existing guests, existing guests only, and only people in your organization. The question isn't whether external sharing should be allowed — for most organizations, it needs to be — but whether the current setting is intentional. If SharePoint is set to "Anyone" and nobody made a deliberate decision about that, it's a finding. Also check whether link expiration and link permissions (view vs. edit) are configured with sensible defaults.

Teams Guest Access

Check the Teams admin center under Users > Guest access. Guest access allows external users to participate in Teams channels and access shared files. The question is whether your guest access settings were reviewed and set intentionally, or whether they're just the defaults. Specifically, check whether guests can create and delete channels, whether they have access to shared files, and whether there's any process for reviewing stale guest accounts. An environment with guest access fully open and no review process is carrying more risk than most teams realize — every guest account is a potential entry point that exists outside your normal lifecycle management.

Assessment Finding Documentation Template
Domain
Finding
Severity
Remediation
Effort
Identity
Legacy auth not blocked by CA policy
Critical
Create CA policy blocking legacy auth clients
1 hour
Email
DMARC policy set to p=none for 6+ months
High
Move DMARC to p=quarantine, monitor 30 days, then p=reject
2 hours
Endpoints
BitLocker not enforced on Windows devices
High
Deploy BitLocker configuration profile via Intune
4 hours
Sample entries — your assessment should produce 20-50 findings across all twelve domains

Domain 4: Endpoint Management

Endpoint management is the operational foundation for everything else in this assessment. If devices aren't enrolled in management, you can't enforce compliance, push security configurations, or ensure that the devices accessing your data meet any standard at all. This domain is about whether your organization has a managed device posture or a hope-based one.

Intune Enrollment

Check the Intune admin center under Devices > All devices. How many devices are enrolled? Compare that number against your expected device count. If you have 200 employees and 85 enrolled devices, you have an enrollment gap that means a substantial portion of your workforce is accessing corporate data from unmanaged devices. The enrollment rate doesn't need to be 100% — BYOD and contractor devices might reasonably be handled differently — but you should know the number and have an explanation for the gap.

Compliance Policies

Enrollment alone doesn't secure anything. Compliance policies define the minimum security bar a device must meet: OS version, encryption status, antivirus state, password requirements. Check Devices > Compliance policies and examine what's configured for each platform. A common finding is that compliance policies exist for Windows but not for iOS or Android, or that they're defined but so lenient they don't actually exclude any device. The more important check is whether compliance status is actually used as a gate — which leads directly to the next item.

Conditional Access and Device Compliance

The real power of compliance policies comes when they're tied to Conditional Access. A Conditional Access policy that requires a compliant device means that a user on an unmanaged or noncompliant device can't access Exchange, SharePoint, or Teams. Check your Conditional Access policies for any that include a "Require device to be marked as compliant" grant control. If no such policy exists, your compliance policies are informational only — they report status but don't actually block access. This is a high finding because it means the gap between your security intent and your security reality is wider than your reporting suggests.

Security Baselines

Intune provides security baselines — pre-built configuration profiles based on Microsoft's security recommendations. Check Endpoint security > Security baselines for applied baselines. The Windows security baseline covers hundreds of settings across BitLocker, Defender, firewall, credential protections, and more. If no baselines are applied and your team hasn't created equivalent custom profiles, the devices in your environment are likely running with a mix of default settings that vary by hardware manufacturer and OS build.

BitLocker Encryption

Check whether BitLocker is enforced through Intune. Navigate to Endpoint security > Disk encryption and look for a BitLocker profile. If BitLocker isn't enforced through management, individual devices may or may not have encryption enabled depending on OEM defaults and user behavior. A lost or stolen laptop without disk encryption is a data breach notification event in most regulatory frameworks. This is one of the most straightforward security controls to deploy and one of the most costly to skip.

Windows Update Management

Check whether Windows updates are managed through Intune update rings or Windows Autopatch. Under Devices > Windows > Update rings, look for configured rings with appropriate deferral periods. The alternative — relying on users to install updates whenever Windows prompts them — results in a fleet of devices at varying patch levels, some weeks or months behind on security updates. If feature updates and quality updates aren't managed centrally, note it. Unpatched devices are the easiest targets in any environment.

Domain 5: Endpoint Protection

Endpoint protection is distinct from endpoint management. Management is about enrollment, compliance, and configuration. Protection is about detecting and responding to threats on the device itself. This domain is primarily about Microsoft Defender for Endpoint and its integration with the rest of your security stack.

Defender for Endpoint Deployment

Check whether Defender for Endpoint is onboarded to your managed devices. In the Microsoft 365 Defender portal, navigate to Settings > Endpoints > Onboarding to see the deployment status. The critical question isn't just whether the agent is installed — Windows comes with Defender Antivirus by default — but whether devices are onboarded to the Defender for Endpoint service, which provides the EDR (endpoint detection and response) capability, threat analytics, and the security signal that feeds into other Microsoft security products. If devices show Defender Antivirus running but aren't onboarded to the cloud service, you have antivirus but not EDR.

Attack Surface Reduction Rules

ASR rules are a set of policies that block specific behaviors commonly exploited by malware: Office applications creating child processes, scripts executing obfuscated code, credential stealing from LSASS, and others. Check Endpoint security > Attack surface reduction in Intune. If ASR rules are not configured, you're missing a layer of protection that blocks known attack techniques before they execute. If they are configured, check the mode — many organizations deploy ASR rules in audit mode for testing and never switch them to block mode. Audit mode generates telemetry but doesn't stop anything.

Tamper Protection

Tamper protection prevents malicious actors (and sometimes well-meaning admins) from disabling Defender Antivirus, real-time protection, or cloud-delivered protection. Check the Defender portal under Settings > Endpoints > Advanced features to verify that tamper protection is enabled. When tamper protection is off, an attacker with local admin access can simply disable Defender before deploying their payload. This should be enabled in every environment without exception.

Automated Investigation and Response

Defender for Endpoint includes automated investigation capabilities that can triage alerts, collect evidence, and in some cases automatically remediate threats. Check the automation level under Settings > Endpoints > Advanced features. The options range from no automation through semi-automated (requires approval) to fully automated. Most organizations should use at least semi-automated — having every alert sit in a queue waiting for manual review means your response time is measured in hours or days rather than minutes.

Intune Compliance Integration

One of the most valuable connections in the Microsoft security stack is feeding the Defender for Endpoint risk signal into Intune compliance. This means that if Defender detects a high-risk condition on a device, that device is automatically marked noncompliant in Intune, which (if your Conditional Access policies require compliance) blocks the user's access to corporate resources until the risk is resolved. Check Intune > Endpoint security > Microsoft Defender for Endpoint to verify this connection is configured. If it's not, your EDR and your access management are operating in separate silos.

Domain 6: Application Security

Application security in the Microsoft 365 context covers how well you're protecting the services your users access and how much visibility you have into the applications connecting to your tenant.

Defender for Office 365

First, determine whether you have Defender for Office 365 and at what plan level. Plan 1 gives you Safe Links, Safe Attachments, and anti-phishing with impersonation protection. Plan 2 adds Threat Explorer, automated investigation for email, attack simulation training, and campaign views. If you covered Safe Links and Safe Attachments in the email domain, note the plan level here and check whether the Plan 2 features are actually configured if you're licensed for them. Many organizations pay for Plan 2 and only use Plan 1 capabilities.

Defender for Cloud Apps

Cloud App Security (now Defender for Cloud Apps) gives you visibility into the SaaS applications your users are connecting to — sanctioned and unsanctioned. Check whether it's configured under the Defender portal. The shadow IT discovery feature analyzes your network traffic or endpoint logs to identify which cloud services are in use. If this isn't configured, you likely have no visibility into how many unauthorized SaaS applications have access to your corporate data. Even if you're not ready to set up enforcement policies, enabling discovery alone is a valuable step.

OAuth App Consent

By default, users in Entra ID can consent to third-party applications requesting access to their data. This means any user can authorize an app to read their email, access their files, or act on their behalf — without IT approval. Check the consent settings under Entra ID > Enterprise applications > Consent and permissions. If user consent is set to allow all apps, that's a high finding. At minimum, user consent should be restricted to apps from verified publishers for low-risk permissions, with everything else requiring admin approval. Better still is to require admin consent for all apps and manage exceptions through a defined request workflow.

Enterprise App Inventory

Navigate to Entra ID > Enterprise applications > All applications and review the list. Many organizations are surprised by the number of applications registered in their tenant. Each application with assigned permissions is a potential access path to your data. Look for applications you don't recognize, applications with high-privilege permissions (like Mail.ReadWrite or Directory.ReadWrite.All), and applications that haven't been used in months. A stale, over-permissioned application registration is an attractive target for an attacker.

Domain 7: Network and Infrastructure Security

This domain applies primarily to organizations with Azure IaaS resources — virtual machines, virtual networks, storage accounts, databases. If your Microsoft cloud footprint is purely Microsoft 365 with no Azure resources, you can note this domain as not applicable and move on. But if you have even a few VMs running in Azure, these checks are essential.

Network Security Groups

NSGs are the basic firewall layer for Azure networking. Every VM's network interface and every subnet should have an NSG applied. Check for any VM or subnet with no associated NSG — that resource is effectively open to the network. Then check the rules themselves. Look for inbound rules that allow traffic from "Any" source on management ports like RDP (3389) or SSH (22). Exposing RDP to the internet is one of the most commonly exploited misconfigurations in Azure, and it appears in a surprisingly large percentage of production environments.

Firewall and Network Architecture

For environments beyond a few VMs, check whether a hub-and-spoke network architecture is in place with centralized traffic inspection. Is Azure Firewall deployed, or a third-party NVA? Is traffic between spokes routed through the hub? If your Azure network is a flat architecture with no centralized inspection point, any compromise of one workload can move laterally without detection. For smaller environments, this may not be necessary, but the architecture decision should be documented and intentional.

Storage Account Security

Check your storage accounts for public access settings. Navigate to each storage account and verify that "Allow Blob public access" is disabled unless explicitly needed. Check whether storage accounts use private endpoints or service endpoints to restrict access to your virtual network. A storage account with public access enabled and no network restrictions is accessible to anyone on the internet who knows (or guesses) the URL. Also verify that the "Secure transfer required" setting is enabled, ensuring all access uses HTTPS.

Just-In-Time VM Access

JIT access is a Defender for Cloud feature that locks down management ports on VMs and only opens them temporarily when an admin requests access. Check Defender for Cloud > Workload protections > Just-in-time VM access. If JIT is not configured, management ports are either always open (bad) or managed through static NSG rules that someone has to remember to update. JIT provides a time-limited, audited, approval-based access model that significantly reduces the attack surface of your VMs.

Diagnostic Logging

Check whether diagnostic settings are configured on your Azure resources to send logs to a central Log Analytics workspace. Activity logs, NSG flow logs, and resource-specific diagnostic logs should all be flowing to a location where they can be queried, alerted on, and retained. If diagnostic logging is not configured, you have no record of what happened on your Azure resources beyond the default 90-day activity log retention — which is often insufficient for investigation or compliance.

Domain 8: Logging and Monitoring

Logging and monitoring is the domain that determines whether you'd actually know about a security incident when it happens. Everything you've assessed so far is about prevention and configuration. This domain is about detection — and it's where many organizations have the biggest gap between their assumed posture and their real one.

Unified Audit Log

The Unified Audit Log in Microsoft 365 records user and admin activity across Exchange, SharePoint, OneDrive, Teams, Entra ID, and more. Check whether it's enabled by navigating to the Purview compliance portal under Audit and running a search. If the search returns results, auditing is on. If it returns nothing and the tenant has been active for a while, auditing may be disabled. This is a critical finding if it's off — without the audit log, you have no forensic record of what happened in your Microsoft 365 environment. No ability to investigate a compromised account. No ability to determine what data was accessed or exfiltrated.

Log Retention

By default, Microsoft 365 retains audit log data for 180 days with E3 licensing and up to one year with E5. Entra ID sign-in logs are retained for only 7 days with free Entra and 30 days with P1/P2. These defaults are usually insufficient for investigation and compliance. Check whether you have extended retention configured. If your logs are retained for only 7-30 days and an attacker was in your environment for two months before detection, you've already lost the evidence you need. Consider exporting logs to a Log Analytics workspace or SIEM for longer-term retention.

Alert Policies

Microsoft 365 includes default alert policies for certain activities — suspicious email sending patterns, malware detected, eDiscovery searches. Check the alerts under Defender portal > Incidents & alerts > Alert policies. Are the default policies still active? Have any been disabled? More importantly, have custom alert policies been created for activities specific to your organization's risk profile? Activities like new inbox forwarding rules, mass file downloads, changes to Conditional Access policies, or new Global Admin role assignments should all trigger alerts.

SIEM Integration

Check whether your Microsoft 365 and Azure logs are flowing to a SIEM — whether that's Microsoft Sentinel, Splunk, or another platform. A SIEM provides correlation, alerting, and investigation capabilities that go well beyond what the native Microsoft portals offer. If there's no SIEM in place, note it. If Sentinel is deployed, check which data connectors are active. A common gap is having Sentinel deployed with only one or two connectors enabled, meaning it's receiving only a fraction of the available telemetry.

Alert Review Process

This is the question that separates operational security from security theater: is anyone actually reviewing alerts? If your organization generates alerts but nobody looks at them regularly — or if alerts go to a shared mailbox that nobody checks — the alerting infrastructure is generating noise, not security. Ask your team: who reviews alerts? How often? What's the response process when an alert fires? If the honest answer is "we look at them when we have time," that's a finding. Detection without response is just logging.

Domain 9: Privileged Access Management

Privileged accounts are the highest-value targets in any environment. This domain focuses on how those accounts are managed, monitored, and restricted.

Least Privilege Role Assignment

Review the role assignments in Entra ID > Roles and administrators. For each role with members, ask: does this person need this specific role for their job? Global Administrator is the most commonly over-assigned role, but Exchange Administrator, SharePoint Administrator, and Intune Administrator are also frequently granted to people who only need a subset of the permissions those roles provide. Microsoft provides more granular roles for most tasks — a user who manages Exchange mailboxes probably needs the Mail Recipients role, not Exchange Administrator. Document over-privileged assignments as findings.

Privileged Identity Management

PIM is an Entra ID P2 feature that enables just-in-time role activation. Instead of users having permanent standing access to admin roles, PIM requires them to "activate" the role when they need it, with a defined maximum duration. Check Entra ID > Identity Governance > Privileged Identity Management for configured roles. If PIM is available but not configured, your admin accounts have permanent standing privileges 24/7 — even when the admin is sleeping, on vacation, or not performing administrative work. PIM reduces the window of exposure from "always" to "only when actively needed."

Emergency Access Accounts

Every organization needs at least one emergency access (break-glass) account. This is a cloud-only Global Admin account that bypasses Conditional Access and MFA, stored securely for use only when all other admin access fails. Check whether such an account exists, whether it's documented, and whether its credentials are stored securely (not in someone's password manager alongside their daily-use passwords). Also check whether the account has been tested recently — an emergency account you've never tested might have a expired password, a revoked session, or MFA requirements that weren't properly excluded.

Service Account Inventory

Service accounts, application registrations, and managed identities all represent non-human access to your environment. Check whether your team maintains an inventory of these accounts, what permissions each has, and when they were last reviewed. Service accounts are often created during projects, granted broad permissions for convenience, and then forgotten. A service account with Global Admin privileges and a password that hasn't been rotated in two years is a significant risk — and it appears in more environments than you'd expect.

Domain 10: Incident Response Readiness

This domain is different from the others because it's not about checking configurations in a portal. It's about evaluating whether your team is prepared to respond effectively when something goes wrong in your cloud environment.

Incident Response Plan

Does your organization have a documented incident response plan that specifically addresses cloud-based incidents? An on-premises IR plan from five years ago doesn't count — cloud incidents have different evidence sources, different containment procedures, and different communication requirements. The plan should define roles, escalation paths, communication procedures, and specific technical steps for common scenarios. If no cloud-specific IR plan exists, that's a high finding. You don't want to be writing the plan during the incident.

Containment Procedures

Ask your team: if you discovered right now that a user account was compromised, could you walk through the containment steps without looking anything up? The immediate actions should include revoking all active sessions, resetting the user's password, disabling the account, reviewing recent sign-in activity, checking for new inbox forwarding rules, reviewing OAuth app consents, and preserving audit logs. If your team needs to research these steps during a live incident, you're losing time that matters. Document the specific procedures and make sure at least two people on the team can execute them.

Forensic Readiness

When an incident occurs, you need to collect evidence quickly. Check whether your team has pre-configured access to the Unified Audit Log, Entra ID sign-in logs, and Defender telemetry. Check whether anyone on the team has used the Content Search tool in Purview to investigate what a compromised account accessed. If the first time someone uses these tools is during an active incident, the learning curve will cost you critical hours. This is where tabletop exercises prove their value — they reveal gaps in tool familiarity and process before a real event forces the issue.

Tabletop Exercises

Has your team run a tabletop exercise for a cloud-based compromise scenario? A tabletop is a structured walkthrough: present a scenario (compromised admin account, ransomware in SharePoint, BEC targeting the CFO), and have the team walk through their response step by step. It costs nothing but time, and it consistently reveals gaps that documentation reviews miss — things like "we assumed the security team would handle that, but they assumed we would," or "we need access to a portal that nobody on the team can actually log into." If you've never run a cloud-specific tabletop, schedule one. It's one of the highest-ROI security activities you can do.

The honest test.

For this domain, the assessment isn't about what's documented — it's about what your team can actually execute under pressure. If the IR plan exists but nobody has read it recently, and the containment procedures are documented but nobody has practiced them, the effective state is unpreparedness regardless of what the documentation says.

Domain 11: Backup and Recovery

This domain addresses one of the most common and most consequential misunderstandings in Microsoft 365: the assumption that Microsoft backs up your data. They don't — not in the way you need them to.

Microsoft 365 Data Backup

Microsoft provides retention and availability, not backup. Exchange Online has deleted items recovery. SharePoint has versioning and recycle bins. OneDrive has restoration windows. But none of these are a substitute for a proper backup. They don't protect against a malicious admin purging data, they have limited retention windows, and they don't give you the point-in-time recovery capability that a real backup provides. Check whether your organization uses a third-party backup solution for Microsoft 365 data — solutions like Veeam for Microsoft 365, AvePoint, Druva, or similar. If there's no backup beyond Microsoft's native retention, that's a high finding. Ransomware, malicious insiders, and accidental bulk deletions are all scenarios where you need a backup that exists outside the Microsoft 365 service itself.

Recovery Testing

If a backup solution is in place, the follow-up question is whether recovery has been tested. A backup you've never restored from is a backup you hope works. Check when the last recovery test was performed. Can you restore a single mailbox? A specific SharePoint site? A single file from a specific point in time? If recovery has never been tested, note that as a finding. The time to discover that your backup solution doesn't actually restore data correctly is during a test, not during an incident.

Azure Resource Backup

If you have Azure IaaS resources, check whether Azure Backup is configured for your virtual machines, databases, and file shares. Navigate to Azure portal > Backup center and review what's covered. Common gaps include VMs that were created outside the standard provisioning process and never added to a backup policy, or databases where the backup was configured at creation but the retention policy was never reviewed. Also check whether backup alerts are configured — if a backup job fails, does anyone know?

Domain 12: Governance and Documentation

The final domain is the one that holds everything else together over time. Technical controls degrade without governance. Configurations drift without documentation. Access accumulates without review. This domain is about the processes that maintain your security posture beyond the initial setup.

Security Policy Documentation

Check whether your organization has written security policies that are current, accessible, and actually referenced. A security policy from 2019 that nobody has read since it was approved is not functional governance — it's compliance theater. The policies should cover acceptable use, access management, data classification, incident response, and change management, at minimum. They should reflect your current environment (Microsoft 365, Azure, Entra ID), not the on-premises environment you had five years ago.

Cloud Asset Inventory

Does your team maintain an inventory of cloud resources? This includes Azure subscriptions, resource groups, virtual machines, storage accounts, databases, registered applications, and licensed services. If the answer is "it's all in the portal," that's not an inventory — that's discovery on demand. An inventory is a documented, maintained list that tells you what exists, who owns it, what its purpose is, and whether it's still needed. Without an inventory, you can't assess your attack surface, you can't identify orphaned resources, and you can't answer basic questions about your environment during an audit or incident.

User Access Reviews

Check whether your organization conducts periodic access reviews. Entra ID P2 includes access reviews as a feature — check Identity Governance > Access reviews to see if any are configured. Even without the formal feature, the question is whether anyone periodically verifies that user access is still appropriate. People change roles, leave projects, and move to different departments, but their access rarely shrinks accordingly. Without access reviews, permissions accumulate over time until most users have far more access than their current role requires.

Onboarding and Offboarding

The final check: does your onboarding and offboarding process include security steps? When a new employee joins, are they provisioned with appropriate (not excessive) permissions? When someone leaves, is their account disabled promptly, their sessions revoked, their devices wiped, and their access to shared resources removed? Check for accounts that are disabled but still licensed, or accounts that were disabled months ago but never deleted. A common finding is that offboarding handles email and laptop return but misses Azure role assignments, app registrations owned by the departing user, or shared mailbox access that persists indefinitely.

Scoring Your Assessment

By this point, your tracking spreadsheet should have a substantial list of findings. The next step is making sense of that list — turning observations into priorities. Not every finding carries the same weight, and your remediation plan needs to reflect the difference between "this could lead to a full tenant compromise" and "this is a best practice we should get to eventually."

Use a four-tier severity model. The boundaries between tiers should be driven by two factors: the potential impact of the finding being exploited, and the likelihood that it will be. A configuration that an attacker can exploit remotely with no authentication is more urgent than one that requires existing internal access plus elevated privileges.

Severity Classification Framework
Critical
Fix immediately — active risk of compromise
No MFA enforcement. Legacy authentication not blocked. Unified Audit Log disabled. Global Admin accounts with no MFA. RDP exposed to internet on Azure VMs.
High
Fix within 30 days — significant exposure
DMARC at p=none for 90+ days. No third-party M365 backup. BitLocker not enforced. No device compliance in Conditional Access. User consent unrestricted for OAuth apps. No incident response plan.
Medium
Fix within 90 days — improvement needed
Sensitivity labels defined but low adoption. DLP policies on Exchange only (not SharePoint/Teams). ASR rules in audit mode. PIM available but not configured. No access reviews conducted.
Low
Address in next planning cycle — best practice
Security policies need updating. Cloud asset inventory incomplete. Guest access policy not formally reviewed. Recovery testing not documented.

Once you've classified every finding by severity, the next decision is sequencing. Not everything with the same severity level should be tackled in the same order. Effort matters. A critical finding that takes one hour to remediate should be fixed before a critical finding that requires a two-week project. This is where a simple effort-versus-impact matrix helps you prioritize within each severity tier.

Priority Remediation Matrix — Impact vs. Effort
Low Effort
High Effort
High Impact
Do First
Block legacy auth. Enable MFA via CA. Enable audit log. Enable tamper protection.
Plan and Schedule
Deploy third-party backup. Implement PIM. Deploy Sentinel with full connectors. Build IR plan.
Low Impact
Quick Wins
Update security policies. Review guest access settings. Configure alert notifications. Document break-glass accounts.
Backlog
Build cloud asset inventory. Implement information barriers. Full DLP expansion across all workloads.

What to Do With Your Results

If you've worked through all twelve domains honestly, your tracking spreadsheet now contains somewhere between 20 and 50 findings. That's normal. In fact, if you ended up with fewer than 15, you probably weren't looking hard enough — or your environment is genuinely well-managed, in which case you should be documenting it as a competitive advantage.

The natural reaction to a long list of findings is to feel overwhelmed. Resist the urge to try to fix everything at once or, worse, to shelve the list because it's too large to tackle. The assessment has already done the hardest part: it's turned the vague sense of "our security could be better" into a concrete, prioritized list of specific things that need attention.

Start with the critical findings. These are the items where the risk is active and immediate — an attacker could exploit these configurations today. Block legacy authentication. Enforce MFA through Conditional Access. Enable the Unified Audit Log. These are typically low-effort, high-impact changes that can be completed in hours, not weeks. There is no reason to defer them.

For high findings, build a 30-day remediation plan. Assign each item to a specific person with a target completion date. Some of these will require change management — deploying a third-party backup solution, configuring PIM, or building an incident response plan involves stakeholder input and potentially budget approval. Start those conversations now so the 30-day timeline is realistic.

Medium and low findings should be folded into your team's quarterly planning. They represent genuine improvements, but the sky isn't falling. The goal is steady progress: reassess quarterly, tackle the next tier of findings, and demonstrate measurable improvement over time. That cadence — assess, remediate, reassess — is how mature security programs operate. It's also how you demonstrate progress to leadership in concrete terms rather than abstract security metrics.

Track your progress visibly.

Keep your assessment spreadsheet as a living document. When a finding is remediated, mark it complete with the date and the person who fixed it. When you reassess, add new findings. Over time, this spreadsheet becomes a running record of your security posture improvement — and a powerful artifact for compliance, audits, and budget conversations.

Final Takeaways

This health check is a starting point, not a destination. The twelve domains give your team a structured way to evaluate your Microsoft cloud security posture, identify what needs attention, and build a remediation plan based on actual findings rather than assumptions. The process itself has value beyond the list of findings it produces — working through it forces your team to look at configurations they may not have reviewed since the initial setup, ask questions about decisions that were made (or inherited) years ago, and confront the gap between what they assumed was in place and what actually is.

The reality for most organizations is that cloud security is not a project with a finish line. It's an ongoing discipline. Microsoft ships new features, threat actors develop new techniques, your organization changes its tools and workflows, and your security posture drifts as a result. The teams that maintain a strong posture are the ones that assess regularly, remediate deliberately, and treat security configuration as something that requires the same attention as uptime or performance. Quarterly reassessment against these twelve domains will keep your team honest about where you stand.

If your team completes this assessment and realizes the scope of work is larger than you can take on internally — whether that's the remediation itself, the deeper analysis some findings require, or the ongoing monitoring that follows — Cybernerds runs this exact process as a structured engagement. We assess across all twelve domains, deliver a prioritized findings report, and work with your team to build and execute the remediation plan.

Chat with an engineer