Virtual Desktop Infrastructure (VDI) is fairly common in customer environments, especially in today’s world where many are working from home as a result of COVID-19. As such, we want to ensure that Microsoft provides protection for VDI machines, and that you understand how Microsoft Defender Advanced Threat Protection (Microsoft Defender ATP) works within your VDI deployment. In this blog post, we’ll cover VDI, how it works with Microsoft Defender ATP, best practices, and some lessons learned.
When we talk about VDI, we often talk about two different deployment types: persistent and non-persistent. Let’s look at both of these types and explore how they interact with Microsoft Defender ATP onboarding.
Persistent VDI is a deployment type where the virtual machines (VM) persist their state, meaning that the machine doesn’t lose its state or data when it is rebooted, shutdown, or when a user logs off. In short, the persistent VDI machine behaves much like a physical machine in that local data is saved or persisted across these actions (reboot, shutdown, logoff). Onboarding a persistent VDI machine into Microsoft Defender ATP is handled the same way you would onboard a physical machine, such as a desktop or laptop. Group policy, Microsoft Endpoint Manager, and other methods can be used to onboard a persistent machine. In the Microsoft Defender Security Center, (https://securitycenter.windows.com) under onboarding, you would select your preferred onboarding method, and follow the instructions for that type. As you can see, onboarding persistent VDI machines really isn’t different than onboarding a physical machine or a server that is a virtual machine. For the remainder of the post we will focus on non-persistent VDI.
Non-persistent VDI is the opposite of persistent VDI. In non-persistent VDI, the virtual machine state does NOT persist across actions, such as reboot, logoff, or shutdown. Typically, when one of these actions is performed, the virtual machine is deleted. It’s helpful to understand at a high level how the non-persistent VDI model works. This will help paint a better picture of how Microsoft Defender ATP onboarding fits in.
In a non-persistent deployment type, VDI pools are typically deployed off one virtual machine commonly referred to as the VDI master, golden image, or master image. For simplicity, I’ll refer to it as the VDI master here. At an extremely high level it looks something like this:
The VDI master is a virtual machine that has Windows installed, as well as software, and any customizations. No users will ever actually log on to this machine; instead, it is used by the virtualization platform as a template to deploy VDI machines. The deployment of the VDI machines from the VDI master typically happens by building the master, installing any software and customizations, then shutting it down in order to provision multiple VDI machines into groupings called “pools.” The VDI machines that are provisioned into pools are what end users log on to.
As you can see, we have a unique scenario with non-persistent VDI. Changes are only made to the VDI master, and the VDI machines that users log on to are based off of this VDI master. But the VDI machines do not save their state and are essentially deleted after a user is done with that machine. This means at any given time two things can be happening:
- VDI machines are being spun up in the pool
- VDI machines are being deleted or de-provisioned
For specifics on this, you will want to contact your virtualization platform vendor.
To protect your VDI machines, you should onboard them into Microsoft Defender ATP. But this puts you in a bit of a chicken/egg scenario. If you somehow deploy the onboarding mechanism to the VDI machines after they are provisioned/spun up, you might have some time lapse between the time the machine came online, and when it is onboarded into Microsoft Defender ATP. This leaves a potential gap in time where the VDI machines would not be protected by Microsoft Defender ATP.
What about using the startup script via group policy? Although you can use this method, there is potential for things to go wrong here as well, such as if group policy is broken for some reason, or you run into domain controller/sysvol issues, etc. Configuration management tools such as Microsoft Endpoint Manager are typically not used on non-persistent VDI machines, mostly because anything that is installed or done to the machine with a configuration management solution is lost when the machine is deleted/de-provisioned (so all of the actions there are in vain).
What you need is a way to onboard the VDI machines at first boot as soon as they are created. This is why you have the “VDI onboarding for non-persistent machines” option in the Microsoft Defender Security Center, as shown in the following image:
Since you want the VDI machines (which are child clones of the VDI master) to be onboarded immediately at first boot, you must stage the onboarding script on the VDI master. That way, it is executed as a startup script at first boot on all of the VDI machines that are provisioned from the VDI master.
Note: This is important to understand: The placement and configuration of the VDI onboarding startup script on the VDI master is merely staging the file and configuring it as a startup script to be executed on the VDI machines. It is NOT intended to onboard the actual VDI master; in fact, you should not onboard a VDI master (more on this later).
The documentation shows that there are two different ways to configure the startup script for VDI machine onboarding:
- A single entry for each machine
- Multiple entries for each machine
This is explained in detail in the documentation, but essentially boils down to how many objects for a given VDI machine you want to see in the Microsoft Defender Security Center.
Let’s take a look at what is happening behind the scenes. The difference is specifically in the single entry for each machine configuration. With this method, we call the .ps1 directly as a startup script as mentioned in the documentation. Here is why: if you crack open the .ps1 onboarding script, you can see the following:
The .ps1 is generating a unique value called senseGuid, which is based off of a concatenation of the OrgID (pulled from the WindowsDefenderATPOnboardingScript.cmd file), the “_” character, and the computer name value. If not already present, the senseGuid value is written to the registry on the VDI machine at the path noted in the $senseGuidRegPath line. Later in the .ps1, the WindowsDefenderATPOnboardingScript.cmd file is then called. When the machine starts the SENSE service, the machineID is calculated, and its conversation with the tenant for onboarding begins. The senseGuid value produced by the .ps1 in the registry on the machine ensures the same machineID is used as long as the machine DNS name stays the same. The machineID value is also calculated as part of the onboarding process, and once it is calculated it is also written to the same path in the registry on the local machine as a value called senseID.
There you have it; this is the logic that ensures the machine only has a single entry in the portal each time onboarding is run. If you opt for the other route, the .ps1 is not used and the .cmd is called directly as a startup script, which doesn’t have this logic; therefore, multiple entries for the same machine are populated in the portal.
There is more to this VDI onboarding story, such as how all of this relates to the management or servicing of the VDI Master.
VDI Master Servicing
Patching and servicing of VDI machines in a pool doesn’t happen directly on those machines. If you push software or configuration changes to them, those changes are lost at logoff/reboot/shutdown of the VDI machine. The patching/servicing happens only on the VDI master. Typically, organizations re-compose their VDI pools at least once a month at a minimum to incorporate the latest Microsoft updates. From a high level, there are several approaches to servicing the VDI master. They are as follows:
- Reuse the existing master by powering it on, patching and updating it, then shutting it back down.
- Build a new master from scratch via automated process such as Microsoft Deployment Toolkit.
- Offline servicing (if possible).
Option 1 – Re-use the existing master
Reusing the existing master can have unintended consequences if you are not careful. The main reason is that because you have staged the onboarding script on the VDI master (so that it is executed as a startup script at first boot on all of the VDI machines), this means that when you power on the VDI master, the onboarding script is going to run. Remember that you should never onboard the VDI master.
If you power on the VDI master, it will be onboarded (which you don’t want). The problem arises if you don’t offboard the VDI master, do some cleanup, and apply patches/service updates, shut it down, and then deploy a VDI pool from it. Here is the scenario and outcome:
- Suppose that you have onboarded the VDI master simply by powering it on and it is assigned a senseGuid and a senseID. Those are written to the registry (with all the other onboarding info).
- The VDI master is patched/serviced and shut down, and a new VDI pool is deployed from it.
- Once the VDI pool is deployed, the VDI machines start to boot up.
- The VDI machines run the onboarding (because they are configured to do so), but onboarding exits because they are already onboarded.
- Since the VDI machines are basically a clone (I use this term loosely here) of the VDI master, they already have the senseID and senseGuid written to their registry along with all the other onboarding information.
- When all the VDI machines start to report their telemetry, they are reporting it under the same senseID (or machineID as it is called in the Microsoft Defender Security Center).
Having all your VDI machines reporting telemetry to the same senseID/machineID in Microsoft Defender ATP causes a number of issues. There is no delineation between VDI machines, so if something happens on one of them, you won’t actually know which one it happened on, since they all report their telemetry to the same machineID. This is a bad scenario any way you look at it.
How do you avoid this scenario? (and you should at all costs) If your organization uses the method of turning on and servicing the same VDI master, then you should ensure that you perform a couple of extra steps to avoid this scenario.
Note: Each time you boot the VDI master for servicing/patching, make sure to run the offboarding script (downloadable from the Microsoft Defender Security Center). This will turn off the Microsoft Defender ATP sensor and remove the onboarding information from the registry. You also need to make sure that the cyber folder contents are cleaned out, as data will begin to accumulate there when it is onboarded. Only the system account has access to perform this action. You can use psexec to open a cmd prompt as system:
PsExec.exe -s cmd.exe
cd "C:\ProgramData\Microsoft\Windows Defender Advanced Threat Protection\Cyber"
del *.* /f /s /q
Our documentation has been updated here to reflect this. This brings us to option number two, which is to build your VDI master from scratch via automated process each time you recompose your VDI pools.
Option 2 – Build a new master from scratch
There are a number of good reasons why you should consider using this method. The first reason is, you will never be in a position to have the issue described above happen (since you don’t reuse the same master over and over). You also get other benefits:
- Using a clean/fresh install of Windows each time (no WinSxS or image cleanup via DISM required)
- VDI master build automation
- VDI master software package consistency
- Automated integration of Microsoft Defender Application Control policies
- Automated integration of Microsoft Defender ATP onboarding
- Automated development and testing of your organization’s vNext VDI image
- More agile iterations between Windows 10 versions and software
- ...and more.
Using this approach also removes a significant amount of room for human error by using automation tools such as the Microsoft Deployment Toolkit (MDT) to build your VDI master. We have seen great success by customers who use this method to deploy their VDI pools. A sample script that can be used to stage the Microsoft Defender ATP onboarding script on your VDI master during an MDT task sequence is here.
Option 3 – Offline Servicing
This option is really only available if you’re using a Microsoft formatted virtual hard disk, such as a vhd or vhdx. DISM can be used to service these disks offline, and this has also been added to the documentation.
I hope this helps better explain Microsoft Defender ATP onboarding and servicing for non-persistent VDI machines. Let us know what you think by leaving a comment below. And stay tuned--we will talk about Microsoft Defender Antivirus settings in a non-persistent VDI environment next time!
Jesse Esquivel, Program Manager
Microsoft Defender ATP