Sunday, October 19, 2008

Problem with Autodiscover on Exchange 2007 SP1

I have run into this problem with an Exchange Server 2007 SP1 implementation running on Windows Server 2008 x64. I also noticed some forum posts where people had the exact same issue on Windows Server 2003 x64, however I cannot verify the solution method described here would help if you're on Windows Server 2003 x64. 

Although the main problem is with the Autodiscover web service, you may experience crashes in Outlook 2007 with Scheduling Assistant or errors with Out of Office Asisstant (OOF). Also Offline Addressbook (OAB) downloads may not work properly. When used with Outlook 2007, all these features depend on Autodiscover, so they fail when Autodiscover has a problem. In my case, users noticed that they were not able to "Add Attendees" when trying to create a meeting request with Scheduling Assistant. Outlook 2007 was crashing when they selected an attendee from the addressbook. After a brief research, I also noticed that Out of Office Assitant was also failing with "Your Out of Office settings cannot be displayed, because the server is currently unavailable" message. My attempts to download Offline Address Book were also unsuccessful. Download attempts did not succeed, but they ran forever and did not produce an error message either.

I will not list all the boring troubleshooting details here, but I want to mention that I deleted and recreated Autodiscover virtual directory with "Remove-AutodiscoverVirtualDirectory" and "New-AutodiscoverVirtualDirectory" commandlets in Powershell. These commands ran without any issues, but they did not help. 

Also I ran "Test-OutlookWebServices" commandlet on Exchange Server and received the following error at the fourth step:

PS] C:\Windows\System32>Test-OutlookWebServices |fl

Id      : 1003
Type    : Information
Message : About to test AutoDiscover with the e-mail address Administrator@xyz.com.

Id      : 1007
Type    : Information
Message : Testing server EX01.xyz.com with the published name https://EX01.xyz.com/EWS/Exchange.asmx & https://mail.xyz.com/EWS/Exchange.asmx.

Id      : 1019
Type    : Information
Message : Found a valid AutoDiscover service connection point. The AutoDiscover URL on this object is https://EX01.xyz.com/autodiscover/autodiscover.xml.

Id      : 1006
Type    : Information
Message : The Autodiscover service was contacted at https://EX01.xyz.com/autodiscover/autodiscover.xml.

WARNING: An unexpected error has occurred and debug information is being generated: Object reference not set to an instance of an object.
Test-OutlookWebServices : Object reference not set to an instance of an object.
At line:1 char:24 
+ Test-OutlookWebServices  <<<< |fl

Because Outlook "Test E-mail AutoConfiguration" ended with a "certificate error" I also re-created the SSL certificate and enabled it for all services. No need to say, none of these actions resolved the problem. 

I was able to find the actual problem source and solution at Technet Forums. (At http://forums.microsoft.com/TechNet/ShowPost.aspx?PageIndex=0&SiteID=17&PageID=0&PostID=3760137, to be exact.) Apparently, the problem was caused by a bug in ".NET Framework 2.0 SP2". It simply was not playing nicely with Exchange Server 2007 SP1. Even though you don't explicitly select to install this .NET 2.0 Service Pack 2, it is installed as a part of ".NET Framework 3.5 SP1" setup. With this information, I searched for .NET Framework 2.0 SP2 in Programs and Features, but failed to find it. So I uninstalled .NET Framework 3.5 SP1 instead and restarted the server. However this did not help. Finally, I found that the .NET 2.0 SP2 was displayed as "Update for Microsoft Windows (KB948609)" in Windows 2008 Programs and Features. After I had uninstalled this update, all of my problems disapperared. I was able to verify that Out of Office Assistant, Scheduling Assistant and Offline Address Book downloads were fully functional. In addition, all error messages disappeared from "Test-OutlookWebServices" test in Powershell and "Test E-mail AutoConfiguration" test in Outlook. 

I had to spend long hours searching the web and testing every possible solution to find the actual one. Hope this post will reduce your troubleshooting period. Please use the Comments link below if want to say something about this post.

Thursday, October 9, 2008

Service Fails to Start After Reboot - Port Conflicts

For the last couple of months, I have experienced high number of unexpected service failures. Randomly, one of the services fails to start successfully after a reboot. Usually, the failed service belongs to a recently installed application. 100% of these incidents were caused by conflicting TCP or UDP ports. Because the port number is already in use, the service fails to start.

In one of the past incidents it was Blackberry Router service, today it was Internet Authentication Server (IAS) service. All of the port conflicts I have seen were on Windows 2003 Domain Controller servers or Windows 2003 SBS. Since the troubleshooting methodology is the same, I will try to describe how to determine the application causing the port conflict and how to resolve the problem.

First of all, you have to know the port numbers and port type of the failed service. For example, Blackberry Router service works on TCP 3101, IAS works on UDP 1645-1646 and UDP 1812-1813. To find the specific port information for a specific service, easiest way is to perform a Google search. If we want to find the port information about Blackberry Router Service, typing “blackberry router service port” in the Google search bar will easily yield to port 3101.

As a second step, you need to verify the port conflict (i.e. verify that the port is already in use). You can achieve this by simply running the following command:

Netstat –an |findstr :portnumber

You need to replace the portnumber with the port number of the failed service. If the failed service uses multiple ports, you have to repeat the command for each port. For example, for Blackberry Router Service, use the following command:

Netstat –an |findstr :3101

If the port is not in use, above command will not return any results. This also means that your root cause is not a port conflict. Unfortunately, this post will not provide any assistance for your problem.

If the port is already in use, you should see a result similar to the line below.

TCP 0.0.0.0:3101 0.0.0.0:0 LISTENING

To verify the port conflict, you can alternatively search Windows Application Log for errors, or look at text based log files for your failed service.

After verifying that the port is in use, the next step would be to determine the application/process using this port. For this purpose, the tool we need to use is TCPView from Sysinternals. You can download the tool at http://technet.microsoft.com/en-us/sysinternals/bb897437.aspx.

The compressed download includes two executables: TCPView.exe – the GUI version, TCPvcon.exe – command line version. I strongly recommend using the command line version because it’s faster. Simply extract the file TCPvcon.exe to a folder, open command prompt, navigate to the folder, and run the command with the following syntax:

TCPvcon –a >tcpview.txt

Now, you can open tcpview.txt with your favorite text editor. Then, search the file contents for the conflicting port number(s). When you find the matching port number, you will also see the application using that port.

In my experience, most common applications causing the conflict are LSASS.EXE and DNS.EXE.

LSASS.EXE
Local Security Authority Subsystem Service (LSASS), is a process in Microsoft Windows operating systems that is responsible for enforcing the security policy on the system. In addition, it is responsible for the following components:
• Local Security Authority
• Net Logon service
• Security Accounts Manager service
• LSA Server service
• Secure Sockets Layer (SSL)
• Kerberos v5 authentication protocol
• NTLM authentication protocol

By default, LSASS.EXE uses random TCP ports ranging from 1024 to 65535. To restrict the port usage of this process, you need to follow the instructions in the MS KB article below:

http://support.microsoft.com/kb/224196

After selecting a dedicated non-conflicting port for LSASS and applying the registry changes in the article, you need to restart the server.

DNS.EXE
In the past, DNS server was designed to listen on TCP port 53 and UDP port 53. However, this behavior has been changed by a recent security patch. If you applied security update 953230 on your DNS server, you will find that it allocates 2500 UDP ports by default. This is the root cause of the most port conflicts. For general information about this security patch and the side effects, you need to read the following article:

http://support.microsoft.com/kb/953230

Although the problem has multiple workarounds, if you want to restore the service functionality and resolve the port conflict while avoiding opening a gap in the security, you need to define reserved ports in the registry. For SBS 2003 version of the detailed resolution you need to check out the following article:

http://support.microsoft.com/kb/956189

For a general Windows 2000/2003 servers, the resolution information can be found at:

http://support.microsoft.com/kb/812873

As described in the articles, after the registry changes, a reboot will be required.

The Official SBS Blog also has a great entry where they explain some additional problems caused by DNS and their recommended solutions in their site. For me, most important one is experienced at IPSEC level, where IPSEC service fails and IPSEC driver enters "block mode". This is a very common issue and can only be avoided by adding "ReservedPorts" entry for IPSEC. Please read the following post for details.


I tried to explain some common root causes of the IP port conflicts and the methodology to resolve these problems. I hope this post will ease the pain of troubleshooting similar problems. I also would like to hear about other conflicting applications and your resolutions. Please use the comments link if you want to share your story about a conflicting port.