Witness Server Boot Time, GetDagNetworkConfig and the pain of Exchange 2010 DR Tests

 

So we recently had a client who wanted to perform a DR test of their Exchange 2010 DAG.  The DAG consisted of a single, all in one server in production, and a single all in one server in DR.  The procedure for this test was to disconnect all network connectivity between prod and DR, shutdown the exchange server and the domain controller, snapshot them, and then start them back up.

Now, we can all agree that snapshots and domain controllers are inherently dangerous, so its up to you to ensure that you have your ducks in a row to ensure that this doesn’t replicate back to production.  That discussion is outside this article.

Now, initially they had trouble bringing up the databases in DR, as well as many components of the DAG.  This article will walk through an example, and try to make sense of what’s causing these issues.

So, here is our setup, we have a two node DAG cluster, stretched across two sites. 

Production

PHDC-SOAEXC01 – Prod all in one Exchange Server

PROD-DC01 – Prod domain controller

PHDC-SOADC01 – Primary witness server

DR

SFDC-SOAEXC01 – DR all in one Exchange Server

DR-DC01 – DR domain controller

SFDC-SOADC01 – Alternate witness server

The DAG name is SOA-DAG-01 and the Active Directory Sites are:

Prod = PH

DR = SF

So in our scenario, we shutdown both PHDC-SOAEXC01 and PHDC-SOADC01.  This will cause the databases in DR to dismount because quorum has been lost by the DR server.

Now, in a DR “test”, we would shutdown the DR exchange server, and the DR domain controller to snapshot them.  I just want to warn you, DO NOT EVER roll a domain controller back to a snapshot in a production environment.  This is a purely hypothetical setup.  Rant over.

Now, in our case, we have snapshotted and rebooted DR-DC01 and SFDC-SOAEXC01.  When we open the Exchange Management Console, we see that the DR servers databases is in a failed state:

image

Now, lets start running through the DR activation steps.  Here is what the process should normally be:

  1. Stop the mailbox servers in the prod site
  2. Stop the cluster service on all mailbox servers in the DR site
  3. Restore the mailbox servers in the DR site, evicting the prod servers from the cluster

After step 3, the database’s should mount, but as you will see, they wont, and I’ll try to explain why.

So, step 1, lets mark the prod servers as down:

   1: Stop-DatabaseAvailabilityGroup SOA-DAG-01 -ActiveDirectorySite PH -ConfigurationOnly

You should expect to see some errors, this is completed expected because the prod site is unable, hence the –configurationonly option:

image

Now, step 2, we will stop the clustering service on SFDC-SOAEXC01 with the powershell command:

   1: Stop-Service ClusSvc

Now, step 3, we will restore the dag with just the servers in DR:

   1: Restore-DatabaseAvailabilityGroup SOA-DAG-01 -ActiveDirectorySite SF

You may get an error stating

Server ‘PHDC-SOAEXC01’ in database availability group ‘SOA-DAG-01’ is marked to be stopped, but couldn’t be removed fro

m the cluster. Error: A server-side database availability group administrative operation failed. Error: The operation f

ailed. CreateCluster errors may result from incorrectly configured static addresses. Error: An error occurred while attempting a cluster operation. Error: Cluster API ‘"EvictClusterNodeEx(‘PHDC-SOAEXC01.SOA.corp’) failed with 0x46.

Simply re-run the command again and it should complete:

image

So now, we should have the databases mounted, and we should be able to see the prod servers as stopped by running the following command:

Get-DatabaseAvailabilityGroup -Status | FL

But, behold, we get an error stating GetDagNetworkConfig failed on the server.  Error: the NetworkManager has not yet been initialized

image

So, here is the first road block, what happened is that since the DR server is one node, it uses the boot time of the alternate file share witness to determine if it is allowed to form quorum.  This is due to a one node cluster, always having cluster, and it trying to prevent split brain.  Tim McMichael does a great job of explaining it Tim McMichael Blog Post.  Essentially the boot time is stored in the registry of the Exchange Server under:

HKEY_LOCAL_MACHINE\Software\Microsoft\ExchangeServer\v14\Replay\Parameters

The Exchange Server checks if it was rebooted more recently than the AFSW, it will not form quorum.  So how do we fix?  We can start by rebooting the AFSW to see what behavior changes.

After we do so, we can re-run:

Get-DatabaseAvailabilityGroup -Status | FL

Now, we get the network and stopped servers info, but there are some entries that are in a broken state, and we get the message that the DAG witness is in a failed state:

image

Note the WitnessServerinUse field reports InvalidConfiguration

We have to re-run our Restore-DatabaseAvailabilityGroup command to resolve this:

Restore-DatabaseAvailabilityGroup SOA-DAG-01 -ActiveDirectorySite SF

Now if we re-run Get-DatabaseAvailabilityGroup –Status | FL we get an expected output:

image

Now, we see that the WitnessShareInUse is set to the alternate.

So, are the databases mounted!? If we check, they are no longer failed, but are “Disconnected and Resyncing”

image

We need to force the server in DR to start because of the single node quorum issue.  This can be done with the following command:

Start-DatabaseAvailabilityGroup SOA-DAG-01 -ActiveDirectorySite SF

Now the database is mounted:

image

So, you can see, the testing can affect what occurs with the DR test, but also the setup with the single node cluster can cause this issue.  The boot time of the alternate file share witness is also extremely important to what the node can do when it restarts.

Hopefully you find the info useful!  Happy Holidays to all!

Advertisement
Posted in Exchange 2010, High Availability | Tagged , | Leave a comment

Follow me on Twitter

Set up the new twitter handle today finally.  Follow me @PaulPonzeka!

Posted in Uncategorized | Leave a comment

Configure Application Impersonation for Exchange 2010 in Resource Forest

 

With the new Exchange 2010 RBAC model, one of the configuration changes is regards to EWS and Application Impersonation.  Instead of defining the ACL’s directly, you configure roles for the appropriate permissions.

If your in a resource forest setup, things are a little different.  Here are the steps.

Your service account, named ServiceAccount needs to be assigned Application Impersonation rights to all the accounts in the Accounting OU.  The user accounts are in client.corp and the Exchange mailboxes are stored in exchange.corp and there is a forest trust between the two.

Step 1:

Create a Universal Security Group in client.corp named UG-ExchangeImpersation.

Step 2:

Create a new linked role group with the Application Impersonation rights bound to this group.  Run the following from an Exchange Management Shell in exchange.corp:

$remotecred = get-credential
Put in a user name of an admin account for client.corp

New-RoleGroup ROLEGROUP-ExchangeImpersonation –LinkedForeignGroup “UG-ExchangeImpersation” –LinkedDomainController DC01.client.corp –RecipientOrganizationalUnitScope ‘exchange.corp\Accounting’

Step 3:

Add serviceaccount to the UG-ExchangeImpersonation group in Client.corp.

Ensure that serviceaccount has a linked mailbox in exchange.corp.

Once AD replication finishes, you should have impersonation rights on all users in that Organizational Unit!

Posted in Uncategorized | Leave a comment

You cannot open an additional mailbox with Outlook Anywhere and Exchange 2010

 

If your running Exchange 2010 and utilize Outlook Anywhere, you may have users who complain that they cannot open an additional mailbox.  When they go to expand the additional mailbox, they get the error “Cannot Expand the subfolder” or a similar error.  The scenario is that you have two separate Active Directory sites, and you have outlook anywhere served out of each of them.  For example:

User – Paul Ponzeka

Site – NewYork

OutlookAnywhere URL – outlookanywhere-NY.company.com

User – Jon Smith

Site – SanFrancisco

OutlookAnywhere URL – outlookanywhere-SF.company.com.

When you add Jon Smith as an additional mailbox to open in Paul Ponzeka’s exchange profile, you cannot open Jon Smith’s mailbox.

This is due to a change in behavior in Exchange 2010 SP2 RU3.  The CAS servers now will try to force a user to connect to a CAS server that is in the local site of the mailbox your trying to connect to.  You can see this in the RPC log:

2012-09-10T17:02:06.140Z,92,1,/o=Company/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Recipients/cn=Paul Ponzeka72f,,OUTLOOK.EXE,14.0.4760.1000,Classic,,,ncacn_http,,DelegateLogon,1003 (rop::UnknownUser),00:00:00.0156005,"Logon: Delegate, /o=Company/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Recipients/cn=John Smith in database DAG01-MDB12 last mounted on SF-MBX04.company.corp at 9/6/2012 2:45:10 AM, currently Mounted",RopHandler: Logon: [RopExecutionException] The client should use Outlook Anywhere and RpcClientAccess server from site SanFrancisco to access the mailbox.. Error code = UnknownUser

The import portion of the log is:

“The client should use Outlook Anywhere and RpcClientAccess server from site SanFrancisco to access the mailbox.. Error code = UnknownUser”

What is happening is the CAS server that Pauls mailbox connects to for Outlook Anywhere has determined that the San Francisco site has a CAS server that is better suited to handle the request for Matt Williams mailbox.  However, since Paul is opening it as an additional mailbox, he cannot open the mailbox and the connection fails.  The workaround is the create the following registry entry:

 

HKLM\System\CurrentControlSet\Services\MSExchangeRPC\ParametersSystem

Create a DWORD entry named EnablePreferredSiteEnforcement and ensure the value is set to 0.  Recycle the Microsoft Exchange RPC Access Service and this should fix it. The fix is described in the following KB article from Microsoft:

http://support.microsoft.com/kb/2725008

Posted in Uncategorized | Leave a comment

Get a “You don’t have sufficient permissions. This operation can only be performed by a manager of this group.” in Exchange 2010.

 

In our environment, like most bigger organizations, we have separate teams for helpdesk, and separate teams for engineering.  Helpdesk should have rights to perform actions on certain items, such as users and distribution groups, but not on the server level.  We utilize Role Based Administration in Exchange 2010 to give our helpdesk team the ability to manage end users, but not the servers.  I recently received a ticket regarding the helpdesk guys not being able to add certain users to a distribution list, they would receive the error “You don’t have sufficient permissions. This operation can only be performed by a manager of this group.”:

 

image

There was a bug in Exchange 2010 SP1 that was fixed in RU 3 for 2010 SP1 (http://support.microsoft.com/kb/2487852), but we are running Exchange 2010 SP2 RU1, so we were way past that.

What was the issue?  Role based administration.  We had assigned the helpdesk group the “Recipient Management” role.  When I checked the group that the helpdesk tech was trying to add the user to, I noticed that the group was a Mail Universal Security Group.  The tech had no trouble adding the user to a Mail Universal Distribution Group.

We could see the issue when the tech tried to create a new distribution group.  A distribution group type worked fine, but when he tried to create a security group he got the error that “A parameter cannot be found that matches parameter name ‘Type’”:

image

So we needed to add the management role “Security Group Creation and Membership” to the group the helpdesk team was in.

Since we were running in a resource forest setup, we needed to create a new role group, and matching Universal Security Group in the management domain and add the role group to this group.  Then we could add the users we want to it.

In our management domain, aptly named management.corp we created a group called Group-HelpDesk-SecurityGroup and add our helpdesk technicians to this group.  Then in the Exchange Management Shell, as an Organization Admin we run:

$remotecred = get-credential

This will cause a windows pop up box, where we need to enter our security credentials for management.corp for later.

Then run:

New-RoleGroup “NameofRoleGroup” –LinkedForeignGroup Group-Helpdesk-SecurityGroup –LinkedDomainController DC01.management.corp –LinkedCredentials $remotecred –Roles “Security Group Creation and Membership”

Now, have the helpdesk technicians close and reopen their Exchange Consoles or Exchange Management Shell and they should be able to add the group members to distribution lists, as well as security groups.

ADDED BONUS:

If your wondering how to create a linked group and assign it to a role, follow the below.  In this case we want to add a group called REMOTE-ORGMGMT to the role “Organization Management” in Exchange 2010.

Create group in management.corp called “REMOTE-ORGMGMT” and add the admins to this group that you want to have this right.  In the Exchange Management Shell run:

$remotecred = get-credential

This will cause a windows pop up box, where we need to enter our security credentials for management.corp for later.

$roles = Get-RoleGroup “Organization Management”

New-RoleGroup “MANAGEMENT-OrganizationManagement” –LinkedForeignGroup “REMOTE-ORGMGMT” –LinkedDomainController DC01.management.corp –LinkedCredentials $remotecred –roles $roles.roles

Then have the Management.corp admin users close and re-open their Exchange Management Consoles and you should be all set.

Posted in Role Based Administration | Tagged , | Leave a comment

Cisco Call Manager and Exchange 2010 Unified Messaging–Fast Busy Signal When Calling Voicemail

 

In this article I spoke about how to configure the Exchange 2010 side of Unified Messaging with Cisco Call Manager:

https://port25.wordpress.com/2012/07/18/cisco-call-manager-and-exchange-2010-unified-messaging/

I wanted to point out this issue in a separate article because it seems a decent amount of people are having the same issue.

After we configured Exchange 2010 UM with Cisco Call Manager, we got a fast busy signal when calling the users voicemail.  After turning up diagnostic logging on all of the Exchange UM servers, we would see the SIP connection hit:

image

But the user would get a fast busy, and we saw event ID 1006 in the event log, which stated The Unified Messaging server has ended a call with ID … because the user at the far end disconnected

image

We checked the trunk between Exchange and CCM, and it was set to the G.711 codec (as Exchange doesn’t support G.729).

After some digging inside the call manager, I changed this setting inside Cisco Call Manager:

System->Service Paramters

Select one of your call manager servers, and select the serviceCisco CallManager (Active):

Change the Default Interregion Max Audio Bit Rate from the default setting of 8 kbps (G.729) to 64 kbps (G.722, G.711).

BEFORE:

image

AFTER:

image

After this change, everything worked!

Enjoy. 

Posted in Exchange 2010, Unified Messaging | Tagged , | Leave a comment

Cisco Call Manager and Exchange 2010 Unified Messaging

 

Recently had an implementation in between Cisco Call Manager 8 and Exchange 2010 SP2 Unified Messaging.  We followed the documentation from Cisco on how to configure the Cisco Call Manager:

http://www.microsoft.com/en-us/download/details.aspx?id=18298

Essentially here is the breakdown from the cisco side:

  1. Set up a SIP trunk from Cisco Call Manager to Exchange 2010 UM Server
  2. Create a pilot identifier that points to the UM Server, in our case 1000

Trust me, this is very light, but is meant for Exchange admins who are looking at how to configure the Exchange side of things.  One thing to note, the walkthrough does not tell you of an issue you need to watch out for, which is Exchange 2010 UM only speaks over G711, not G729.  Even if your SIP trunk is configured for G711, you need to make the following change in Cisco Call Manager:

System->Service Paramters

Select one of your call manager servers, and select the service Cisco CallManager (Active):

Change the Default Interregion Max Audio Bit Rate from the default setting of 8 kbps (G.729) to 64 kbps (G.722, G.711).  Otherwise you will get a fast busy when calling your voicemail.

So, here is our configuration:

(2) Cisco Call Manager Servers [phdc-voiccm01.voice.com] and [phdc-voiccm01.voice.com]

(1)  Exchange 2010 UM Server [nysrv2-um01.exchange.corp]

Pilot Identifier = 1000

Number of digits in extension = 4

We need to create a UM Dial Plan. Navigate to Organization Configuration->Unified Messaging->UM Dial Plan->New UM Dial Plan

image

Here we set the digits to 4, the URI type to Telephone Extension, and since we are not using SIP with TLS, leave it as unsecured. Also, since we are North America, the country/region code is 1.

Click Next, and you can add nysrv2-um01 to the dial plan:

image

Open the dial plan you just created, as there are some settings we need to change:

Subscriber Access Tab

Add “1000” as Telephone number to associate:

image

Settings Tab

Change Audio Codec to G711

image

Save and close.

Next we need to add a new UM IP Gateway.

Go to Organization Configuration->Unified Messaging->UM IP Gateways->New UM IP Gateway

Create a separate UM IP Gateway for each cisco call manager, ensure to add the dial plan you created above:

image

image

image

You’ll notice also that you will get a “Default Hunt Group” listed underneath the IP Gateways if you associate it with a dial plan:

image

Creating the Dial Plan also automatically creates your UM Mailbox Policy, which you can check out at Organization Configuration->Unified Messaging->UM Mailbox Policies

image

This goes into the default pin settings for users mailbox’s, as well as features they can access.  Besides security purposes, there is not much to change here.

Next, enable a user for Unified Messaging by right clicking on his mailbox and selecting “Enable for Unified Messaging”

image

Set the Unified Messaging Mailbox Policy and set the PIN:

image

For the extension, if you have entered in the phone number in AD, it will automatically be filled in:

 image

Once it’s done, your user will get an email letting them know they have been enabled for Exchange Unified Message:

image

You should be all set!

Posted in Exchange 2010, Unified Messaging | Tagged , | 3 Comments

Users Are Unable to Use Activesync After Migration from Exchange 2007 to Exchange 2010

 

At a recent customer, we ran into an issue where a set of users were migrated from Exchange 2007 to Exchange 2010.  All of the users activesync worked without issue, but one user was unable to connect.  No matter what we tried, he would get”unable to connect to server” on his phone.  We checked the activesync logs, would see an initial connection but then nothing else.

Checking the event logs of one of the CAS servers, we found error event ID 1053: “Exchange Activesync doesn’t have sufficient permissions to create the container under Active Directory User”Untitled

So I opened Active Directory Users and Computers, selected View-Advanced Features:

image

Then I opened the user account, went to to the security tab->;Advanced:

23

Here, the “Include inheritable permissions from this objects parent” was UNCHECKED:

admin

I checked this box, hit apply, and boom active sync started working. Since this account was not a domain admin and just a standard user account, this was unexpected.

Posted in ActiveSync, exchange 2007, Exchange 2010, Threat Management Gateway | Tagged | 2 Comments

Configuration Exchange 2010 DAG Replication for use with WAN Acceleration

 

If you using WAN acceleration devices such as Silverpeak’s, Riverbed’s or Citrix Branch Repeater’s, and are sending Exchange 2010 replication traffic through them, there are some changes you should make to ensure that you are getting the best utilization out of these devices. 

By default, Exchange 2010 comes set with NetworkCompression and NetworkEncryption set to InterSubnetOnly

image

You can see this by running the command:

Get-DatabaseAvailablityGroup –Identity DagName –Status | FL

This means that Exchange 2010 will encrypt and compress the replication network traffic across sites.  Since the WAN accelerators cannot unencrypt the data, it cannot reduce the traffic.  If we disable these two options and let the dedicated WAN accelerators handle the reduction, we’ll get much better utilization.

You want to run the command:

Set-DatabaseAvailabilityGroup –Identity DagName –NetworkCompression disabled –NetworkEncryption disabled

Check the DAG with the Get-DatabaseAvailabilityGroup command and the settings should be changed:

image

Posted in Exchange 2010 | Tagged , | 1 Comment

Outlook Rules Are Not Working After Moving a Mailbox From Exchange 2010 to Exchange 2007

 

Recently had an issue with a customer who was in the middle of an Exchange 2007 to Exchange 2010 migration.  After moving some test users, there was a bug exposed with a separate vendor’s (not Microsoft Exchange) unified messaging system.  The UM vendor needed to apply a patch that had to be scheduled for a later date.  The bug prevented users from receiving Voicemails on their desk phones and being able to call in and check their VM’s.

As a workaround, we moved the users who had been migrated to Exchange 2010, back to Exchange 2007.  After the move, the UM worked fine, but the users rules were broken, all except for Client Side rules.

Turns out it is a bug with Exchange 2007 SP3, that is resolved in Exchange 2007 SP3 RU7 available here:

http://www.microsoft.com/en-us/download/details.aspx?id=29426

The workaround (not recommended) is to run isinteg on the specified database that contains users having the issue.  This is ONLY if you do not want to install the update.  Below is the specific KB page regarding the issue:

http://support.microsoft.com/kb/2654700.

Posted in Uncategorized | Leave a comment