Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 3507
  • Last Modified:

Exchange 2010 DAG Node Down And Cannot Get It Back Up

Hello,

I am an extremely unhappy camper right now.  Last week I set up a DAG for Exchange 2010 SP3 running on Win 2012 Server.   Worked fine.   Set up like so:

ex01 - 172.16.1.141 - mapi network
            10.10.10.141 - replication network
MBX, CAS, HUB

ex02 - 172.16.1.66 - mapi network
            10.10.10.66 - replication network
MBX, CAS, HUB

srvr01 - 172.16.1.217 - File Witness Server
No Exchange loaded

exo1 mbx database replicated to ex02
ex02 mbx database replicated to ex01

was working fine.   tested switchover, worked fine.  booted ex01 and failed over to ex02 with no problem.    then all put back to normal.

Today I replaced a network switch which ex01 10.10.10.141 was connected.
Just powered off, replaced it and plugged the LAN back into it.

ex01 mbx database successfully failed over to ex02.  Running fine.   HOWEVER,
in EMC, there are now no network interfaces, where before they were there.  The MAPI Network and Replication network subnets status are both unknown.   In the Failover Management Console, Node ex01 is DOWN.    When I validate the cluster, it says the connections exist between the nodes, but it cannot ping ex01 to ex02 or ex02 to ex01 on the replication (10.0.0.0/8) subnet.    

I have done everything I can think up to get this network connection working.... booting, powering off, uninstalling the NIC and reconfiguring it.   Both ex servers show node ex01 as down and no NIC info for the interfaces.    Need some help.  Please advise.    Thanks.
0
rstuemke
Asked:
rstuemke
1 Solution
 
Alexander KireevCommented:
Hello,

Could you send an answer of cmdlet "Get-DatabaseAvailabilityGroupNetwork | fl"?

Did you follow instructions about network configuration? Article - Table 1.
https://www.simple-talk.com/sysadmin/exchange/exchange-2010-dag-creation-and-configuration-part-1/

Replication network must have clear check box "Register this connection’s addresses in DNS".
0
 
SreRajCommented:
Hi,

From Ex01, try to ping to the gateway IP Address for subnet 10.0.0.0/8. Verify the connectivity is there. Also verify after replacing the switch, gateway for 10.0.0.0/8 is still connected to the switch.

Also, could you please confirm all the IP Addresses you have mentioned earlier is statically configured and switch is not using any DHCP Server for IP Allocation.
0
 
I QasmiCommented:
You need to check for the preferred network connections on each server.
Chances are there there might be the old one or disconnected one set as preferred LAN connection for network access can cause the failure.

Cross check and verify that the NIC you have installed has been set on top most priority

Also open Cluster Failover manager and toggle to Cluster core resources  and check
whether all the Cluster core resources under the network are up ,

If not then try bringing the resource online by right click > Bring online and check

check this also

http://blogs.technet.com/b/timmcmic/archive/2010/05/12/cluster-core-resources-fail-to-come-online-on-some-exchange-2010-database-availability-group-dag-nodes.aspx

http://workinghardinit.wordpress.com/2010/06/18/exchange-2010-dag-issue-cluster-ip-address-resource-cluster-ip-address-cannot-be-brought-online/
0
Free Backup Tool for VMware and Hyper-V

Restore full virtual machine or individual guest files from 19 common file systems directly from the backup file. Schedule VM backups with PowerShell scripts. Set desired time, lean back and let the script to notify you via email upon completion.  

 
rstuemkeAuthor Commented:
Thanks for all the responses.    I will answer each one in a separate post.

chestor02 -

showing DAG first  1730W436 is the bad boy.

[PS] C:\>get-databaseavailabilitygroup | fl


RunspaceId                             : e44896bc-ca9d-4895-b901-12d593239662
Name                                   : CTCCDAG
Servers                                : {1730W436QPS1, 1730W50RXBX1}
WitnessServer                          : 1730wc4xmth1.calvaryspringfield.org
WitnessDirectory                       : C:\CTCCDAG Witness Directory
AlternateWitnessServer                 : 1730wcl2hyh1.calvaryspringfield.org
AlternateWitnessDirectory              : C:\CTCCDAG Witness Directory
NetworkCompression                     : InterSubnetOnly
NetworkEncryption                      : InterSubnetOnly
DatacenterActivationMode               : Off
StoppedMailboxServers                  : {}
StartedMailboxServers                  : {}
DatabaseAvailabilityGroupIpv4Addresses : {172.16.1.159}
DatabaseAvailabilityGroupIpAddresses   : {172.16.1.159}
AllowCrossSiteRpcClientAccess          : False
OperationalServers                     :
PrimaryActiveManager                   :
ServersInMaintenance                   :
ServersInDeferredRecovery              :
ThirdPartyReplication                  : Disabled
ReplicationPort                        : 0
NetworkNames                           : {}
WitnessShareInUse                      :
AdminDisplayName                       :
ExchangeVersion                        : 0.10 (14.0.100.0)
DistinguishedName                      : CN=CTCCDAG,CN=Database Availability Groups,CN=Exchange Administrative Group (F
                                         YDIBOHF23SPDLT),CN=Administrative Groups,CN=Calvary Temple,CN=Microsoft Exchan
                                         ge,CN=Services,CN=Configuration,DC=calvaryspringfield,DC=org
Identity                               : CTCCDAG
Guid                                   : 9e311866-c047-44c1-bf11-814ead816c9f
ObjectCategory                         : calvaryspringfield.org/Configuration/Schema/ms-Exch-MDB-Availability-Group
ObjectClass                            : {top, msExchMDBAvailabilityGroup}
WhenChanged                            : 6/19/2013 1:34:56 PM
WhenCreated                            : 6/13/2013 9:24:21 AM
WhenChangedUTC                         : 6/19/2013 6:34:56 PM
WhenCreatedUTC                         : 6/13/2013 2:24:21 PM
OrganizationId                         :
OriginatingServer                      : 1730W6FZNQW1.calvaryspringfield.org
IsValid                                : True

Cannot get the network to display on either server.   Both get this error.

[PS] C:\>get-databaseavailabilitygroupnetwork | fl
A server-side administrative operation has failed. 'GetDagNetworkConfig' failed on the server. Error: The NetworkManage
r has not yet been initialized. Check the event logs to determine the cause. [Server: 1730W436QPS1.calvaryspringfield.o
rg]
    + CategoryInfo          : NotSpecified: (0:Int32) [Get-DatabaseAvailabilityGroupNetwork], DagNetworkRpcServerExcep
   tion
    + FullyQualifiedErrorId : C67769A,Microsoft.Exchange.Management.SystemConfigurationTasks.GetDatabaseAvailabilityGr
   oupNetwork
    + PSComputerName        : 1730w50rxbx1.calvaryspringfield.org


Yes, I used that same URL to set up my network.   HOWEVER, there is another item that showed up (enabled) in the replication network adapter list called MICROSOFT FAILOVER CLUSTER VIRTUAL ADAPTER PERFORMANCE FILTER.   I tried it leaving it enabled and disabling it, but it made no difference.
Went thru and check my network and it is set up just like the URL indicates

Registration To DNS box was cleared when replication network setup.  Remains unchecked.

From a command prompt on each node, I tried to ping the other 10.10.10.xxx network.   Prior to me replacing the switch, I could ping with no problem.  Now the pings time out.
Put back the old switch and tried this whole thing again, yet it still would not ping.
0
 
rstuemkeAuthor Commented:
Ok....  SreRaj,

1) There is no gateway for subnet 10.0.0.0/8.   There are only two devices with 10.10.10.x addresses, and that is the two nodes on the REPLICATION NETWORK.   The 10.10.10.x network shares the same switches as the 172.16.1.x network, for which there is a gateway at 172.16.1.254.

2) Confirmed all IP addresses are static.   No DHCP.
0
 
rstuemkeAuthor Commented:
Ok........  iQasmi

1) Network connection order is MAPI, then REPLICATION on both nodes.
    No old LAN connections in the list

2)  Cluster Core Resources - File Share Witness - Online
                                             Name - CTCCDAG - Online
                                             IP Address - 172.16.1.159 - Online


checking provided URLs.
0
 
rstuemkeAuthor Commented:
Okay, reviewed the articles.     Checked the Cluster Networks.  
Cluster Network 1 - Allow Clients To Connect Thru This Network - CHECKED
Cluster Network 2 - Allow Clients To Connect Thru This Network - NOT CHECKED

Tried changing the setting on Network 2, but even though I changed it, and applied it, it would not remain set.


Some how, the replacement of the switch "disturbed" a perfectly good DAG set up...... bummer
0
 
rstuemkeAuthor Commented:
Here is some additional information I did not get earlier:

[PS] C:\Windows\system32>get-databaseavailabilitygroupnetwork | fl


RunspaceId         : 1710c4b6-db83-4bff-8a1d-dae65ac421f0
Name               : MAPI Network
Description        :
Subnets            : {{172.16.0.0/16,Unknown}}
Interfaces         : {}
MapiAccessEnabled  : False
ReplicationEnabled : True
IgnoreNetwork      : False
Identity           : CTCCDAG\MAPI Network
IsValid            : True

RunspaceId         : 1710c4b6-db83-4bff-8a1d-dae65ac421f0
Name               : Replication Network
Description        :
Subnets            : {{10.0.0.0/8,Unknown}}
Interfaces         : {}
MapiAccessEnabled  : False
ReplicationEnabled : True
IgnoreNetwork      : False
Identity           : CTCCDAG\Replication Network
IsValid            : True


The MAPIACCESSENABLED : FALSE on the MAPI network is a concern?  Do not know how to change that????
0
 
rstuemkeAuthor Commented:
UPDATE

Installed Exchange 2010 MBX server on another file server, running Win 2012 Server.
Set up DAG Replication network on separate NIC as 10.10.10.135   (MAPi is 172.16.1.135)
 Exchange installed successfully.  Server working fine at this point.   Could not ping any other 10.10.10.x machine.   Timed out.   So wondering if there is an underlying issue with our network and the Exchange Servers are just presenting the symptoms of the problems.    Any help is greatly appreciated.

Tried to join this MBX server to the existing DAG.  It failed.  Timed out trying to connect to the main Exchange Server:


1730W107048721
Failed

Error:
A database availability group administrative operation failed.
Error: The operation failed. CreateCluster errors may result from incorrectly
configured static addresses.
Error: An error occurred while attempting a cluster operation.
Error: Cluster API '"AddClusterNode() (MaxPercentage=100) failed with 0x5b4.
Error: This operation returned because the timeout period expired"' failed.
[Server: 1730W50RXBX1.calvaryspringfield.org]
0
 
SreRajCommented:
I see there is two interfaces on the server and each interface is having IP address which belongs to a different subnet. Are you connecting both of these interfaces to the same switch. If both interfaces are connected to the same switch, does the switch have different vlans added in its configuration for each of these subnets and does the required port mappings are done for these vlans?
0
 
rstuemkeAuthor Commented:
the different interfaces are connected to the same LAN and sets of switches.   There is no VLAN configured.   The interfaces had previously been working on the orginal DAG a couple of weeks ago.  At that time, I used 10.10.0.0/16 for the replication network and it was plugged in the same switches as the 172.16.1.0/24 network.   It was working fine then.... all members of the DAG up, all networks and network interfaces up and my mailbox replicated on EX02........  my problems started when I replaced a network switch connected to EX02's replication network.   Things have never worked correctly since then.    In fact I have another open question dedicated to that problem, so I won't go into details here.
0
 
rstuemkeAuthor Commented:
Finally was able to delete the MBX copies and remove the servers from the DAG.   Worked thru problem little by little.
0
 
rstuemkeAuthor Commented:
No one was really able to help me on this one.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now