Written By: |
Gary Teft |
---|---|
Manufacturer: |
Avaya |
Product: |
Communication Manager |
Version: |
R3.x and higher |
Patch Information: |
|
Ticket Number(s): |
Description:
Avaya NR-CONN alarms are created when an IP connected item in one network region cannot ping an IP connected item in another network region when it has been administered to do so via the IP Network-Region form.
The alarm is created due to test 1417 failing (this is the test that initiates the ping between network regions). When test 1417 is run it chooses between the available IP resources, usually trying to use either a CLAN, or Media-Gateway, to ping between but will use other resources, up to and including IP Stations.
The alarm is created but it does not provide any information on what ping failed, or the resources being used to ping. This document will show you how to go to the ECS logs and determine where the failure occurred.
Steps
First, to determine if the alarm is still active you would go into CM/SAT and run the “test failed-ip-network-region <enter>” command to see if it is still failing the ping test. If the test command passes then the issue has been resolved and you can follow process for the issue.
If the test still fails it will tell you the network regions that are failing and you will need to continue troubleshooting.
Run the command “list ip-network-region monitor <enter>” and if there is a name associated with the network region record it so you can provide it to the customer. In the below example NR 5 does not have a name but NR 6 does.
list ip-network-region monitor Page 1
IP NETWORK REGIONS MONITOR
RTCP Monitor Port Report Codec UDP Port Range
Region Name IP Address Number Period Set Min Max
5 . . . 5005 5 1 2048 3329
6 MM LAB . . . 5005 5 1 5000 5999
The first step is to determine the date and time the failure occurred. If it is a CHRONIC occurrence you can also use these steps to determine if the failure is always between the same network regions,
or if it varies.
After accessing the server that reported the alarms, run the Linux shell command(s)
“almdisplay |more” or “almdisplay –res |more” and capture the date and time of the NR-CONN alarms. The below example shows two alarms on 1-8-2012, the first at 03:31AM and the other at 18:51PM
admin@cm-server1> almdisplay -res |more
ID MO Source On Bd Lvl Ack Date
1 A CMG 25 WRN N Mon Jan 09 10:13:07 EST 2024
2 NR-CONN n MIN Y Sun Jan 08 18:51:19 EST 2024
3 NR-CONN n MIN Y Sun Jan 08 03:31:24 EST 2024
4 A CMG 26 WRN N Fri Jan 06 08:50:03 EST 2024
5 CO-TRK 018V403 y MAJ Y T
In order to find out which devices are responsible and invovled in the failed network region test we need to go to a Linux shell on the active Communication Manager server and run the following command.
admin@cm-server1> egrep -i 'net_region|Internetwork|Testing|pinged' 2024*
2024-0108-000137.log:20120108:032621536:11058832:ps_mapa(31526):HIGH:[net_region:
2024-0108-000137.log: Internetwork region connectivity test failures:
2024-0108-000137.log: Testing between regions 7 and 50
2024-0108-000137.log: MG 7 pinged MG 32
2024-0108-000137.log: MG 32 pinged MG 7 ]
2024-0108-183415.log:20120108:184624053:11263338:ps_mapo(31541):HIGH:[net_region:
2024-0108-183415.log: Internetwork region connectivity test failures:
2024-0108-183415.log: Testing between regions 12 and 52
2024-0108-183415.log: MG c pinged MG 34 << MG c = Hex, need to use Hex to Dec to determine which MG, so Hex c = 12 so MG12)
2024-0108-183415.log: MG 34 pinged MG c ]
The previous tools show you how to identify where the failure occurred. At this point you would test the connections, using the gathered data, to see if the network regions are able to ping each other, the key being to ping both ways. Example would be you could find IP resources in NR 11 and 211 and then ping between them.
If you are able to successfully ping, both ways, then the issue is resolved and
you can follow procedure in completing/closing the SR. If the ping still fails
then you would use network troubleshooting procedures to determine where
the failure is and then communicate your findings to the customer.
If this is a CHRONIC issue and the customer is not experiencing any reported
issues/problems, an example being they simply do not have any IP resources
in one of the network regions, then you can disable test 1417. Disabling test
1417 is not service affecting and will most likely not cause any issues in the
future, BUT, the test is designed to identify communication issues and
disable/reroute calls thru other network regions to complete the
call/connection (it is not a big deal to disable the test, but ONLY as a last
resort).
Manufacturer Release notes:
Please copy the content of this and edit your copy if not creating your article from a ZenDesk Ticket, Delete the text in red