Examining Avaya G430 and G450 Media Gateway Recovery
The G700 Media Gateway (MG) that Avaya introduced circa 2002 was a radical departure from the IPSI-based Port Networks (PN). While the PN was based on the older PBX architecture of the System 75, the G700 MG was built on the chassis of a Cajun Layer 3 Switch, clearly signaling Avaya’s commitment to Voice over IP. (Yes, the MG can be used as a router!)
Further, Avaya adopted the standards-based H.248 protocol for controlling the Media Gateways rather than porting over its proprietary Control Channel Message Set protocol. I learned to understand and eventually appreciate Media Gateways from Gray Tillman, a senior member of the engineering group at Avaya.
Fast forward to the present, and the newer G450 and G430 Media Gateways, sometimes called Branch Gateways, have superseded not only the Port Networks, but also the G700 in capacity and performance. What has remained constant about these H.248 Media Gateways is their process for registering to the main Communication Manager, or if they lose communication with the main, then a survivable Communication Manager.
A few of my students begin class believing that the s8300 server embedded in slot v1 and the MG chassis comprise a single, rather odd, monolithic server. I take pains to demonstrate that they are administered separately, run different software/firmware, are assigned different IP addresses, and therefore are logically separate devices. The s8300 simply shares with the MG some hardware resources, such as power and Ethernet connectivity. The best way to think about the CM software running on the s8300 server is to view it as a completely separate device.
Technically speaking, MGs register to, and exchange signaling with, a “gatekeeper.” Avaya provides two types of gatekeepers: One is the TN799 Control-LAN circuit pack that lives within a Port Network, the other is the Processor Ethernet (aka PROCR) interface within CM. Starting with CM5.2, Avaya has been steadily promoting the use of Processor Ethernet over CLANS to the point where I believe CLANs provide no value in CM7.
The MG uses TCP Keep Alive messages as heartbeats to confirm it can communicate with CM-Main. Every 15 seconds (not administrable), the MG sends the Keep Alive to CM, expecting CM to acknowledge receipt. If the MG misses three consecutive acknowledgments of the heartbeats (not administrable), it realizes that CM is not available and starts both its Primary Search and Total Search timers.
Unlike the Port Network, which is mostly configured from within Communication Manager’s System Administration Terminal, configuring a MG requires using a secure shell utility, such as PuTTY, to connect to the Media Gateway. Among the required configuration steps is inputting at least one, and as many as four, IP addresses of Gatekeepers. These addresses constitute the Media Gateway Controller (MGC) list.
Until the Primary Search Timer (default= 1 minute, range=1 to 59 minutes) expires, the Media Gateway will repeatedly try to reconnect with CM-Main. In Avaya’s early design, there might be several CLANs that could provide a connection to CM-Main, so there was a way for the MG to try up to three of them. Now, Avaya advocates bypassing CLANs and communicating with the Processor Ethernet of CM, which means there is only one IP address for CM-Main. The Transition Point identifies how many of Gatekeeper addresses provide the connection to CM-Main.
As soon as the Primary Search Timer expires, but before the Total Search Timer (default= 30 minutes, range = 2 to 60 minutes) expires, the MG will repeatedly try all the addresses on the MGC list, including CM-Main. In one variation of Avaya’s new design, the last three addresses would be the Processor Ethernet of CM-Survivable servers. So, I might populate the four IP address like this:
- CM-Survivable Core (“ESS”) #1
- CM-Survivable Core (“ESS”) #2
- CM-Survivable Remote (“LSP)
The other variation, designed to minimize Split Registration (the topic of a future article), limits the list to two servers:
- CM-Survivable Remote (“LSP)
To populate the list of CM addresses, use the MG’s command: set mgc list ip#1 ip#2 ip#3 ip#4
To set the quantity of addresses that connect to CM-main, use the command: set reset-times transition-point x
To modify the Primary Search Timer, use the command: set reset-times primary-search xx
To modify the Total Search Timer, use the command: set reset-times total-search xx
When the Total Search Timer expires, the MG reboots. Only then would any stable calls still using resources (trunks, lines, VoIP Resources) within the Media Gateway be dropped. This is a significant benefit of a MG over a PN. As you may have read in another of my articles, after about two minutes from the when the IPSI detects the outage, the IPSI reboots the PN, thereby dropping any calls using the PN’s resources. In contrast, the MG by default waits up to 30 minutes before rebooting.
If there are stable calls in progress when the MG registers to a survivable CM, the audio connection is generally preserved (some types of stable calls are dropped during the registration). However, these calls are forever orphaned from features, such as transfer or conference.
A common misconception is that the Total Search Timer starts up right after the Primary Search Timer expires, but in fact, they both start at the same time. So, if the Primary Search Timer was set to 7 minutes, and the Total Search Timer set to 30 minutes, the MG would reboot 23 minutes after Primary Search Timer expired.
Using the Standard Local Survivability (SLS) feature, the MG itself can act as a PBX of last resort. While this additional level of survivability looks fantastic on a Request for Proposal, in reality it is difficult to program.
Unlike CM-Survivable Core, or CM-Survivable Remote, SLS does not get a copy of CM’s Translations database. Instead, how SLS should behave must be programmed individually on each MG using several obscure CLI commands or using the deprecated Provisioning and Installation Manager (PIM) program. Because of the hassles in programming SLS and keeping it current, I discourage using it.
Once the Media Gateway has registered to CM-Survivable, you can force the Media Gateway to register to CM either manually or automatically. The Automatic methods require the administrator change a Recovery Rule in CM, setting the thresholds. The threshold choices are when there are zero active calls occurring within the MG, and/or at particular hours/days of the week.
Alternatively, the administrator can issue the command from CM’s administration terminal:
enable mg-return network-region x,
enable mg-return all
This command temporarily overrides the recovery rules for media gateways’ return to CM-main.
For the last decade or more, Avaya has shifted its focus from IPSI-controlled Port Networks to the H.248 Media Gateways. Now, the glory days of MGs are over, coinciding with the introduction of the Avaya Aura Media Server (AAMS) in Aura 7.0. How the AAMS’ registration and recovery mechanism works will be the focus of a future article.