Understanding Avaya Aura Media Server Survivability Settings
My recent articles have explored how Port Networks and H.248 Media Gateways invoke the survivable modes of Avaya Aura Communication Manager (CM). In this article, I describe how the newest actor, Avaya Aura Media Server (AAMS) can also activate a CM-Survivable Core (fka Enterprise Survivable Server) and CM-Survivable Remote (fka Local Survivable Processor). In this article I will use the generic term CM-Survivable to reference both the Survivable Core and Survivable Remote servers.
If you follow Avaya’s announcements, then I’m sure you have heard that AAMS is one of the significant enhancements introduced in Avaya Aura 7.0. Actually, AAMS is a mature product that was created at Nortel circa 2003 to provide additional capabilities to their various communication servers. Back then, it was known as the Media Application Server. Since then, it has grown to provide a plethora of services, such as software-based DSPs, many leading voice and video CODECs, announcements, text-to-speech conversion, speech recognition, and DTMF detection.
Those abilities have been available to other formerly-Nortel products. Now, the abilities are also available to Communication Manager. Because AAMS has been around for a while, if CM is to use it AAMS must be release 7.7 or newer.
CM accesses the services of AAMS with SIP connections. You start by defining the AAMS as a “media server.” Interestingly, communication from CM to AAMS is defined within a SIP Signaling-Group, but lacks a corresponding Trunk-group. Further, the SIP communication is directly between the two devices and must not traverse a Session Manager. Similarly, AAMS needs to be configured to communicate with CM.
As described in other articles, a CM (either CM-Main or CM-Survivable) server becomes active whenever it controls DSP resources, which happens when either a H.248 Media Gateway (MG) or Port Network (PN) registers to the server. Because the AAMS contains DSP resources, it can also activate a CM server.
The first issue is determining which of up to 313 CM-Survivable (63 Survivable Core + 250 Survivable Remote) servers an AAMS could register to. That begins with an option on the third page of the change survivable-processor form called “Priority with respect to Media Servers.”
If no priority is assigned, then an AAMS cannot register to that CM-survivable and make it go active. However, this setting does not prevent a PN or MG from registering to this CM-Survivable server.
The next challenge is deciding exactly which priority to assign. This requires an analysis of your network topology, an estimation of what network failures are likely and/or most catastrophic, and a ranking of several survivability possibilities depending on how the network might fracture. That plan should drive the placement of resources such as AAMS, PNs, MGs and CM-Survivable servers. It would also suggest which priority to assign to each CM-survivable.
If your environment contains a mix of PN, MG and ASMS, you will want a failover strategy that causes as many as possible of them to register to the same CM-Survivable Processor. That administration needs to apply to your H.323 endpoints as well.
Assignable priorities start at 2 and go up to 9999. Since CM-Main is implicitly assigned priority of 1, it is obvious that larger integers mean lower priorities. By the way, priorities do not need to be assigned sequentially, allowing an administrator to deliberately leave numerical gaps that could be filled in later.
Only if a priority is assigned on page 3 of the change survivable-process form will you be able to populate the MEDIA SERVER REPORTING LIST on page 4. Effectively, this list identifies which of the potentially 250 AAMS servers could register to this CM-Survivable server.
Each AAMS needs to receive a list of all the CM-Survivable servers it might communicate with. So, CM-Main analyzes all the CM-Survivable entries and compiles a list per AAMS. I speculate that as part of its reporting mechanism, CM-Main then provides each AAMS with its custom list of CM-survivable servers and their assigned priorities.
Next, we need a heartbeat mechanism for AAMS to learn when CM-Main has become unavailable. AAMS periodically sends a status “report” to CM-Main that CM must promptly acknowledge. The Report Interval (RI) determines the frequency of this “heartbeat” (default 60 seconds). The Report Expiration (RE) timer (default 180 seconds) determines how long AAMS will wait for a response from CM-main.
If the Report Expiration timer expires, the AAMS will look to its list of assigned CM-Survivable servers. It will then work its way down the list, sending status reports to each CM-survivable until one responds. The available documentation suggests that each AAMS simultaneously sends reports to all its configured CMs (CM-Main and all assigned CM-Survivable servers). When a CM-Survivable receives a report from AAMS telling it that CM-Main is down, it is effectively a registration that activates CM-Survivable.
If you assign the same priority (for example ‘3’) to two or more CM-Survivable servers, then you need to make sure that each one has unique AAMS assigned to it on the Media Server Reporting List. In other words, no AAMS can be assigned a list of CM-Survivable servers with duplicated priorities.
In a different article, I discussed how the Split Registration Prevention feature (SRPF) works with MGs. I was surprised to learn that it works the same way with AAMS devices.
Fallback to CM-main can be invoked automatically when the ms-recovery-rule threshold is met (i.e. as soon as possible, or at a particular day and time). Alternatively, failback can be invoked manually from CM-Main with the command: enable ms-return.
Another implication of the introduction of AAMS is that it modifies a technical distinction between CM-Survivable Core (SC) and CM-Survivable Remote (SR). Previously either a PN or a MG could register to Survivable Core, but only a MG could register to a Survivable Remote. An AAMS can register to either a Survivable Core or a Survivable Remote. In other words, now the distinction between the two types of survivable servers is simply that PN cannot register to a Survivable Remote server.
With the addition of AAMS in Aura 7, Avaya has introduced some fantastic features. It also added flexibility to the survivability strategies that can be applied to CM.