How to Virtualize ALL of Avaya Aura on a Single Server

Paul Leatherman is the CTO at CRI (Communication Resources, Inc.), a major Avaya partner and systems integrator. I first interviewed Paul, an Avaya veteran, at the Avaya Technology Forum last year. I spoke to him again at the Avaya Evolutions San Francisco show last week. Below is a transcript of our conversation, which focuses on CRI’s flavor of Avaya virtualization, called Integrated Server Aura:

Paul Leatherman CTO of CRI

Photo by Andres Larranaga


This Avaya CONNECTED Blog is also available as an MP3 Audio File


Fletch: You’re doing something that’s really innovative called I.S. Aura, or Integrated Server. What’s that all about?

Leatherman: So the Integrated Server is kind of taking step one of virtualization and instead of having an application require its own physical server, we started consolidating it. In particular, we’ve been doing this Integrated Server for a number of years. We’ve done point solutions like conferencing or messaging. But this is the first time we’ve took the Avaya core infrastructure: Communications Manager, Session Manager, System Manager, AES Presence, the whole. When you look at the Aura core, this is everything is a single box.

Fletch: People wanted to go VMWare so they could virtualize your environment, but you didn’t want to have a whole stack of servers. You guys are using VMWare to bring all of this together on one piece of hardware.

Leatherman: That is correct. And then late last year when Avaya embraced the fact that they delivered the ability to not only install it but support it by Avaya, we took it that next step and said, “Let’s make it easy to consume.” So while we build this infrastructure, make it available to our direct partners. Do they want everything in that box? Do they want pieces of it? Do they want the systems integrations side of CRI to implement it? We’re very flexible on how we deliver the solution.

Fletch: So you can kind look back at what the whole solution needs to be for a specific customer and almost do a custom build.

Leatherman: We can. We position it as the foundation piece that can have everything included. Awhile back, Avaya delivered something called Mid-Size Enterprise, and that was based on the system platform scenario where they consolidated a lot of these applications. The challenge was it was a singular template where you got everything in it no matter what.

Fletch: Like it or not, it was all there.

Leatherman: Correct, no flexibility. Instead, we start out with a kind of reference architecture.  We can do the whole Aura core, but let’s say you don’t want Presence. It’s not applicable or you’re not ready for it yet. You don’t have to put that VM in, you don’t have to pay for that service, and you can customize it.

Fletch: That’s going to save on licensing fees alone by just not buying crap you don’t need. Not that it’s crap, but it’s just crap to you at that particular point in time because you don’t need that. It’s a great thing but it adds no value right now.

Leatherman: Correct, not relevant to your business at this point in time, so why have that virtual machine running or the expense of it being installed.

Fletch: But if you need it later, it’s a key code to add in.

Leatherman: Correct. Now what’s also nice with this “Keep the Flexibility” theme is let’s say you want the Aura core, and you like the fact that we consolidate it in a single server, but you have this other DevConnect partner that says, “Oh, well, my application can run on VMWare, too.” Well guess what? We just allocate the resources. We can put it on the same physical box, still that single server, but I’m also leveraging the third party app that brings in that capability. We can also start expanding into other Avaya applications, let’s say, CMS, right? So, you can add CMS into there if you’re a call center. So it’s not locked down into a single architecture. Think of it more of as a reference, a place to start, and we made it real simple to consume.

Fletch: You can build that environment and mold that environment and constantly mold it one way to the left or to the right, depending on how your business model changes. That’s really kind of cool. That’s going to let people to get into more technology than they’ve ever been to afford or handle before, and allow it to be flexible so it actually does something, because people don’t buy technology just to be cool. It has to have an ROI there.

Leatherman: Absolutely. Now let me ask you a question. Does it scare you a little bit that I put all of this stuff into a single server?

Fletch: I’m a little scared. What if that one box goes down? You have to have some resiliency there.

Leatherman: Yeah, the whole eggs in one basket scenario.

Fletch: Something tells me that you have a pretty cool answer for that.

Leatherman: We do, and it’s leveraging again a little bit of VMWare because what if I then added a second box, so physical server, I use VMWare to consolidate but now I start marrying that up with what Avaya brought to the table. So adding this next box, if I did have a disastrous box failure of physical hardware, well guess what, Avaya has protected me with its High Availability features, so I get the best of both worlds: consolidation from the VMWare, a second one for the HA, and away we go.

Fletch: It scares a lot of those people about what if the app goes down, but I think that something like this, utilizing this is really to where they have security blanket now.

Leatherman: Exactly, and but wait, there’s more.

Fletch: But wait, there’s more? (laughs) So if I order this second one right now, I get it and I just get shipping and handling?

Leatherman: (laughs) Something like that. Let’s go to the next level of redundancy or availability. What about backups? Traditionally applications come out and have their own backup scheme. This application does it this way. This one is on Windows. This one is on Linux. The way they accomplished it was application specific. In the VMWare world, I can now back up the entire set, so think of it as an entire appliance that has whatever applications on there can be backed up as a virtual machine to another location, another appliance, ready to go.

Fletch: That’s pretty cool.

Leatherman: And the other nice thing about it, let’s say I had a failure. You don’t go restore. You simply hit the start button, and the application runs out of the backup.

Fletch: Wow, that’s pretty cool.

Leatherman: So I can go build the redundancy. How far do you want to go with that comfort blanket, your SLA’s, or whatever you want to do.

Fletch: So where do you see this driving technology, Paul? Is there any one industry that’s really waking up that hasn’t in the past?

Leatherman: I think where this has a nice play is in two areas. One, we started out by making it a mid-market play. This is really nice, it scales up to about 5000 users. You have some gateways or something that you put on there for your phones and that kind of thing. But what we’ve also seen happen is there’s large enterprises who weren’t ready to change the world yet to go to the new technology, but they had the need for the remote work, or the Bring Your Own Device, and some of the new SIPP stuff. So we can use this as sort of a gateway to the new, communicating to the old, and it’s real easy for them to consume, and they’ll start moving it in that direction.

Fletch: So I think that a lot of people were afraid of the new technology. They were afraid of the unknowns of the new technology. I don’t want to do something that’s new that going to potentially take my business out of service. Resiliency, security, reliability, these are key things and until you start getting that you’re not going to get the big players to buy in. Until you get the big players to buy in, you’re not going to get that paradigm shift in the industry, I think.

Leatherman: You got it. They get to taste it a little bit. They get comfortable with it. And then they’re ready to make the big scale move.

Related Articles:

Zang Serves Up a Special Delivery for Your Mom this Mother’s Day

Mother’s Day is the one day in the U.S. when the most phone calls are made. According to this cool Mother’s Day Facts site, 122 million calls are made to mothers on Mother’s Day in the United States alone. Considering there are only 85 million mothers in the U.S., Mom must be pretty busy taking calls from her multiple children, and Dad must be busy making reservations at the favorite family restaurant (Mother’s Day remains the top holiday for dining out).

To help make sure Mom gets that special call on Mother’s Day, Zang today announced a Zang-built service for those who 1) are multiple time zones away from mom (ie: military, working or studying abroad), 2) just want to send another thoughtful gift to Mom to let her know she’s loved, or 3) frankly, for those who have a track record for forgetting (you know who you are). With the Zang Forget Me Not service, anyone can record a voicemail for their mom before Mother’s Day, designate the date & time the voicemail should be sent, then receive a text confirming the voicemail was delivered. The new service was created using  cloud-based Zang Comms platform as a service, which allows anyone to create communication applications and services just like Forget Me Not.

How does it work, you ask? Simple. First go to and complete four short steps:

1)  Enter your telephone phone number
2)  Enter recipient’s telephone number
3)  Pick the time you would like the recording to be delivered
4)  Zang Forget Me Not service will then call your phone number for you to record, review and approve your message for delivery.


Go ahead—give it a try! It’s just one more surprise you can give Mom this Mother’s Day.

Next time you visit Dubai, take a public transport

With happiness being a key focus in Dubai, government agencies are looking towards contributing to the goal of raising the quality of life of customers and ensuring public happiness. These agencies are quickly realizing that the key to delivering a better and more personalized experience is technology. Using the latest services and solutions paves the way to guaranteed customer retention and loyalty.

One of the leading organizations in the area of customer care, winning multiple awards for its contact centre operations including a Hamdan bin Mohammed Smart Government Award, is the Roads & Transport Authority (RTA).

The RTA has a wide remit including Dubai’s Metro, public buses, private road vehicle registration, traffic management and more, so it has a diverse customer base negotiating Dubai’s busy transport system, with a volume of customer enquiries to match. It therefore comes as no surprise that the RTA is investing in multiple channels of communications with its customers, to improve standards of service, increase efficiency and gain valuable feedback from its user. It is also looking to technology to help improve the quality of interactions with clients and to improve overall levels of customer satisfaction and engagement. It has utilized a number of different solutions to increase its outreach to customers, and over time the focus of these efforts has evolved, to include voice communications, smart apps and multi-channel engagement.

From projects and operational perspective, RTA has a big focus on alternative smart channels. It offers 173 smart services under nine apps, that can help customers complete their transactions with a click of the finger through the automation of the main services the authority provides. It is dedicated to opening up more channels of communication, with an omni-channel strategy, that includes delivering services through channels such as self-service kiosks. At present the RTA has deployed around 16 kiosks, which offer smart services to users in RTA service centres, and in future it plans to have around 100 kiosks all over the city. The Authority has a well-established customer care line, which handles enquiries across the range of its activities, running on Avaya contact centre solutions. In 2015, the centre handled over 2.5 million calls, with over 80% of calls responded to in 20 seconds, and 90% of issues resolved in one call.

To make this possible, last year the contact centre underwent a major technology refresh, to put in place the latest generation of solutions. With Avaya Aura, RTA is now using the most recent software to increase the efficiency of the contact centre. With the aim to deliver the best possible interaction experience to transport customers, Avaya aligned with RTA’s Customer Resource Management strategy to consolidate channels and mediums into RTA’s first, best-in-class contact center to host multi-channel interactions. Among the capabilities that the new technology has enabled is an advanced Interactive Voice Response (IVR) system, which has helped to improve operations by automatically handling some of the more common customer enquiries. On New Year’s Eve the centre received some 12,000 calls, with the IVR handling one third of all enquiries.

The RTA is a pioneering example of how technology can make the difference in delivering quality to customers through the creation of a seamless and hassle free experience. As we share the RTA’s vision in excelling in customer experiences to achieve happiness, my advice to you is that, next time you visit Dubai, remember to take a  public transport.

How to Prevent Media Gateway Split Registrations

Back when Avaya Aura Communication Manager 5.2 was released, I recall reading about this new capability called Split Registration Prevention Feature (SRPF). Although I studied the documentation, it wasn’t until I read Timothy Kaye’s presentation (Session 717: SIP and Business Continuity Considerations: Optimizing Avaya Aura SIP Trunk Configurations Using PE) from the 2014 IAUG convention in Dallas that I fully understood its implications.

What is a Split Registration?

First I need to explain what SRPF is all about. Imagine a fairly large branch office that has two or more H.248 Media Gateways (MG), all within the same Network Region (NR). SRPF only works for MGs within a NR and provides no benefit to MGs assigned to different NRs.

Further, imagine that the MGs provide slightly different services. For example, one MG might provide local trunks to the PSTN, and another might provide Media Module connections to analog phones. For this discussion, it does not matter what type of phones (i.e. SIP, H.323, BRI, DCP, or Analog) exist within this Network Region. During a “sunny day,” all the MGs are registered to Processor Ethernet in the CM-Main, which is in a different NR somewhere else in the network. It aids understanding if you believe that all the resources needed for calls within a NR are provided by equipment within that NR.

A “rainy day” is when CM-Main becomes unavailable, perhaps due to a power outage. When a MG’s Primary Search Timer expires, it will start working down the list trying to register with any CM configured on the Media Gateway Controller (MGC) list. All MGs should have been configured to register to the same CM-Survivable server, which by virtue of their registration to it causes CM-Survivable to become active.

Image 1

In this context a CM server is “active” if it controls one or MGs. A more technical definition is that a CM becomes “active” when it controls DSP resources, which only happens if a MG, Port Network (PN) or Avaya Aura Media Server (AAMS) registers to the CM server.

Since all the MGs are registered to the same CM, all resources (e.g. trunks, announcements, etc.) are available to all calls. In effect, the “rainy day” system behaves the same as the “sunny day” with the exception of which CM is performing the call processing. Even if power is restored, only the CM-Survivable is active, and because no MGs are registered to CM-Main it is inactive.

In CM 5.2, SPRF was originally designed to work with splits between CM-Main and Survivable Remote (fka Local Survivable Processor) servers. In CM 6, the feature was extended to work with Survivable Core (fka Enterprise Survivable Servers) servers. To treat the two servers interchangeably, I use the generalized term “CM-Survivable.”

A “Split Registration” is where within a Network Region some of the MGs are registered to CM-Main and some are registered to a CM-Survivable. In this case only some of the resources are available to some of the phones. Specifically, the resources provided by the MGs registered to CM-Main are not available to phones controlled by CM-Survivable, and vice versa. In my example above, it is likely some of the phones within the branch office would not have access to the local trunks.

Further, the Avaya Session Managers (ASM) would discover CM-Survivable is active. They would learn of CM-Survivable server’s new status when either ASM or CM sent a SIP OPTIONS request to the other. The ASMs then might begin inappropriately routing calls to both CM-Main and CM-Survivable. Consequently, a split registration is even more disruptive than the simple failover to a survivable CM.

What can cause split registrations? One scenario is when the “rainy day” is caused by a partial network failure. In this case some MGs, but not all, maintain their connectivity with CM-Main while the others register to CM-Survivable. Another scenario could be that all MGs failover to CM-Survivable, but then after connectivity to CM-Main has been restored some of the MGs are reset. Those MGs would then register to CM-Main.

How SRPF Functions

If the Split Registration Prevention Feature is enabled, effectively what CM-Main does is to un-register and/or reject registrations by all MGs in the NRs that have registered to CM-Survivable. In other words, it pushes the MGs to register to CM-Survivable. Thus, there is no longer a split registration.

When I learned that, my first question was how does CM-Main know that MGs have registered to CM-Survivable? The answer is that all CM-Survivable servers are constantly trying to register with CM-Main. If a CM-Survivable server is processing calls, then when it registers to CM-Main it announces that it is active. Thus, once connectivity to CM-Main is restored, CM-Main learns which CM-survivable servers are active. This is an important requirement. If CM-Main and CM-Survivable cannot communicate with each other a split registration could still occur.

My second question was how CM forces the MGs back to the CM-Survivable. What I learned was that CM-Main looks up all the NRs for which that Survivable server is administered. The list is administered under the IP network region’s “BACKUP SERVERS” heading. CM-Main then disables the NRs registered to CM-Survivable. That both blocks new registrations and terminates existing registrations of MGs and H.323 endpoints.

Image 2

Once the network issues have been fixed, with SRPF there are only manual ways to force MGs and H.323 endpoints to failback to CM-Main. One fix would be to log into CM-Survivable and disable the NRs. Another would be to disable PROCR on CM-Survivable. An even better solution is to reboot the CM-Survivable server because then you don’t have to remember to come back to it in order to enable NRs and/or PROCR.

Implications of SRPF

Enabling SRPF has some big implications to an enterprise’s survivability design. The first limitation is that within an NR the MGC of all MGs must be limited to two entries. The first entry is Processor Ethernet of CM-Main, and the second the PE of a particular CM-Survivable. In other words, for any NR there can only be one survivable server.

Similarly, all H.323 phones within the NR must be similarly configured with an Alternate Gatekeeper List (AGL) of just one CM-Survivable. The endpoints get that list from the NR’s “Backup Servers” list (pictured above). This also means the administrator must ensure that for each NR all the MGs’ controller lists match the endpoints’ AGL.

Almost always, if SRPF is enabled, Media Gateway Recovery Rules should not be used. However in some configurations enabling both might be desirable. In this case, all MGs must be using an mg-recovery rule with the “Migrate H.248 MG to primary:” field set to “immediately” when the “Minimum time of network stability” is met (default is 3 minutes). Be very careful when enabling both features because there is a danger that in certain circumstances both the SRPF and Recovery Rule will effectively negate each other.

Finally, SPRF only works with H.248 MGs. Port Networks (PN) do not have a recovery mechanism like SRPF to assist in rogue PN behavior.

Enabling SRPF

The Split Registration Prevention Feature (Force Phones and Gateways to Active Survivable Servers?) is enabled globally on the CM form: change system-parameters ip-options.

Image 3

If I had not found Tim Kaye’s presentation, I would not have completely understood SRPF. So, now whenever I come across a presentation or document authored by him, I pay very close attention. He always provides insightful information.