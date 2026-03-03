A note from the author

This post is an effort to surface the most important lessons we keep seeing as enterprises work to add AI into real customer support environments. When AI moves from a demo to a live, customer-facing system, it introduces practical constraints that teams cannot ignore. The constraints are not theoretical. They show up in call quality, security reviews, legal discussions, and cost models. The goal here is to explain those insights in plain language so CX and IT leaders can make better architecture decisions.

The Simple Idea: AI Adds New Rules That Customers Can Feel

For years, “cloud first” was sold as the modernization story for contact centers. In practice, large enterprises kept making pragmatic decisions based on latency, data control, and governance, not slogans. Now that AI is moving into real customer conversations, those practical constraints are becoming the main event.

Traditional contact center platforms were built around features, routing, integrations, and scale. AI-driven customer experience adds a different set of requirements - I intend to explain each of these in “plain English”:

1. Voice needs fast, natural timing.

2. Sensitive data needs tighter control.

3. AI logs create new legal and discovery exposure.

4. Inference economics can become unpredictable at scale.

When those requirements show up, the most resilient architectures tend to look hybrid, because hybrid gives you more control over where the most demanding AI workloads run.

1. Voice AI Has a Latency Budget, and It is Smaller Than Most People Think

Human conversation is tightly timed. Research often finds turn transitions around ~200 milliseconds, and “no gap” timing is often described around ~150 to 250 milliseconds.

Telecom planning guidance reinforces the same reality. ITU-T G.114 treats one-way speech delay of 0 to 150 ms as “good,” 150 to 400 ms as “acceptable” with increasing care, and around 400 ms as an upper bound for general network planning.

Now consider a typical voice AI turn:

telephony ingress

speech to text

model reasoning

text to speech

telephony egress

Each stage adds time. Each network hop adds delay and jitter. Even “only tens of milliseconds” per hop adds up quickly, and jitter buffers add more.

Public cloud inter-region latency alone can consume a meaningful part of the conversational budget. For example, published cloud RTT tables show tens of milliseconds for domestic hops and higher for transatlantic or transpacific paths.

Plain English takeaway: if your voice AI loop is bouncing between regions, providers, and services, the conversation can start to feel slow even when nothing is “broken.” The customer just feels friction.

2. The “Slow Leak” Risk: Proprietary Data and Metadata Exposure

AI in a contact center produces valuable artifacts: transcripts, summaries, intent signals, and coaching insights. It also produces sensitive artifacts: customer PII, regulated details, and internal strategy content that can show up in prompts and notes.

The research highlights a practical risk pattern:

Provider defaults can vary by tier and channel, and can change over time

Logs can become a legal focus (more on this to follow)

Employee behavior creates unmanaged leakage paths (copy-paste, ad hoc tools)

A non-trivial portion of enterprise prompts and uploads contain sensitive data

Plain English takeaway: once sensitive CX data is regularly flowing into third-party AI systems, it is harder to be confident about where it lives, how long it persists, and who can compel access to it.

3. Legal Exposure has Become a Board-level Topic

AI creates new categories of electronic records: prompts, outputs, and interaction histories. Those records can become relevant in litigation and discovery.

The research points to a major legal inflection:

In February 2026, a U.S. federal court decision (United States v. Heppner) concluded that documents generated via a third-party AI tool were not protected by attorney-client privilege or work product, emphasizing third-party disclosure and the lack of confidentiality assurances.

It also highlights the expanding realities of e-discovery, including that AI logs stored on third-party servers can be subject to preservation orders and subpoenas.

Plain English takeaway: if sensitive legal, compliance, or investigative work is done within third-party AI tools, you may be creating discoverable material outside your control. That risk alone can change infrastructure decisions.

4. Regulation and Data Sovereignty Push Toward Localized Control

Many Global 1500 organizations operate across jurisdictions. AI adds governance obligations that make “controlled inference zones” attractive, especially for sensitive interactions.

The research summarizes key pressures:

GDPR cross-border transfer risk favors localized processing for sensitive interactions

The EU AI Act introduces governance obligations that favor controlled deployments

HIPAA safeguards and vendor governance reinforce conditional requirements for cloud processing of ePHI

Government and defense frameworks create practical constraints that often require sovereign or hybrid patterns



Plain English takeaway: compliance is not abstract. It is about proving control of data flows, retention, and oversight.

Enterprises are Already Adjusting Course

Survey data indicates that 86% of CIOs are planning to move some public cloud workloads back to private cloud or on-premises environments, driven by AI-era cost and risk dynamics.

This is not a rejection of cloud: Public and private cloud still plays a role for burst workloads, global routing, experimentation, and baseline digital channels. However, it does show that the view by some of cloud being a binary workload option is myopic to say the least, and cloud cost governance is intensifying.

Plain English takeaway: the direction is selective, not absolute. You can put what works in the cloud, and place other workloads where they can meet strict requirements consistently.

The Architecture That Fits the AI Era: Hybrid by Default, Workload by Workload

A useful reference pattern is to separate the system into three paths:

1. Real-time voice path (low latency): telephony ingress, streaming STT, local policy plus inference, streaming TTS, barge-in control

2. Sensitive data path (governed): CRM and case data, knowledge bases, redaction, policy enforcement, enterprise-owned audit logs

3. Cloud scaling path (elastic): overflow inference for non-sensitive use, non-real-time analytics, global routing, and resiliency

This pattern helps address:

Latency constraints by keeping the loop local

Data leakage vectors by minimizing third-party exposure

Legal discovery exposure by owning logs and retention posture

Regulatory governance by localizing sensitive processing

Where Avaya Infinity Fits

Avaya Infinity is positioned for this reality because it is built for connection, orchestration, and governance across environments.

Avaya Infinity is designed for workload placement, not blanket assumptions

Infinity supports hybrid patterns in which the most latency-sensitive, high-liability workloads can run in controlled zones. At the same time, the cloud remains available for elasticity, global reach, and non-real-time AI tasks.

Avaya Infinity is built to connect context safely

AI only helps when it can retrieve the right context from systems of record such as CRMs, ticketing systems, and knowledge bases. Hybrid architectures make it easier to apply policy, redaction, and audit controls where the data is most sensitive.

Avaya Infinity is aligned to the AI era’s biggest constraint: trust

In an AI-driven contact center, trust is created by fast conversations, controlled data paths, and an enterprise-owned governance posture. Those are core requirements, not optional extras.

This post was intended to help you understand the real, on-the-ground lessons that show up once AI is added to live customer support systems. If you are planning voice AI, agent-assisted, or automated summaries, consider starting with a simple architecture exercise: identify which interactions require ultra-low latency, which data must remain tightly governed, and which workloads can safely benefit from elastic cloud-scale. Avaya Infinity is designed to support that hybrid reality, so you can deploy AI where it makes sense without taking on unnecessary risk.

FAQ

Does this mean public cloud is the wrong approach for enterprise CX?

No. Cloud remains excellent for burst scaling, global operations, and experimentation. The AI era makes certain workloads, especially real-time voice AI and sensitive-data AI, better suited for controlled environments.

Why is voice AI different from chatbots?

Voice is time-critical. Human turn-taking happens fast, and telecom guidance treats delay as a quality limiter. Multi-hop architectures can quickly consume the latency budget.

What is the biggest hidden risk with third-party AI tools?

Logs and interaction histories. They can create sensitive records that may be retained, preserved, or subpoenaed, and recent legal developments show that privilege can be fragile with third-party AI tools.

What does “hybrid by default” actually mean?

It means separating the real-time voice path, the sensitive data path, and the elastic cloud-scaling path so that each can meet its own requirements consistently.

How does Avaya Infinity help in this environment?

Infinity is built to orchestrate the customer experience across environments and connect enterprise context with the right governance controls, which meet the AI era’s needs for low-latency voice, regulated data handling, and an enterprise-owned risk posture.