AT&T Labs Edge AI Deployment: Real-Time Video AI on Cisco Unified Edge with Cisco AI POD
By Matthew Levesque and Jason Fleagle, OnStak VP and AI Architect
Edge AI discussions often hold back at the same point: teams agree the use case is reasonable, but they can’t validate deployment, performance, and operational visibility on the infrastructure that would actually run it. This AT&T Labs engagement was designed to remove that friction. We built a production-realistic in-lab environment working on Cisco’s new Unified Edge. It demonstrates real-time video AI and a digital twin workflow, with GPU observability built in from day one.
Learn more about what we built, how we delivered it, what worked, and why it’s a repeatable motion Cisco sellers and other companies can run in other enterprise environments.
What We Built
We deployed on an in-lab Cisco Unified Edge (Avatar) edge device (a Cisco UCS-based Cisco AI POD with NVIDIA GPUs). On top of that, the joint team implemented:
- A digital twin “drive-thru” demo using NVIDIA Omniverse
- Real-time video AI/computer vision use case
- OnStak GPU performance monitoring dashboard for on-prem AI, LLM, and inference workloads
The purpose was not to create a “demo environment.” It was to run a production-realistic stack that makes it simpler to answer infrastructure and performance questions early. And more importantly, it positions AT&T to implement additional AI use cases in their retail stores globally all while monitoring and improving performance for the connected AI infrastructure.
Here’s what we did
Working with AT&T, Cisco, & our OnStak team members, we executed a five-milestone delivery program, fully remote, deployed on AT&T’s in-lab Cisco Unified Edge (Avatar) edge device.
1) Discovery and requirements
We worked with AT&T Labs to assess edge constraints, data flows, and AI priorities, especially around video and computer vision. We selected a drive-thru-style, multi-camera workflow because it was easy for technical and business stakeholders to appraise within a short timeline.
2) Solution design and lab build-out
We designed the deployment architecture optimized to maximize the Cisco Unified Edge capabilities and prepared the environment with Red Hat Enterprise Linux and NVIDIA GPU drivers so that the implementation indicated production conditions.
3) AI workload deployment and integration
We deployed NVIDIA Omniverse to support a 3D digital twin drive-thru/retail environment and integrated it with real-time video AI workflows.
We also installed OnStak GPU performance monitoring to provide visibility into GPU utilization, latency, and throughput, so teams could evaluate model behavior and infrastructure performance together.
4) Demonstration and AI workshop
In the workshops with AT&T internal teams, we demonstrated:
- Real-time video AI workflows for people/vehicle movement, queue length monitoring, service-time estimation, and safety/compliance indicators
- How Omniverse-based digital twins can connect to real-time camera and IoT data to support remote monitoring and operations
- How GPU observability supports sizing decisions for Cisco AI POD & edge infrastructures and helps stakeholders discuss performance in business terms
5) Knowledge transfer and next steps
We delivered a structured knowledge-transfer session so AT&T engineers can expand and replicate the environment. The engagement also produced a backlog of follow-on video AI and computer-vision use cases across retail, hospitality, and telco operations.
What worked
A few things made the engagement effective:
- Using a scenario that both engineering and field teams can explain quickly (drive-thru, multi-camera)
- Keeping the build production-realistic (OS, drivers, deployment approach)
- Treating observability as a first-class requirement, not an afterthought
- Ending with documentation and a path for internal teams to extend the environment
What we learned
The enterprise customers move faster when they can see the workload running on the same class of edge infrastructure they’re considering. Second, they can understand performance through concrete signals like GPU utilization, latency, and throughput. Third, they can leave a workshop with a clear shortlist of “next use cases,” not just a concept. Eventually, AT&T requested additional phases based on the outcomes of this initial engagement.
1) Why this matters for Cisco sellers & other enterprise teams
This engagement provides a practical pattern of a repeatable motion for CIsco Unified Edge and Cisco AI PODs:
- Start with an initial POC deployment via Cisco MINT & partner program with us at OnStak
- Deploy a verticalized video AI + digital twin scenario
- Use GPU monitoring to make infrastructure behavior visible
- Exit with a roadmap that supports follow-on work and additional capacity discussions
2) Video AI and computer vision are a strong entry point
Across industries, video AI tends to be one of the fastest ways to validate edge AI value because the data already exists and latency often matters:
- Retail and QSR: queue monitoring, loss prevention, drive-thru optimization
- Telco and utilities: safety monitoring, access control, asset/site visibility
- Healthcare and campuses: footfall analytics, compliance monitoring, secure access workflows
3) Cisco MINT reduces friction
The project was delivered through Cisco MINT where sellers can lead with a low-risk first step: prove the workload in a lab, then expand into pilots, rollouts, managed services, and additional Cisco AI POD capacity. As the Cisco Account Team put it, “Unified Edge and AI PODs are the perfect stage; OnStak provides the play.”
How to engage OnStak with your customers
Ideal targets
- Large retailers, QSRs, and hospitality brands with distributed sites
- Telcos, utilities, and transportation providers with existing Cisco networks and emerging AI initiatives
- Healthcare, manufacturing, and campus environments where latency, data locality, and compliance make edge AI essential
Typical motion
- Qualify interest in video AI, computer vision, or digital twin scenarios and confirm Unified Edge/UCS alignment
- Sponsor a funded “Cisco + OnStak Edge AI Lab and Workshop” presales initiative
- Co-create an Omniverse scenario aligned to the customer’s KPIs
- Expand into follow-on work: pilots, rollouts, managed services, and additional AI POD capacity
Final thoughts
This engagement delivered a validated edge AI environment running on Cisco Unified Edge with Cisco AI POD, combining NVIDIA Omniverse-based digital twin workflows with real-time video AI and the GPU visibility required to make performance and sizing discussions concrete. For Cisco sellers, the takeaway is the motion: use Cisco MINT to sponsor a low-friction lab engagement, deploy a vertical scenario that stakeholders can evaluate quickly, and use observability to translate infrastructure behavior into decisions and next steps.
If you have an account exploring video AI, computer vision, or digital twin scenarios, and they need a credible first step before a pilot, this pattern is designed to shorten the path from interest to an executable roadmap.
About OnStak
OnStak is an edge-first AI services firm specializing in deploying GPU-accelerated workloads, digital twins, and LLM infrastructure across data center and edge environments. From concept to production, OnStak helps enterprises turn AI infrastructure investments into measurable business outcomes.
About Cisco Unified Edge and AI PODs
Cisco Unified Edge is an integrated platform for distributed AI workloads, bringing together compute, networking, storage, and security at the network edge for real-time inferencing and agentic AI. Cisco AI PODs provide pre-validated, modular AI infrastructure built on Cisco UCS and NVIDIA GPUs, supporting the full lifecycle from training to high-throughput inferencing.
Contact us to learn how we can help support you and your AI projects.