Qualcomm Dragonwing Intelligent Video Suite Modernizes Video Management with Generative AI at Its Core

This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm.

Video cameras generate a lot of data. Companies that use a video management system (VMS) are left wanting to get more value out of all the video data they generate, enabling them to take the actions that really matter. “How can we increase worker safety and security?” they ask. “How can we track vehicles accurately, detect intrusions quickly and make sure people are where they’re supposed to be?”

The traditional VMS is about to graduate, thanks to the power of edge computing, innovations in generative AI and the Industrial and Embedded Internet of Things (IE-IoT). Through our relationships with original design manufacturers (ODMs) and independent software vendors, we now offer the Qualcomm Dragonwing Intelligent Video Suite, a set of AI-enabled video capabilities designed to help businesses improve security, safety and operational efficiency.

Our intelligent video suite offers the video surveillance as a service (VSaaS) option of an on-prem appliance — entirely air-gapped and on site — as well as direct-to-cloud camera deployments. The camera-based AI is used for object detection, people tracking and face detection. It runs generative AI (GenAI) on the on-prem appliance to execute natural-language queries on the content of video feeds from hundreds or thousands of cameras.

To further enhance this offering, we’ve signed an agreement to acquire FocusAI,1 which brings cutting-edge on-device AI and machine learning (ML) algorithms that enable real-time insights and analysis. Our intelligent video suite is designed to make full use of existing video cameras, in addition to new cameras powered by our Dragonwing processors, giving our broad ecosystem of original equipment manufacturers and ODMs a large portfolio to work with when creating modern video surveillance solutions.

The traditional VMS: Room for improvement

In a wide range of industries, the VMS has been valuable in such compute-intensive use cases as object detection, people tracking and license plate analysis. The traditional VMS is centralized, with dozens or hundreds of camera feeds going to a console monitored by humans.

But there are three main shortcomings with the traditional VMS:

  1. Commercial camera systems are relatively costly and require extensive resources. They involve specialized, purpose-built video monitoring equipment with little applicability to other uses. The footage they capture must be encoded at the source, then decoded, monitored and analyzed at destination. They require large amounts of storage, whether on site or in the cloud.
  2. Searching through the footage is overwhelming and inefficient. A traditional VMS captures more video data than human monitoring can study or even search conveniently. Human monitors can detect and identify threats from camera feeds, but no single human can monitor all feeds all the time. And few systems are equipped to enable person tracking across multiple cameras and zones.
  3. AI analytics is expensive and constrained by bandwidth. As traditional VMS has given way to AI and monitoring in the cloud, the human factor has become less of a bottleneck. But analyzing camera feeds in the cloud incurs processing costs (compute and storage), in addition to network costs (time, throughput). And, in some verticals, security precludes any connection at all to the cloud.

The technology stage is set for modernization of the traditional VMS, and for a system that can:

  • Perform detection using analytics on edge AI cameras.
  • Send back alerts, notifications and footage of incidents.
  • Detect a wide variety of incidents (slips, falls, collisions, hazards), objects (safety gear, license plates, vehicles, weapons) and people (number, faces, paths of movement).
  • Operate with or without a connection to the internet or a cloud (air-gapped).
  • Reside entirely on company premises if needed.
  • Accommodate legacy, non-AI cameras and cabling.
  • Offer lower total cost of ownership (TCO) than traditional VMS or cloud AI approaches.
  • Use GenAI and large language models (LLMs) to conduct natural-language searches (“How many employees wore safety vests yesterday?” “What was that person’s journey through this building last Tuesday?”).

The market is ripe for just such a video surveillance solution.

Figure 1: Basic architecture of a Dragonwing Intelligent Video Suite implementation.

The Dragonwing Intelligent Video Suite

Qualcomm Technologies’ camera and AI ecosystem encompasses GenAI, computer vision, LLMs, hardware accelerators and intelligent camera devices at the edge of the network. The combination of our expertise in AI, high-powered processors and enterprise solutions has led to the creation of our Dragonwing Intelligent Video Suite for enterprises ready to improve their safety and surveillance systems.

Our video suite addresses the shortcomings of traditional and cloud AI-based VMSes:

  1. At the heart of our intelligent video suite is the Dragonwing AI On-Prem Appliance Solution, designed for GenAI inference and computer vision workloads on air-gapped, on-premises hardware. The solution, based on the industry-standard PCI Express (PCIe) architecture, simplifies and streamlines video monitoring by running sensitive customer data, fine-tuned models and inference workloads on premises.
  2. An LLM-based intelligent assistant running on the on-prem appliance enables natural-language queries on footage from hundreds or thousands of camera feeds, including multi-camera tracking of subjects across zones. Using ordinary prompts, human monitors can conduct searches on criteria like objects, incidents and vehicles. They can make advanced queries for patterns such as people slipping and falling, holding hands, carrying bags and wearing a hard hat, and for movement through a building.
  3. With on-prem AI, the work of detecting motion and objects takes place as close to the camera as possible. Feeds from AI-enabled cameras (e.g., powered by Dragonwing processors) are analyzed on the collecting device, with metadata sent to the on-prem appliance. Feeds from legacy cameras are first analyzed on an edge box (also powered by Dragonwing processors), with metadata then sent to the on-prem appliance.
  4. An alternative to on-prem is a direct-to-cloud camera deployment, powered by Dragonwing platforms. This option eliminates the need to deploy, manage and maintain complex VMS, making VSaaS a more affordable and accessible solution. Direct-to-cloud cameras also offer more flexibility and scalability than a traditional VMS.
  5. The intelligent video suite features a GenAI command center running on the on-prem appliance, in which users can get answers for specific queries like tracking an object through time or generate meaningful reports (e.g., “List of all the safety infractions that happened over the past week”).

The basic architecture of a Dragonwing Intelligent Video Suite implementation is shown in Figure 1 above. The same architecture applies to use cases as diverse as security, surveillance and worker safety, and can benefit a wide range of industries including oil and gas, manufacturing, retail, hospitality and health care.

The technical benefits and advantages of the intelligent video suite include the following:

  • On edge AI cameras, the AI analytics and computer vision algorithms run on low-compression, high-quality video, before encoding. The results are fewer false positives and more accurate inference than cloud AI, which relies on API calls for inference.
  • The solution is ideal in brownfield environments because it accommodates non-AI cameras and existing cables, minimizing the waste and expense of a rip-and-replace deployment. Our system enhances those older systems by integrating leading edge, computing, AI and connectivity.
  • Even without an external connection to the internet or a cloud, this air-gapped VSaaS solution applies AI to the same common use cases as cloud computing. It gives enterprises full use and control of their models for GenAI, natural language processing and computer vision. It enables workflow automation for applications as diverse as intelligent multilingual search, code generation, automated drafting and note-taking, and custom AI assistants and agents.
  • The solution’s on-premises approach enforces the privacy, personalization and customization that companies need to deploy generative AI applications with their own models.

The Dragonwing Intelligent Video Suite enables small and medium businesses, enterprises and industrial organizations to run custom and off-the-shelf AI applications, including GenAI workloads, locally. It can run in the cloud as well as on premises.

Figure 2: Cost comparison of traditional VMS, cloud AI- and edge AI-based VSaaS.

Lower TCO

Running inference with edge AI can deliver significant savings in operational costs and TCO compared to the cost of traditional VMS and cloud AI infrastructure. The charts in Figure 2 above depict the costs in four categories related to installing and maintaining a safety and surveillance system: hardware, labor, network/bandwidth and software/SaaS.

Because traditional VMS does not rely on the data network, its bandwidth cost is negligible. But that economy is overshadowed by the high costs of hardware (for single-purpose equipment), labor (for installation and human-constrained monitoring) and proprietary local software.

By comparison, cloud AI-based VSaaS greatly reduces the costs of hardware and labor. But transmitting camera feeds to the cloud incurs bandwidth costs. The cost of software/SaaS is lower but still appreciable because of cloud compute (inference) and storage (video footage).

In edge AI-based VSaaS, the hardware costs rise to include the edge AI camera and the on-prem appliance. Labor costs are comparable because the monitoring is done by AI. But bandwidth is lower because the solution resides on premises, and software/SaaS is much lower because compute is local and far less storage is required.

Thus, most of the cost savings from VSaaS come from edge AI. Modernization to VSaaS is a whole-platform solution, encompassing hardware, software and services all the way to the edge.

Multiple use cases and industry verticals

The diversity our product portfolio combined with the diversity of our ecosystem means that edge AI and intelligent VSaaS apply to multiple use cases, including the following:

  • Motion and object detection/tracking (e.g., people, vehicles)
  • Vehicle detection and analysis (e.g., delivery vehicles)
  • Location mapping with bird’s-eye view
  • Multi-camera tracking (e.g., journey through building)
  • Detection of loitering
  • Slip-and-fall detection

The solution accommodates the needs of multiple industries/verticals:

  • Oil and gas
  • Automotive
  • Manufacturing
  • Retail
  • Telecommunications
  • Public spaces and smart cities
  • Transportation (air, ground, sea)
  • Hospitality
  • Health care

Intelligent VSaaS and edge AI are flexible enough for venues as diverse as retail stores, quick service restaurants, shopping outlets, dealerships, hospitals, factories and shop floors. They fit where the workflow is well established, repeatable and ready for automation.

Intelligent monitoring, natural-language video search

The most powerful and labor-saving features of the Dragonwing Intelligent Video Suite include AI monitoring and video search using a seamless combination of VLM and LLMs which can be configured to work wholly at the edge or in hybrid mode.

The intelligent video suite is designed to track persons of interest across multiple cameras and rooms, with real-time tracking alerts and live streaming. For example, if the system detects a weapon or other dangerous object, it can associate the object with the person carrying it. The system can then automatically track the person’s footprints across rooms, floors or buildings, mapping the journey for security personnel. Or, it can track a customer’s movements across a retail store after an event has occurred.

Our intelligent video suite uses a VLM that enables smarter embeddings that in turn generate optimized metadata, as compared to conventional CNN AI models. This VLM can be executed on the camera or edge box itself that allows you to further reduce cloud costs. An LLM then powers an intelligent video assistant, enabling natural-language searches of footage gathered from edge AI cameras. With the power of GenAI, a building manager can now extract more information instantly by chatting with the system, a task that would otherwise require a cumbersome search and many clicks to get to the right data. Some examples of such queries include:

  • “Show person of interest events on Monday between 8 and 9 a.m.”
  • “Show me people wearing white helmet last week”
  • “Show the last time Marco was seen today”

Intelligent monitoring and natural language video search at Aramco.

Inside the on-prem appliance

AI on-prem appliance solutions are scalable because they are built around the Qualcomm Cloud AI family of accelerators and Dragonwing AI Inference Suite for On-Prem. They offer machine learning capacities of up to 870 TOPs with a single Qualcomm Cloud AI Ultra card.

The Qualcomm Cloud AI 100 Ultra is a PCIe AI inference accelerator card that can run GenAI models with up to 70 billion parameters on a single card. ODMs are working with Qualcomm Technologies to build accelerators into other solutions in different form factors for specialized GenAI and LLM workloads.

The solutions are the result of collaboration among ecosystem members. In the case of intelligent VSaaS, ODMs build cameras powered by Dragonwing processors, including our flagship Dragonwing IQ9 Series, and the on-prem appliance powered by Qualcomm Cloud AI accelerator cards. Our intelligent video suite includes the stack for VSaaS monitoring and natural-language video search that runs on the on-prem appliance.

The model is easily extensible to any GenAI or LLM application because the software and stack are flexible. For use cases that require, for example, detection of a different kind of object, the effort is incremental.

Next steps

The Dragonwing Intelligent Video Suite enables businesses of all sizes to run their AI applications, including GenAI workloads, locally or in the cloud.

Ready to see an example of on-device GenAI in video security and surveillance? A project between Qualcomm Technologies and Aramco has resulted in  improved monitoring and safety in Aramco’s Abqaiq oil facility. With feeds from AI-enabled cameras connected to the control center and legacy cameras connected to an on-prem appliance, security personnel receive real-time notifications for security breaches, including intruder detection. They take advantage of automated report generation and cross-camera tracking enabled by GenAI.

Kedar Gharat
Director, Product Management, Qualcomm Technologies

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone
Phone: +1 (925) 954-1411
Scroll to Top