National

As per Qualcomm AI research team, AI can play an important role in analyzing videos

By

Published : Jan 21, 2021, 1:07 PM IST

Updated : Feb 16, 2021, 7:53 PM IST

Qualcomm AI Research team developed ways and means to use AI to analyze videos in an efficient and effective manner. Video perception techniques that have been developed are centered around two key concepts: leveraging temporal redundancy and making early decisions. Leveraging temporal redundancy means taking advantage of the fact that video frames are heavily correlated. Making early decisions attempts to make easy decisions early by dynamically changing the network architecture per input frame.

Qualcomm AI Research team , analyzing videos
As per Qualcomm AI research team, AI can play an important role in analyzing videos

San Diego: As we all know a picture is worth thousands of words and a video is essentially a sequence of static pictures, adds a temporal element and more context. Now, what is video perception? It is simply the ways by which we analyze and understand video content. Artificial Intelligence, AI can play an important role in analyzing videos in a number of ways ranging from autonomous driving and smart cameras to smartphones and extended reality.

As per Qualcomm AI research team, AI can play an important role in analyzing videos. Courtesy, Qualcomm

For example, autonomous driving uses video from multiple cameras for a variety of crucial tasks, including pedestrian, lane, and vehicle detection. Video perception is crucial for understanding the world and making devices smarter.

At Qualcomm AI Research, video perception techniques that have been developed are centered around two key concepts: leveraging temporal redundancy and making early decisions.

As per Qualcomm AI research team, AI can play an important role in analyzing videos. Courtesy, Qualcomm

Leveraging temporal redundancy to reduce computations across frames

Leveraging temporal redundancy means taking advantage of the fact that video frames are heavily correlated. since there is no need to analyse the entire image, the Qualcomm AI research team kept the computation only to the regions where there are significant changes. Learning to skip regions and recycling features are two novel techniques they developed to take advantage of temporal redundancy in the video.

For the learning to skip regions, then the team developed skip-convolutions for convolutional neural networks (CNNs). With various techniques, permutations and combinations, the net result was that the neural network learns to skip unnecessary computations while maintaining accuracy.

Qualcomm AI research team further explained that their skip-convolution technique applied to state-of-the-art object detection models resulted in 3x-5x speed-up over state-of-the-art models without sacrificing model accuracy. What’s also noteworthy is that skip convolutions are broadly applicable and can replace convolutional layers in any CNN for video applications.

The recycling features technique computes features once and uses them later rather than computing deep features of the neural network repetitively. It is applicable to any video neural network architectures, including segmentation, optical flow, classification, and more. The team explained said that on a semantic segmentation example, they saw a 78% reduction in computation and a 65% reduction in latency by using feature recycling. Dramatic reduction in memory traffic also saved power.

As per Qualcomm AI research team, AI can play an important role in analyzing videos. Courtesy, Qualcomm

Making early decisions to reduce computation

Making early decisions attempts to make easy decisions early by dynamically changing the network architecture per input frame. Early decisions, in essence, allow us to skip computation that is unnecessary for maintaining accuracy.

The two techniques; Early Exiting and Frame Exiting help in making early decisions.

Early exiting exploits the fact that not all input examples require models of the same complexity to maintain accuracy. For simple input examples, very small and compact models can achieve very high accuracies, while only failing for complex examples. Early exiting reduces compute while maintaining accuracy. For an object classifying example, exiting at the earliest possible neural network layer resulted in a 2.5X reduction in computations while maintaining accuracy.

Frame exiting uses a similar gating concept but attempts to skip computations on an entire input frame by making early decisions. For action recognition tasks, frame exiting not only reduces compute but also improves the accuracy of the model. This gating method also allows us to train models that trade-off between accuracy and efficiency, allowing AI developers to customize the model for the use case requirements. There are challenges too which the team is working towards.

As per Qualcomm AI research team, AI can play an important role in analyzing videos. Courtesy, Qualcomm

Also Read:Qualcomm unveiled boosted Snapdragon 870 5G Mobile Platform

Last Updated : Feb 16, 2021, 7:53 PM IST

ABOUT THE AUTHOR

...view details