What is Intelligent Video Analytics?
Video Analysis or Video Content Analytics (VCA) refers to the process of extracting useful information from video files, such as identifying stationary or moving objects or individuals (biometric security) under observation, counting people, detecting threats or intruders, reading license plates etc. Intelligent video analytics is an amalgam of computer vision and artificial intelligence that automates all the aforementioned tasks and more from multiple camera feeds in real-time. It analyzes digital images and videos frame-wise and extracts meaningful insights as a result.
This technology uses deep neural networks that locate and recognize people, objects, actions etc. in real-time or later, as per the business requirement. It involves an ecosystem of surveillance cameras and intelligent software that bring visual framework and AI together to ensure safety across premises and improve overall operational productiveness. The value of intelligent video analytics global market was estimated to be around USD 1.2 billion as of 2020. This is steadily growing at a CAGR of 28.4% and is expected to reach a value of approximately USD 7 billion by 2027 [Business Wire].
Market size outlook for global intelligent video analytics solutions from 2020-2027
Working of AI Video Analytics
Let us start with the basics of video analytics solutions. Video surveillance refers to the process of using a network of display units, analog or digital recorders and 24/7 operating cameras for transmitting recorded images and video footage of a particular area. Video files take up larger space since they are a combination of audio, real-time images, video and meta-data. Video formats refer to how information present in the video is stored on servers and devices digitally. Codecs and containers are two components of a video format. Codec stands for compressor/decompressor that helps in reducing the file size and storage amount taken up by a video by reducing the resolution, number of colors present in a video and merging similar data. Video codec decompresses the file data as and when it is opened and viewed. The quality of a single file is usually not diminished during compression (lossless) but if it involves removing or merging multiple files, the process can be lossy in nature, that is its quality gets reduced. Few coding/ codec/wrapper formats can be H.264, Apple ProRes, H.265, VP8, VP9, AV1, VP6, Sorenson Spark etc.
Containers, on the other hand, are responsible for keeping together all the file elements such as audio, subtitles, meta-data and video for continuous and synchronous playback. AI video analytics systems utilize progressive computer vision and machine learning tools for extracting insights from videos of all container extensions such as .mp4, .webm, .wmv, .mpeg4, .mov, .flv, .avi, .mkv, and others like Matroska, M-PEG-4 Part 12, OGG etc. The main difference between codecs and containers is that the former is the software part that includes compressing during recording process and the latter is the package that contains the end-product for playback. Codecs reduce the overall file size through encoding and decoding, while containers package this compressed/decompressed data, transport it to the viewing software and keep all data streams synchronized during playback.
On receiving such videos, video analytics solutions analyze using machine learning algorithms that detect actions, people and objects while sequentially scanning videos pixel-wise in real-time or isolating freeze frame-wise during post-processing in case of recorded videos. They utilize predictive rule-based algorithms that involve decision trees through if/then queries and elimination process per image to generate alerts for further verification. Simpler video content analytics software traverses through entire trees per image posing limitations such as decreased accuracy due to excessive rules, non-conforming cases, sensitivity of motion detection etc. Models based on AI/ML are more capable of handling such edge cases because they are exposed to millions of images during training. For example, an if/then algorithm poses questions like whether the object has four wheels, or is x pixels high and y pixels wide, whereas ML algorithm compares the image with its memory of generalized concepts involving a vehicle in all types of lighting conditions, vehicle sizes etc. to conclude that the image is of a car.
The above diagram showcases decision tree using XGBoost OSS for detecting driving behavior
Use Cases of AI Video Analytics
Video content analytics is powered by flexible, scalable and powerful computer vision, IoT, AI/ML technologies for capturing, digitizing scenes, automating human supervision, detecting threats, quantifying risks and providing security assessments. A common terminology used in this case is anomaly detection, where anomalies may refer to aberrations of region entities (people, object, vehicle) using unsupervised or semi-supervised learning. It can detect and identify different objects and people across multiple security video streams for longer durations. A specific example could include object tracking combined with person detection (YOLOv7) via vertical motion, virtual fencing or fence-climbing detection systems that detect such irregular behavior using 3D sensing and pose estimation algorithms. In contrary to this, a person walking (gait biometrics) near the fence is classified as horizontal motion and not a suspicious activity to raise alarm through message, email etc. In such cases, the detection algorithms run on-device in the area of view rather than on external servers, also known as edge computing.
Another example can be AI-based object classification systems for videos that detect abandoned or suspicious objects as per trainings based on detecting miniscule differences between a safe and hazardous object in airports (local anomaly), events, metros etc. X-ray and computed tomography scanners with 3D technology have been implemented worldwide for luggage scanning for firearms, sharp metals, and other dangerous objects. Intelligent VCA can be used for behavior tracking that involves movements of humans (or groups – collective anomaly) in relation to objects nearby them. Cases that involve AI video analytics systems can be loitering and stopped vehicle detection (point anomaly) in which vehicles remain in particularly sensitive zones like around ATMs, pharmacies, dispensaries, defense quarters etc. for unusually longer periods. They can also help in avoiding obstruction of dock area movements, reducing wait time in parking areas and indicating unreported accidents. Alerts are also triggered in times of compromise with video stream, vandalism and camera sabotage like paint or lens removal.
The above diagram shows detection of unattended object in an airport by video analytics surveillance
Intelligent video analytics can assist retailers in streamlining operations and providing better customer experience without additional expenses. One example can be intelligent queue management that can help in establishing self-checkout systems without endangering the store to shoplifting activities. To know more about seamless retail self-checkout systems, visit our blog. AI video analytics systems were especially useful during pandemic times in managing queue sizes. Footfall analysis is another application of intelligent video analytics solutions that help retailers in people counting, time spent by customers near product displays, gaining operational and branding insights and subsequently devising new marketing strategies. AI camera-based smart parking lots also act as effective retail IT solutions as they help in tracking parking occupancy using image recognition (driver), automated number plate recognition and object detection (make & model).
The above image shows queue detection and management in a retail store by video analytics surveillance
Intelligent VCA help in detecting unusual positions, falls, period for which a person is lying on floor or is incapacitated, while monitoring patients and for elderly care in homes and hospitals. These pose as a hands-free solution as compared to smart wearables, and not only detect fall but also successful medication administration. Combining these with ML facial analysis can help clinicians’ non-verbal prognostic approach in determining a patient’s mental health, where intelligent VCA can detect subtle differences between an abnormal and normal facial, physical and emotional behavior (DeepFace library) through training. To know more about facial recognition using opencv, visit our blog. Moreover, video analytics solutions are also used to screen foodborne pathogens by using fluorescent labeling and video processing, as well as obtain live bacteria feeds for differentiating amongst various bacterial composures.
The above image shows pose, fall detection and facial analysis of a patient by AI video analytics
Real-time intelligent video analytics of constant feedback from city-wide camera systems powered by deep learning algorithms can be used for various applications in smart cities. These help in making informed decisions, intra-agency collaboration, driving new revenue streams, enhancing citizen engagement, analyzing crowd behavior and overall economic development with minimal human intervention and streamlined complex processes. License plate image capture and recognition of speeding vehicles (contextual anomaly) with requisite illumination, OCR and vehicle database, avoids traffic mishaps and helps in penalizing drivers for traffic rule violations. AI video analytics systems perform foreground segmentation, background subtraction and deep learning using convolutional neural networks. OCR converts number plate images to digital textual strings for VCA systems to recognize number plates and create metadata to be used by authorities. In addition to the above cases, these systems can also be used for monitoring traffic jams, detecting accidents, road and infrastructure maintenance, count of vehicles, analyzing traffic patterns, gaze recognition and other quantitative insights for authorities. For a detailed reading on AI pattern recognition, visit our blog.
The above diagram showcases traffic analysis and congestion detection using deep learning in by video analytics surveillance
Your Next Steps with KritiKal
KritiKal Solutions falls amongst the best intelligent video analytics companies with experience of more than 21 years in this domain. It has developed varied applications of AI-based video analytics such as aerial surveillance of tea leaves for bountiful harvest using multimodal data fusion for enhanced situational awareness and geo-tagged semantic region identification. Another project involved drone image analytics of construction sites that tracked multiple work progress checkpoints. We combined container OCR with video analytics to develop shipping port automation solution as well as developed trolley management and vehicle management system for a major retail client. Our state-of-the-art traffic analyzer and enumerator (TRAZER) has assisted many cities in traffic surveillance and security. To know more about license plate recognition systems, do read out blog. KritiKal has delivered many more projects involving intelligent video analytics that surpassed shortcomings related to real-life data collection, illumination, camera angles for pose and perspective, heterogenous objects, anomaly detection in dense or sparse conditions, occlusion, mass deployment, privacy requirements etc. Please call us or mail us at email@example.com to avail our services.
Sradheya Pattnaik currently holds the position of Senior AI/ML Engineer at KritiKal Solutions. He is proficiently skilled in computer vision, neural networks, NLP, Kubernetes, Python, TensorFlow and numerous other programming languages, software and sub-fields of AI/ML etc. He has played an eminent role in delivering cutting-edge deep learning technology-based projects for some major clients.