Exploring the cost performance benefits of CCTV analytics technologies.

There are three broad types of video analytics technology available in server-based, VMS, CCTV solutions.

In order of accuracy and capability, they are Neural Analytics; then Deep Learning and Artificial Intelligence (DL and AI) Analytics; and finally Binary Large Object, or BLOB.

Digifort supports all three types, as well as integrating with the analytics in third-party NVR’s; ‘all-in-one’ analytics boxes; and cameras. We therefore asked Nick Bowden, Managing Director of Digifort UK to explore their cost performance benefits.

The most accurate and capable analytics option is neural. This is the most expensive option to deploy because the software costs more and it requires high performance hardware to run it. The other analytics options may be less capable, but they are perfectly suitable for many CCTV applications, where budgets are tighter.

Digifort is a technology partner of Nvidia. Its analytics software is optimized to run on Nvidia Graphics Processing Units (GPUs). These are fitted in a server alongside the operating system (OS) processor. VA server ‘performance’ is measured in CUDA cores, which is similar to brake horsepower (BHP) in cars. GPU’s of 4000 CUDA cores or more are common place and affordable. This GPU performance ‘budget’ is distributed across multiple analytics channels and the analytics functionality allocated to the required video channels – with the flexibility to be reallocated to different video channels in the system, if required. NVRs; boxed analytics solutions; and cameras with onboard analytics simply do not have this performance ‘grunt’ or system deployment flexibility.

GPU boards are rapidly developing, with processing performance doubling each year, for the same cost. We can therefore expect to benefit from yet more, huge performance improvements and cost reductions in server-based CCTV systems going forward. Also, dedicating the GPU cores to analytics and the server cores to the OS and video processing is good practice for optimal server performance, as each accesses its respective processor resources differently.

1. Neural Analytics.

Neural analytics is a relative newcomer to mainstream CCTV. Like human recognition, many different objects within a camera view are identified from a library of known objects, with specific new objects “introduced” to the system and other objects learnt by the system over time. Rules can be applied for individual or combinations of objects with real-time alarms or events triggered for an operator response. Digifort has three neural networks to choose from:

General’ objects such as vehicles and humans.
Crime, for identifying weapons, suspicious arm positioning and movement (like aiming a gun).
Industrial, for identifying people wearing helmets, masks, goggles and PPE.

Neural analytics lends itself to ‘occupancy’ type applications, such as the number of cars in a car park or people in a queue. It recognizes the objects ‘seen’ in the camera view, or a zone, and counts them. Multiple zones from one or many cameras can be aggregated for a site count. Scene backgrounds are ignored, as they are not recognized objects, reducing false alarms.

2. Deep Learning and Artificial Intelligence (DL and AI).

DL and AI analytics may also have a neural element and most commonly recognize people, vans, bikes, cars, trucks, groups of people, bags, cyclists and much more, including with a specific, colour profile. As a camera scene is ‘learnt’, the DL / AI analytics self-calibrates to learn the scene backgrounds, minimizing false alarms. Many rules can be applied individually or concurrently, such as presence, entry, exit, appearance, disappearance of an object; direction, tailgating filters; counting over a line; and stopped, loitering, abandoned and removed object. Digifort’s analytics also uses a metadata reporting framework which allows forensic searching of recorded video for different objects to the original settings.

Many NVRs, boxed analytics and embedded camera solutions use versions of this analytics type, usually without the neural element, but often lack the processing capability required to maximize their potential as its not practical or cost-effective to fit Nvidia GPUs into NVRs.

3. Binary Large Object / BLOB.

This is the most basic level of analytics, recognizing object size (number of pixels), and behavior based on motion detection and some simple analytics like line crossing. Many NVRs use this type of analytics. It is a low-cost option, ideal for driving motion or event recording in a VMS system, to save on storage.

Analytics performance and hardware overhead.

Neural Analytics use D1 (720x576 pixels) video streams for processing, even if the recorded ‘evidential’ stream in the VMS is 4MP, 8MP or more. Some very specific analytics types use 1080p (1920x1080 pixels), usually when analyzing human behaviors. As an indication of capability, a 3000 core GPU at under £500 will typically process around 40x neural channels.

A word of warning, some boxed analytics solutions only record the analytics processing stream, which might be at D1 or less, without a concurrent HR stream for evidence. This means that analytics video can often be recorded at low resolution - so do check if you go down a boxed analytics route.

There is a place for each type of analytics, when cost and performance are factored in. However, neural analytics outperforms them all in terms of accuracy and capability; its cost to deploy is reducing as server and GPU costs drop; and it is future proof, allowing GPU and performance / accuracy upgrades in line with continuous, neural analytics software development.

Exploring the cost performance benefits of CCTV analytics technologies.

Explore More

Ready to dive in? Talk to our team today