Case-Study: Action Recognition

We are proud to present our detailed case study in Action Recognition!

11/01/2023

Case-Study: Action Recognition

Problem:

Our client faced a security challenge that required monitoring human actions within a specific area using cost-effective CCTV installations. The goal was to detect suspicious behaviour in real time and flag it for further investigation. The solution needed to be robust yet affordable, as the existing infrastructure relied on basic camera setups without advanced capabilities.

The main challenge stemmed from the client’s limited budget, which restricted the option of upgrading to high-end cameras or implementing complex, GPU-backed processing systems from the start. Despite the budget constraints, the client needed the solution to be reliable and capable of detecting actions that could indicate potential security risks. The system had to work efficiently even with standard graphics processing units (GPUs) and off-the-shelf video cards.

Solution:

Our initial approach to solving the problem focused on using deep learning models. We intended to rely heavily on neural networks and action recognition techniques, which are known for their high performance in large-scale systems with access to abundant, high-quality training data. These systems can process video feeds, classify actions, and identify suspicious behaviour through continuous learning from labelled datasets.

However, as the project progressed, it became clear that the expected quantity and quality of training data could not be supplied. A large-scale deep learning model requires high-resolution video feeds, extensive datasets for pre-trained models, and considerable computing power, which simply wasn’t available for this particular project. Without enough real-world examples of suspicious actions, the deep learning model could not be fully trained to recognise specific actions or behaviours.

Recognising this limitation, our team shifted to a hybrid model. Instead of purely relying on neural networks, we decided to integrate transfer learning techniques for the parts of the project that dealt with modelling human bodies. Transfer learning allows us to take advantage of pre-trained deep learning models that have already been exposed to large datasets. We then adapted these models to recognise the basic structure and movement of the human body, without needing to start from scratch with a new training set.

To compensate for the lack of data in identifying suspicious actions, we incorporated a rules-based approach into the system. This rules-based method operates based on predefined sets of conditions that represent unusual or suspicious behaviour. These rules can include unexpected movements, actions that violate normal behaviour patterns, or lingering in restricted areas.

This hybrid model enabled us to process the video feeds using standard graphics cards and mid-range GPUs. We used PyTorch to handle the deep learning aspects of human body recognition and vectorised NumPy code for the rules-based logic. The rules-based components are more computationally efficient, requiring less GPU processing than the deep learning models, allowing the system to run on dedicated graphics cards without the need for high-end GPUs or video cards. This approach also optimised the clock speed and performance of the system, ensuring smooth operation within the existing hardware constraints.

Results:

The proof-of-concept delivery of the system was deemed a success, given the limitations of the available training data. Although the system did not perform with the same level of autonomy as originally planned, the hybrid model allowed for reliable human action recognition. Human supervision was still required to validate the flagged actions, but the system provided a strong foundation for future developments.

The combination of deep learning and a rules-based approach proved effective. The system was able to recognise specific actions and identify when those actions violated the preset rules. While human operators are still necessary for the final verification of suspicious behaviour, this hybrid system significantly reduces the workload by narrowing down the number of incidents they need to review.

Additionally, the successful deployment of this system opened up the opportunity for future improvements. With the system in place, the client can begin to collect more training data from real-world use. Over time, as more suspicious actions are captured and labelled, the dataset will grow, allowing for the deep learning component of the system to become more effective. This will eventually lead to a reduction in the reliance on human supervision, as the model becomes more capable of identifying suspicious actions independently.

One of the key outcomes was the optimisation of the system to run on mid-range GPUs, which were sufficient for processing both the deep learning and rules-based components. By using GPU-accelerated computing for the deep learning tasks, we managed to significantly boost the system’s performance without the need for expensive, high-end video cards. The discrete GPUs used in the system were able to handle the complex tasks of human body recognition and action classification while maintaining high clock speeds and performance levels.

The system also benefited from techniques like ray tracing, which improved the quality of visual inputs by tracking the movement of objects and people in higher resolution. This enhanced the clarity of the video feeds, allowing the system to detect small, subtle movements that might indicate suspicious actions.

Moreover, the use of optical flow in computer vision helped to track movement and direction within the video feeds. Optical flow refers to the pattern of apparent motion of objects in a visual scene. This was crucial in detecting actions like someone moving into restricted areas or behaving in an unusual manner. By leveraging pre-trained models and applying them to real-time video streams, the system could track and classify human actions more effectively.

Future Potential:

With the system in place, the client has the opportunity to upgrade it further by enhancing the action recognition and classification aspects. For instance, as the client begins to gather more data from actual incidents, they can use this data to improve the performance of the deep learning models. This would allow the system to detect more complex behaviours and reduce the need for manual intervention.

The system can also be scaled to higher-resolution video feeds or be applied to a wider range of security tasks. With better GPUs and higher-performance video cards, the system could be used for tasks like video editing, large-scale monitoring, or even virtual reality (VR) applications in security settings.

In the future, the client could implement more advanced neural networks to make the system more autonomous. With the use of dedicated graphics cards, the system could handle real-time analysis of large video streams without the need for human supervision. This could greatly increase the efficiency of the monitoring process, allowing for faster detection and response to suspicious activities.

TechnoLynx’s flexible approach ensures that the system can evolve alongside technological advancements. Our deep understanding of both machine learning and real-world constraints allowed us to deliver a solution that fits within the client’s budget while still offering high performance. The use of pre-trained models, combined with rules-based logic, provided a cost-effective solution that can be further enhanced as the client’s needs grow.

Conclusion:

In summary, our client’s security-related problem was successfully addressed through a combination of deep learning and rules-based logic. The hybrid model allowed for real-time monitoring of suspicious behaviour using cost-effective hardware, including mid-range GPUs and dedicated graphics cards. Although the system still requires human supervision, it significantly reduces the workload by pre-screening suspicious actions and flagging them for review.

As the client collects more data from real-world usage, the system can be further improved to provide more autonomous action recognition and classification. By utilising modern techniques like optical flow, ray tracing, and GPU acceleration, the system is well-equipped to handle future challenges in security monitoring and action classification.

Read our Blog!

Technical Excellence

Founded in 2019 by Balázs Keszthelyi, co-inventor of more than a dozen patents and contributor to two international standards, we know how to beat the state-of-the-art.

Balázs’ passion for high quality and superior performance sets a high bar, generating value for our clients and growth for our employees.

Meet our team

Technologies

Computer Vision
Generative AI
Extended Reality (XR)

What We Do

We specialise in guiding clients through the entire research and development journey, from initial prototyping to seamless integration and even safeguarding intellectual property. As an innovative solutions center, we not only identify areas for workflow enhancement but also actively engage in crafting and implementing solutions.

Reach out!

Services

Technical Business Analysis & Consulting
R&D Outsourcing
Custom Software Development
MLOps
Performance Optimisation

12/03/2024

Case-Study: Text-to-Speech Inference Optimisation on Edge

See how our team applied a case study approach to build a real-life Kazakh text-to-speech solution using ONNX, deep learning, and efficient research design.

15/12/2023

Case-Study: GPU Porting from OpenCL to Metal

Case study on moving a GPU application from OpenCL to Metal. Boosts performance, adds support for real-time apps, VR, and machine learning on Apple M1/M2 chips.

15/12/2023

Case-Study: GPU Porting from OpenCL to Metal

Case study on moving a GPU application from OpenCL to Metal. Boosts performance, adds support for real-time apps, VR, and machine learning on Apple M1/M2 chips.

6/06/2023

Case-Study: Generative AI for Stock Market Prediction

Case study on using Generative AI for stock market prediction. Combines sentiment analysis, natural language processing, and large language models to identify trading opportunities in real time.

15/05/2023

Case-Study: Performance Modelling of AI Inference on GPUs

Learn how TechnoLynx helps reduce inference costs for trained neural networks and real-time applications including natural language processing, video games, and large language models.

10/02/2023

Case Study: Multi-Target Multi-Camera Tracking

Learn how TechnoLynx built a cost-efficient, AI-powered multi-target tracking system using existing CCTV infrastructure. Real-time object tracking across non-overlapping cameras using global and local IDs.

2/11/2022

Consulting: AI for Personal Training

Read all about our case study in AI application in Personal Training!

22/05/2022

Case-Study: A Generative Approach to Anomaly Detection

See how we successfully compeleted this project using Anomaly Detection!

29/12/2020

Case Study - Accelerating Cryptocurrency Mining

Our client had a vision to analyse and engage with the most disruptive ideas in the crypto-currency domain. Read more to see our solution for this mission!

10/11/2020

Case Study - AI-Generated Dental Simulation

Our client, Tasty Tech, was an organically growing start-up with a first-generation product in the dental space, and their product-market fit was validated. Read more.

17/09/2020

Case Study - Fraud Detector Audit

Discover how a robust fraud detection system combines traditional methods with advanced machine learning to detect various forms of fraud!

15/04/2020

Case Study - Embedded Video Coding on GPU

TechnoLynx developed a customised embedded video coding solution using GPU optimisation, dedicated graphics cards, and discrete GPUs to enhance video compression efficiency, performance, and integration within the client’s pipeline.

23/01/2020

Case Study - Accelerating Physics -Simulation Using GPUs

TechnoLynx used GPU acceleration to improve physics simulations for an SME, leveraging dedicated graphics cards, advanced algorithms, and real-time processing to deliver high-performance solutions, opening up new applications and future development potential.