TechnoLynx helped V-Nova bring GPU acceleration to their LCEVC external video decoder framework on Apple devices. The work focused on replacing CPU-heavy decoding with Metal-based GPU processing, lowering CPU usage while maintaining high playback quality across iPhone, iPad, and Apple TV, including AVPlayer-based iOS apps.
V-Nova’s MPEG-5 LCEVC codec already performed well on AMD/NVIDIA GPUs, but on iOS the decoder relied on CPU-only processing, creating performance gaps under heavy loads, especially for high-resolution video and fast frame delivery. The goal was to add Metal GPU acceleration to reduce CPU load and improve scalability on Apple hardware without breaking framework compatibility.
CPU-only decoding on iOS
No Metal shader support, leading to less-than-ideal performance on Apple devices.
Scalability under load
CPU decoding struggled with higher resolutions, rapid frame changes, and real-time playback demands.
Frame handling + latency
CPU-bound image processing increased frame times, latency, and uneven playback under intensive workloads.
Compatibility requirements
The solution had to work with V-Nova’s external decoder framework and AVPlayer-based iOS apps across Apple devices.
Image credits: Freepik.
From CPU-only iOS decoding to Metal-based GPU pixel processing across Apple devices
Confirmed the core issue: GPU acceleration existed on non-Apple platforms, but iOS decoding remained CPU-only with no Metal path.
Designed a Metal-based approach to shift pixel processing and decoding workloads onto the GPU while maintaining framework compatibility across iPhone, iPad, and Apple TV.
During testing, running the GPU in short bursts worked better than keeping it always active. Letting the GPU work at full load briefly and then rest helped manage power and heat more effectively.
Implemented a producer-consumer queue for frames so CPU and GPU could work in parallel, and built Metal GPU kernels that combine multiple operations into single passes to reduce memory reads and improve cache usage.
Precompiled kernel variants to avoid runtime stalls and enable future format support, refined memory layout for efficient access, and validated performance under normal and heavy-load playback conditions.
The TechnoLynx team built a GPU-based solution using Apple’s Metal shader language. The goal was to move heavy decoding tasks away from the CPU. We kept compatibility with V-Nova’s external decoder framework and ensured support across iPhone, iPad, and Apple TV.
Built Metal GPU kernels and combined multiple operations into a single pass to reduce memory reads and improve GPU cache usage.
Focused on efficient image processing techniques that would use GPU power without draining battery too quickly, and used short GPU bursts to help manage power and heat.
Used a producer-consumer model with a frame queue so CPU and GPU could work in parallel, precompiled different kernel variants to choose at run-time without delays, and refined memory layout for efficient access.
The Metal-based GPU implementation lowered CPU usage in most conditions and outperformed the original under simulated heavy loads, while power use stayed mostly the same and overall system heat stayed lower because the GPU finished tasks faster and rested more often. When video playback needed more processing power, the system responded better and did not freeze or lag.
Under normal conditions, video playback performance was equal to the original.
Under simulated heavy loads (fast frame changes or higher resolutions), the Metal-based solution had fewer dropped frames and smoother playback.
Power use stayed mostly the same, and overall system heat stayed lower because the GPU finished tasks faster and rested more often.
When video playback needed more processing power, the system responded better and did not freeze or lag.
Let’s discuss how stronger data pipelines, better accuracy measurement, and ML-driven approaches can reduce risk from rare outlier cases.