.

Technology

STCFormer

STCFormer (Spatio-Temporal Criss-cross Transformer) is a high-efficiency model for 3D Human Pose Estimation (HPE), utilizing a decomposed attention mechanism to minimize quadratic computational cost.

This is the Spatio-Temporal Criss-cross Transformer: a robust architecture for 3D Human Pose Estimation. STCFormer addresses the quadratic computational cost of full spatio-temporal attention by introducing the STC block, which efficiently decomposes correlation learning into parallel spatial and temporal pathways. The system integrates a Structure-enhanced Positional Embedding (SPE) to factor in explicit human body structure, boosting accuracy. Validated on major benchmarks, the model delivered a state-of-the-art 40.5mm P1 error on the challenging Human3.6M dataset, confirming its superior performance and economic design: it achieves this with significantly fewer parameters than prior state-of-the-art techniques.

https://github.com/zhenhuat/STCFormer
1 project · 1 city

Related technologies

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Sign in to see who built these projects