Authors
Erwin Wu, Ye Yuan, Hui-Shyong Yeo, Aaron Quigley, Hideki Koike, Kris M Kitani
Publication date
2020/10/20
Book
Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology
Pages
1147-1160
Description
The automatic recognition of how people use their hands and fingers in natural settings -- without instrumenting the fingers -- can be useful for many mobile computing applications. To achieve such an interface, we propose a vision-based 3D hand pose estimation framework using a wrist-worn camera. The main challenge is the oblique angle of the wrist-worn camera, which makes the fingers scarcely visible. To address this, a special network that observes deformations on the back of the hand is required. We introduce DorsalNet, a two-stream convolutional neural network to regress finger joint angles from spatio-temporal features of the dorsal hand region (the movement of bones, muscle, and tendons). This work is the first vision-based real-time 3D hand pose estimator using visual features from the dorsal hand region. Our system achieves a mean joint-angle error of 8.81 degree for user-specific models and 9.77 …
Total citations
20212022202320249132110
Scholar articles
E Wu, Y Yuan, HS Yeo, A Quigley, H Koike, KM Kitani - Proceedings of the 33rd Annual ACM Symposium on …, 2020