TY - GEN
T1 - Elastic functional coding of human actions
T2 - IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
AU - Anirudh, Rushil
AU - Turaga, Pavan
AU - Su, Jingyong
AU - Srivastava, Anuj
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/10/14
Y1 - 2015/10/14
N2 - Human activities observed from visual sensors often give rise to a sequence of smoothly varying features. In many cases, the space of features can be formally defined as a manifold, where the action becomes a trajectory on the manifold. Such trajectories are high dimensional in addition to being non-linear, which can severely limit computations on them. We also argue that by their nature, human actions themselves lie on a much lower dimensional manifold compared to the high dimensional feature space. Learning an accurate low dimensional embedding for actions could have a huge impact in the areas of efficient search and retrieval, visualization, learning, and recognition. Traditional manifold learning addresses this problem for static points in ℝn, but its extension to trajectories on Riemannian manifolds is non-trivial and has remained unexplored. The challenge arises due to the inherent non-linearity, and temporal variability that can significantly distort the distance metric between trajectories. To address these issues we use the transport square-root velocity function (TSRVF) space, a recently proposed representation that provides a metric which has favorable theoretical properties such as invariance to group action. We propose to learn the low dimensional embedding with a manifold functional variant of principal component analysis (mfPCA). We show that mf-PCA effectively models the manifold trajectories in several applications such as action recognition, clustering and diverse sequence sampling while reducing the dimensionality by a factor of ∼ 250×. The mfPCA features can also be reconstructed back to the original manifold to allow for easy visualization of the latent variable space.
AB - Human activities observed from visual sensors often give rise to a sequence of smoothly varying features. In many cases, the space of features can be formally defined as a manifold, where the action becomes a trajectory on the manifold. Such trajectories are high dimensional in addition to being non-linear, which can severely limit computations on them. We also argue that by their nature, human actions themselves lie on a much lower dimensional manifold compared to the high dimensional feature space. Learning an accurate low dimensional embedding for actions could have a huge impact in the areas of efficient search and retrieval, visualization, learning, and recognition. Traditional manifold learning addresses this problem for static points in ℝn, but its extension to trajectories on Riemannian manifolds is non-trivial and has remained unexplored. The challenge arises due to the inherent non-linearity, and temporal variability that can significantly distort the distance metric between trajectories. To address these issues we use the transport square-root velocity function (TSRVF) space, a recently proposed representation that provides a metric which has favorable theoretical properties such as invariance to group action. We propose to learn the low dimensional embedding with a manifold functional variant of principal component analysis (mfPCA). We show that mf-PCA effectively models the manifold trajectories in several applications such as action recognition, clustering and diverse sequence sampling while reducing the dimensionality by a factor of ∼ 250×. The mfPCA features can also be reconstructed back to the original manifold to allow for easy visualization of the latent variable space.
UR - http://www.scopus.com/inward/record.url?scp=84959193616&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84959193616&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2015.7298934
DO - 10.1109/CVPR.2015.7298934
M3 - Conference contribution
AN - SCOPUS:84959193616
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 3147
EP - 3155
BT - IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
PB - IEEE Computer Society
Y2 - 7 June 2015 through 12 June 2015
ER -