arXiv:1607.02556 Abstract | arXiv Analytics

arXiv:1607.02556 [cs.CV]Abstract References Reviews Resources

Action Recognition with Joint Attention on Multi-Level Deep Features

Jialin Wu, Gu Wang, Wukui Yang, Xiangyang Ji

Published 2016-07-09Version 1

We propose a novel deep supervised neural network for the task of action recognition in videos, which implicitly takes advantage of visual tracking and shares the robustness of both deep Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). In our method, a multi-branch model is proposed to suppress noise from background jitters. Specifically, we firstly extract multi-level deep features from deep CNNs and feed them into 3d-convolutional network. After that we feed those feature cubes into our novel joint LSTM module to predict labels and to generate attention regularization. We evaluate our model on two challenging datasets: UCF101 and HMDB51. The results show that our model achieves the state-of-art by only using convolutional features.

Comments: 13 pages, submitted to BMVC

Categories: cs.CV

Keywords: action recognition, joint attention, deep convolutional neural network, novel joint lstm module, firstly extract multi-level deep features

Related articles: Most relevant | Search more

arXiv:1809.03669 [cs.CV] (Published 2018-09-11)

Temporal-Spatial Mapping for Action Recognition

Xiaolin Song, Cuiling Lan, Wenjun Zeng, Junliang Xing, Jingyu Yang, Xiaoyan Sun

arXiv:1801.10304 [cs.CV] (Published 2018-01-31)

Action Recognition with Visual Attention on Skeleton Images

Zhengyuan Yang, Yuncheng Li, Jianchao Yang, Jiebo Luo

arXiv:1512.03980 [cs.CV] (Published 2015-12-13)

Action Recognition with Image Based CNN Features