In the field of human action recognition, it is a long-standing challenge to characterize the video-level spatio-temporal features effectively. This is attributable in part to the inability of CNN to ...
This paper aims to propose a faster and more accurate network for human spatiotemporal action localization tasks. Like the YOWO model, we also use convolutional neural networks (CNNs) for feature ...