group img banner
header-image_0001_layer_4.jpg
header-image_0003_layer_0.jpg
header-image_0002_layer_1.jpg

Prediction and Description of Near- Future Activities in Video

Can you predict an activity that may occur in the near future and provide a natural language description of it? See our recent journal paper on this topic.

The paper proposes a system that can infer the labels and the captions of a sequence of future activities from current observations. This is one of the earliest works for captioning near-future events in videos. The proposed network for label prediction of a future activity sequence uses spatio-temporal relationships of activities and objects. The predicted labels and the observed scene context are then mapped to meaningful captions using a sequence-to-sequence learning-based method.

Title: Prediction and Description of Near-Future Activities in Video.
Tahmida Mahmud, Mohammad Billah, Mahmudul Hasan, Amit K. Roy-Chowdhury (CVIU), 2021.