Prediction and Description of Near- Future Activities in VideoCan you predict an activity that may occur in the near future and provide a natural language description of it? See our recent journal paper on this topic.
The paper proposes a system that can infer the labels and the captions of a sequence of future activities from current observations. This is one of the earliest works for captioning near-future events in videos. The proposed network for label prediction of a future activity sequence uses spatio-temporal relationships of activities and objects. The predicted labels and the observed scene context are then mapped to meaningful captions using a sequence-to-sequence learning-based method.
Title: Prediction and Description of Near-Future Activities in Video.
Tahmida Mahmud, Mohammad Billah, Mahmudul Hasan, Amit K. Roy-Chowdhury (CVIU), 2021.