Online Action Detection

Roeland De Geest1, Efstratios Gavves2, Amir Ghodrati1, Zhenyang Li2, Cees Snoek2, Tinne Tuytelaars1

ECCV 2016

1 PSI, ESAT, KU Leuven
2 QUVA-Lab, University of Amsterdam


The goal of online action detection is to detect an action as it happens and ideally even before the action is fully completed. A decision is made early, without having seen a complete video (as is the case in traditional action detection). Being able to detect an action at the time of the occurence can be useful in many practical applications, e.g.,
We introduce the online action detection problem in our ECCV 2016 paper (read it on arXiv). Essentially, an online action detection method must answer the following question: based on all frames seen up to now, what action (if any) is happening in the current frame? Therefore, we use the per-frame average precision for evaluation. We collected the TVSeries dataset, a new dataset that can be used to evaluate online (as well as traditional) action detection methods. We evaluate three popular video interpretation methods on this dataset, both in an online and an offline action detection setting: Fisher vectors with SVM, a frame-based CNN, and an LSTM with the output of this CNN as input. None of the methods perform well, indicating that more research on this relevant, challenging problem is needed.

Citation:
De Geest, R., Gavves, E., Ghodrati, A., Li, Z., Snoek, C. & Tuytelaars, T. (2016). Online Action Detection. ECCV 2016.