Online Action Detection
Roeland De Geest1, Efstratios Gavves2, Amir Ghodrati1, Zhenyang Li2, Cees Snoek2, Tinne Tuytelaars1
ECCV 2016
1 PSI, ESAT, KU Leuven
2 QUVA-Lab, University of Amsterdam
The goal of
online action detection is to detect an action as it happens and ideally even before the action is fully completed. A decision is made early, without having seen a complete video (as is the case in traditional action detection). Being able to
detect an action at the time of the occurence can be useful in many practical
applications, e.g.,
- a pro-active robot offering a helping hand,
- a surveillance camera raising an alarm not just after the facts but well in time to allow for
intervention,
- a smart active camera system zooming in on the action scene and
recording it from the optimal perspective,
- an autonomous car stopping for a child chasing a ball.
We introduce the online action detection problem in our ECCV 2016 paper (read it on
arXiv). Essentially, an online action detection method must answer the following question:
based on all frames seen up to now, what action (if any) is happening in the current frame? Therefore, we use the per-frame average precision for evaluation. We collected the
TVSeries dataset, a new dataset that can be used to evaluate online (as well as traditional) action detection methods. We evaluate three popular video interpretation methods on this dataset, both in an online and an offline action detection setting: Fisher vectors with SVM, a frame-based CNN, and an LSTM with the output of this CNN as input. None of the methods perform well, indicating that more research on this relevant, challenging problem is needed.
Citation:
De Geest, R., Gavves, E., Ghodrati, A., Li, Z., Snoek, C. & Tuytelaars, T. (2016). Online Action Detection. ECCV 2016.