Edinburgh office monitoring video dataset,
2021.
WWW Link.
Dataset, Office Monitor.
2103
This dataset consists of video, image frames, and ground truth for 20
days of monitoring people in 4 different offices. The data is
acquired using a fixed camera as a set of 1280*720 pixel color images
captured at an average of about 1 FPS. This dataset is interesting
because there are about 450K labeled frames of people doing standard
office activities. The ground truth is the position of each person in
each image with a bounding box, plus their behavior. Four behaviors
are annotated (standing/walking, sitting, two or three people are
talking, or the person in room has fallen).
Paper to appear CVPR21.
Ayers, D.[Douglas],
Shah, M.[Mubarak],
Monitoring human behavior from video taken in an office environment,
IVC(19), No. 12, October 2001, pp. 833-846.
Elsevier DOI
0110
BibRef
Earlier:
Scenario Recognition from Video Using a Hierarchy of
Dynamic Belief Networks,
ICPR00(Vol I: 835-838).
IEEE DOI
0009
BibRef
Earlier:
Recognizing Human Actions in a Static Room,
WACV98(42-47).
IEEE DOI
9809
indoor
BibRef
Shah, M.[Mubarak],
Understanding human behavior from motion imagery,
MVA(14), No. 4, September 2003, pp. 210-214.
Springer DOI
0309
BibRef
Shah, M.[Mubarak],
Recognizing human actions,
VSSN05(1-2).
WWW Link.
0511
BibRef
McCowan, I.[Iain],
Gatica-Perez, D.[Daniel],
Bengio, S.[Samy],
Lathoud, G.[Guillaume],
Barnard, M.,
Zhang, D.[Dong],
Automatic Analysis of Multimodal Group Actions in Meetings,
PAMI(27), No. 3, March 2005, pp. 305-317.
IEEE Abstract.
0501
Group actions result from interactions of individuals.
Visual in addition to audio analysis.
See also Audio-visual speaker tracking with importance particle filters.
See also Automatic nonverbal analysis of social interaction in small groups: A review.
BibRef
Zhang, D.[Dong],
Gatica-Perez, D.[Daniel],
Bengio, S.[Samy],
McCowan, I.[Iain],
Lathoud, G.[Guillaume],
Modeling Individual and Group Actions in Meetings:
A Two-Layer HMM Framework,
EventVideo04(117).
IEEE DOI
0502
BibRef
Gatica-Perez, D.,
McCowan, I.,
Barnard, M.,
Bengio, S.,
Bourlard, H.,
On automatic annotation of meeting databases,
ICIP03(III: 629-632).
IEEE DOI
0312
BibRef
Zhang, D.[Dong],
Gatica-Perez, D.[Daniel],
Bengio, S.[Samy],
McCowan, I.[Iain],
Semi-Supervised Adapted HMMs for Unusual Event Detection,
CVPR05(I: 611-618).
IEEE DOI
0507
BibRef
Nait-Charif, H.[Hammadi],
McKenna, S.J.[Stephen J.],
Tracking the activity of participants in a meeting,
MVA(17), No. 2, May 2006, pp. 83-93.
Springer DOI
PDF File.
0605
BibRef
Popescu-Belis, A.[Andrei],
Lalanne, D.[Denis],
Bourlard, H.[Herve],
Finding Information in Multimedia Meeting Records,
MultMedMag(19), No. 1, January-March 2012, pp. 48-57.
IEEE DOI
1202
BibRef
Lepri, B.[Bruno],
Subramanian, R.[Ramanathan],
Kalimeri, K.[Kyriaki],
Staiano, J.[Jacopo],
Pianesi, F.[Fabio],
Sebe, N.[Nicu],
Connecting Meeting Behavior with Extraversion: A Systematic Study,
AffCom(3), No. 4 2012, pp. 443-455.
IEEE DOI
1302
BibRef
Si, Z.Z.[Zhang-Zhang],
Zhu, S.C.[Song-Chun],
Learning AND-OR Templates for Object Recognition and Detection,
PAMI(35), No. 9, 2013, pp. 2189-2205.
IEEE DOI
1307
BibRef
Earlier:
Unsupervised learning of stochastic AND-OR templates for object
modeling,
SIG11(648-655).
IEEE DOI
1201
See also Learning mixed templates for object recognition.
See also Learning Hybrid Image Templates (HIT) by Information Projection. Animals
BibRef
Pei, M.T.[Ming-Tao],
Si, Z.Z.[Zhang-Zhang],
Yao, B.Z.[Benjamin Z.],
Zhu, S.C.[Song-Chun],
Learning and parsing video events with goal and intent prediction,
CVIU(117), No. 10, 2013, pp. 1369-1383.
Elsevier DOI
1309
Temporal And-Or Graph (T-AOG)
BibRef
Pei, M.T.[Ming-Tao],
Jia, Y.D.[Yun-De],
Zhu, S.C.[Song-Chun],
Parsing video events with goal inference and intent prediction,
ICCV11(487-494).
IEEE DOI
1201
BibRef
Si, Z.Z.[Zhang-Zhang],
Pei, M.T.[Ming-Tao],
Yao, B.[Benjamin],
Zhu, S.C.[Song-Chun],
Unsupervised learning of event AND-OR grammar and semantics from video,
ICCV11(41-48).
IEEE DOI
1201
Office scenes.
See also Learning explicit and implicit visual manifolds by information projection.
See also Learning Hybrid Image Templates (HIT) by Information Projection.
See also Learning mixed templates for object recognition.
BibRef
Park, H.,
Park, J.,
Kim, H.,
Jun, J.,
Son, S.H.[S. Hyuk],
Park, T.,
Ko, J.,
ReLiSCE: Utilizing Resource-Limited Sensors for Office Activity
Context Extraction,
SMCS(45), No. 8, August 2015, pp. 1151-1164.
IEEE DOI
1506
Acoustics
BibRef
Yokoyama, H.[Hitomi],
Nakayama, M.[Masano],
Murata, H.[Hiroaki],
Fujita, K.[Kinya],
Development of Acoustic Nonverbal Information Estimation System for
Unconstrained Long-Term Monitoring of Daily Office Activity,
IEICE(E102-D), No. 2, February 2019, pp. 331-345.
WWW Link.
1902
BibRef
Liu, C.X.[Cheng-Xu],
Zhang, Y.[Yaru],
Xue, Y.[Yao],
Qian, X.M.[Xue-Ming],
AJENet: Adaptive Joints Enhancement Network for Abnormal Behavior
Detection in Office Scenario,
CirSysVideo(34), No. 3, March 2024, pp. 1427-1440.
IEEE DOI
2403
Behavioral sciences, Feature extraction, Head, Detectors,
Object detection, Surveillance, Adaptive systems, feature enhancement
BibRef
Qiao, M.L.[Ming-Lang],
Liu, Y.F.[Yu-Fan],
Xu, M.[Mai],
Deng, X.[Xin],
Li, B.[Bing],
Hu, W.M.[Wei-Ming],
Borji, A.[Ali],
Joint Learning of Audio-Visual Saliency Prediction and Sound Source
Localization on Multi-face Videos,
IJCV(132), No. 6, June 2024, pp. 2003-2025.
Springer DOI
2406
BibRef
Esmaeilzehi, A.[Alireza],
Khazaei, E.[Ensieh],
Wang, K.[Kai],
Kaur Kalsi, N.[Navjot],
Ng, P.C.[Pai Chet],
Liu, H.[Huan],
Yu, Y.H.[Yuan-Hao],
Hatzinakos, D.[Dimitrios],
Plataniotis, K.[Konstantinos],
HARWE: A multi-modal large-scale dataset for context-aware human
activity recognition in smart working environments,
PRL(184), 2024, pp. 126-132.
Elsevier DOI
2408
Human activity recognition, Multi-modal data collection
BibRef
Lv, J.M.[Jian-Ming],
Chen, C.J.[Chu-Jie],
Liang, Z.Q.[Ze-Quan],
Automated Scoring of Asynchronous Interview Videos Based on
Multi-Modal Window-Consistency Fusion,
AffCom(15), No. 3, July 2024, pp. 799-814.
IEEE DOI
2409
Interviews, Videos, Artificial intelligence, Task analysis,
Computational modeling, Recruitment, Predictive models,
user modeling
BibRef
Bhattacharya, I.,
Eshed, N.,
Radke, R.J.,
Privacy-Preserving Understanding of Human Body Orientation for Smart
Meetings,
PBVS17(284-292)
IEEE DOI
1709
Cameras, Estimation, Microphones, Sensor arrays, Speech, Training
BibRef
Brena, R.F.[Ramon F.],
Nava, A.[Armando],
Activity Recognition in Meetings with One and Two Kinect Sensors,
MCPR16(219-228).
Springer DOI
1608
BibRef
Shivappa, S.T.[Shankar T.],
Trivedi, M.M.[Mohan M.],
Rao, B.D.[Bhaskar D.],
Hierarchical audio-visual cue integration framework for activity
analysis in intelligent meeting rooms,
VCL-ViSU09(107-114).
IEEE DOI
0906
BibRef
Al-Hames, M.[Marc],
Lenz, C.[Claus],
Reiter, S.[Stephan],
Schenk, J.[Joachim],
Wallhoff, F.[Frank],
Rigoll, G.[Gerhard],
Robust Multi-Modal Group Action Recognition in Meetings from Disturbed
Videos with the Asynchronous Hidden Markov Model,
ICIP07(II: 213-216).
IEEE DOI
0709
See also Submotions for Hidden Markov Model Based Dynamic Facial Action Recognition.
BibRef
Al-Hames, M.,
Rigoll, G.,
A Multi-Modal Graphical Model for Robust Recognition of Group Actions
in Meetings from Disturbed Videos,
ICIP05(III: 421-424).
IEEE DOI
0512
BibRef
Wallhqff, F.,
Zobl, M.,
Rigoll, G.,
Action segmentation and recognition in meeting room scenarios,
ICIP04(IV: 2223-2226).
IEEE DOI
0505
BibRef
Zobl, M.,
Laika, A.,
Wallhoff, F.,
Rigoll, G.,
Recognition of partly occluded person actions in meeting scenarios,
ICIP04(I: 333-336).
IEEE DOI
0505
BibRef
Bauckhage, C.,
Hanheide, M.,
Wrede, S.,
Sagerer, G.,
A cognitive vision system for action recognition in office environments,
CVPR04(II: 827-833).
IEEE DOI
0408
indoor
BibRef
Chapter on Motion -- Human Motion, Surveillance, Tracking, Surveillance, Activities continues in
Action Quality Evaluation .