Temporal Order-Preserving Dynamic Quantization for Human Action Recognition
Temporal Order-Preserving Dynamic Quantization for Human Action Recognition from Multimodal Sensor Streams Jun Ye Kai Li Guo-Jun Qi Kien A. Hua University of Central Florida Outline Background Problem, existing methods, challenges
Our algorithm Dynamic Temporal Quantization Multimodal Feature Fusion Performance study MSR-Action3D UTKinect-Action MSR-ActionPairs Conclusions 2 Background Depth sensors becomes affordable and popular New human-computer interaction
Gesture recognition Speech recognition Application domain Video games, education, business, healthcare 3 Problem and Challenges Key problem: modeling the temporal dynamics of 3D human action/gestures Existing methods
Histogram-based methods do not preserve order (bagof-3d-words [5, 21], HOJ3D [16], HON4D [9] ) Temporal modeling suffer from video misalignment (motion template [7, 20], temporal pyramid [9, 14]) Challenge: temporal misalignment due to Temporal translation Execution rate variation 4 Objective Modeling the temporal patterns of 3D actions according to the transition of sub-actions satisfying 1. Frames with similar postures are clustered together (sub-action constraint) 2. Temporal order of the sequence must be preserved (order-preserving)
Dynamic Temporal Quantization Algorithm 5 Dynamic Temporal Quantization Quantization: videos X1,X2, Xn of varied length n quantized vector V1,V2,Vm of fixed length m. Optimal frame assignment a Objective function: Optimal quantization can be obtained by jointly optimizing a and V 6 Dynamic Temporal Quantization (contd)
Nontrivial to jointly solve the frame assignment a Initialization: uniform partition Aggregation step: given fixed assignment a, vj is computed by the aggregation Assignment step: fixed the quantized vector V, update the assignment a by DTW Iterate until convergence. 7 Hierarchical representation Multilayers of the Dynamic Quantization
Top layers: global temporal patterns Bottom layers: local temporal patterns Concatenate all layers 8 Multimodel Feature Fusion Multimodal features:
joint coordinate pairwise angle joint offset [21] histogram of velocity components (HVC) Supervised learning for all quantized vectors Multiclass SVM Fusion by regression (softmax) 9 Experiments Experiments on three public 3D human action datasets MSR-Action3D
90.42% 83.15% Similar performances can be observed in the other two datasets. 11 Experiment: hierarchical representation MSR-Action3D dataset with the joint coordinate feature Layers 1 2 3 4 5 Accuracy
66.28% 67.82% 71.26% 81.61% 77.39% More layers generally produce higher accuracy though need to take care of the overfitting. 12 Experiment: Comparison with state-of-the-art results Method Accuracy Actionlet Ensemble [14] HON4D [9]
DCSF [15] Lie Group [13] Super Normal Vector [18] Proposed method 88.2% 88.89% 89.3% 89.48% 93.09% 90.42% MSR-Action3D dataset Method Accuracy Actionlet Ensemble [14] HON4D [9] HON4D + Ddisc [9] Super Normal Vector [18] Proposed method
82.22% 93.33% 96.67% 98.89% 93.71% MSR-ActionPairs dataset Method Accuracy Histogram of 3D joints [17] Combined features with random forest [21] Lie Group [13] Proposed method UTKinect-Action dataset (100% accuracy) 13 90.92% 91.9% 97.08%
100% Conclusions A novel algorithm for 3D human action sequence recognition from the perspective of dynamic temporal quantization. Extensive experiments on three public datasets demonstrate the effectiveness of the proposed technique for temporal modeling. 14 Thank you. Questions? 15
AIDA Concept Attention-Interest-Desire-Action Think Feel Do Promotion Strategies for Primary and Secondary Markets Primary market - Demand for a product category Pioneering Advertising Got Milk Social issue marketing "Kick the Can" Drug Free America "Talk to your kids about drugs"...
TEKS (2) Listening/speaking/culture. The student listens and speaks to gain knowledge of his/her own culture, the culture of others, and the common elements of cultures. The student is expected to: (A) connect experiences and ideas with those of others through...
On April 28th, 2014, when the storm outside was raging and 50 plus family members were packed in a rather small storm shelter, kids crying, wind howling, Audrey said, say a prayer Matt. Faith of a child in the one...
* This unit is NOT dishwasher safe. * Keep in a dry place and out of direct sunlight. * Clean the bottle regularly to avoid dirt and calcium build up. * When going on holidays leave the bottle full and...
The contractionary phase of the business cycle. A period of decline in Real GDP accompanied by an increase in unemployment. Trough. The bottom of the business cycle. Unemployment is probably high and inflation is probably low. Recovery (or Expansion) The...
* = 4.8 = 4.3 Vo (V) Vi (V) 5 0.7 Cutoff Active 0.2 x 5 Saturation Vi (V) Bipolar Transistor Biasing Biasing refers to the DC voltages applied to the transistor for it to turn on and operate in...
The Lord's Prayer: thy Kingdom come… Mary's Magnificat: a world in which the lowly are raised up, the powerful are overthrown, the hungry are fed, the rage of nations is subject to the power of the one who has made...
Magnetic isolation of cells & molecules with MACS® Technology Three take home (or better „take to the customer" messages which are explained in the following slides.
Download Presentation
Ready to download the document? Go ahead and hit continue!