, . . . http://cvsp.cs.ntua.gr . , .

  ,       .  . . http://cvsp.cs.ntua.gr   . , .

, . . . http://cvsp.cs.ntua.gr . , . . CVSP --

. () 3 7 . + 2-5 . + . : . , . , / ( )

- & / & : http://cvsp.cs.ntua.gr

(McGurk & MacDonald)) () : / -

/ , :

- (King et al., Deng) : N (Articulatory Gestures, Browman & Gold)stein) ... (.. Bell, 1867))

G. Papand)reou, A. Katsamanis, V. Pitsikalis, and) P. Maragos, Ad)aptive Multimod)al Fusion by Uncertainty Compensation with Application to Aud)io-Visual Speech Recognition, IEEE Trans. ASLP, 2009 : &

() :

1 2 :

. Face detector System Overview Adaboost-based, @5 fps Image Acquisition Firewire color camera, 640x480 @25 fps (Re)initialization

Face tracking & feature extraction Real-time AAM fitting algorithms GPU-accelerated processing OpenGL implementation HMM-based backend Transcription :

; (.. ) , , , ...

: , , (Knill & Richard)s) (.. Ernst et al.) // Maragos et al., Cross-Mod)al Integration, Springer 2008 : :

: Wiener Kalman ; : :

SNR= 20d)B SNR= 5d)B : o (Gaussian Mixture Mod)el - GMM) S

. : : C X : C X

Y !

GMM S p c | x1:s p (c)1 s ,c N xs ; s ,c , s ,c C X :

p ys | xs N ys ; xs e , s , e, s S M s ,c p c | y1:s p (c ) s ,c ,m N ys ; s ,c ,m e, s , s ,c ,m e , s s 1 m 1 C X Y GMM

1- (y1 y2), 2 S ws : b (c | y ) p ( c ) p( y | c)

1:s s 1 : S p c | y1:s p (c)1 N ys ; s ,c , s ,c e ,s PoG : w N x; , N x; , w 1

S b c | y1:s p (c)1 N ys ; s ,c , w s ,c : 1 s ,c ws ,c e ,s 1 1 s ,c

EM- C

Q( ,pXCX ) [log p( X ,{C}| ) | X , pXCX ] X C X Y Q( , pXCX ) [log p(Y ,{ X , C}| ) | Y , pXCX ]

Markov () & Viterbi () - () ( frame) C1 C2

C3 C4 X1 X2 X3 X4 C1

C2 C3 C4 X1

X2 X3 X4 Y1 Y2 Y3 Y4

Mel Frequency Cepstral Coefficients (MFCCs): Pre-emphasis STFT | . | Mel-scale log( . ) DCT (e.g. SPLICE, ALGONQUIN) MFCC (VTS) X noisy f ( X clean , N ) MFCC MFCC

+ X clean X E Deng, Droppo, Acero, IEEE Tr. SAP, 2005 - 1 2 3

C1 C2 C3 X1 X2 X3 Multistream-

Product- : Asynchronous-HMM, Coupled)-HMM, Dynamic Bayesian Networks, CUAVE

. : CUAVE: 36 (30 , 6 ) 5 10 : 1500 (30x5x10) : 300 (6x5x10) babble - NOISEX HMMs (- , 8 , 1 /, ) HTK (

) AV A

/ AV-W-UC vs. A-UC 28.7 %

AV-UC vs. AV AV-W-UC vs. AV-W 20 % Product-HMM Prod)uct-HMM vs.

Multistream-HMM 1.2 % : &

: MUSCLE (NoE) & HIWIRE (STREP) - A. Katsamanis, G. Papand)reou, and) P. Maragos, Face Active Appearance Mod)eling and) Speech Acoustic Information to Recover Articulation, IEEE Trans. ASLP, 2009 -

: : () : , , MOCHA CSTR, Univ. Edinburgh

(, 1 /1 ), 460 TIMIT (2- 9 ) 30 - phoneme

37 y, x : prior : Yehia, Rubin & Vatikiotis-Bateson, Speech Comm., 1998

CCA . (CCA) CCA : : . : 40

Viterbi Markov -> Hiroya & Honda, IEEE TSAP 2004 : /

: . : HMM / MS-HMM: () : / . : Visemes ( ) ( ) MOCHA

- ( ) (//) : :

: . 51 Katsamanis et al. EUSIPCO 2008 / CVSP (. )

: X-rays, (. . ) Audiovisual Speech Inversion Articulatory Parameter Extraction Articulatory Speech Synthesis Articulatory Model

Training - : : ()

: , , : ASPI (FET) & ()

!

: http://cvsp.cs.ntua.gr

Recently Viewed Presentations

  • On Climate Change & Institution Ikerne del Valle & Kepa Astorkiza

    On Climate Change & Institution Ikerne del Valle & Kepa Astorkiza

    On Climate Change & Institutions. Ikerne del Valle & Kepa Astorkiza Dept. Applied Economics V. University of the Basque Country. Workshop "Finance and the Macroeconomics of Environmental Policies".
  • Aucun titre de diapositive - INSA Lyon

    Aucun titre de diapositive - INSA Lyon

    Solutions de l'équation I.4. Exemples de lignes réelles Introduction Rappels Perméabilité et permittivité du vide µ0= 4p10-7 H.m -1 e0= 1/(36p)10-9 F.m -1 Vitesse de la lumière dans le vide I.4.a. La ligne bifilaire I.4. Exemples de lignes réelles D...
  • Outline of this presentation Part 1: Teaching Practice

    Outline of this presentation Part 1: Teaching Practice

    Leah George withdrew also because of the poor internet connection especially . since she . couldn't get to work online. Ngaruaine. faced . similar issue with the internet connection at her . ... 'when will we receive the marks for...
  • CHAPTER 7 INPUT AND OUTPUT 7-2 Competencies Define

    CHAPTER 7 INPUT AND OUTPUT 7-2 Competencies Define

    PDA keyboards. Features. Typewriter keyboard with numeric pad. Special purpose keys. Toggle and combination keys . Keyboards come in a variety of designs. Range from full-sized to miniature and from rigid to flexible. Common types. Traditional . Flexible - fold...
  • Introduction to Arduinos What is an Arduino?  Arduino

    Introduction to Arduinos What is an Arduino? Arduino

    How/when will we use Arduinos in the lab? ALL the labs in MAE106 use Arduino. Some of the code will be provided to you, but in most cases you will have to understand and change it accordingly to finish your...
  • Chapter 6 Performing Basics Vehicle Maneuver

    Chapter 6 Performing Basics Vehicle Maneuver

    Describe how to park uphill and downhill with or without a curb. Explain how to start from an uphill parking space without rolling backwards. KEY FACTORS TO REMEMBER WHILE DRIVING AND PARKING. Reference points. Angel parking. Perpendicular parking.
  • Parts of a Story - mrburnsenglishclass.com

    Parts of a Story - mrburnsenglishclass.com

    Parts of a Story Story Map Exposition Rising Action Climax Falling Action/ Denouement Exposition a setting forth of facts, ideas, etc.; detailed explanation writing or speaking that sets forth or explains that part of a play, etc. which reveals what...
  • Food Webs - The World of Teaching

    Food Webs - The World of Teaching

    They are primary consumers. They are herbivores. Secondary consumers are animals that eat other animals for energy. They are called carnivores. Decomposers are organisms that eat dead things for energy. Summary Combined food chains make food webs. Food webs start...