The Selective Tuning Model of Visual Attention

The Selective Tuning Model of Visual Attention

Studying Visual Attention with the Visual Search Paradigm Marc Pomplun Department of Computer Science University of Massachusetts at Boston E-mail: [email protected] Homepage: http://www.cs.umb.edu/~marc/ Studying Visual Attention with the Visual Search Paradigm

Overview: The Feature Integration Theory Visual Search The Guided Search Theory The Area Activation Model The Binding Problem Different features of the visual scene are coded by separate systems

e.g., direction of motion, location, color and orientation How do we know this? Anatomical & neurophysiological evidence Brain Imaging (fMRI & PET) So how do we experience a coherent world? Feature Integration Theory (Treisman et al) Attention is used to bind features together Code one object at a time on the

basis of its location Bind together whatever features are attended at that location Feature Integration Theory Sensory features (color, size, orientation, etc) are coded in parallel by specialized modules Modules form two kinds of maps Feature maps (e.g., color maps, orientation maps etc.) A master map of locations Feature Integration Theory

Feature maps contain two kinds of information: - presence of a feature anywhere in the field (theres something red out there) - implicit spatial information about the feature Activity in the feature maps can tell us which features are contained in the visual scene. It cannot tell us which other features the green blob has. The master map codes the location of features. Feature Integration Theory The basic idea of the FIT is that visual attention is used for

Locating features Binding appropriate features together There are two stages of object perception: Preattentive stage: Individual features are extracted in parallel across the whole visual scene. Attentive stage: When attention is directed to a location, the local features are combined to form a whole. Feature Integration Theory Attention moves within the location

map Focus of attention selects whatever features are linked to that location Features of other objects are excluded Attended features are then entered into the current temporary object representation Feature Integration Theory Empirical evidence for the FIT has been obtained through Visual search tasks Illusory conjunctions

We will focus on the paradigm of visual search. Visual Search Feature Search Is there a red T in the display? Target defined by a single feature According to

FIT, this should not demand Target should attention pop out T T T T T T T T

T T T Conjunction Search Is there a red T in the display? T X X T

Target is now defined T X T T by its shape and color T T T This involves binding T X X features and so should demand attention Need to attend to each item until target is found

Feature Search Changing the number of distractors: T T T T T T T T T T T T T

T T T T T T T T T T T T T T T

T T T T T T T T T T T T T

Conjunction Search Changing the number of distractors: T X T T T TX X T X T X X

T X X T T X X T T X T X T X T X T

T X T X T T T X T X X T T X

X X Visual Search Experiments Record time taken to determine whether target is present or not Vary the number of distractors Search for features should be independent of the number of distractors Conjunction search should get slower with more distractors

Visual Search Feature targets pop out 3000 2500 flat display size function 2000 Conjunction targets demand serial

search 1000 significant slope Feature Target Conjunction Target 1500 500 0 1

5 15 Display Size 30 Problem with FIT: Pop-Out of Conjunction Targets A moving X pops out of a display of moving Os and static Xs

O X X X O Target is defined O X by a conjunction of movement and form

At least some conjunctions do not require focal attention O X O Guided Search Theory The Guided Search Theory (GST) is similar to the FIT in that it also assumes two subsequent stages of visual search performance: a preattentive, parallel stage

an attentive, serial stage However, the main difference to FIT is that GST assumes the preattentive stage to obtain spatial saliency information that is used to guide attention in the serial stage. Guided Search Theory According to GST, saliency is encoded in an additional map, called the saliency map. The saliency map is created during the preattentive stage and can combine multiple features if necessary. In the subsequent serial search process,

attention is first directed to the highest peak in the saliency map, then to the second-highest, and so on. This visual guidance allows efficient search even for some conjunction targets. Guided Search Theory Support for the GST comes from eyemovement research. Eye-movement recording allows researchers to determine the items that a subject looks at during visual search. Guided Search Theory

Guided Search Theory In the previous example, 80% of fixations were closest to an item sharing color with the target, 20% of fixations were closest to an item sharing orientation with the target. It seems that the color dimension is guiding the subjects visual search process. Of course, due to imprecision of eye movements and their measurement, better statistics are necessary to determine the guiding dimension. Guided Search Theory

In visual search tasks, subjects are usually guided by one target feature or a combination of target features. This supports the idea of GST that preattentively derived information from multiple dimensions guides and thereby facilitates the subsequent serial search process. Guided Search Theory There are two problems with GST: According to GST, grouping the guiding distractors should result in reduced guidance (less bottom-up activation). However, the

opposite happens. There is no quantitative implementation of a Guided Search model that could predict guidance, i.e., saccadic selectivity for a given search task. To overcome these problems, we proposed the Area Activation Model of saccadic selectivity in visual search tasks. Area Activation Assumptions: Processing resources during a fixation are distributed like a two-dimensional Gaussian function centered at fixation.

Fixation positions are chosen to allow a maximum of information processing according to the assumed processing resources. Scan paths are chosen in such a way that they connect the optimal fixation positions with minimal eye-movement cost (path length). Area Activation - Strong Guidance Area Activation - Strong Guidance

Area Activation - Weak Guidance Area Activation - Weak Guidance Area Activation - Empirical Results Area Activation Problems with the Area Activation Model: Empirical number of fixations per trial needs to be known in advance.

Only very basic factors influencing visual search have been implemented so far. Nevertheless, Area Activation can be considered a very first step towards a quantitative model of visual search. Conclusions We have discussed how the visual search paradigm can be employed to investigate the mechanisms of visual attention. Various models of attention have been developed and evaluated with visual search tasks; in more recent studies, this was done based on eye-movement data.

In the next lecture, we will look at slightly different paradigms, which are aimed at identifying factors that determine visual scan paths. See you then!

Recently Viewed Presentations

  • Somalia Today - Paul Bacon

    Somalia Today - Paul Bacon

    Colonialized by many countries. 1875 - Egypt occupies towns on Somali coast and parts of the interior. 1860s - France acquires foothold on the Somali coast, later to become Djibouti. 1889 - Italy sets up a protectorate in central Somalia,...
  • Carga eléctica - BQMC

    Carga eléctica - BQMC

    Banda prohibida es estrecha, de valor típico de Eg= (0.5 - 2) eV La conducción es algo más probable, mientras a temperatura T=0 K es aislante. Semiconductores T>0 K Los semiconductores empiezan conducir a temperaturas finitas. Occure una formación de...
  • Contingency Table Analysis  contingency tables show frequencies produced

    Contingency Table Analysis contingency tables show frequencies produced

    Contingency Table Analysis contingency tables show frequencies produced by cross-classifying observations e.g., pottery described simultaneously according to vessel form & surface decoration most statistical tests for tables are designed for analyzing 2-dimensions only examine the interaction of two variables at...
  • From Underapproximations to Overapproximations and Back! Arie Gurfinkel

    From Underapproximations to Overapproximations and Back! Arie Gurfinkel

    joint work with AwsAlbarghouthi and Marsha Chechik from University of Toronto
  • Muscle Stimulation & Exercise 4-5 November 2014 Response

    Muscle Stimulation & Exercise 4-5 November 2014 Response

    Exercise & Muscles. Isotonic vs Isometric Contractions. Isotonic contractions occur whenever the muscles shorten, causing movement (e.g. weight lifting, running, swimming, jumping, etc.) Isometric contractions occur whenever the muscle's tension increases without actually shortening. This happens when you try to...
  • Process Model Realism

    Process Model Realism

    Process Model Realism. Measuring Implicit Realism. 8/09/2014. dr. Benoît Depaire. ... The lower IR(L,M), the less confident we can be that M actually produced L . because M contains too much unobserved behavior! (for a given n) 8/09/2014. dr. Benoît...
  • Asphalt Base Repair: Fix The Problem

    Asphalt Base Repair: Fix The Problem

    Asphalt Base Repair: Fix The Problem. James Anderson. DPW/ACEC Training Event. February 27, 2018. INTRO Notes. Greeting- Appreciate the opportunity to be part of this program today; going to be discussing asphalt base repair.
  • Chapter 25 Latin singular nom. gen. dat. acc.

    Chapter 25 Latin singular nom. gen. dat. acc.

    Each group of nouns is the same case - identify the case and number, then change singular forms to plural and plural forms to singular. 1. Case: _____ Number: _____