1. Introduction
  2. Spatial Attention
  3. Selective Attention
  4. Effortful versus Automatic Attention
  5. Summary

1. Introduction
We are constantly bombarded with a massive amount of perceptual information, more than we can process at one time. Selecting relevant information from this stream and filtering out irrelevant information is critical to our ability to perform goal-directed behavior. If we filter out too much information (as may be the case in catatonia), then we fail to respond to important events. If we fail to filter out the irrelevant (as may be the case in attention deficit disorder) then we bounce around like a ball in a pinball machine, responding to everything. The cognitive operations that perform this selecting filtering are known collectively as attention. Understanding the nature of attention and its neural bases is one of the enduring problems of cognitive psychology and cognitive neuroscience.

Attention is more than one cognitive operation and is supported by more than one neural system. For example, Posner postulates three separate attention systems performing different functions and supported by separate neural systems. In this model of attention there is a right frontal system that maintains vigilance, a posterior parietal system that orients attention in space, and an anterior cingulate system that is active in target detection.

An important distinction has been drawn between what is known as early attention and late attention. Early attention is thought to operate at a stage of perceptual processing when the internal representations are being formed. Late attention is thought to operate on fully formed perceptual representations. These attention operations can either enhance (select) or filter (reject) certain aspects of a stimulus or the entire stimulus representation. For example, in a selective attention paradigm subjects might be instructed to press a key to any red colored stimulus, regardless of shape or location. Subjects in this task could ignore the shape and location aspects of the stimulus and attend only to the color channel.

2. Spatial Attention
Studies of selective attention have shown that spatial location plays a spatial role in attention. In one model of visual perception, developed by Triesman, spatial attention is required to create a complete visual percept. In this model, different aspects of a stimulus are stored in separate spatial (retinotopic) maps. Support for this model has come from visual search and illusory conjunction studies.

In a visual search study, subjects are asked to find a target in an array of stimuli. If the target is defined by a single feature such as color, then it does not matter how many distracters are in the array, and the target “pops-out.” However, if the target is defined by a conjunction of two or more features, e.g. color and shape (find the red “X” in a field of red and black “X”s and “O”s) then response time increases as a function of the number of distracters. Thus when subjects have to put together multiple features to form the perceptual representation, each item in the array must be scanned, one at a time, by moving spatial attention around the array.

In the illusory conjunction paradigm subjects are presented with shapes of different colors, for example a red X, a green O, and a yellow T. Subjects correctly report the colors and the shapes, but if presentation times are very short subjects will often put the features together incorrectly, e.g. report the presence of a yellow O. If the presentation time is too short for visual attention to orient to the items in the display, then the features to not get properly conjoined into fully formed percepts.

A second commonly-used way to study how spatial attention works is to instruct or cue subjects to attend to specific locations in visual space prior to the presentation of a stimulus. Results from these studies indicate that stimuli appearing at attended locations receive preferential perceptual processing. In one such paradigm, the cued spatial orienting paradigm (also known as the "Posner paradigm" after its developer), subjects are presented with two boxes to the right and left of a central fixation point. In one form or the paradigm, one of the boxes brightens (the cue). A few hundred milliseconds later a target appears at one of the boxes. Subjects are to respond to the target without moving their eyes from the central fixation point. If the target appears at the box that brightened, the trial is called a "valid" trial. If the target appears at the other box (the one that did not brighten) then the trial is called an invalid trial. Usually, most of the trials are valid trials (about 80%) so that the cue is a good predictor of where the target will appear. Subjects are faster to respond to validly cued targets and slower to respond to invalidly cued targets than to targets on neutral or no-cue trials.

It is thought that in order to respond to the target, subjects must attend to it first. This would involve moving attention from the central fixation to the location where the target appeared. In the cued orienting paradigm, the cue attracts attention even if the eyes stay fixed on the central fixation point. Thus in the valid trials the reaction times are faster because the subject is already attending to the location at which the target appears. In the invalid trials the subject has been cued to the wrong location and the reaction time is longer because the subject must move their attention from the incorrect to the correct location in order to process the target. By examining the performance of patients with brain damage in the orienting task we have learned something about the neural systems that support the shifting of visual spatial attention.

2.1 Neural Systems of Spatial Attention

2.1.1 Lesion Studies
Using patients with brain damage and the cued spatial orienting paradigm, Posner has developed a three-stage model of spatial attention. He postulates that there are three separate operations in the orienting of attention to a location: (1) the subject must Disengage from wherever they are currently attending, (2) they must Move their attention to the new location, and (3) they must Engage their attention to the new location. Patients with damage to specific areas of the brain experience deficits in these specific operations. Damage to the posterior parietal cortex appears to impair the Disengage operation; damage to the superior colliculus impairs the Move function; and damage to the lateral pulvinar nucleus of the thalamus impairs the Engage function. Perhaps the best known result of damage to the spatial attention system is a syndrome known as neglect.

Patients who have damage to their posterior parietal lobes, particularly on the right side, sometimes exhibit a disorder known as contralateral hemifield neglect. Patients with neglect have normal visual acuity but are unable to properly attend to locations in visual space opposite to the lesioned side of the brain. The classic test for neglect is called "double simultaneous extinction". In this test the neurologist sits facing the patient and holds up both hands with the index fingers extended. The neurologist then wiggles on of his fingers and asks the patient to point to the wiggled finger. A patient with neglect will correctly point to either finger. However, if the neurologist wiggles BOTH fingers, the patient will only see the one ipsilateral to the lesions, the patient will not detect the contralateral finger if the ipsilateral one is wiggling at the same time. It is as if the ipsilateral stimulation has captured the patient's attention.

Although the study of spatial attention in patients with brain damage has been informative, it is also important to study these neural systems in normal subjects. The most common method of studying the neural systems of attention in humans is by measuring the electrical fields generated by the brain as it functions.

2.1.2 ERP studies

The brain is an electrochemical engine, and as it operates it generates electrical fields. Some of these fields can be measured at the surface of the scalp as an oscillating waveform, the electro-encephalogram (EEG). If a stimulus is presented to a subject while recording the EEG, little or no change is observed in the EEG. This is because most of the EEG reflects neural activity that has nothing to do with the presented stimulus. However, if a stimulus is presented multiple times and the EEG recorded time-locked to the stimulus, then these multiple epochs of EEG can be averaged together, synchronised to the stimulus. All of the EEG activity that is random with respect to the stimulus will, because it is random, tend towards zero. In the limit, all that will remain is the brain's electrical activity, the field potentials, that are related to the stimulus event: the event-related potential or ERP.

Figure 1. ERP waveforms from the scalp over auditory cortical areas (superior temporal gyrus) in an auditory oddball design in which one tone is presented frequently (on 80% of the trials) and another tone rare (20% of the trials). In some blocks, subjects passively listen to the tones; in other blocks subjects attend to and silently count the number of rare tones in the block. The waveforms are subtractions of the frequent tone from the ignored tone. About 100 milliseconds after the stimulus, in both the active and passive conditions, there is a negative deflection in the waveform, the MisMatch Negativity (MMN). The MMN is equivalent in both the passive and active conditions, indicating that it indexes an automatic detection of novelty in auditory cortex. This may relate to the exogenous orienting of attention to novel events. About 200 milliseconds after the stimulus, there is a second deflection, the N2b. The N2b is present only in the active condition, i.e. only when the subject must direct attention to the task-relevant stimulus. Thus the N2b may index the endogenous orienting of attention to task-relevant items.

The ERP is comprised of different peaks at different times relative to the stimulus. These components are named for their polarity (P for positive, N for negative) and either their latency or their position in the sequence of peaks. Thus the first positive peak in the visual ERP, occurring at about 100 ms after the stimulus, is called the P1 or the P100; the first negative peak at about 180 ms is called the N1 or N180, and so on. Early components are thought to index lower-level perceptual processes while later components are thought to index higher-level cognitive operations. Thus attention effects which alter early ERP components are thought to index changes in the formation of the percept (early selection) while processes which alter late ERP components are thought to index changes in cognitive operations on the formed percept (late selection). ERPs have been used to investigate the orienting of spatial attention in the cued spatial orienting paradigm.

If ERPs are collected while presenting stimuli in the spatial cueing paradigm and the waves from the valid and invalid conditions compared, the earliest components, the P1 and N1, will be larger to the validly cued targets. These components are thought to index stages in the initial formation of the visual percept. For example, the P1 is located over the occipital scalp and is thought to be a result of neural activity in secondary visual cortex. Thus thee enhanced P1 to stimuli appearing at an attended location may index an increased gain in the prestriate neurons which respond to the attended portion of retinotopic space. It appears from the ERP data that spatial attention functions by amplifying the responses of the neurons with receptive fields covering the attended portion of retinotopic space. This amplification function appears to be specific to spatial attention; similar effects do not occur for attention to other visual features like color or shape.

3. Selective Attention
In selective attention tasks, subjects are told to respond to a specific stimulus feature while ignoring other features (e.g. color: respond to all red stimuli regardless of shape or location). In these tasks, when some feature other than location defines a target, then these very early enhancements do not exist -- the P1 and N1 occur, but they are the same amplitude to both the targets and non-targets. Only attention to spatial location produces the early gain enhancement. Thus non-spatial selective attention seems to be a late selection process. There are ERP effects to targets defined by non-location features, starting between 200 and 300 ms after the stimulus. These effects consist of negativities over the posterior brain, the N2 and selection negativity (SN), and a positive deflection over the frontal brain, the frontal selection positivity (FSP). The exact latency of these components depends upon the attended feature as does the topography of the posterior SN. The differential topography of the SN may reflect activity in feature specific perceptual areas in the posterior brain that are enhanced by attention. This attentional enhancement may be regulated by neural systems in the frontal cortex, indexed by the FSP. These ideas are supported by single unit recordings in the alert monkey and by neuroimaging studies in humans.

Single unit recording studies in the monkey have shown changes in firing of visual perceptual neurons due to attention. Neurons in the inferior temporal lobes which code the features of a stimulus show enhanced firing if those features are a target in the monkey's task, that is if those features are task-relevant. This task relevant enhancement may be the source of the selection negativity or N2, appears to require input from the frontal lobes, and may be the neural basis of selective attention.

Neuroimaging studies in humans have supported the idea of combined activity in frontal cortical areas and posterior perceptual areas in selective attention. Studies using PET and fMRI with selective attention tasks have show activation in multiple areas of frontal cortex, including dorsolateral prefrontal cortex, orbito-frontal cortex, premotor cortex, the frontal eye fields, and the anterior cingulate. These seem to be areas concerned with assessing the task relevance of stimuli and of sequencing and regulating behavioral output. These studies have also found activity in posterior areas of perceptual representation, e.g. the human analog of the monkey color area V4 in tasks requiring selection by color.

Figure 2. Source model of the ERP at about 300 milliseconds after stimulus onset in a visual selective attention task. Subjects must attend to the shapes of objects while ignoring their location. Dipoles locate bilaterally to inferior temporal cortex, which processes information about the visual features of objects, and prefrontal cortex, which may be involved in directing attention to specific perceptual representations.

Figure 3. FMRI activations from a visual selective attention study in which the subject had to attend to some visual object while ignoring others. There were activations in right and left parietal cortex, which may play a role in integrating visual information, and in prefrontal cortex, which may participate in networks that direct attention to perceptual representations.

4. Effortful versus Automatic Attention
A distinction has also been drawn between effortful attention (internally directed) and attention that is captured by an external event (e.g. the orienting reflex). One property of a stimulus that can capture attention is its relative frequency. Novel environmental events attract attention -- we orient to novelty. This makes adaptive sense -- something changing in the environment might be something good to eat, or it might be something trying to eat me; in either case, I'd better pay attention to it. The brain has evolved perceptual systems to detect novelty. The operation of these systems can be seen in single unit monkey recordings. IT neurons which code the features of a stimulus will show enhanced firing on repeated presentations of that stimulus. However, the majority of IT neurons do not code features of that stimulus and those neurons show suppressed firing on repeated presentations. Thus the neural representation of the stimulus may become "sharpened" on repeated presentations. Then, when a novel stimulus is presented, there is an overall increase in firing as the representation of presented stimulus fails to conform with the existing state of the IT neurons. This type of neural computation can account for both habituation effects, where the animal stops attending to repeated stimulation (the way we can feel our watch when we first put it on in the morning but soon forget that it's there) and the orienting of attention to novel events.

The early neural detection of novelty can also be measured in human subjects with the ERP. If a series of two tones is presented, one presented frequently and the other infrequently, there is a negative deflection in the ERP (known as the mismatch negativity or MMN) to the infrequent tone. This deflection can occur as early as 100 ms or before. Thus the brain can detect novelty and orient attention at a very early in perceptual processing.

5. Summary
Attention comprises multiple cognitive operations subserved by multiple neural systems. Attention can operate on the early stages of percept formation or later on the fully formed internal representation of an external stimulus. Attention can be drawn to objects or events or can be directed under voluntary control. In visual attention, spatial location has a special role, operating at an early stage of perceptual processing and necessary to create a complete object representation. Some attention operations take place in the same cortical areas responsible for perception, some require input from other areas of the brain, particularly in the frontal lobes. Understanding the neural bases of attention will allow us to better understand how we organize our behavior to best meet the demands of current environmental challenges.