Being About table of contents   Chapter 4 bibliography   Chapter 5 Imagining   Ellie Epp home page


BEING ABOUT  Chapter 4. What, where and how

Spatial aboutness

Functional and physiological contrasts

Characteristics of the ventral stream

Characteristics of the dorsal stream

Reach and grasp

Imagining a structure sense


Organism perspective ... is essential for the preparation of acts Damasio 1999, 147-8

Spatial aboutness

Spatial directedness is the core of organism aboutness. A living thing is most thoroughly about -- keyed, tuned, ready, structurally adapted -- when it is dynamically engaged with located things.

When sensory surfaces (or whole bodies) are oriented to an object at a location, that centeredness has immediate consequences for the organization of a network of foci distributed throughout the cortex. As described in the last chapter, subnets involved in segregating objects at visual center will have several kinds of advantage: there is facilitated early response to whatever is discovered at the center of an attentional field, and in primates with foveas there is sensory magnification for the center of the gaze. Visual tracking maintains an object at foveal center and so maintains subnet formation in relation to that object long enough to refine and accumulate structure. These characteristics of a perceptual axis are important also to action.

Different cortical substreams coordinate subtypes of spatial action. In this chapter, I will describe an intermediate stage of several of these kinds of spatial action. I have chosen to concentrate on this intermediate stage of spatial action because its distributed organization in the covariant matrices of parietal cortex makes it the clearest possible example of the difference a wide net vision makes to our understanding of spatial aboutness.

The parietal lobe is a region where vision, audition, touch, and kinetic sensation work separately or together to prepare, monitor, and coordinate many kinds of act. Parietal cortex has therefore been thought of as the probable locus of the representation of space. Understanding the parietal even a little corrects the idea that there could be such a thing. We are spatially formed and spatially located bodies, and we perceive and act by relating to located things in spatially organized locales. Action and perception together are spatial function by being spatially related function. There are such things as representations of space -- maps, diagrams, drawings, mathematical descriptions -- but a creature that is able to orient, perceive and act does not need internal versions of these human artifacts in addition to the sensor-effector connections that integrate spatial function.

Functional and physiological contrasts

The general idea of the "what"/"where" distinction is important; philosophers ought to pay more attention to it than they have. Hurley 1998, 180

Binary contrasts are common at early stages in the description of a domain. A number of dichotomies have been proposed for functions and areas discussed in this chapter. Early contrasts were based on differences in kinds of damage found with lesions in human post-primary association areas. Later contrasts have been based on experimental work with non-human primates. The most recent versions are supported by functional imaging studies of humans.

Dichotomous classes that have been suggested have included

different terms that can be equally well defended: among these are evaluating-orienting (Ingle, 1967), what-where (Schneider, 1967), focal-ambient (Trevarthen, 1968), examining-noticing (Weiskrantz, 1972), figural-spatial (Breitmeyer and Ganz, 1976), foveal-ambient (Stone, Dreher and Levanthal, 1979), and object-spatial (Mishkin, Ungerleider, and Macko, 1983). Maunsell and Ferrera 1996, 458

As is usual when a topic is dichotomized in many ways, there have been tendencies to align the different contrasts (Livingstone and Hubel 1989, for example) and search for a single underlying principle of difference. It hasn't been possible to find such a principle, because the various contrasts have been found to cut across one another.

Where the contrast has been primarily vision-based, theorists have spoken in terms of a focal-peripheral or foveal-ambient contrast, or a color vision versus motion vision contrast. Visual function differences have subsequently been found to result from partly segregated vision subsystems with different sensitivities. (There are similar subsystem differences within other sensory modes too: bat CF-CF and DSCF regions are examples of biosonar subsystems.)

Other theorists have posited a more general what-where contrast between perceiving, remembering and imagining objects, and perceiving, remembering and imagining locations (Ungerleider and Mishkin 1982, Ungerleider and Haxby 1994). For this group, what areas working on object recognition and memory are concerned with intrinsic object properties that continue to be present when the object is at different distances or orientations, or is detected with different sense modalities. Where areas would be concerned with extrinsic and temporary locational characteristics.

A third group (Goodale and Milner 1992, 1993, Goodale et al 1994) makes this distinction a perception-action, or a what-how contrast, reasoning that locational defects are really action deficits rather than perception defects. Tasks Goodale and Milner think of as perceptual include recognition, naming, and same/different judgments made on the basis of shape -- skills whose loss is a visual form agnosia. Action tasks include reaching for, pointing at, grasping, and manipulating objects. Loss of these skills is considered an ataxia rather than an agnosia.

The anatomical basis for these functional contrasts is being described in several ways. Modular theorists tend to think of both what-where and what-how distinctions in terms of the two cortical lobes whose lesion damage doubly dissociates, that is, the temporal and parietal lobes. A less modular version in terms of cortical through-function speaks instead about ventral and dorsal streams.

Our understanding of structure and function in the cortical areas under discussion is changing very rapidly; both data and interpretation are under revision. Given new functional imaging work and recent physiology, two sets of differences seem to continue to be important. They are differences between magnocellular and parvocellular vision subsystems, and dorsal and ventral sensory-motor through-streams. The two are different but related contrasts, with partial overlap: the dorsal stream is mainly magno-derived, but magno- as well as parvo-derived units have recently been found in the ventral stream.

Two visual systems

Magnocellular and parvocellular visual subsystems originate at retinal nucleus cells with different response characteristics, are segregated at midbrain nuclei, and include widely separated regions of different cortical lobes. They are named for cell types at their retinal nucleus origins -- magno cells are larger than parvo cells. Since the parvocellular system has foveal magnification and the magnocellular system does not, the parvocellular and magnocellular systems are sometimes also called focal and peripheral systems.

The magno system is thought to be much older than the parvo system and possibly homologous to the entire visual system of nonprimate mammals (Livingstone and Hubel 1988, 748). The parvo system is approximately ten times more massive than the magno system and evolutionarily so new that it is well developed only in primates. It adds an ability to see in more detail and in more kinds of detail.

Magnocellular neurons are fast, they are insensitive to wavelength differences but very sensitive to luminance differences, and their spatial resolution is relatively coarse. Parvocellular neurons are slow, they are wavelength-sensitive but relatively insensitive to brightness differences, and they have fine spatial resolution -- they respond to light from small regions of the visual field.

Both streams are tied into both rods and cones in the retina. Parvo structures propagate the wavelength-sensitivities of the cones, while magno structures, very generally speaking, sum them.

The two systems are segregated in the midbrain geniculate nucleus they pass through on the way to visual cortex; blocking either stream at that point has clearly different effects. In normal vision, the magno stream into the parietal and the magno/parvo stream into the temporal function in parallel, but there are reciprocal connections at various junctures along their diverging paths.

The evolution of the parvocellular system in the context of an existing magnocellular system accounts for certain structural features of the visual system as a whole. In primary visual cortex, new through-lines from parvo areas of the thalamus are bundled as they pass through older magno regions responding to the same retinal locations: patches of cortex that are parvocellular in origin are visible blobs interspersed among magno-based structures. Further into the parvo stream there is extensive in-fill of new association cortex integrating parvo capabilities with earlier function.

Two cortical streams

The other currently relevant contrast -- with the magno/parvo contrast -- is the contrast between two broad streams from primary sensory cortex to frontal cortex, one passing through the parietal lobe and the other passing through the temporal lobe. The two streams are called the dorsal and ventral streams, for their positions, one at the back and one at the belly of the cortex. The two routes remain separate all the way to the forebrain.

[4-1 Two visual streams in macaque]

The existence of two streams has recently been confirmed by correlation analyses of PET data for vision experiments. Combined with what is known about axonal connectivity, cluster analysis confirms an overall dorsal/ventral segregation, as well as a posterior/anterior axis (Young 1992, 154).

[4-2 Cluster analysis]

Within this general topology of visual function, object recognition tasks do preferentially activate a fairly loose complex of ventral areas, and dot location tasks preferentially activate a dorsal complex (Haxby et al 1991b). Temporal damage results in inabilities to recognize or name things, or to tell objects apart on the basis of perceived form. Parietal damage results in spatial action inabilities.

Since the same object may appear in many places, and many kinds of objects may appear in the same location, it makes sense to think that object and location response are partially discoupled. Response to spatial characteristics of object form that change when it or the perceiver moves should also be partially discoupled from recognition-relevant characteristics constant across orientation and distance: both matter to action; the first matters to what is done, the second matters to how muscles must alter to do it.

The functional basis for segregation is clear, but the notion of two visual streams is nonetheless approximate. Different object-recognition and location tasks will set up through-nets that are somewhat different from each other even when they involve the same nodes. More important, task dissociations set up for experimental purposes do not exist in normal spatial function. We see locations by looking at objects, and we see objects by orienting to locations. Our muscular mode depends on what something is as well as how it is oriented in the moment; we do not grasp a wasp and a raspberry in the same way.

Characteristics of the ventral stream

The ventral stream and its temporal lobe matrices will come up later in chapters concerned with simulation and language. Since the emphasis in this chapter is on active spatial engagement, I want at this point only to give a general sense of ventral stream organization and function.

The ventral stream has both magnocellular and parvocellular roots, but its character is dominated by the latter. Ventral stream visual characteristics resulting from the evolution of parvo structures include at least four kinds of added resolution:

Temporal resolution. Parvocellular neurons are relatively slow but, unlike magnocellular neurons, they sustain response so that structure downstream from them can stabilize over the entire period of a fixation.

Spatial resolution. Because there are 8 times more parvocellular than magnocellular retinal ganglion cells in the foveal area, and because their dendritic arbours are relatively restricted so that these cells have relatively small receptive fields, parvocellular projections show spatial resolution 4 times finer than the resolution found in magnocellular projections.

Spectral resolution. Like magno cells, cells in the parvo system propagate response to illumination contrasts, but many of them do so at a finer grain: by resolving broadband illumination into narrowband frequency ranges, the parvo system increases the number of contrasts that can be seen.

Subcortical hyping. Where the magnocellular-derived structures in the posterior parietal are heavily connected with action-related subcortical structures like the superior colliculus, the parvo-derived structures in the temporal are, instead, heavily interconnected with limbic structures that organize memory and emotion. Connections to limbic saliency systems can have the effect of hyperactivating ventral matrices, increasing their participation in core dynamics. Goodale and Milner have speculated that activation of the ventral system is a necessary condition for conscious vision (1992, 24).

Cortical magnification in bat sonar offers an interesting parallel with foveal resolution in the visual system. As described in Chapter 3, the bat CF-CF matrix coordinates the bat to intercept an insect flying at a particular velocity, and additional correlation of amplitude with frequency response prepares the bat for prey of a particular size. The DSCF bulge in the CF-CF matrix discriminates the relative velocities of parts of the target, which would allow the bat to recognize an insect by its wingbeat. Convolving amplitude with frequency response in the DSCF region allows the bat to determine also the size of the parts of the target (Suga 1990, 60). It is as if the CF-CF system is a large-scale where system and the DSCF system is a what system by being a high resolution where system.

Temporal vision, like parietal vision, occurs in the context of other sense modalities. Temporal cortex includes primary and secondary auditory areas, and its junction with parietal cortex is at the lower end of the somatosensory strip, that is, the somatosensory area facilitating touch and proprioception particularly of the hand, mouth and throat. As it happens, all of these body regions have expanded sensory areas; Colby et al describe the mouth as "the 'fovea' of the facial somatosensory system" (1996, 49). In this neighbourhood, neuronal groups correlating responses across visual and somatic modalities would be crossmodally magnificatory.

Color and form

The ventral stream is usually thought of as divided into two substreams responding separately to object color and object form. This description is more accurate in relation to color than it is to form: there does seem to be a well-defined color-sensitive substream, but the status of the second substream is less clear.

[4-3 Visual subsystems]

The color-sensitive substream is sometimes called the P-B, or parvo-blob, system, since its through-lines in V1 are visible as segregated patches. In contemporary contexts where paints and dyes modify surfaces at random, color is often irrelevant to object recognition, it is difficult to appreciate the uses of the color system's increase in frequency resolution. In evolutionary contexts, frequency resolution is highly relevant -- natural objects absorb and reflect (or transmit or refract) different wavelengths because their surfaces are materially different. Narrowband vision is thus vision of surface microtexture, and color vision can be understood as part of the ventral stream's general expansion of visual resolution in the service of object perception and recognition. Color vision is a segregable subsystem in the sense that it can be lost without loss of other sorts of vision.

There is no selective loss of form vision (Zeki 1992, 73). A less visible parvo subsystem called the P-I, or parvo-interblob, system is thought to be important to form vision, but form vision also seems to be widely distributed; there are orientation selective cells in many areas, at many depths into both magno and parvo streams. Object form is visible at both broadband and narrowband resolution; we can distinguish forms at equiluminance, when all frequencies are equally bright, and we can recognize forms when it is too dark to see color.

Nonetheless, the ventral parvo-interblob system is often described as the form vision system, since the identifying characteristic of P-I units, orientation selectivity, can be found in parvo structures that are not frequency sensitive. In area V4, downstream in the P-I substream, receptive field sizes for orientation-sensitive cells are 16-50 times greater than at corresponding eccentricites in V1, a fact that suggests these areas are able to respond to forms over a range of locations. 3D form vision -- binocular stereoscopy -- would also seem to be part of this ventral substream since it relies on foveal resolution (think of trying to thread a needle without binocular focus) but not frequency discrimination.

Object constant response

The evolutionary significance of parvo-derived ventral magnification is suggested by the fact that rodents and primates do not differ very much in route-finding or object retrieval capabilities, but they differ spectacularly in their ability to identify, classify, and remember objects (Ingle et al 1985, 247).

Hyped attention to focalized objects is a precondition for attention to conspecifics, and hyped attention to conspecifics, to their faces in particular, is one precondition for kinds of social organization in which communication can develop. It is probably significant that the ventral area found to be particularly responsive to faces is near areas particularly responsive to landmarks and to color, and that a disability in one of these sorts of perception is often accompanied by deficits in the others.

Hyped attention to objects is also a precondition for remembering them and talking about them. It is thus not surprising that in humans the ventral stream's forward projections into frontal cortex overlies the Wernicke-Broca's through-path for language function. Knowing what something is, what it is good for, and what it is called are closely interrelated in humans; "What is that yellow stuff you put on rice, you know, ... saffron." It seems that the look of a thing, its use, its name, and other facts about it, can call each other up because ventral wide nets facilitate all of these evocations.

Object recognition seems to be highly redundant in normal conditions. We can recognize something by its sound, its motion, its color, its smell, in addition to or instead of its form. Object recognition can also fail independently of other aspects of perception. The Damasios report a patient able to see the age, gender and expression of a face without being able to identify it (Tranel D, A Damasio, H Damasio 1988). This combination of redundancy and segregability suggests a separate network with many sorts of convergent connections (Zeki 1999).

Areas quite deep into the ventral stream (they are at the temporal pole in monkeys and at the base of temporal cortex in humans) are in fact thought to respond to object identities rather than object form as it is being perceived at any particular moment. Neuronal groups in TEO and TE, whose receptive fields are 1000 times greater than in V1, show maximum response to moderately complex shape, and to combinations of such shapes with color and texture (Tanaka 1993); they can respond categorically to many variants of important kinds like hands or faces. Unlike many sensory cells in the dorsal stream, their response does not depend on the perceiver's current behavior. It also does not depend on the full presence of normal conditions of vision: it does not require two eyes and it does not require frequency contrasts. It is reliable across differences of size, color, position, orientation. It is cross-modal.

Shape seems to be privileged in recognizing, categorizing, and name-learning, but it can be quite schematic. Object constant columns will respond to profiles if they not atypical. They may even be insensitive to the difference between things and representations of things. Neurons of the ventral stream of anesthetized monkeys responded almost as strongly to simplified 2D images of objects as to the sight of real objects (Sakata et al 1996, 251). Plastic ducks and Oldenberg's 100 foot pencils come to mind.

The relation between the ventral stream and sentient visual perception is not completely clear. Form perception survives ventral lesions, which cause difficulties mainly with object recognition. But object perception is not separate from object recognition either. Object recognition response deep in the ventral stream must occur concurrently and interactively with figure-ground organization in V1 and V2 (Pollen 1999). That is, back-projections from form, color and identity areas in the ventral stream would codetermine primary visual structure, along with response propagated from receptors. In this way, discovering what something is would be an integrated part of being able to see it at all, as well as of knowing what it looks like.

Characteristics of the dorsal stream

Because the dorsal stream is magno-derived, its visual response shows low spatial resolution but high temporal resolution.

Magno cells make up only ten percent of the total number of retinal ganglion cells, and, unlike parvo cells, they are evenly distributed over the whole retinal area. Like parvocellular units their retinal receptive fields vary with distance from the fovea, but because there are few of them and because their receptive fields are relatively large, their ability to resolve spatial detail is quite poor. They also have relatively little frequency resolution; it is approximately true to say that they respond only to broadband luminance contrast. They are capable of finding contrasts in low light because their contrast gain is steep.

Magno cells are fast, able to resolve flicker at 5-40 Hz. Magnocellular neurons at particular locations respond transiently -- only to rapid changes in contrast. Cells responding to contrast onset and offset are segregated from each other. An integrated array of magno units can thus be thought of as responding because an object is moving, because the eye is moving, or both.

Dorsal visual areas propagating magno response are found to be directly connected to the superior colliculus and pons in the midbrain, which gives them fast access to evolutionarily early motor systems setting up reflex motion and whole body orientation. These dorsal connections bypassing the major thalamic routes into primary sensory cortex can account for our ability to catch something before we've seen it falling, and their existence may explain how people with Anton's syndrome can focus their eyes on objects they cannot see (Damasio 1999, 269).

The even distribution of magno receptive fields suggests that when we see something with the corner of our eye, we are using the magno system. Much of the dorsal system shares this sensitivity to the peripheral field. Visual receptive fields in dorsal cortex are often bilateral and may include even the far periphery; some bilateral fields even have a doughnut shape that excludes the fovea -- they will respond even if the central 40 degrees of the visual field are occluded (Baizer et al 1991).

Early versions of the what-where contrast describe the distinguishing characteristics of ventral and dorsal streams as color sensitivity versus motion sensitivity. Like the other contrasts invoked in this domain, this contrast is only part right. Frequency response can contribute to certain kinds of motion perception.

Perception of motion (one's own or an object's) and perception by means of motion (eye, head or whole-body displacement) both make use of the rapid, transient response of magnocellular neurons, propagated into the dorsal stream through areas MT and MST. Cells in these areas can select for motion in different directions, including depth motion and rotation, and motion at different speeds.

If areas MT and MST are lesioned we cannot see object motion. Object form perception does not similarly depend on these areas. However, observer motion may be important to object perception; in the dark, we don't know what something is unless we run our hands over it. With the magno system alone we probably can't see a motionless thing unless we move our gaze across it.

Response in the dorsal stream is often conditional on concurrent action. Visual response to an object may be stronger if it is being targeted by a hand movement, and the dorsal stream is in fact thought to be silent to vision under anaesthesia (Milner and Goodale 1995, 43-44).

Magno-based dorsal vision has been described as "capable of what would seem to be the essential features of vision for an animal that uses vision to navigate in its environment, catch prey, and avoid predators" (Livingstone and Hubel 1989, 748). This description should be modified in two ways. Ancient sorts of whole-body reflex are subcortical rather than dorsal. And, while it is true that dorsal vision in primates is fast, works at night, and is inherently action-relevant, it also makes use of association areas that do not exist in subprimate mammals.

Historical dichotomies that have been evoked to describe dorsal function have a qualified usefulness:

1) The dorsal stream does respond to locations, but it also responds to the motion, three dimensional form, and orientation of located objects.

2) Its visual response seems to be largely magnocellular in origin, but aspects of its response are focal rather than peripheral.

3) Some cells respond to object motion, but some respond to the presence of objects that are not moving.

4) Some are predominantly sensory, some are predominantly motor, and many are sensory-motor.

5) The dorsal stream organizes ancient behaviors critical to simpler animals, but, as I will outline in later chapters, it also participates in simulational networks evoked with high cultural skills.

The parietal

The four lobes of primate cortex, named in 1838, can be seen developing through the course of mammal evolution.

[4-7 All lobes]

The occipital at the back of the head is primarily visual. The frontal lobe opposite it at the forehead has primary motor structure at the central fissure grading forward to multipurpose coordinative structure. The temporal lobes (next to the ears) and the parietal lobe at the top and back of the head are complex associative structures with many specialized subareas. The temporal lobe is very generally speaking a gradient from secondary visual and primary auditory cortex to very high-order memory- and language-essential association areas.

The parietal area is named for its location under the parietal bones of the skull, and these, in turn, seem to have been named for the way they enclose the brain: parietal derives from the Latin paries, a wall.

Critchley, whose 1953 work has been the standard reference, remarks that

The parietal lobe cannot be regarded as an autonomous anatomical entity. Its boundaries cannot be drawn with any precision except by adopting conventional and artificial landmarks and frontiers...it will also be seen that it is not possible to equate the parietal lobe with any narrowly defined physiological function. In other worlds, the parietal lobe represents a topographical convenience pegged out empirically upon the surface of the brain. Critchley 1953, 55

The parietal is important in an account of the dorsal stream because it contains midway sectors of many sensor-effector through-lines. The dorsal stream has its origin at various cortical gateways including primary sensory cortex for all sensory modalities. After much branching and braiding, it terminates in primary motor cortex. Much of this branching and braiding occurs in parietal cortex.

In humans, the parietal takes up about a fifth of total cortical area. Much of its development is recent.

Associated with the occasional use of the forelimbs for purposes other than pure locomotion (as for example, in the bear), and with a trend toward arboreal habits, the cortex expands particularly in the regions immediately caudal to the coronal sulcus ... But it is in the arboreal mammals (tree-shrews, lemurs and especially the lower primates) that the parietal area shows its most conspicuous advancement. This rather sudden parietal elaboration concerns mainly the upper and anterior region of the sensory cortex, for the postero-inferior development is delayed until a further stage. The expansion is however, sufficient to crowd the purely visual area backwards, and the auditory area downwards. Critchley 1953, 2-3

In most mammals and even New World monkeys such as the owl monkey, nearly all the parietal area seems to be organized as unimodal secondary sensory cortex: that is, it shows retinotopic, cochleotopic or somatotopic organization. Old World monkeys have new multimodal association areas between areas of secondary somatosensory, visual and auditory cortex. In the macaque, parietal areas are divided by the intraparietal sulcus: the area above the sulcus is called the superior parietal, and the area behind it, the inferior parietal. The SPL in macaques is mostly somatosensory, while the IPL, and the tissue within the sulcus itself, are important in organizing spatial action on the basis of spatial sensing.

[4-8 Macaque and human parietal areas]

The SPL/IPL naming system means something different when it is applied to humans. Human superior parietal includes Brodmann's areas 5 and 7, which would be macaque SPL and IPL respectively, while the human inferior parietal is made up of Brodmann's 39 and 40, both of which are "ill-developed even in the higher apes" (Critchley 1953, 16).

Naming conventions are even more confusing in relation to the posterior parietal. In humans the anterior parietal, which is the cortex forward of the postcentral sulcus, is occupied by primary and secondary somatosensory structures. The posterior parietal would then reasonably be thought to be the area behind the postcentral sulcus, that is, the superior parietal in humans, and the SPL, the IPL and the sulcus between them in monkeys. Clinical use has been inconsistent, however; especially in lesion studies, the posterior parietal sometimes seems to include areas V4, MT, MST, DP, and PO otherwise classed as occipital, temporal, or temporo-parieto-occipital. The reason for terminological differences is presumably that functional networks do not stop at the sulci we use as cortical markers.

[4-9 Illustration of anterior and posterior parietal areas named]

The most immediate way to understand the primate parietal is to notice its place in the over-all topography of the cortex. The parietal is bounded on three sides by cortex serving all three of the major spatial senses: it is in very broad contact with visual cortex, is bordered by and contains patches of auditory cortex, and includes skin and other proprioceptive projections from all parts of the body. Since all of these sensory areas project through to motor areas in the frontal lobe, the parietal is like a three dimensional field crossed by many paths which cross each other at many points.

From back to front, posterior to anterior, parietal cortex can be thought of as a gradient between visual cortex and somatosensory cortex, which facilitates proprioception of the whole of the body as well as haptic or touch perception of things outside the body. From the top of the brain to the midline, dorsal to lateral, somatosensory cortex grades from leg, to trunk, to arm, to hand, and then to face areas. The gradient from vision to somatic sense (in both hemispheres) can thus be thought of as fanning forward to connections with all parts of the body.

[4-10 Macaque motor and somatosensory strips]

The vision-to-somatic gradient, which is a sensor gradient, can be thought of as overlaid by another gradient, a sensor-effector or perception-action gradient. Premotor cortex (which is secondary motor cortex) grades forward from primary motor cortex, just anterior to the central fissure that bounds the frontal lobe. This sets it at some distance from the parietal, but premotor cortex is nonetheless very intensively connected with parietal cortex.

[4-11 Premotor, motor and parietal areas in human and macaque]

I will have more to say about the three-dimensional area where occipital, parietal and temporal cortex meet at the top of the Sylvian fissure -- the human IPL -- in Chapter 8, since this area is particularly important to communication and representation. But it is worth noticing here, that particularly in the broad strip of cortex running up from auditory areas in the Sylvian fissure, there is yet another gradient present in the organization of the parietal, this one from areas important to language and other sorts of representation, upward to areas important in nonlinguistic action.

Frontal-parietal connections

The human superior parietal lobule forms the roof of the brain bilaterally and contains many foci active in dorsal through-streams. It is often described as the locus of our 'representation' of space. It has not been clear how this description should be understood. What has been clear, however, is that the parietal is essential to the organization of spatially-directed action. Our notions of space and our notions of action are deeply interrelated, and it is this interrelation that can be seen embodied in the multifunctional neural populations of the SPL.

The comparable structure in macaques, the inferior parietal lobule, areas PG and PF, has been very intensively studied. Single-cell testing with monkeys has required an immobilized animal, and so the neural sensitivities discovered have often involved interactions with biologically desirable objects in near space. Even given these experimental limits, the response conditions of neuronal groups in this area have been found to be extraordinarily diverse.

Cells within a small area (the macaque 7b) have, for instance, been found to respond to touch on the face, the hand, the neck, or the shoulder. Some tactile neurons respond also to visual stimuli presented near the face. Others, some primarily tactile and some primarily visual, also respond to perceived motion of the animal's own arm and hand.

[4-12 Macaque frontal and parietal areas]

In macaque 7b there are many sorts of neurons whose response is specific to categories of object interaction such as grasping, grasping and holding, grasping and placing, or bimanual manipulation. Some of these neurons are also motor neurons: some respond during hand actions, some respond during mouth action, some during both mouth and hand action, and some with other types of movements. Some respond during both observed and performed action, if they are of the same kind; in some, observed and performed actions need only be similar. In some, observed and performed actions differ in terms of effectors or in terms of goal (Fogassi et al 1996, Mountcastle 1975).

There has been frequent surprise that an area supposedly specialized for the perception 'of space' has turned out to be a "vast area of neocortex in which we could make out no topographic pattern of any sort" ( Milner and Goodale 1995, 385). Milner and Goodale suggest we think of it as a "series of relatively independent visuomotor channels" comparable to the nearly independent action-organizing through-lines found in frogs (1993, 320). As always, however, it is the space of the brain itself that must be understood. Critchley points out that

Parietal expansion is not the only feature of the arboreal .. for frontal lobe development is also conspicuous. A rapid increase, moreover, takes place in fibre-structure, particularly in re .. the complex intercommunications between one part of the brain and another 1953, 2-3

Parallel evolution of parietal areas and premotor areas in frontal cortex, and intensive associated development of interlobe connectivity, suggest that the functionally miscellaneous cells described above are units in a system for act sequencing that is broadly distributed across parietal and frontal cortex. Areas in the SPL responsive during some category of activity, finger prehension for example, project to areas in prefrontal cortex that are responsive during the same kind of activity. The extraordinary heterogeneity we find in the parietal makes sense if we think in terms a progressive organization of contexted action integrated at many points along sensor-effector through-streams.

Motor cortex is usually thought to organize muscle response directly, and parietal and premotor areas are thought to select and pre-organize action on the basis of current sensory response; Sakata and Taira (1994, 853) suggest these areas are also important in learning motor sequences. Caminiti reports, however, that if motor areas in frontal cortex are lesioned the superior parietal can take over motor control (1996, 320).

The vision of acts as organized by progressive integration at many nodes across a gradient is taken from Mesulam, who gives this example:

Our results do not imply that the visual-to-motor transformation in reaching is achieved in discrete steps in different cortical areas by a hierarchical network processing from a retinal frame (V1), through a body frame (V6), an arm-centered frame and a motor command (M1) step. The current results, together with our previous data on cortical function and connectivities, suggest that this transformation is accomplished in a distributed fashion along a visual-to-somatic gradient in the parallel parietal and frontal pathway (caudal to rostral in the parietal lobe, and symmetrically rostral to caudal in the frontal lobe) ... Distributed along this gradient, in the visual-to-somatic direction, is a matching operation that combines retinal, eye-position, oculomotor, and somatic hand position information. In the opposite, somatic-to-visual direction, a synergy operation that links together the results of sensory combinations that correspond to the same motor command is computed in a distributed manner. Thus, the neural populations along the visual-to-somatic gradient effect a progressive match between the two sensory modalities and the appropriate motor command. A simple interpretation is that the population activity in the whole parietal-frontal network is modified through an iterative process due to intra-and interareal cortico-cortical connections, resulting in a progressive computation of the movement direction consistent with the different available sources of information along the visual-to-somatic network. Mesulam et al 1996, 342

(For 'computation' in a passage such as this we can read 'coordination' or 'organization.')

SPL and attention

Parietal and prefrontal areas, heavily interconnected as described above, are two of four nodes in the large-scale attentional net Mesulam describes (Mesulam 1981, 309). In Mesulam's schema they are the motor and sensory-motor nodes that together coordinate orientation and exploratory motion.

There are many kinds of evidence for the importance of parietal foci to maintaining and shifting attention. In binocular rivalry tasks, in which the two eyes of an experimental subject are shown different images or objects, subjects will normally report seeing only one, with alternation at some regular interval; functional imaging studies during binocular rivalry trials find that activity in parietal and prefrontal areas covaries consistently with shifts in what is seen. Lesions in the superior parietal are particularly damaging to the ability to disengage and shift visual focus (Posner 1991, 1625).

For spatial attention to the surrounding world, the network is predominantly right hemisphere, Mesulam says (1981, 309). Corbetta reports that in dextrals -- the right-handed -- PET studies find two distinct foci in the right SPL, one involved in directing attention into the right visual field and the other involved in directing attention into the left. In the left hemisphere SPL, only one focus is found, active mainly when directing visual attention to the right (Corbetta et al 1993, 1224).

Act segregation

At birth the superior parietal is surprisingly simple in structure, but it gains organization rapidly between birth and two years, the greater part of its eventual organization being attained by six (Critchley 1953, 14). Babies between birth and two are working on basics like hand-eye coordination and walking, and on combinations of these basics. The structures that will be their coordinative means are being built progressively, by distributed, iterative activity in the parietal-frontal gradient described above.

There is a lot to be learned. We often need to be simultaneously related to spatial facts at different spatial and temporal scales and for different purposes. Basic kinds of spatial action -- posture, orientation, locating and targeting, handling, locomotion, and wayfinding -- often occur together, and when they do they must be organized separately as well as coordinated. Each involves several time scales and many stages. Nested cycles must be maintained for each separate action system. Different sorts of action may have to be organized from different senses, or different action systems may use the same sensors in different ways. The interplay of focal and peripheral vision in targeting is an example. Think of running for a fly ball: we use focal vision to stay aimed at the ball and peripheral vision to get us across the field, dodging obstacles, keeping a sense of the runner on third base.

Gaze calibration, attention and task axis

In an experienced organism the connective organization of the superior parietal is the calibrational framework for spatial action. Within the SPL gradients, networks involved in eye-related motor preparation are considered particularly important. It is thought, for instance, that limb coordination makes calibrational use of gaze. There is single-cell evidence that at various positions in the dorsal stream, point, reach, hand shaping, and trunk orientation may all be separately calibrated to gaze position. If these partially separate systems are all calibrated to gaze, they will automatically be calibrated to each other.

Rizzolatti considers 'spatial attention' to be, in fact, an aspect of eye-related motor preparation:

... visual attention to a particular part of space is nothing more nor less than the facilitation of particular subsets of neurones involved in the preparation of particular visually guided actions directed at that part of space. Thus, different attentional phenomena will be associated with the activation of visuomotor circuits including the superior colliculus and the various parietofrontal circuits ... Sheliga, Riggio and Rizzolatti 1994, 185

The calibrational importance of gaze direction seems to continue even there is no overt eye movement. In a functional imaging study by Nobre et al (2000) the parietal network active when we attend to a location without looking at it (covert spatial attention) has been found to overlap the network by which we direct our gaze toward a spatial location. There were differences in levels of activation in some of the shared foci during one or the other task, but the network was similarly distributed. Corbetta had earlier reported that, while frontal motor activity occurs only when there is overt orienting behavior, SPL activity occurs also with covert spatial attention, that is, with voluntary or automatic shifts of spatial attention not accompanied by eye or other motion (1993, 1223). Thompson and Kosslyn (2000, 557) propose a network that mediates interactions among ocular fixation, eye movements, and directed visual attention.

Act constants and mirror cells

Areas deep in the temporal stream are sometimes described as accomplishing object constancies, since they respond preferentially to categories of objects, or even to particular individual objects, no matter where they are or how they are being perceived. Segregated populations of neurons in the dorsal stream can similarly be thought of as act constant, since they begin the process of setting up kinds of action that may be carried out in different ways.

Cells with the properties of act constants have been found in both parietal and prefrontal areas. The response of these kinds of cells can be categorical in various ways. It can be specific to an act type but generalize across effector types: some cells respond in the same way whether tearing is accomplished by the hand or by the mouth. Some cells are specific to types of grip no matter which hand will be used. Some begin to respond when the action begins, and do not stop responding until the act is completed. Parietal cells are more likely than frontal cells to be of this type: parietal neurons most often start to respond during hand shaping, for instance, and continue to respond during object holding (Rizzolatti et al 1995, 440). Other cells are active during segments of a sequence.

Cells active during a particular category of grip may respond to objects of the appropriate size, no matter where they are --whether they are seen or touched, and whether the animal is going to grasp them or not. Some of these kinds of cells respond while an animal is waiting to be able to act: they continue through delays. Some respond when a graspable object is seen but not when it is touched; others respond either way.

The most extraordinary finding has been that there are cells (in frontal, posterior parietal and superior temporal areas) whose response is specific to types of hand-object interaction, not only when the monkey itself is performing them but also when it sees another monkey, or even the experimenter, performing that action. These neurons will for instance respond when the monkey grasps a piece of food, or when it sees the experimenter do so (Fogassi et al 1996).

[4-13 Mirror cell locations]

These very interesting cells have been called mirror cells, but, unlike mirrors, they respond also when there is no observation, when the monkey is alone and prevented from seeing its own action. The point about their response seems to be that it is specific to kinds of hand-object interaction (tearing or bimanual manipulation, for instance), no matter who performs it. They are part of a network by which the monkey is perceptually, actively, and socially about a category of object.

The unexpected mingling of cell sensitivities in premotor-parietal gradients suggests that acting, observing action, perceiving an object that invites action, preparing to act, monitoring action, and feeling oneself acting, are not distinct. They are, instead, simultaneous and overlapping accomplishments of a wide net whose multimodal sensitivities include many kinds of sensory and many kinds of motor response.

Reach and grasp

One possibility is that the "where" system is drawn on for the general trajectory of one's reach, whereas the "what" system is drawn on for the grasp. Another is that the 'whats" and "wheres" are described in two different ways ... and that these are drawn on differentially for different tasks. Landau and Jackendoff 1993, 261

One monkey grooming another, poking, peering and popping nits into its mouth, is demonstrating high evolutionary skill. "A typical motor pattern of monkeys which apparently brings pleasure to both groomer and groomee" (Mountcastle 1975, 885), grooming has needed the coordinated development, in stages, of arm, hand and eye -- development both of the organ and of its facilitating neural structure.

Forelimb joints had to specialize to support reaching. Before there could be binocular depth vision, eyes had to shift to the front of the head so they could converge on a location. Forepaws had to be replaced by prehensile hands with opposable thumbs. Fingertip skin sensation had to intensify. Parvocellular vision had to add its focal and foveal hypersensitivities to existing magno-based vision.

To be able to groom, the monkey must integrate motor control of two eyes and a mouth, two arms that start from different positions, and two hands that are doing different things; and it must do so with sensory guidance by touch, proprioception, and several kinds of vision. In the cortex, therefore, the dorsal stream has had to increase parietal and frontal structure to coordinate the new motor capabilities, and, after the addition of new parvo-based ventral structure, ventral-dorsal connections have had to be added to integrate detailed object vision with existing action-related location and motion vision.

Dorsal wide nets that facilitate grooming and other complex sensory-motor behaviors include many partially segregated parietal-frontal through-streams; eye, head, mouth, arm, hand, and finger movements must be organized independently because they occur in different combinations and from many possible starting points to many possible destinations. At the same time, these separate motions must be coordinated: the mouth must open as the hand approaches, and the hand must slow its arc as thumb and forefinger prepare their pounce.

Reach and grasp are employed together in fine-scale object-handling tasks, but like other dorsal stream specializations, they are partly segregated sensory-motor subsystems which remain segregated through their projections into frontal areas. Both are visually and/or somatically guided. Both involve both object perception and proprioception. But reach and grasp also use these senses differently, and to different degrees.

Like everything else in the dorsal hinterland, reach and grasp organization is complex and obscure. Macaque and human structures have been studied in different ways, and structural parallels are uncertain. Nomenclature and theoretical frameworks are confusing. The story is under daily revision.

Nonetheless, I want to describe a little of what is coming to be known about reach and grasp, because a body reaching and grasping in a spatial world is structurally about the thing it reaches for and grasps and the space in which it does so. Provisional as it may be, a more detailed understanding of reach and grasp can give us a sense of what embedded and situated mean, when we describe knowing in those ways. In addition, the differences between reach and grasp will be important in later discussions of representing practices.

Before saying more about reach and grasp, I will have to say a little about motor organization in general, and then about the two areas in the dorsal stream where reach and grasp differences have been most apparent, the intraparietal sulcus in parietal cortex, and premotor areas in frontal cortex.

Motor organization

The motor ensemble includes the skeleton, muscles and their peripheral innervation, and many parts of the brain. Motor behavior is facilitated by through-streams rooted in subcortical nuclei, traversing the parietal to premotor areas in frontal cortex and then via primary motor cortex back down through the spinal cord to muscles.

In early mammals frontal cortex is unimodal motor control cortex. In later mammals such as primates there is increased multimodal associative tissue anterior to primary motor cortex and mediating complex activity. In monkeys and humans there are three main divisions of frontal or precentral cortex: M1, premotor areas, and prefrontal areas.

[4-14 Human frontal cortex subdivisions]

M1 is primary motor cortex, which does not show sensory response, which is always active when muscles are in use, and which is therefore considered the go/no go control for overt behavior.

Premotor areas show sensory response as well as being active in motor selection and organization. There are many subdivsions of premotor cortex. The most important is a basic division into dorsal and ventral premotor areas. In monkeys, dorsal and ventral premotor areas are called F4 and F5 respectively. In humans, the lower part of Brodmann's area 6 seems to be homologous to F4, and portions of Brodmann's 44 and 45 to F5.

If premotor cortex is understood as motor association cortex, prefrontal cortex can be understood as heteromodal or higher-order cortex connecting both sensory and motor association areas. It is particularly important to complex action, including deliberate, conscious control of movement. In monkeys it is inferior post-arcuate cortex -- that is, the lower, most frontal part of the frontal lobe. Extended development of prefrontal association tissue in human frontal cortex has pushed premotor areas backward, so that premotor eye fields, inside the lower edge of the arcuate sulcus in monkeys, are for instance pushed back to the edge of the precentral sulcus in humans.

Intraparietal junctures

Through-lines for many kinds of sense-guided act organization have been found routed to motor and premotor cortex through small areas in and around the intraparietal sulcus in macaques. The area seems to be irreducibly complex; although zones of the intraparietal have been given separate names, separate functions are not necessarily implied. To illustrate the complex, multimodal, multifunctional character of these regions, all of which have direct, reciprocal connections to similar regions in frontal cortex, I will describe what are thought to be the dominant functions of a few named regions in the ips.

These descriptions apply primarily to the macaque, in whom the intraparietal sulcus lies between somatosensory cortex and the sensory-motor areas of the macaque inferior parietal. In humans, the intraparietal sulcus lies between the analogous sensory-motor areas of the human superior parietal and the representation-related areas of the human inferior parietal. The human intraparietal sulcus cannot be studied by the single-cell probes used on monkeys, and so it is only very generally known.

[4-16 Macaque intraparietal]

Vision cells organizing gaze direction and binocular fusion are found (along with reach and grasp and other sorts of cell) in areas LIP and MIP of the macaque -- lateral and medial intraparietal, respectively -- and in 7a in the area behind the sulcus. These areas are like the frontal eye fields in premotor cortex in having both visual and oculomotor response: they respond to light and they are active during saccadic eye movements. Response in 7a seems to be predominantly eye-position or eye-movement dependent. Cells in LIP are thought to set up response needed to saccade to an object at a location; but response in this area may also depend on head, body and neck proprioception. MIP, which is next to arm somatosensory populations in area 5, has arm-centered visual and somatosensory response.

VIP, the ventral intraparietal area, is considered a reach region, but it has many cells responsive during saccades. There are cells that respond somatically to movement on the skin, or visually to an object seen moving near the skin (Duhamel et al 1998, Colby 1996, 46-50). The area may be also be involved in visual monitoring of arm and hand motion.

Like VIP, area 7b (as described earlier) seems to be involved in coordinating visual fixation and reach toward a desired object in near space. Its sensitivities are visual, tactile and motor, and it includes high level act-constant neuronal groups.

AIP, the anterior intraparietal at the lower end of the sulcus (next to the hand area in somatosensory cortex), is thought to organize vision for purposes of grasp. Some cells are exclusively motor-responsive, some exclusively visual, and some bimodally visuomotor. There is visual response when the monkey looks at a graspable and desired object without moving, no matter where the object is. Some cells respond visually to the monkey's own hand. Many of these cells respond during the whole course of an act, that is, they begin to respond while the hand is being shaped during transport, and they continue to respond while the animal holds the object.

Areas PO and 7m on the medial or inner surface of parietal cortex are thought to coordinate eye, arm and hand in the service of reaching and grasping (Caminiti 1996, 325).

Reach

Reach is an older ability than grasp, more somatosensory and more dorsal. Arm movements are guided by feeling more than by vision (Fogassi 1996, 156), and parietal reach regions in the macaque include small predominantly somatosensory areas in 5 and 7b. Also called arm transport, reach is centered on the shoulder and involves control mainly of arm muscles; 5 and 7b include cells active during shoulder joint flexion.

Visually, reach shows characteristics of magnocellular response. When we guide the arm visually, we do so without consciously seeing it: arm motion is monitored with fast, peripheral, nonfocal vision. Reach areas in the parietal are connected to primary visual cortex through motion-vision area MT at the occipito-parietal junction.

In keeping with a dorsal/ventral division, reach using peripheral vision in contrast to central vision may be affected separately as a result of lesions of the superior parietal lobule, whereas reach guided by central vision is affected as a result of lesions of the inferior parietal lobule (Sakata and Taira 1994, 853). In keeping with magno characteristics for the reach substream, peripherally guided arm transport is fast; when the arm arrives in the area observed with focal vision its motion actually slows.

The reach through-stream from the parietal to premotor cortex terminates in dorsal premotor cortex F4 in the macaque, and in the homologous inferior area 6 in humans.

Grasp

Grasp, or hand-shaping, requires control of the many fine muscles of the hand and fingers; it is more dependent on focussed object vision, and therefore more ventral.

To reach for something, we do not need a very exact sense of its shape, its size, its surface texture, its temporary orientation or its identity. We need only its angle and distance in relation to our arm. We do not have to look at the arm as we move it. Usually we are, instead, looking at the object we are going to grasp. We fixate this object and then we bring our hand to the location where we have centered our gaze.

Grasp preparation requires fine visual resolution for object shape and texture, and it needs three dimensional depth vision for object size and orientation. Because we are looking at the object before the arm arrives at its destination, the hand begins to shape itself while it is still being transported.

The fine resolution for object shape and texture implied by the precision of hand shape, and the fact that skilled use of our hands is monitored with central vision, suggest that grasp areas must be well integrated with ventral vision, or else that dorsal vision also includes stereoptic perception of the three-dimensional shape and orientation of objects (Sakata et al 1996, 251). There is some uncertainty about this point. Jeannerod (1994, 236) and Rizzolatti (Rizzolatti et al 1997, 190) and others suggest that parts of the dorsal where system must also be what systems, responding not only to the location of objects, but also to object-intrinsic spatial characteristics such as volume, and to object-extrinsic action-relevant characteristics such as orientation.

In monkeys, the temporal lobe is more integrated with parietal and frontal cortex than it is in humans. The macaque grasp substream through AIP in the intraparietal sulcus dips into ventral territory before it arrives at a premotor area which is also much more ventral than the premotor terminus of the reach substream -- area F5, homologous to human area 44.

[4-17 Reach and grasp streams in monkeys]

In humans this ventral/dorsal grasp substream seems to be the basis for the evolution of the human IPL, which allows cultural developments of many kinds. (More in Chapter 8.)

Macaque premotor F5 neurons have motor response during goal-directed mouth and hand actions as well as visual response to objects graspable by mouth or hand. F5 also has a subarea with mirror cells -- cells that have visual response to mouth and hand actions performed by other monkeys and humans in addition to the sorts of response present in the rest of F5. Rizzolatti et al (1996a) suggest that the division of F5 into mirror and nonvisual areas could be the beginning of a segregation within human areas 44 and 45 -- the mirror cell area being the precursor of Broca's area, which controls oral and manual speech movements in humans.

Imagining a structure sense

A redescription of aboutness or intentionality not based on metaphors of inner representation must be centered on an account of environmentally embedded spatial function. The most important points in my redescription, so far, have been these:

Basic aboutness is task axial. A living thing must act or be ready to act in relation to things at locations. Perceiving and acting are structural responses of the whole body to located things and events. They are neither spatially nor temporally separable, being organized simultaneously by overlapping cortical networks. Current action determines sensory response as well as being determined by it.

Being physically oriented is the first requirement in attending to something; for both sensing and acting we must first be oriented by movements of at least our sensors, and often also of head, limbs and the rest of the body. Once oriented we adjust in more comprehensive ways by altering sensor and central nervous structures so they are increasingly relevant both to our motivational state and to the thing attended. Once oriented and engaged in these ways, we are able to accumulate and fine-tune the structures by means of which we are about that thing; over time we are able to be more exactly related to it.

In primates, as in earlier mammals, the dorsal stream coordinates action, perception for purposes of action, and action for purposes of perception. Motor behavior is organized by activity distributed over an area of the dorsal stream reaching from sensory association areas to primary motor populations in the frontal lobe. The superior parietal lobule is a mid-stream area in which through-lines from audition, vision, touch, and proprioception converge on the way to motor cortex. It also has abundant back-connections from motor areas. Its sensitivities thus include many kinds of sensor-sensor and sensory-motor covariance. Complexly covariant matrices in the SPL are active in preparing, staging, and monitoring motion.

Visual response in the dorsal stream tends to be response to large areas of the visual field. It is rapid and transient, suited to perception of motion or perception by means of motion. It includes peripheral vision, which can for example guide arm motion when the center of the eye is directed elsewhere. It is frequency-insensitive and has relatively coarse spatial resolution, but it is sensitive to small contrasts in illumination intensity. These response characteristics suggest that in addition to motion perception it is mainly effective for picking out the relative locations and outer edges of objects and large-scale environmental forms.

Parallel substreams in the dorsal stream organize subactivities such as eye motion, reach and grasp, which must be segregated because they use different muscle groups and because they may be guided by different sensory modalities or by different aspects of sensor response. While reach is, for instance, being guided by muscle and joint proprioception and/or by peripheral vision, grasp may be guided by focal vision. In monkeys, the sensory-motor through-line for grasp organization is correspondingly more ventral than through-lines for reach.

Subactivities such as reach and grasp must, at the same time, be coordinated for joint effect in present circumstances. To satisfy both constraints -- structural segregation and momentary functional integration -- complex action is organized progressively by recurrent wide nets anchored at many points in the broad areas of dorsal cortex that have multimodal sensitivities. Cycled stabilization of these recurrent nets finds covariances relevant to the action in progress and guides behavior concurrently adapted to object, organism and spatial context.

When we have understood (only) a little about the segregated and integrated organization of action components in the SPL, it becomes somewhat clearer what task axis means in practice. Whatever it is that happens in dorsal cortex, in the recurrent cycling of activity through the complicated mysterious matrices of parietal and frontal gradients, must be a working out of the multiple covariant dependencies that set up these coordinations of axial aboutness.

If we avoid representation metaphors and instead try to imagine how a body manages itself spatially, these are relatively secure starting points. The rest of this chapter will be more speculative.

Place sense and structured wholes

Place sense is also a basic but not a simple ability:

Animals and children learn the habitat by locomotor exploration ... they see where to go and how to go, and what places are good for what. They connect the hidden places with the place that is seen now from here. They can 'orient' (turn to) relevant places. They see where they are relative to where they might be. Is this perceiving, or remembering, or knowing, or is it behaving? Or what? Gibson 1982, 293

Aguirre's 1997 PET study of human wayfinding showed that large-scale spatial function is like near-space object manipulation in having subabilities: navigation by landmark and navigation by path integration have been found to set up overlapping but clearly different nets that tap into dorsal and ventral streams to different degrees.

The most immediate way of moving in a large space is by panoramic perception, auditory or visual. We can move around in a space because we see and hear what's where. Navigation by landmarks uses panoramic perception in combination with object memory; a visual landmark or a soundmark is recognized, remembered, often named. As we would expect, landmark-based navigation makes intensive use of object vision areas and object memory areas in the parvo-based temporal stream.

Path integration (also called heading, dead reckoning or inertial homing) is a more abstract form of place sense based on some sort of cumulative tracking of distances and angles. It allows us to know where something is from where we are, even when we cannot use landmarks. It allows us, for instance, to take short cuts across unfamiliar terrain or retrace a path whose landmarks are unrecognizable when we approach from another direction. Path integration normally works in conjunction with panoramic and landmark navigation, but it also is a segregable subability; even young children accurately localize targets on the floor of a large room after walking and turning along a disjointed route blindfolded (Ingle 1993, 248).

Non-primate mammals without parvocellular vision often have excellent path integration abilities and are able to find their way in extensive ranges. The evolutionary earliness of this ability suggests it must be anchored in the dorsal stream. Aguirre found that human experimental subjects navigating by path integration rather than by landmark do make more use of parietal areas (1997, 2512).

Path integration is often imagined as calculation and mapping (multiplying rate of motion by time spent and laying down a record of turns relative to a starting axis). We know animals do not literally calculate, and blindfold children able to point to the spot where the target is probably do not explicitly remember their turns, speeds and times. They just point to the target. Our ability to draw maps and use maps depends on a developed place sense, and so it is generally damaged with the kinds of parietal lesions that damage place competence, but children are able to find their way and retrace their steps long before they are able to draw or use maps. Is this perceiving, or remembering, or knowing, or is it behaving? Or what?

Knowing what angle to take from a starting point in order to arrive at a place presently out of sight is not simply a matter of geometry. It also involves knowing what you will encounter on the way, whether for instance there is a lake you will have to skirt. Scott Mainwaring suggests that survey knowledge may treat large spaces as "a single complex object" we are inside (Mainwaring 1993, 249). It is an object whose perception must include remembering or imagining; even the moment's panoramic perception requires something like memory, because when we are in a large open space we can never look or listen to everything at once.

This suggestion seems to apply to place sense in general, whether we are navigating by landmarks, by path integration, or by some combination. It takes time to see, Gibson says: we glance about and thereby maintain and accumulate the structure by means of which we are able to be about that place.

Mainwaring's point is that this temporally extended perception of place is different in scale but not necessarily different in character from the sense of object form we have in relation to a smaller thing. We can shape our hand to grasp a cup by a handle we can't see from our present visual angle. Along with the things we are presently seeing in a landscape, we are able to be about or related to parts of a place we were seeing focally a second ago, an hour ago, or years ago. Along with the sense of present motion in this terrain, we have a sense of possible paths. We are related to structural wholes by means that must be active in the moment but need not be built from scratch in the moment, and this sort of relation to wholes can be constructed for large territories as well as same-size or smaller things, or even conspecifics.

The actual structural reconfigurations that enable complex place learning are poorly understood. Researchers have been concentrating on the hippocampus because large-scale navigation differs from object handling and mid-scale object targeting primarily in the amount of hippocampal and parahippocampal involvement. The hippocampus and parahippocampus are medial areas (cortical areas inside the fissure that divides the brain into two hemispheres) intensively interconnected with saliency areas. Their medial position puts them in contact with both ventral and dorsal streams.

The hippocampus has long been known to be essential to memory (Haxby et al 1996); hippocampal lesions interrupt the ability to set new memory, but they do not harm the ability to re-evoke established memories. The hippocampus has also been found to be particularly important to knowledge of large spaces. London cabbies have recently and famously been found to show hyperdevelopment of the posterior hippocampus, which has intensive recurrent connections with the posterior parietal (Gatehouse 2000).

Familiarization with an environment sets up place fields in the rat hippocampus. When a rat enters a new maze, single hippocampal cells develop place-specific sensitivities over the course of several minutes: that is, the cells will respond more actively whenever the rat is at a particular position in the maze, regardless of its orientation. If the rat is moved to another environment, the cells will have other sorts of sensitivity, but as soon as the rat is replaced in the original environment the original place sensitivities are restored (Ungerleider 1995, 773).

Place response in the hippocampus is said to be allocentric -- absolute or non-egocentric -- because it is independent of the rat's orientation; but absolute location alone does not account for all the variance in place cell response. Position response also varies with interoception of self-motion and with the presence of visual markers at the environmental periphery (Gallistel 1990). In rats, that is, a place sense that resembles path integration seems to include a systematic covariation of motion perception with the transformational geometry of surrounding surfaces defined by edges.

As suggested by its name, the parahippocampus is a structure next to the hippocampus. It is also a medial structure, more anterior than the hippocampus and therefore more tied to the ventral stream. A focus responsive to houses has been found in the parahippocampus near a face perception area in temporal cortex, and it is thought that the parahippocampus may be active when we move about in smaller spaces. The parahippocampus is more active in landmark navigation than it is in path integration (Aguirre 1997).

Women and girls are in general better at navigation by landmark than men and boys, and they also have better memory for the location of objects on a small surface such as a table top. Boys and men are more likely to use path integration, and they have an advantage in spatial targeting behaviors (Kimura 1992). It may be that the genders use the hippocampus and parahippocampus to different degrees.

In Aguirre's studies using computer-generated virtual reality scenes to study simulated navigation in large spaces, both genders were found to be using the right hippocampus. Most males were also using the left hippocampus, but females were more likely also to be using an area in right hemisphere frontal cortex.

Ventral areas active in perceiving wholes with parts have shown some lateralization of function. The right hemisphere is thought to respond to wholes in which parts have fixed and permanent relations (living things, cars, buildings, faces)(Kosslyn et al 1995). This sort of whole is recognizable on the basis of a part: if we see a nose we can be sure there is a face nearby (whereas if we see a t or an e we must wait to see more). The use of the right hemisphere for navigation makes sense if locations in a place are being understood as parts of a permanent whole.

The left hemisphere is thought to be the hemisphere specialized for recognizing compositional wholes, such as words made of letters that can appear in other kinds of wholes. Left hemisphere involvement in male navigation could suggest that mobile points are more important in the male sense of place.

Points, surfaces and marks on surfaces

The perception of surfaces, I argue, is radically different from the perception of markings on a surface. The former kind of perception is essential to the life of animals, but the latter is not. The former is presupposed when we talk about the latter, and we cannot understand the latter unless we understand the former. Gibson 1951/1980

We can hear sound sources at distinct distances and angles, and we can hear the motion of sounding objects relative to ourselves and relative to other sounding objects. We can in fact hear an ordered circumambient world, with sounding points -- the clink of a cup -- sounding trajectories -- a helicopter up there, a mosquito just here -- and sounding regions -- seashore in front, trees behind. We cannot ordinarily hear shapes or oriented planes.

'Facial vision', which uses the skin around the ears to sense changes in pressure, is a sense intermediate between touch and hearing. People who rely on facial vision because they cannot see are able to feel/hear surfaces if they use reflected rather than emitted pressure waves. Like bats, they can detect the shape and orientation of a large surface like a wall (Mehta 1957, 250). Unlike bats, they probably cannot hear fine surface textures.

A wall can be felt/heard by the use of reflected pressure waves in the same way it can be seen using reflected electromagnetic waves. The shape of a surface can be perceived because there are abrupt drop-offs of acoustic or optical values at its edges. The slope or orientation of a surface can be perceived because an oriented surface reflects an array of continuously changing optical or acoustic values.

An object's form, the disposition of its edges and surfaces, is its most important perceivable characteristic for most human purposes, Gibson says (1951/1980). What we see when we see edges and surfaces is the object's mass geometry, its oriented volume.

Because three-dimensional relations among surfaces and edges of a solid object are fixed, a continuous shift of viewpoint onto that object will result in continuous changes of perspective. There will be very rapid changes in relation to a small object we pick up or walk past, and slower changes in relation to a large object that encloses us, but in either case the changes will be systematic. This systematicity is part of what it is to perceive an object or space as a whole thing independent of other things.

Mass geometries discovered through peripheral vision and motion sensitivities are particularly important at the scale of whole environments, because movement perspective -- the layout of velocity gradients that invariably accompanies our own motion -- is important to the sense of interrelated relative distances that is the panorama.

[4-18 Gibson's illustration of movement perspective]

Systematic changes in the relations of edges are also important to the feeling of our own motion as continuous within a connected space. If we turn our head to the side while moving forward, velocity gradients will transform continuously: there will be smooth changes in the direction and rate of flow of contrasts across the retina as they shift from radial expansion to lateral translation. Above all, optical flow geometries give us the sense of being centered in the midst of a space we can move into.

[4-19 Radial expansion]

Retinotopic matrices at many cortical locations are capable of responding to layouts of oriented discontinuities. Matrices in many parts of visual through-streams are sensitive to velocities and directions of change of these discontinuities. Cells in areas MT and MST, on the way into the parietal, also respond to types of overall change: radial flow (contractions or expansions), deformations, rotations, and rigid translations in various directions.

Different retinotopic matrices can be responding simultaneously to different retinal field geometries -- distinguishable at different frequencies or combinations and contrasts of frequencies -- for different purposes. Gibson makes this point in relation to movement around a rectangular surface:

The perspective transformation of a rectangle, for example, was always perceived as both something rotating and something rectangular. This suggests that the transformation, as such, is one kind of stimulus information for motion and that the invariants under transformation are another kind of stimulus information for the constant properties of the object. Gibson 1982, 285

One sort of optical flow may be used in tracking an object, while another is being used to bring the arm into line with the gaze and yet another is being used to move through terrain. We do not need to see these different sorts of optical structure, although they are important in organizing us to be able to see.

Earlier in this chapter I described reach and grasp as organized by two parietal-frontal sensorimotor subsystems, the latter more ventral than the former. The two subsystems are segregated because they use different muscles, but also because they use the senses in different ways, reach using peripheral vision and shoulder and arm somatosense, and grasp using central vision and hand somatosense.

If vision for purposes of reach is vision in terms simply of a spatial location, while vision for purposes of the hand is volumetric -- also abstract, but less so -- we could go on and imagine vision for the purposes of the fingertips as a kind of texture vision, vision in terms of marks on surfaces.

I have described several vision systems, one responsive to edges, older than the other, with less spatial resolution but more contrast sensitivity, and the other (newer, color sensitive, with high spatial resolution) responsive to qualities of surfaces. The two systems normally work together to give us the sort of vision we know and love, with wondrous surface detail clothing a solid geometry of independent objects moveable like us in an enclosing place. Now suppose we could turn off the vision of marks on a surface and be left with nothing but vision of surfaces as defined by edges. How would the world look?

Like a wireframe animation, presumably, with spatial layouts very adequately understood on the basis of edges, perspectives, and the optical occlusion of some things by other, nearer, things. A minimal volumetric vision would be enough to allow us to find our way across a terrain, aim at landmarks, grasp small objects.

It would not allow us to see that our child was flushed. It would not allow us to tell the difference between ripe and green fruit, or between a burning coal and a cold one. It would be enough for many kinds of spatial action but not enough for many sorts of action decision.

Sensed spatial structure

With landmark navigation, we are aware of aiming toward our marker, which we are seeing surrounded by shapes and textures of an extended landscape. We see it, and we sense our movement toward it. Path integration or heading is less visual, it seems: we also aim ourselves toward a point but it is a point we do not see. We might say we feel it as stably there, and we feel ourselves moving in relation to it (Ingle 1993, 248).

In the section above, I described two similarly abstract kinds of spatial aboutness: audition's alertness to a surrounding world of coexistent stable and moving points, and active vision's interest in three dimensional volumes defined by edges, corners and oriented surfaces.

What does abstract mean in these instances? Path integration seems to be a sense of related points embedded in a place as we know it, a bare where of the place, a where without the many kinds of what normally included in place knowledge. Similarly, the reach system seems to need only the point of object location relative to the shoulder, and handshaping needs only located surfaces, oriented and bounded by edges. It is as if, while temporal object vision matrices are effecting a detailed perception of the thingness of things, parietal systems are tracking them as points and surfaces related to other points and surfaces.

Perception for purposes of reach and grasp is structural in the sense that, like path integration, it is a sensing of spatial order. But it is not clear how to think of this structural sensing. Should we think it as included in, but segregable from, the normal fullness of perception -- a sensing of related points, navigable surfaces, paths to intended points, oriented volumes with graspable surfaces, and so on -- something like path integration, something like the wireframe vision described above? Could we think of it as experienced, part of sentient perception, the way color vision is experienced as part of seeing? Would it make sense to say there is a higher-order, multimodal, but integral sort of dorsal perception, that is not vision as we usually imagine vision, not audition although like audition, not touch, since it occurs at a distance and without physical contact, and yet more like perception than it is like action, although action is necessary to it?

In Chapter 9 I will describe evidence of an experiencable structure subsystem of this kind -- evidence drawn from facts of grammatical form, from the phenomenology of music and language, from outline and perspective drawings, and from the early development of mathematics. This material suggests a structural or spatial sense participating in sentient vision in an effective but meagerly visual way, and also including touch, somatic self-monitoring, muscle memory, action planning, and action for purposes of perception.

We can probably assume that the core dynamical subnet of conscious vision is dominated by energetically hyped ventral matrices dealing with color, texture, object identity, and all the factual and emotional associations recruited by those energized circuits (Farah 1990, Pollen 1999). We've seen that dorsal vision, constantly active in organizing the muscular means both of action for purposes of perception and of action per se, is definitely a kind of perception; but at the same time, being magno-based, it is a kind of vision minimally included in the visual hyperactivity of the parvo stream.

If we imagine dorsal vision as part of the core dynamical subnet active when we deal with located things, but in a manner that is minimally visual and mainly something else, what might the something else be?

Touch and distance

It is significant that this specifically or abstractly spatial sense seems to be embodied at an intersection of vision, touch, proprioception and action.

Cell populations of a macaque interparietal area have combinations of sensitivity that suggest spatial perception may be inherently multimodal. Many cells in the area have somatic fields; they respond when a small area of skin is touched. These somatic cells are often also motion-sensitive: they respond preferentially when something is being moved across the skin. A large percentage have direction preferences: they may have facilitative response with movement toward some point but inhibitory response with movement away from it.

The receptive field of this sort of somatic cell is a region of skin surface, but some of these somatic cells also have a visual receptive field: they respond visually when an object moves toward their somatic receptive field. The combined visual-somatic receptive field of these cells is thus a region of skin along with a region of visible space next to the skin -- a region of space on and near the surface of the body. Spatial regions on and around the monkey's muzzle are small; visual response in these areas is to moving objects seen with the center of the eye. Spatial areas on and around the side of the head or body are larger, and visual response is to motion seen peripherally. The cells respond sooner when the object is moving faster; that is, the visual receptive field is deeper for more quickly moving objects. The sensitivities of these cells thus seem to have something to do with preparing for contact. Like many parietal cells, these somatic-visual cells may also respond during the animal's own motion. Motor response sensitivities include various kinds of arm and mouth motion; motor connections are to arm and mouth fields in primary motor cortex (Duhamel, Colby and Goldberg 1998).

Imagine a baby setting out to get its thumb into its mouth at some early stage while multimodal parietal connectivities are still being worked out. The eye feels itself focusing on a there where the thumb is feeling itself at the end of a stretched arm, while the mouth is feeling itself here, where the thumb isn't. From the here where the mouth is feeling itself, to the there where the thumb is feeling itself, is a spatial tension, which is the eye's tension as well as the arm's. Then the eye feels itself seeing the thumb getting closer, the arm feels itself folding, the torso feels the arm arrive, and the mouth feels the thumb touch down. And then arm, thumb and mouth all feel themselves here, and the eyes relax.

And then imagine, a little later, the feeling of a pointing or a reaching gesture, where the eye pulls focus to the distant object while the hand is stretched as far in that direction as it will go. The sensed thereness of the distant point will include the arm's sensed stretch from here to there, shoulder to hand, as well as the eye's sensed stretch of confocal tension maintained.

Distance in these instances is a muscular relation; different distances evoke different states of at least the muscles controlling the lens, and probably also of muscles ready for the burst of action that would be needed to attain something at that distance. Doesn't the sense of space go on having a muscular feel about it, a tension of here to there?

More than one thing

When we reach for a located object and grasp it, there is one task axis organizing both reach and grasp subsystems. Reach and grasp are two ways of being about a single object. We also very often have to be about more than one thing at a time. Integrated experience of foreground and background objects, or parts and wholes, is a common perceptual state. The monkey grooming her baby has to be about the baby while she is also, more focally, more attentively, about the nit. She will be maintaining two axes, each of which may set up an object subnet. The two axes would be organizing themselves simultaneously, maybe even on the basis of response by the same sense modality, but probably by means of networks including different matrices with different sensory mixtures, so that the two objects will be perceived in different ways. One may for instance be perceived as a foreground object while the other is perceived as a background object. She is spatially related to her baby by her posture, holding the baby with one of her arms. Her focal relation to the nit includes the other arm and its hand, as well as the center of her gaze.

The mother monkey will have to be about background and foreground objects in an integrated way. Each motor subsystem must be guided by the correct combination of senses. The timing of separate motions must be correct, their separate cycles nested. The whole body must hold itself stable while a point is held in focus by eye muscles, while an arm transports a hand to that point, and while the fingers form themselves with precision.

It is not clear how multiple task axes are established and coordinated. It seems likely that, although any task axis requires dorsal gaze system involvement, networks relative to foreground objects have more ventral participation, and networks relative to background objects are more exclusively dorsal. Would this arrangement also work for wholes and parts? In part-whole perception is there a dorsal axis tracking the whole object as if it were a background object, while a more ventrally connected subnet responds to parts or features?

Parietal lesions, particularly in the right hemisphere, can damage ability to maintain more than one task axis; they result in various kinds of dorsal simultanagnosia. If two pencils are held up in front of the simultanagnosic, he or she may see only one of them. Shown a reversible figure, the simultanagnosic is unable to reverse it. If a single object (a flag for instance) has parts with strong forms of their own, the simultanagnosic may not be able to see the object as a whole (Kosslyn 1987, 152).

Shifts and sequences of shifts

In talking about spatial function we need above all to account for the sense we have that things or points are related: to each other as well as to ourselves: the bird chirping above the stream over there, the handle on this side of the cup in front of us, those two people talking to each other. Aguirre calls it configural ability. Configural ability is a crucial basic level ability, and very little is known about it.

Because dorsal simultanagnosics cannot see more than one item at a time, they also cannot see spatial relation. Simultanagnosics shown a drawing or photo including more than one object may be able to see first one object and then another, but they will be unable to understand the picture as a narrative, because they will not be able see those objects as related to each other.

Perceiving relatedness, which is to say perceiving things and parts of things as related, is normally not separate from perception of multiple things or whole scenes. If we are able to see different objects at the same time, each at its own location, do we automatically see them as related? Or is the sense of relation something else, something added?

Sequences of gaze shifts must be a way of maintaining structure relevant to separate parts of a task. We look from the cutting board to the salad bowl and back. Social tasks often need this sort of sequence too: we look at someone, and then look at the thing we are talking to each other about. It is a way of maintaining relatedness to both.

Maybe we register relation in the process of looking from one thing to the other, so that spatial relatedness would have the feel of a gaze trajectory. If spatial relation is sensed by means of eye motion, it would be sensed by means of the eyes but not exactly seen -- maybe more like felt and remembered, a combination of felt tensions of the eye and other orienting parts of the body, and some kind of integrated muscle or act memory. We would see the relation of the table and chair by seeing the table, seeing the chair, looking back and forth between the two of them, and accumulating a sort of integrated memory of the direction, distance, sequence and feeling of those saccades and fixations.

 

 


Chapter 5. Imagining