Vision is the sense we most depend on in our daily lives, and it is complex - despite the huge strides recently made in artificial intelligence and image processing, the way our brains process images is vastly superior. So how do we do it?

From the eye to the brain

The axons of ganglion cells exit the retina to form the optic nerve, which travels to two places: the thalamus (specifically, the lateral geniculate nucleus, or LGN) and the superior colliculus. The LGN is the main relay for visual information from the retina to reach the cortex. Despite this, the retina only makes up about 20% of all inputs to the LGN, with the rest coming from the brainstem and the cortex. So more than simply acting as a basic relay for visual input from retina to cortex, the LGN is actually the first part of our visual pathway that can be modified by mental states.

The superior colliculus helps us to control where our head and eyes move, and so determines where we direct our gaze. Saccades, the jumpy eye movements that you are using as you read this text, are also controlled by the superior colliculus. As with the LGN, the superior colliculus receives strong input from the cortex, which provides the dominant command as to where our gaze moves.

Cortical processing of visual input

From the thalamus, visual input travels to the visual cortex, located at the rear of our brains. The visual cortex is one of the most-studied parts of the mammalian brain, and it is here that the elementary building blocks of our vision – detection of contrast, colour and movement – are combined to produce our rich and complete visual perception.

Most researchers believe that visual processing in the cortex occurs through two distinct 'streams' of information. One stream, sometimes called the What Pathway (purple in the image below), is involved in recognising and identifying objects. The other stream, sometimes called the Where Pathway (green), concerns object movement and location, and so is important for visually guided behaviour.

brain's visual pathway

Selket/Wikimedia

Building our visual world step by step

Our visual cortex is not uniform, and can be divided into a number of distinct subregions. These subregions are arranged hierarchically, with simple visual features represented in 'lower' areas and more complex features represented in 'higher' areas.

At the bottom of the hierarchy is the primary visual cortex, or V1. This is the part of visual cortex that receives input the thalamus. Neurons in V1 are sensitive to very basic visual signals, like the orientation of a bar or the direction in which a stimulus is moving. In humans and cats (but not rodents), neurons sensitive to the same orientation are located in columns that span the entire thickness of the cortex.

That is, all neurons within a column would respond to a horizontal (but not vertical or oblique) bar. In a neighbouring column, all neurons would respond to oblique but not horizontal or vertical bars (see image below). As well as this selectivity for orientation, neurons throughout most of V1 respond only to input from one of our two eyes. These neurons are also arranged in columns, although they are distinct from the orientation columns. This orderly arrangement of visual properties in the primary visual cortex was discovered by David Hubel and Torsten Wiesel in the 1960s, for which they were later awarded the Nobel Prize.

primary visual cortex

Orientation columns in primary visual cortex, as viewed from above. All neurons within a column respond preferentially to bars of a specific orientation, denoted here by colour. Crair et als/Wikimedia 

Moving up the visual hierarchy, neurons represent more complex visual features. For example, in V2, the next area up in the hierarchy, neurons respond to contours, textures, and the location of something in either the foreground or background.

Beyond V1 and V2, the pathways carrying What and Where information split into distinct brain regions. At the top of the What hierarchy is inferior temporal (IT) cortex, which represents complete objects – there is even a part of IT, called the fusiform face area, which specifically responds to faces. The top regions in the Where stream are involved in tasks like guiding eye movements (saccades) using working memory, and integrating our vision with our body position (e.g. as you reach for an object).

In summary, the visual cortex shows a clear hierarchical arrangement. In lower areas (those closest to incoming light, like V1), neurons respond to simple visual features. As the visual input works its way up the hierarchy, these simple features are combined to create more complex features, until at the top of the hierarchy, neurons can represent complete visual objects such as a face.

Visual processing isn’t all one way

This bottom-to-top processing of our visual world may seem the logical path, but it isn’t the whole story. Such a 'bottom-up' approach would be far too slow and laborious, but more importantly, it would render our visual world full of ambiguity and we would struggle to survive. Instead, our perception relies to a very large extent on our previous experience and other 'top-down' mechanisms such as attention. QBI Professors Jason Mattingley and Stephen Williams are both studying how attention can alter visual processing, using cognitive and cellular approaches, respectively.

As an example of top-down processing, consider the image below:

top-down processing

Wuhazet - Henryk Żychowski

Square A looks lighter, but is actually darker than square B. Clearly, our visual system is doing a terrible job at seeing reality. But that isn’t its purpose. Instead, our brains are trying to make sense of what they are seeing, rather than seeking the truth.

In the case of the above image, we automatically see – based on past experience – light and dark squares arranged in a checkerboard fashion, with a centrally lit portion and a shadow cast around the edges. With all of this information, we interpret A as a light square in shadow, and B as a brightly lit dark square. It isn’t reality, but it is the most likely explanation given all of our previous experience and the data at hand. This is how our visual system works, ultimately to help us understand the world and so promote our survival.

Here are the squares side by side: 

AB squares


And to end, just because it’s fun, here’s a video with a great demonstration and explanation of how powerful top-down processing can be in visual perception.