As research in the field of perceptual experience advances, neuroscientists continue to uncover new methods to see through the eyes of human beings – almost literally. Using functional magnetic resonance imaging (fMRI), neuroscientists can observe brain activity while someone performs a task or responds to a stimulus.

This method of analyzing the brain allows researchers to create categories of perception by matching brain activity, measured by fMRI, to a specific stimulus, such as an image. For example, if a student is shown an image of a husky, this stimulus and the corresponding fMRI data collected from the student are combined to form the category ‘husky’. This pairing forms a unique category that can be used as a template for examining and, in a sense, translating future patterns.

Though the technique can yield important data and lead to interesting findings, there are limitations. The foremost of which is the virtually infinite number of possible categorizations. Developing a single pairing for each stimulus (e.g. husky, basket, airplane, run) severely impedes fMRI research, almost making such progress impossible.

In addition, establishing and distinguishing between baseline neural activity and stimulus induced activity further complicates the creation of perceptual categories.

By using quantitative receptive field models, researchers have developed a possible solution to the categorization problem. These models take into account various elements of our receptive field, such as orientation and spatial frequency, and measure them quantitatively to better reconstruct visual stimuli we perceive. Dr. Kay and his colleagues at the university California, Berkeley, recently developed a broader quantitative receptive field model by gathering data from people in an fMRI machine who were introduced to over 1,000 natural images.

After integrating this data and creating a visual decoder that translates fMRI data into a reconstructed image, they demonstrated that any novel image could be identified based on the data from the model’s initial set of images.

Gathering fMRI data from 1,000 images can be time consuming, however, and still sets a limitation on the utility of the model. Thus the current primary goal of visual image reconstruction is the product of an optimal visual decoder that could better predict a future image from a limited amount of fMRI data.

A group of Japanese neuroscientists, headed by Yoichi Miyawaki and Hajime Uchida at the National Institute of Information and Communications Technology, Kyoto, have gained new ground through the novel development of a system based on a combination of multi-scale local image decoders. Initially, a person is shown an image that triggers neural activity in multiple regions of the occipital cortex.

These signals can be broken down into many 3D pixel, or voxel, patterns, which are further broken down by visual decoders called multi-scale local image decoders. These decoders can quantitatively translate fMRI data into a pattern of contrasts, like a mosaic with squares of varying shades. One mosaic, or scale, can be comprised of large squares organized in a 2×2 manner, giving a rather blurry general impression. Another mosaic might be more refined in a 2×1 lay- out of squares, or even more so in 1×1 squares. These multiple scales can be integrated to better recon- struct a contrast pattern of the image. This contrast pattern provides a close approximation of the actual image the person was shown.

This new method enabled scientists to accurately reconstruct random images (up to 2100 possible images) in a single trial. For example, a person in an fMRI scanner was shown an image of the word “neuron” in white letters on a black background. Various fMRI activity patterns were recorded from the person’s visual cortex, and the new model analyzed, interpreted, and integrated these patterns to produce a reconstructed image.

After this process was repeated over five trials, the mean reconstructed image resembled the word “neuron”, although there was a great deal of fuzziness around the edges of the letters. This same process was repeated for simple geometric images like black and white squares and crosses, all with similar results.

While this new model has provided some of the clearest images in the field of visual image reconstruction, it is far from perfect. The system lacks precision, and as such produces noisy images with low resolution. Despite these limitations, this research can be considered a step forward in refining current models of reconstruction. With any luck, the continued efforts of scientists may bring things into focus.