(Very) Quick summary of recent papers

Language processing: We use naturalistic neuroimaging experiments and encoding models to study language comprehension. We built on our 2014 work in aligning neural network language models with brain activity by showing that increasing the alignment with brain activity can lead to better NLP performance, and by directly fine-tuning language models with brain activity recordings. We have also used naturalistic experimentation and encoding models to show that language comprehension difficulty recruits the language network and not multiple demand regions, to show that syntactic information is distributed in the language network in places that also process semantics, and to show that a naturalistic context leads to much more broad representations of word meaning than more simple contexts. We have proposed the use of post-hoc computational controls, which have revealed that the anterior and posterior temporal lobes are predicted by new meaning that arises from combining words, but that this relationship is only visible in fMRI and not MEG. Further, we show that the encoding models we build can be used to perform in silico replication experiments that combine the interpretability of controlled experiments with the generalizability and broad scope of natural stimulus experiments. Finally, we propose hypotheses for what types of language input is processed differently by language models and the brain, and validate these hypotheses by showing that fine-tuning on relevant tasks makes the language model more aligned with the brain.

Vision: We have studied the effect of the training task of a convolutional neural network (CNN) on its alignment with brain activity and used it to derive relationships between different computer vision tasks from the perspective of their brain alignment, and to show that training CNNs with language supervision and with high data diversity leads to powerful models of the higher visual cortex. We have also constructed powerful hypothesis-neutral models of high-level visual cortex by training CNNs end-to-end to predict fMRI activity and used them to provide strong evidence of categorical selectivity, as well as to showcase spatial preferences of different brain regions. We have shown that higher-visual cortex shows biases for low-level features related to preferred categories, and have identified a new region in the ventral stream that processes food. Further, we have focused on characterizing features that are important for the visual system, by focusing on the representation of mid-level features, investigating the representation of object size and uncovering the spatial features most important for individual voxels. Finally, we have worked on generating optimal images for different brain regions using a diffusion model as well as generating the captions of the optimal images for each voxel, with both methods providing a way to hone into the semantic selectivity of sub-areas of the visual system.

Methods: We believe that naturalistic experiments and computational modeling are promising tools for investigating brain function. We have proposed multiple extensions to encoding models such as the use of stacking to combine multiple feature spaces and specific instantiations of variance partitioning to test more precise hypotheses and the use of data from multiple subjects to denoise MEG responses. In older work, we have shown that the spatial pattern of regularization parameters learned by cross-validation of many types of encoding models closely follows the pattern of prediction accuracy. We have also extended encoding models by proposing approaches to compare the learned representations between two brain regions and more confidently make conclusions about the effect of the stimulus on brain activity. Finally, we have proposed an approach for incorporating tasks effects into a computational model as an attention mechanism.

Health and real-world application: We have shown that encoding models, beyond being useful for identifying commonalities amongst subjects, can also be used to identify individual differences that predict behavior and clinical diagnoses. We have also shown that the MEG data of an individual can be used to identify them and that this identification ability is maximized during a task and in areas engaged in that task. In a more clinical setting, we have worked on making machine learning useful for classifying intra-operative neuromonitoring signals to prevent nerve damage. Finally, we have proposed a transformer model for motor BCI data that is pretrained on neural spiking data from different subjects, sessions and experimental tasks, and is rapidly adaptable to downstream decoding tasks.





Aligning representations from artificial networks and
real brains



Success in AI is often defined as achieving human level performance on tasks such as text or scene understanding. To perform like the human brain, is it useful for neural networks to have representations that are similar to brain representations?

In these projects, we use brain activity recordings to interpret neural network representations, to attempt to find heuristics to improve them, and even to change the weights learned by networks to make them more brain like. The results promise an exciting research direction.





The spatial representation of language sub-processes



In this project, we use functional Magnetic Resonance Imaging (fMRI) to record the brain activity of subjects while they read an unmodified chapter of a popular book. We model the measured brain activity as a function of the content of the text being read. Our model is able to extrapolate to predict brain activity for novel passages of text - beyond those on which it has been trained. Not only can our model be used for decoding what passage of text was being read from brain activity, but it can also report which type of information about the text (syntax, semantic properties, narrative events etc.) is modulating the activity of every brain region. Using this model, we found that the different regions that are usually associated with language appear to be processing different types of linguistic information. We were able to build detailed reading representations maps, in which each voxel is labeled by the type of information the model suggests it is processing.

Our approach is important in many ways. We are able not only to detect where language processing increases brain activity, but to also reveal what type of information is encoded in each one of the regions that are classically reported as responsive to language. From just one experiment, we can reproduce a mutiple findings. Had we chosen to follow the classical method, each of our results would have required its own experiment. This approach could make neuroimaging much more flexible. If a researcher develops a new reading theory after running an experiment, they would annotate the stimulus text accordingly, and test the theory against the previously recorded data without having to collect new experimental data.





The time-line of meaning construction



To study the sub-word dynamics of story reading, we turned to Magnetoencephalography (MEG), which records brain activity at a time resolution of one millisecond. We recorded the MEG activity when the subjects undergo the same naturalistic task of reading a complex chapter from a popular novel. We were interested in identifying the different stages of continuous meaning construction when subjects read a text. We noticed the similarity between neural network language models which can ``read" a text word by word and predict the next word in a sentence, and the human brain. Both the models and the brain have to maintain a representation of the previous context, they have to represent the features of the incoming word and integrate it with the previous context before moving on to the next word.

We used the neural network language model to detect these different processes in brain data. Our novel results include a suggested time-line of how the brain updates its representation of context. They also demonstrate the incremental perception of every new word starting early in the visual cortex, moving next to the temporal lobes and finally to the frontal regions. Furthermore, the results suggest the integration process occurs in the temporal lobes after the new word has been perceived.


//