Vectors of Cognitive AI: Attention

Jan 20, 2022, 9-11 am PST

YouTube link to panel recording

Panelists: Michael Graziano, Jonathan Cohen, Vasudev Lal, Joscha Bach

The seminal contribution "Attention is all you need" (Vasvani et al. 2017), which introduced the Transformer algorithm, triggered a small revolution in machine learning. Unlike convolutional neural networks, which construct each feature out of a fixed neighborhood of signals, Transformers learn which data a feature on the next layer of a neural network should attend to. However, attention in neural networks is very different from the integrated attention in a human mind. In our minds, attention seems to be part of a top-down mechanism that actively creates a coherent, dynamic model of reality, and plays a crucial role in planning, inference, reflection and creative problem solving. Our consciousness appears to be involved in maintaining the control model of our attention.

In this panel, we want to discuss avenues into our understanding of attention, in the context of machine learning, cognitive science and future developments of AI.

Chat transcript can be found here.

Program

Vasudev Lal

Presentation slides can be viewed here.

Attention in Transformers: Transformers with the paradigm of key, value, query (k,v,q)  based attention have had a disruptive impact on Natural Language Processing (NLP). Recently, they have been applied with great success to multimodal problems e.g. vision-language tasks such as Visual Question Answering (VQA), Image/Text Retrieval, Image Captioning, etc. In the context of multimodal Transformers, attention can be viewed as a mechanism that helps in multimodal fusion: aligning the same concept in different modalities. Pre-training tasks of multimodal Transformers are designed to bring low level unimodal inputs into a common ‘amodal’ semantic space. Multimodal attention serves as a convenient mechanism to mix the representation of these unimodal tokens. I will use interactive tools developed at Intel Labs to show how attention in multimodal Transformers effectively brings about concept-level vision language alignment. I will also review recent trends that seek to make attention in artificial neural networks more top-down, eg: trajectory attention for videos, as well as using attention to fuse external knowledge into neural networks.

Speaker Bio:

Vasudev is a Research Scientist at Intel Labs where he leads the Multimodal Cognitive AI team. He and his team develop AI systems that can synthesize concept-level understanding from multiple modalities: vision, language, video and audio. His current research interests include equipping deep learning with mechanisms to inject external knowledge, and self-supervised training at scale for continuous and high dimensional modalities like images, video and audio. Prior to joining Intel, Vasudev obtained his PhD in Electrical and Computer Engineering from the University of Michigan, Ann Arbor.

_________________________________________________

Jonathan Cohen

Presentation slides can be viewed here.

Attention — an intuitively accessible construct — has been equally ephemeral when subjected to close scientific scrutiny.  Work using neural network architectures to develop models of attention — and the closely associated construct of cognitive control — that address empirical phenomena concerning human performance suggest that there is an intimate relationship between attentional capabilities and the structure of the representations — both on which it depends and over which it presides.  This relationship is an instance of a broader, more fundamental relationship between the statistical structure captured by semantic representation, and the capacity for control — a relationship that is productively informed by studying the kinds of representations that develop in neural network architectures, and how control operates in such architectures to guide processing.

Speaker Bio:

Jonathan Cohen is Robert Bendheim and Lynn Bendheim Thoman Professor in Neuroscience and Co-Director of Princeton Neuroscience Institute.  

Professor Jonathan Cohen’s research focuses on the neural mechanisms underlying cognitive control, and their relationship to the human capacity for general intelligence. Cognitive control is the ability to guide attention, thought and action in accord with goals or intentions. One of the fundamental mysteries of neuroscience is how this capacity for coordinated, purposeful, and flexible behavior arises from the distributed activity of many billions of neurons in the brain. Several decades of cognitive and neuroscientific research have focused on the mechanisms by which control influences processing (e.g., attentional effects in sensory processing, goal directed sequencing of motor output, etc.), and the brain structures upon which these functions depend. However, we still have a poor understanding of how these systems give rise to cognitive control and intelligence. Our work seeks to develop formally rigorous, mechanistically explicit hypotheses about the functioning of these systems, and to test these hypotheses in empirical studies. Understanding how the human brain gives rise to the remarkable flexibility of the human mind is one of the greatest challenges in science, and work in our laboratory is leveraging the convergence of research in neuroscience, psychology, and computer science that is addressing this challenge. Progress in this area promises both to deepen our understanding of how the human brain gives rise to the mind, and to serve as the foundation for long sought rational approaches to the treatment of neuropsychiatric disorders, as well as the design of machines that can interact more naturally and productively with humans. Professor Cohen holds a B.A. in Biology and Philosophy from Yale University, an M.D. from University of Pennsylvania, and a Ph.D. in Cognitive Psychology from Carnegie Mellon University. He joined the Princeton faculty in 1998. He has been conferred the highest awards for research in psychology, including the American Psychological Association’s Distinguished Scientific Contribution Award and the William James Fellow Award from the Association for Psychological Science.

_________________________________________________

Michael Graziano

Presentation slides can be viewed here.

A Conceptual Framework for Consciousness: Neuroscientists understand the basic principles of how the brain processes information. But how does it become subjectively aware of at least some of that information? What is consciousness? In my lab we are developing a theoretical and experimental approach to these questions that we call the Attention Schema theory (AST). The theory seeks to explain how an information-processing machine could act the way people do, insisting it has consciousness, describing consciousness in the ways that we do, and attributing similar properties to others. AST begins with attention, a mechanistic method of handling data. In the theory, the brain does more than use attention to enhance some signals at the expense of others. It also monitors attention. It constructs information – schematic information – about what attention is, what the consequences of attention are, and what its own attention is doing at any moment. Both descriptive and predictive, this “attention schema” is used to help control attention, much as the “body schema,” the brain’s internal model of the body, is used to help control the body. The attention schema is the hypothesized source of our claim to have consciousness. Based on the incomplete, schematic information present in the attention schema, the brain concludes that it has a non-physical, subjective awareness. In AST, awareness is a caricature of attention. In addition, when people model the attention of others, we implicitly model it in a schematic, magicalist way, as a mental energy in people’s heads. Our deepest intuitions about consciousness as a hard problem, or as a mystery essence, may stem from the brain’s sloppy but functionally useful models of attention.

Speaker Bio:

Michael Graziano is a professor of neuroscience and psychology at Princeton University. He is also a writer, composer, and occasional ventriloquist. He is the author of many books (both novels and neuroscience books), and has written for the Atlantic, the New York Times, the Wall Street Journal, and other media outlets. His research at the Princeton Neuroscience Institute has spanned topics from movement control to how the brain processes the space around the body. His current work focuses on the brain basis of consciousness. He has proposed the Attention Schema Theory of consciousness, a mechanistic explanation of how brain-based agents believe and insist they contain consciousness inside them, and how that self model is useful for effective functionality.

_________________________________________________

Joscha Bach

Current progress in machine learning is in no small part driven by attention based models, especially the Transformer architecture. The core insight embodied in the Transformer is the need to learn which features to select in a given context. While the sensitivity to individual elements of perception is an important aspect of attention, it is far from the only function. Can we get to a deeper perspective on attention, by drawing from insights from psychology, neuroscience and philosophy of mind, and translate it into more AI architectures that are capable of more efficient learning and more powerful inference? Here, we propose to treat attention as an agent, complementing the distributed nature of perception with centralized abilities for the construction of a coherent reality.

Speaker Bio:

Joscha Bach, PhD, is a cognitive scientist and AI researcher with a focus on computational models of cognition and neuro-symbolic AI. He has taught and worked in AI research at Humboldt University of Berlin, the Institute for Cognitive Science in Osnabrück, the MIT media lab, the Harvard Program for Evolutionary Dynamics and is currently a principal AI researcher at Intel Labs, California.