Support the show to get full episodes and join the Discord community.
Mike and I discuss his modeling approach to study cognition. Many people I have on the podcast use deep neural networks to study brains, where the idea is to train or optimize the model to perform a task, then compare the model properties with brain properties. Mike’s approach is different in at least two ways. One, he builds the architecture of his models using connectivity data from fMRI recordings. Two, he doesn’t train his models; instead, he uses functional connectivity data from the fMRI recordings to assign weights between nodes of the network (in deep learning, the weights are learned through lots of training). Mike calls his networks empirically-estimated neural networks (ENNs), and/or network coding models. We walk through his approach, what we can learn from models like ENNs, discuss some of his earlier work on cognitive control and our ability to flexibly adapt to new task rules through instruction, and he fields questions from Kanaka Rajan, Kendrick Kay, and Patryk Laurent.
Michael Cole. Uh, you and I go back a few years, welcome to the
Michael 00:04:01 Podcast. Thanks for having me on.
Paul 00:04:04 So, uh, we, well, I say we go back a few years. Uh, it’s more like, uh, I’ve just been admiring you from afar. I guess you were one year ahead of me in graduate school at the CNBC at Pitt and CMU. And you’ve gone on to be many years ahead of me. It turns out,
Michael 00:04:21 I don’t know about that. You’re, you’re pretty, I don’t know, like intellectually, I feel like there’s, I’ve been following you from afar, I guess I should say in the form of the podcast, as soon as I heard you had this podcast, I started listening. I haven’t heard all your episodes yet, so many, but I I’ve heard, but, uh, yeah, I dunno, like I can, I can see that you’ve really, uh, expanded your horizons and I’m a little jealous that you have, like the time and, and, uh, I guess, space to be having these really awesome conversations with such a variety of people.
Paul 00:04:58 Well, uh, well today’s topic is about, uh, the jealousy that I have for you and what you’re doing, so
Michael 00:05:04 Let’s kill it.
Paul 00:05:05 Yeah. Focus. Let’s focus on that. So back when I knew you, um, you were a cognitive control guy back with Walt Schneider or early on, but, uh, I’d like to just, um, pick your brain about how you see your sort of career trajectory and, um, alongside that, just how your interests have changed, uh, over time.
Michael 00:05:28 Um, yeah, sure. So how far back should we go? Um, I actually got into cognitive control and marked Esposito’s lab at UC Berkeley. That’s where I went to undergrad and I didn’t know much about it. Uh, when I volunteered in that lab, I mean, I, I learned about it in class. Um, so I was a cognitive science major, so that was some multidisciplinary major at UC Berkeley, um, and actually out with more interest in psychology and computer science. And then I was forced to take these neuroscience classes. I ended up shifting my interests toward sores there. So I had this kind of computational and, you know, cognitive psychology kind of bent to my interest in neuroscience early on. And I volunteered in Marquez DDoS lab. Um, so full-time RA for a little bit and then started working with Walt Schneider. And, um, while I was there, I got more and more into computational topics.
Michael 00:06:34 Um, but kind of like indirectly, I guess I’ve been reading papers and building little models for a long time, but not publishing much on that, but it’s really shaping my and shaping my thinking over the years. Um, and so, uh, while it was there, um, so while, um, back in the 1970s, I had, uh, come up with this controlled versus automatic processing dichotomy that really influenced things a lot. And so you would talk a lot about the real basics of like, what is controlled processing, what’s automatic processing. And, um, out of that, I realized, um, there hadn’t been much exploration of, one of the definitions of controlled processing is novel task performance. And so that led me into rapid instructed tasks learning, which is there wasn’t much for almost any work done at the time when it first started thinking about it and talking to Walter about it. And so, yeah, that ended up being my dissertation. So that’s technically, you know, almost by definition conduct control is just a different, it’s not like conflict kind of stuff is typically talked about, um, with kind of control or
Paul 00:07:44 So what, we’re going to talk a little bit about, um, riddle, rapid instructed task learning. So what is that? And, you know, in the, in the sort of big picture, uh, what have you found, and I know that it has stayed with you throughout your career,
Michael 00:07:59 Right? Yeah. So riddle stands for rapid instructed tasks learning and, uh, it’s something that we actually do all the time in everyday life. So for instance, like playing a new game that you’ve never played before, like monopoly, there’s maybe some rules that someone tells you about the game, and then you can rapidly integrate those altogether and play this game. And it may sound kind of trivial because, you know, I’m using a game example, but it’s really in everyday life, we do all sorts of things like cooking new recipe, um, or use new technologies. So it’s not just about being able to understand the words it’s about being able to transfer previous knowledge into new contexts. So you get a new smartphone, you don’t have to start from scratch and kind of do this trial and error learning. You can transfer knowledge that you already had and also, um, have the instructions kind of prompt you on what kind of transfers to have, um, different kinds of information that you’ve actually learned before could be relevant in a new context. And it’s really interesting to me, both from a computational perspective, and also just in terms of, I guess, I guess contrasting humans with machines and humans versus animals, um, the animals can’t do it. It’s up for it turns out I ended up adding this with a figure, um, to my dissertation. Cause I thought it was interesting. There’s one, uh, bonobo, chimpanzee named Kanzi who can do it. So it’s possible in animals
Paul 00:09:34 Not to sign right?
Michael 00:09:35 English words, English words for simple little tests, English words. Yeah. He’s a, he’s like a genius Chimp. So, so it’s possible, but yeah, there’s just like one genius animal that can do it. Um, pretty much.
Paul 00:09:50 So the idea right is to have a massive set of different possible tasks that an organism or a machine could perform. And then you instruct, uh, whatever task to perform. You instruct it. And this happens just kind of interleaved, right? Where you say, do this task, all right. Now do this task. Now do that task. And the idea is you have to be able to switch between them, which takes a lot of cognitive control.
Michael 00:10:15 Yeah. I mean, th there is a distinction between the theoretical like topic or construct or however you want to put it of rental, which is just things that we do every day. Like, um, I dunno, like get directions to go to a new grocery store or something, or, or kind of arbitrary things you could have people do. Like, um, I dunno, sing the national Anthem while jumping on one leg or some things that you’ve never done before that you could clearly do immediately. There’s this whole set of them, but there’s of course limits to that. And so the problem was how do we translate that into an actual systematic way to research it empirically? And so that’s where this kind of a paradigm that I developed with Walt started from an idea is this, uh, cognitive tests that we came up with to investigate riddle systematically.
Michael 00:11:12 The idea is that we have these little tasks with three rules each. And the key is that we want them to be kind of arbitrary and complex enough that we were really sure that participants haven’t actually done them before. So they’re novel, right? Uh, there’s something to be learned, but we also want them to be learnable very rapidly, uh, for humans. And so we have, uh, an example here would be both vertical left index. If those are just beep cues that you’d see on the screen. And then what that means is if both stimulate your vertical press, your left index finger, so then two stimulator would come up in this case, there are these vertical or horizontal bars. You see two vertical bars in this case, in this example. And so the answer would be true and you’d press your left index finger. And so we can swap out these different stimuli and different rules.
Michael 00:12:05 Um, so another example task would be neither red left index. So that means if neither stimulus is red, press your left index finger. And so you’d press, uh, and if it’s not true, you’d press your left middle finger. And then so you see like a blue vertical bar and a red horizontal bar. So it’s not true that neither seamless is red. So you’re pressed here, your left middle finger in that case. And so, uh, you know, it’s not super easy to do, but it has fins can do this on the first try if you can believe it, uh, about chance. And that key is that we’re moving across, you know, think of it like a state space, right? You have all these different combinations of rules and it’s systematically, we’re systematically traversing it cognitively, right? In terms of information processing. And we want to be looking at the brain while that’s happening and you can see the brain updating and the systematic way we can do all sorts of things, like look at changes in activity patterns and the functional connectivity patterns with task, state, functional connectivity. And then that’s, um, led to, uh, what we call the flexible hub theory that we’re really testing. And some of these papers where, um, we have these kind of control networks that, um, are really highly distributed. And we’ve found evidence that there have hubs in them that update global processing systematically as you perform different tasks, like this,
Paul 00:13:29 Say a few more words about flexible hub theory, because I’m not sure that we’re going to be talking about it a ton, but I think it’s a neat and important. So can you just talk about that a little bit more about what it is and what you found?
Michael 00:13:41 Yeah, so the flexible hub theory is really building on some older theories. Uh, uh, one is called the, uh, guided activation theory by Miller and Cohen. So that was back in 2001 and it actually got, it goes all the way back to some artificial neural networks and the early nineties, they were really focused on lateral prefrontal cortex and, um, how they represent their, it represents contexts or tasks, rules. And, uh, so what we’ve done with this theory is really expand that to entire kind of control networks, so much more distributed. Uh, we’re also emphasizing global brain connectivity or hub newness. Um, cause we’re really thinking about in a context of something like riddle, you need to be able to rapidly update your global information processing. And so how is that going to happen? How are you going to even coordinate that sort of update? And so there’s a lot of evidence from neuroimaging that these kinds of control networks are involved in that sort of thing.
Michael 00:14:38 And also from lesion studies, uh, from neurology and neuro-psychology. Um, so if you ablate these regions, you have major problems with things like Rydell and, uh, fluid intelligence generally. Uh, so that’s one thing. So we’re really emphasizing the connectivity here. Um, there’s also flexible connectivity. So we’re looking at task state functional connectivity, uh, how it updates, um, uh, the connectivity, how the brain regions interact update, and then finally a more of a computational property is called compositional coding. So the idea is that you don’t just totally update with new connections and new activity patterns. Every time you do a new tasks, you actually reuse, uh, activity patterns and connectivity patterns that you’ve done before. And the kind of the kind of format of the pro paradigm lends itself to this, right? Cause we’re reusing different rules in different contexts, making new tasks sets that have never been performed before. And you can see that a lot of the same information patterns are being reused when we actually look at the connectivity and activity patterns. So altogether, um, yeah, that, that kind of builds this theoretical framework that we call the flexible hub theory.
Paul 00:15:53 All right. Well take a deep breath because I’m going to play you the first guest question here. Yeah. Well, because it’s, it’s, it’s from the rapid instructed task learning era of your life. And I think you’ll recognize who it is. I’ll, I’ll say who it is after I, after the question.
Speaker 4 00:16:12 Hi, Mike Patrick here, big fan of your work ever since graduate school at the centers for neuroscience and the neural basis of cognition, Pittsburgh. Wondering if you could share ideas on how riddle could be used to study a couple of phenomenon on the one hand free will, which Al narrowly define as the emergence of an instruction rather than an explicit given instruction through language perhaps synthesized by activity and another brain region, or even a neuromodulatory process. So not quite a random goal, but rather an intended goal do free will representations like the decision to go to a particular goal or perform a particular action look like commanded instructions. And on the other hand, stubbornness or a steadfastness, the preservation of instruction or goal in the face of distracting, disruptive, or even goal related inputs, do those representations look similar to the commanded representations?
Paul 00:17:26 All right, Mike, we’ve only got three hours to do this. So that’s Patrick Lauren, uh, an old friend of both of ours, who is the director of emerging technology at DMGT, which is a British, private holding company. All right.
Michael 00:17:46 Starting
Paul 00:17:46 With an easy one, right. That’s why I wanted to introduce riddle because,
Michael 00:17:56 Um, so yeah, there are a few things I could say about that. Well, one is that we could think of our ability to flexibly switch our goals into novel scenarios as like on a continuum with red over. So the way we’ve been studying it is just kind of taking the shortcut for, uh, experimental convenience of giving the instructions and just having the participants like do this task correctly or incorrectly and systematically explore the space. But, uh, I have thought quite a bit about what if we had the participants select their own tasks or explore some space of tasks and then I guess, yeah, that would open things up to more for your will. There is a literature on task switching. So there’s a relationship between all this stuff and task switching. So those would be between two familiar tasks, um, uh, as opposed to a novel task.
Michael 00:18:55 And there’s a whole literature on, if you let the participant choose whether to switch to the other task or not, and then there’s different brain responses to those two things. And, you know, it’s not my exact area, so I can’t really describe what the differences are in detail, but, um, uh, they tend to be from what I recall and what we call cognitive control network. So there’s some kind of additional control processes that are involved in making that decision. Um, yeah, I don’t, I’m not sure what their, what was the second part of the question? It was, uh, it was a very, uh, stubbornness, right? Huh? Yeah. I mean, there’s a whole literature on, uh, perseveration with, uh, that’s related to cognitive control. So it’s the inability to switch to switch the current rule that you’re supposed to be applying in a given task. It kind of gets stuck in one state.
Michael 00:19:58 So I don’t know, I, I kind of wonder if, uh, stubbornness might be actually default and you need conduct control and higher order cognition. That kinda jump out of that, unless there’s some kind of stimulus driven reward thing that would pop you out of the state, but yeah, kind of, I guess I’m trying to read between the lines. I think Patrick might be asking about self-instruction kind of, and, and I, yeah, I think that is pretty compelling. Um, and maybe like an evolution when we evolved the ability to, um, represent is kind of task instructions in this really flexible way. We’ve probably got both some ability to, uh, think of novel tasks and perform them and also be instructed from others. Um, either through imitation or language yeah.
Paul 00:20:53 With the rapid instructor task learning. I mean, like you already said, you create these artificial scenarios where you have to, um, whereas really instruction dependent and there’s a lot of control having to even understand the instructions and put them together. So, so free will right. Internally generated internally motivated, uh, you know, whatever free will really is, but that internal, um, self organ, you know, wherever it comes from, some sort of self-organization process where it’s generated internally, internally, um, I don’t know how much do you think that it would overlap? Um, network-wise with like the riddle types of networks,
Michael 00:21:35 I guess I’m kind of thinking of there’s, there’s some set of mechanisms that, um, you know, the language network would, uh, interact with and these kinds of control networks sometimes about pre lateral prefrontal cortex, uh, poster, product cortex, um, mid simulate, pardon me, mid singular cortex. Um, and what would happen in the case of instruction? Uh, external instruction would be through, you know, auditory cortex, the language court, uh, the language network, and then to kind of control network. And then maybe, I don’t know where it would start maybe orbital frontal cortex or something where you’re representing, you know, the reward that you might receive. If you do this other thing, it might drive selection of, of a set of task rules or strategies for getting those rewards. Yeah. I mean, freewill is a tricky thing. Like, um, obviously, but I guess my own, you know, as a neuroscientist, I think most neuroscientists, I don’t want to speak for everyone, but, um, there’s this sense that I get when I speak to neuroscientists about freewill is that there is no freewill or there’s it’s it’s, uh, mechanisms all the way down that, you know, are, are largely determined just by one’s history and, and physical reality.
Michael 00:22:57 And so I don’t think maybe Patrick continued to as to go down that rabbit hole that deep, but like, um,
Paul 00:23:04 Oh, I bet he did, but
Michael 00:23:06 He did bring it up pretty well, but yeah, I guess I would say that, um, yeah, it’s probably gonna come down to, uh, reward predictions for selecting the goal, like, uh, and maybe the tasks that you want to perform. If you have like free reign, there’s also this funny thing, a lot of the freewill experiments will do, or they they’ll give you like two choices and then you just kinda can randomly, or like Benjamin experiment where you just press a button whenever you want. There’s some, uh, they’re called demand characteristics in the literature, but like the task is to, you know, press the button whenever you want. Well, like if you really ask the person, do they even want to press the button? Like if they really were free, they’d just like walk out of there and do something more fun than that. So what they’re really doing is forcing you into this situation where you need to decide if you’re gonna press a button and then you’re sitting there and you’re like, well, I kind of don’t like that say, you didn’t even want to press the button at all, but you’re like the social pressure to press this button.
Michael 00:24:13 So I guess I’ll press it every once in a while. And so there’s, there’s some little element of freewill, but it’s also just like, okay, I have to, you know, I have all these constraints on my behavior actually. And then there’s also this sense of like, it’s gotta appear random, which is not normal for people to be random. So
Paul 00:24:33 Right. We’re not random. I hadn’t heard that there had been other criticisms of the, of the limit experiments and we don’t need to go through all that because we need to move on. But I hadn’t heard that one that, that, um, you are constrained to that particular response. And so that in itself is a bit of a con found, I suppose, for free free will. I mean, so most of the, most of these solutions to free will, are really reconceptualization of the concept of, of free will. Um, so it’s almost like moving the goalposts, but on the other hand, the free will that we commonly want, right. This, um, the sovereignty over all of our actions and thoughts, uh, I think everyone agrees that that doesn’t exist because it, well, most people agree that that doesn’t exist because it requires some sort of quantum indeterminacy. And then we have to like somehow be in that nexus and be responsible for our own behaviors, you know, or something like that. Right. So anyway, most of the solutions that I see re reconceptualize it rightly so I think,
Michael 00:25:39 Yeah. I mean, we should move on, but I will say that I, I do think we have pretty much the kind of freewill that we actually do on it’s just that I think of it as, as long as who I am myself representation, which is in my brain is properly controlling or influencing how I behave and what goals I pursue. Then that’s the kind of freewill that we want. It’s just that who we are is determined by genetics and our past. Right. And with that, that’s my conceptualization of it.
Paul 00:26:15 Yeah. All right. Very good. Well, let’s back out, so thanks, Patrick. So thanks Patrick. I’m going to go ahead and play another question because, um, it has to do with MRI. So, and then this will bring us up to speed with the work that we, that, uh, you’ve done that we really want to talk about because a lot of your empirical work is, is based on MRI measurements, right? How you construct these networks that we’re going to talk about. So let me just play this question for you.
Speaker 5 00:26:46 Hey Mike, this is Kendrick K, and I have a question for you. It’s sort of an open-ended question. So really any remark you have along these topics would be appreciated. Uh, I was thinking about your sort of research and your focus. Um, I mean, obviously you use a for Mariah is a measurement technique and you are thinking about computational models of cognition. Um, so I guess my question has to do with, uh, the limitations or your wishlist, maybe so to speak, uh, for neuro imaging, uh, I guess specifically Fri um, in terms of informing your, uh, network models of computation, what, what, what, what do you desire out of FMRI or what do you see are the current limitations? Uh, for example, is it spatial resolution? So obviously FMRI is limited, at least in compared to single neurons or multiunit activity type recordings. And, you know, for Mariah is trying to push the limits there and trying to get higher and higher resolution, but it’s, it’s from practical sense pretty far away from small populations of neurons.
Speaker 5 00:27:49 So do you feel that’s a major bottleneck to say developing the types of network models that you do, uh, or alternatively things like artifacts and head motion, and whether you think that’s a major problem and whether, you know, that’s limiting, uh, your progress in the type of research that you do. So I would just be curious to hear your thoughts on sort of whether you’re worrying about currently, uh, trying to make FMI better either in terms of the raw acquired data and or the analyses that we can make of it. Uh, again, with the ultimate goal of trying to inform, uh, what we can learn about computation in the brain.
Paul 00:28:28 So he, he mentioned your network coding models. I think he just called them network models. And we’re going to talk about those, the types of models that you build. So if you feel that that was Kendrick K by the way, uh, if you feel like you need to explain those two to answer those questions, go for it. Otherwise we can hold off and you can answer the, uh, the, uh, your wishlist for MRI.
Michael 00:28:54 So, yeah, I, I have been, I’ll say this. So in general, in my career, I had these kinds of oscillations of, um, pessimism versus optimism. And so, um, it’s actually pretty useful because when I’m in one of the other states, I’m maybe overly optimistic, I’ll think back to like my pessimistic views. So at various times I’ve been quite pessimistic about SMRI and then other times I’m quite optimistic. And I have to say overall, I’ve learned a lot more about computation from FMI than, uh, my most pessimistic phases and the things that frustrate me the most about Fri at this point, I think has to do with the temporal resolution, because these things have, uh, these network models, which, um, we can get get into in a bit, I’ve really come to the conclusion that we need to understand causal relationships between neural populations. And that’s going to be the key.
Michael 00:30:01 And, uh, temporal information is very useful, right, for making causal inferences. Um, but it’s not the only way, and it’s not the only piece of information, but it’s quite useful. So I actually have done, uh, a bit of work with Meg and, uh, started doing more UGI work high-density EGF because of course it’s the opposite problem. There were more frustrated with the poor spatial resolution. Um, and then, yeah, my, uh, frustration has led me to, uh, work with some non-human primate data sets recently, um, for multi-unit recording, not, you know, collecting the data in my lab, but, but really, um, in terms of theory and method development, that kind of thought experiment that led to a lot of this stuff, we’re going to talk about that network models is, uh, was from actually thinking about the perfect kind of neuroimaging technique. If we could record all neurons in real time, what would I do with that?
Michael 00:31:09 Right. And so that’s actually keeps coming up again and again, where I’m just like, oh, well, you know, I want to, I wish I had that. Here’s what I actually have. How do I make do, and try to get closer to that? Would you know what to do with that? I, I, I think so it’s one of those things where you don’t know for sure until you go and try it, it’d be probably too, too much data. I’d have to do data reduction, to be honest, but, but what would be cool about that is you could pick your data reduction based on theory or something and get ensembles and, and, and do a different analysis or get different ensembles or something like that. But, but yeah, I mean, I guess we get into the network model approach, and then I thought about it pretty abstractly, pretty abstractly with doing these thought experiments about like, what would, if it was real like spiking data or LFPs or something.
Michael 00:32:11 But the idea is that we have these artificial neural networks and we want to, with this, these algorithms that can dictate how the dynamics play out on a network architecture. You know, if we had the data, uh, we would go in and take the, all the detailed connections between all the neurons and maybe we could simulate those dynamics on that network. Now, since we don’t have that sort of data, and especially in humans, we are using FMRI and that’s given us, you know, somewhat decent spatial resolution compared to something like parameter G, um, and, um, not great temporal resolution, but we have a lot of tricks up our sleeves for making do. Um, and the main trick is really experimental control. So you can control the timing of stimuli and responses and so forth. So you can separate different neural events from each other. And then there’s a lot of useful connectivity techniques.
Michael 00:33:20 We could use structural connectivity we could use, but we typically use functional connectivity and we use, uh, specifically resting state functional connectivity. And the way I think about that might be a little, I think it’s different than a lot of people think about it. I should ask, make it a little survey and ask how people think about this, but I think of resting state connectivity as, um, as if you, it’s almost like if you can just inject noise into each, uh, neural population and see what happens downstream and, um, but you know, it’s spontaneous activity. So it’s kind of like just, we’re looking at the effects on this statistics of the signal from just these spontaneous activities flowing between different neuro populations. And then that gives us a sense of what’s called intrinsic functional connectivity. And what we found was that it’s really similar across a bunch of different brand states. So resting state isn’t necessarily that special. It’s just that, um, you’ve removed some compounds that it might be coming from test stimuli, and maybe it’s closer to something like structural connectivity with maybe the synaptic weights kind of influencing things. That’s one thing I like about it too. Relative districts for connectivity, we might, we might be getting closer to the actual, well, ultimately causal influences between them, but it’s, it’s really hard to make strong causal claims.
Paul 00:34:55 All right, Mike, well, I’ve already, you know, I’ve buried the lead already, but, um, I thought that those two guests questions were sort of Mo would fit better toward the beginning. So, and in fact, I don’t really know exactly where to start because so what you’ve done in the paper is, uh, to take functional connectivity data. And in contrast to training up a model made up of some sort of architecture and otherwise fairly random units and random connectivity. Um, instead you guys have built models and used, uh, functional connectivity data to decide the architecture and also decide the weights, the nodes. So you don’t train the model. There are sort of two axes that, uh, we could talk about. One is the difference between, uh, your approach and the deep learning Jim DeCarlo, Dan Gaiman’s type approach. And also with recurrent neural networks where you, you train the network on a cognitive task, like you would train an organism on a cognitive task and you optimize the network and then you compare the network to your normal recordings or however you’re recording data. Uh, the other way, the other access that we could talk about, and I’ll let you decide what you’d like to, how you’d like to introduce the network. Coding models is, uh, this play between encoding and decoding models, which we’ve talked about on the podcast, uh, about a long time ago. And it would be good to refresh people’s memory and then use that to, to talk about what network coding models are. So do I, is that a fair enough summary? Yeah.
Michael 00:36:32 Yeah. So we’re trying to kind of bypass the whole, like, uh, what’s the right learning role. Uh, how do you update the weights in these networks to just say, let’s go look, let’s go look in that human brain or, or even animal brains work too, if you had the right data and then just parameterize the network that way. Um, and then, so there are lots of different things that we are using it for and thinking that it’s useful for, um, one is just while we can go and like test these artificial neural network theories, um, because instead of just doing the same thing of optimizing for task performance, we can go and see what, whether the weights that are there from functional connectivity in the brain, whether the performance or the cognitive effects of interests will just emerge when you go ahead and simulate these things.
Michael 00:37:34 And so it, it actually, they use the word emergence, which has been, I really think that’s what we’re doing, but I know there’s a lot of philosophical baggage with that term. So I started using the word, like just generate the cognitive process of interest, but it’s really the same thing. I mean, emergence in a very simple sense of like, you know, the property of going 60 miles an hour down a highway, you know, emerges from these mechanisms in the car that we understand it. It’s not some sort sorta like, uh, a very mysterious thing, but
Paul 00:38:10 Yeah, I’ve been trying to say emergent properties just because it sounds less, uh, less like strong, magical images, but I don’t know how to turn the emergent properties into a verb. So
Michael 00:38:25 Emerging
Paul 00:38:25 Properties, I guess.
Michael 00:38:27 Yeah. I’m trying to think. So there’s that you can test use these models to test, uh, these theories. We can use them to make sense of the neural data. So if you have a connection, when you say, oh, I think, you know, this connection is important or, you know, it’s connecting these two regions. You can have all these different ideas about what it does or is four. But when you build one of these models that will literally have these, what we call activity flows over those connections, and you can go and see, and you can even lesion inside the model and see like, oh, what did it do downstream? And the same goes for the activation. So the classic neuroimaging approach of just saying like where in the brain is this kind of the process you do that? And it’s like, yeah, you learn something, but then, you know, well, what does that activity do mechanistically it’s not really clear.
Michael 00:39:22 And there’s typically, you know, there’s all this kind of hand wave and trying to interpret it, including in my own work. Right. Cause you’re trying to make some bigger narrative here that you understand what’s going on, but you know, if you actually link it up with connectivity, then you can say, well, oh, this, this activity or plausibly, you know, influences activity over here. And then that could lead to motor responses and behavior. And so you, you start to maybe have something like an integrated understanding of what’s going on. It’s not as easy as all that, of course, like I said before, the, uh, you know, I think it comes down to causal inferences and causal inferences are super hard, but I, I think a lot of, um, a lot of people have kind of given up on, uh, causality and they’ll just use correlations. We’ve been trying to kind of move past correlations, um, for the connectivity estimates because of this. And, and we have made some progress towards more Causely valid estimation, but, um, yeah, there aren’t perfect causal inferences. So we’re always pushing towards a, you know, more valid measures and I’m trying to make clear inferences here, but it’s like, I think it’s a good starting point. I think we are learning a lot and it’s just a matter of, you know, keeping on and, and advancing the methods while we’re advancing that theory and kind of making a nice feedback with them.
Paul 00:40:51 Uh, how, how I mentioned that, like the Jim DeCarlo were using convolutional neural networks to study, uh, the ventral visual stream, right. And object recognition. How do you think of a network coding models in relation to that? Because, you know, one of the strengths of convolutional neural networks, which of course were inspired by the visual system already from way back with Fukushima and, uh, you know, through Yon Lacount, um, and now those models are the quote-unquote best predictive models for brain activity in those regions. And they were roughly modeled there. They were built to, uh, recreate roughly the hierarchical layers within the ventral visual stream, um, both in sort of their, their magnitude, the size of each layer. And of course they’re, they’re ordering. So, um, how does your approach differ and how do you, how do you think about what you’re doing relative to that kind of approach?
Michael 00:41:49 I’d say that our approach is probably more empirically constrained because we not only have the activity patterns that were constrained by quote unquote constraints, both as like holding us in, but also telling us what, how the brain is computing things, right? So it’s constraints also in a good way, um, that we have the activity patterns and the connectivity patterns. And so if it works, if it predicts well on, on each layer, say we haven’t done this, the visual model with the multiple layers, but that would be interesting if it actually did work, then we would say, um, well, w maybe we understand more directly how the neural populations are interacting because there’s actual empirical constraint and the connectivity, if you don’t have that, then it’s an optimization problem. And there’s a lot of different solutions that would lead to the same predictions without saying, that’s actually how it works in the brain.
Michael 00:42:52 Of course you’d have more constraints than we have, and that would be nice. And then even more confident, right. That this is exactly how it works in the brain, but that the point is right, that these key constraint of like, here’s actually how the neuro populations interact with each other is in there. And, um, that could allow for emergence of things, you know, the generation of processes that we don’t even think of, uh, because, you know, if it is really how the brain is, is working, uh, then you know, you put in maybe some stimulus that the person you’re modeling from, I guess you could take, uh, one individual, or you can take group data from F MRI or whatever, modality and parameter as the same. Maybe that person has never seen that stimulus before. And you’d see what happens in the model that would be interesting to see would it, and then you actually take that person on and have them see the stimulus.
Michael 00:43:50 Would it do the same thing, but yeah. Um, I think they’re both super useful approaches, but, uh, there’s this added something about the inferences you can make and then there’s this like potential for yeah. Something that the connectivity is doing, maybe evolution specified it, there’s some kind of, uh, some kind of bias and the connectivity weights that does something, a model that’s optimized for the particular stimuli that were presented during training. Maybe they wouldn’t be optimized in the same way. Maybe it’s from the person’s development or experience that the connectivity weights might have been biased a particular way maybe to generalize better. So, yeah, there are a lot of questions like that. That would be really interesting. Yeah. It’d be really interesting just to compare, compare them. The thing is though, because it’s not optimized for the task performance, it’s probably going to do worse just because there’s noise in the data.
Michael 00:44:51 Right. Like even if we had perfect data, then I would think it would do better just because humans have a ton of training and have evolution, like, um, setting things up for optimal performance to some extent. But, um, but yeah, there, there is this idea that we haven’t actually explored yet, but of also just starting with connectivity and then training on top of that. Um, so that, that could be interesting too, right? Like maybe it would speed up training, let’s start from the connectivity and maybe push the model in a certain way. One reason that we’re really excited about this activity flow approach and the whole, uh, ENM, uh, approach is applications to mental health and brain diseases. So we actually had a paper come out recently and science advances that looks at schizophrenia, and we build these little where you could think of as simple computational models that predict activity during a working memory task.
Michael 00:45:53 And what we’ve found is that we can predict the abnormal activations during the working memory tasks and schizophrenia patients. And it’s also predictive of their working memory performance and how they have this deficit and working memory performance. And we took it, uh, just to kind of illustrate the power of these kinds of models is we actually took it a step further and made a treatment, uh, kind of hypothetical treatment that if we could get in and change the connectivity, however we wanted what would happen. And, uh, we have this machine learning algorithm that predicts from the healthy individuals and the patients, what their working memory memory performance would be. And once we implemented this hypothetical treatment and applied the activity flow algorithm to generate what activations would happen, uh, in the context of this treatment, we actually predict a 12% increase and the working memory performance, which puts the patients just about and the normal range. And so, yeah, we’re, we’re excited about, well, I mean, that’s illustrative of the power of this kind of approach for real-world applications potentially. And, and so we’re excited just about, um, the potential for that, but also, um, we, we don’t want to actually start to learn how to change connectivity, systematically and other, uh, research so that we can actually go and test this stuff.
Paul 00:47:16 So I’m really, I’m really glad that there are people like you that are working on these things because diseases, uh, are super important. And they’re not something that I ever cared about in my research. So, but I know that that’s like sort of the point. And so, uh, it’s really great. It’s really great that you’re focusing on that now. Now I’m going to have to go ahead and play, uh, our last guest question. I think that this is a good time, although, um, you know, I was just talking about these convolutional neural networks, obviously something that, um, you’ve worked on as kind of having like recurrent neural networks and setting them up in an architecture so that they are talking to each other, like different brain areas would talk to each other and where you can go and perform multiple tasks. And we can come back to this idea of multiple tasks. Um, but you just saying that you’ve been thinking about training on top of the, uh, functional connectivity models. Um, made me think of this next question. So final guest question here from your coauthor, one of your co-authors.
Speaker 6 00:48:19 Hi. Um, thank you for asking my opinion. I’m always happy to chat. Um, first of all, Mike is great. He and I coauthored a review on multi-tasking learning, um, in RNs with Robert Yang a few years ago. Um, and too, this is such a clever paper. Um, one of the many holes in the field of computational neuroscience in my opinion, is that there aren’t too many models of RNs based on human data, SMRI data in particular. Um, Mike is one of the few people thinking deeply in the space, um, and, you know, selfishly, I hope to be working alongside him. Um, again, scientifically, uh, both of the approaches, you know, using connectivity motifs in for, from FMR I in a generative, um, sense in neural network models like Mike does in this paper, uh, Mike and his team and training RNs based on time series or dynamics data directly, uh, like I do, and inferring from the second type of network models, connectivity, motifs, I think both of those approaches are perfect compliments.
Speaker 6 00:49:20 Um, the two types of models should also be able to work as constraints for one another. And the reason I’m asking, um, well, the reason for my question is, you know, functional connectivity is often inferred. You’re saying, you know, network analysis or graph theoretic methods on the covariance matrix of time series data. Uh, we just, you know, the, you know, end by an object for an units or an voxels, um, now in types of networks that I build and train, um, to match units activity to time series directly such a covariance matrix should come along for free. You’ll see, cause every neuron or every voxel is kind of being fit. Uh, but in addition, in my type of models that recurrent weight matrix should also be dynamically stable and should work. And you should be able to find one, even if the underlying distribution were to change over time as it does in the brain.
Speaker 6 00:50:13 So if you buy both of these things that I said, um, by knowing just the initial condition, we should be able to use this recurrent weight matrix from an RNN Fitch to dynamics in generative sense also, you know, almost as if it were hooked up to an actuator. Um, so my question to Mike would be, you know, when would this approach, um, in his opinion work or fail, and, you know, when I say work, um, I want that to mean to capture dynamics and maybe some features of behavior. And also, um, how would this depend on task complexity and the number of tasks being performed? Um, now I haven’t obviously shown any of this directly yet, or at all for human data, but, you know, I really would like Mike’s thoughts on these and then also, you know, would he please work with us, um, on this problem? Thanks, Paul.
Paul 00:51:05 All right. Konica Rajon. So why did you get all?
Michael 00:51:09 Yeah, I don’t know if I got all, all of it, but it sounds awesome. And an invitation
Paul 00:51:16 For, for
Michael 00:51:16 Collaboration I’m flattered. This is, um, yes, I would like to work on that. I’ll say that, uh, let’s see. Um, some problems I worry about are relevant here, worry about this type of approach. And I, um, so I’m focusing on the negative, but I think it’s, it is awesome. And so I’ll say that upfront, that isn’t a really good way to go. I think that things I worry about, uh, so the limitations of Fri with the temporal resolution in particular, uh, so the, the kind of recurrent dynamics are, uh, going to be difficult to pick up when you’re the neural activity is being filtered through the human dynamic response function. And so it’ll be like, you know, event that’s a hundred milliseconds long will be spread out over 18 seconds, um, function. And you can kind of infer when it happened, but it’s, it’s a rough approximation.
Michael 00:52:17 Yeah. So we could use something called deconvolution to help with that potentially a loo current in my lab is a post-doc in my lab who is currently working on exploring those and trying to validate those approaches more so that that could help, but they aren’t perfect, but there are still a lot of, there’s still a lot of constraints that are there. So, um, it’s possible that we could use that for my data for that, uh, fitting recurrent neural networks. Um, the other thing I worry about is model complexity, um, between say two neuro populations, there are a bunch of different functions that could equally well predict downstream. So you need to take a certain strategy for dealing with that. And, and one of our strategies has been simplicity kind of like Hawkins razor kind of approach, um, and then adding complexity as necessary. So, you know, we start out with correlation, it’s probably the simplest thing, actually co-variants without normalization to be even simpler, but you know, you move up to correlation, you move.
Michael 00:53:27 But then when we S we want to deal with, uh, the confounding problem and causality, so there are confounders. So it’s one region say influencing to others, you’ll make a false connection between those two others. So we use multiple regression typically to deal with that. So you fit all the time series simultaneously, and then, but then there nonlinearities, uh, which we haven’t fully gone into, but, um, we’re finding that there are cases where nonlinearities are really important. I don’t know the nitty gritty details of how the recurrent neural networks are fit. Is there some way to, or like with, um, multiple regression, for instance, we use regularization to also deal with some of this is a way of putting a bias into the model to simplify things. And, um, basically you can, you don’t fit noises as much. You put a bias in there. So, so you’re not doing as much over-fitting.
Michael 00:54:27 Um, so I wonder if there’s some way to do that with, um, however, the recurrent neural networks, but, um, yeah, I definitely think, you know, the actual, I mean, there’s evidence that actual brand uses for current, um, connectivity a ton. Um, and there’s a lot of really good computational things that come out of that just from artificial neural networks, like, uh, the old element nets and so forth, like for language. And I could imagine for like the kind of paradigm. So we were talking about with rapid and start to test learning, I forgot to mention Todd braver, uh, held, uh, I was in his lab for my postdoc and, and actually Todd helped me, you know, we together developed rapid instruct to test learning the paradigms and the theory. Um, so I don’t even mention what Snyder, but, uh, yeah, Todd played a big role in that and, uh, also the network theories, but yeah, so that kind of task requires these sequential, um, processes. And it’s kind of like, yeah, you’re, you’re being programmed to do this little, uh, three roll program. Um, and, uh, that’s very different than, I guess what artificial neural networks are really good at like more like pattern recognition kind of thing. It’s, this is actually a sequence and it requires like temporal control and maintenance of information and updating information in time. And so that is really compatible with the things that, uh, recurrent neural networks can do.
Paul 00:56:05 I was going to say, by the way, it’s, it’s fun to watch you think about a proposed collaboration in real time and immediately know that negative, like a good scientist.
Michael 00:56:16 I kind of bookended it though. I said positive and then a bunch of negative. And I was like, no, this is totally the way to go.
Paul 00:56:26 In fact, what happened is you started saying something negative in said, oh, I think it’s a really good idea, which is good on you. All right. Well, thanks Konica for the
Michael 00:56:36 Question. Thanks for the question Konica
Paul 00:56:38 Thinking about these, so-so the thing that you and Konica and, um, like Robert Yang are working on, are these sort of, inter-regional like multi-region kinds of models, right. Whereas, I mean, I think that you could think of the convolutional neural network as multi-regional, but if you train a convolutional neural network to do perform object recognition, you’re training it on one thing, essentially. And of course, um, you know, catastrophic forgetting is a problem in artificial networks. And so as continual learning, do you see the, the advent of these multi-regional kinds of networks, whether they’re inferred from empirical data like yours are or trained on the current flows, like, like Conoco’s are, or the more traditional, uh, train and recurrent network on cognitive, uh, set of cognitive tasks, um, like Robert Yang is doing, do, do you think that the interplay between these regions will help us explain, especially in a multi-task sort of environment, um, will help us explain properties of empirical data that wouldn’t be explained by training on one task in one network?
Michael 00:57:59 Yeah. Um, I think that’s plausible. I don’t know exactly why mechanistically though. I’m trying to think of, well, I think actually it’s the, um, they call it the inductive biases is one term that’s out there for the kind of things that evolution brings to the table and actual biological systems and, and maybe those biases are toward generalization. And so that, that might be the way we would discover what those are and then we can start using those and artificial neural networks too. So that’s kind of, I kind of alluded to that sort of idea, like if we did, you know, the gym de Carlo style network, but using empirical connectivity, maybe that would generalize better, um, provision, I don’t know, but, uh, certainly, yeah, I can imagine there’s all sorts of different processes for, um, generating flexible behavior that would have been, you know, supposedly selected for during evolution that would, uh, maybe shape how development happens or how, you know, just the brain is organized as a whole.
Michael 00:59:09 And then on top of that, there’d be these learn. There are these learning algorithms that fine tune things, but, uh, maybe they, these biases in the network organization are key. That be my guess, I don’t know about whether it’s important to have a lot of regions or, you know, it’s really about the number of units or how well, one thing that I’ve kind of wondered about actually is, um, but what’s different about what we do is we look, we look at the empirical brain connectivity and it’s quite sparse, at least if you’re not using correlation, it’s quite, it’s quite sparse. You like the structural connectivity at the, at the like large scale. Um, whereas, you know, artificial neural networks will start out with these, like all the connections that are randomly weighted. And I do wonder if, um, sparsity is a big role here. That’s just the beginning now, right? Like sparsity and men, what, you know, what, what is it about the particular organization that’s helping shape activity flow and create these computations that generalize
Paul 01:00:17 One of the reasons why I’m asking and I’m going to kind of keep pushing on this just a little bit, just to build up, I suppose, is something like a agenda Carlo convolutional, neural network, trained to perform object recognition. And that’s not really what vision is right. To solve static objects because we’re in this constant flow of doing quote-unquote vision while we’re doing seven other things. Um, and you know, uh, paying attention to our earned internal, uh, homeostatic signals, et cetera, et cetera. But, and it’s not enough just to like show movies because yes, that’s that’s movement, but it’s also still embedded in this sort of here is a task, um, framework where the world is much more. And I guess I could, um, allude to the push for ecologically valid tasks, but I still say task, but, but our interaction with the world is much more dynamic and flowing. And, um, you know, so, um, I’m wondering if you, if you think that, and, and here I’ll say emergent properties, right? So if you, if you think that using these kinds of inter-regional, um, approaches where you have more dynamic interactions among the different regions, however they’re connected, et cetera, uh, whether, you know, we, we might be able to explain inch closer to explaining, um, more of our subjective awareness or our internal cognitive flow, um, you know, of, of processing that we experienced. That was a mouthful. Sorry.
Michael 01:01:52 That’s really interesting. So, yeah, so I think in order to really get the kind of dynamic interactions with the world, we’re really gonna need to be modeling multiple brain regions at the same time, but then not just that, but how they interact with each other. And so we’ve really emphasized going all the way. Ideally from stimulus to response, we focus really on that feed forward process for now, and it’s really about experimental attractability there. Um, but the key is right. There’s no one brain region that’s going to go all the way from stimulus to response. So we’re really going to need all these inter brain region interactions. And then, uh, yeah, once we get the before process figured out in some probably limited context, cause it’s a huge challenge. Um, then I can imagine worrying more about feedback, which is going to be, you know, let’s say the feed forward processes, uh, a lot of contexts it’s most of the problem, right?
Michael 01:02:53 If you’re just like kind of passively, I don’t know, watching TV, uh, playing a video game or something, maybe that’s most of it, but um, other contexts it’s, it’s just a small part of it. In reality, most contexts feed forward and feedback are just constantly dynamically updating, but action perception cycle. Um, but yeah, I mean, at a minimum, yeah. You’d want multiple brain regions involved in your model. And so what we found is that if we took the activity from those things, I just described now the sensory input, the task context or rule representations, and also the, the motor responses. Then we were able to actually simulate that and generate, uh, a task performing model from empirical brand data. The trickiest part was, was in the middle. Like how do you integrate the task rule representations? So they’re going there’s activity flowing through the rest of the state connections somewhere.
Michael 01:03:56 And there’s sensory information flowing through the rest of state connections somewhere. And we want to know where is that? And that’s equivalent to the hidden layer and an artificial neural network. It’s just like, it’s just thrown out there like, oh, clearly there’s this hidden layer. And in the literature it’s talked about is association cortex, which is most of cortex in humans, right? So it’s like, where is that exactly? Right. So this is part of the, this is kind of a major issue. Um, and actually an opportunity for advancing understanding by saying like, no, that’s actually figure out what, where th this theoretical construct that hidden layer it is, these are the connection or, sorry, these are the conjunction injunction. Hubs is what we call the hidden layer actually plays a lot of different roles in a lot of different networks. So in this particular situation, uh, it’s at the conjunction between, uh, the context, uh, you know, task role representations and the stimulus input.
Michael 01:04:57 Um, and then, so what we ended up doing, um, so there were a lot of different strategies. We thought of what we ended up doing is actually building an artificial neural network that could perform the task. Um, and then looking at what’s called the representational geometry of the hidden layer and then using a representational similarity analysis to look at where, which brain regions have a similar representational geometry. So they, you know, the similarity of the activity patterns matches what’s going on in the hidden layer. So just to go over this pre-print, uh, ITO at all 2121, pre-print on the ENN, there are basically, uh, three, let’s say four steps to it. So what we, the big ideas that we wanted to take the actual empirical brain data for the activity patterns and use empirical function, connectivity to link together these different brain regions all the way from stimulus to response.
Michael 01:05:57 And so we start with the sensory input. We decode sensory areas to ensure that we actually have the information that’s relevant to the task and these regions, we then, uh, also decode the task context. So this is all using that pro paradigm that I talked about earlier, by the way. So you have all these 64 different tasks rules that are recombined, and we do code each of those tasks, uh, and find brain regions that actually have that information in them. We then, uh, use functional connectivity to, uh, stimulate the activity flow that would go into what we might call the hidden layer or, uh, what we recall specifically conjunction hubs. Cause it’s the conjunction between the sensory and put in tasks context, we then apply a non-linearity there, uh, which turns out to be pretty important. And then after that, we do another activity close to up to M one.
Michael 01:06:54 So the output regions, and then that’s our prediction of behavior, right? So we’ve gone all the way from sensory input to motor output in a context dependent decision-making task. And then we decode, uh, what motor response is happening. And it’s not just a normal decoding, by the way, it’s, it’s trained on actual, empirical. This is how people press buttons. And this is what happens in primary motor cortex when they do so we’re actually, um, decoding in the form of that M one uses to represent these button presses and then we get above chance accuracy. So that’s, that’s actually, uh, uh, for task performing brand model from empirical data
Paul 01:07:36 Now training with zero training using the
Michael 01:07:41 That’s awesome. I’m trying to think. Oh yeah. The other thing that theory predicted that made us think we were really going to have to do this, but we weren’t totally sure was, um, whether we needed a nonlinearity at the hidden layer. So the, uh, there’s a model by, uh, uh, John Cohen, uh, Dunbar and, uh, Jim McClellan in 1990 is the Stroop model where they introduced this, uh, context layer to compliment the hidden layer. So that we think of that as like, you know, where the rules are represented, the context layer. Um, so they made a big deal in that paper about the nonlinearity and the hidden layer is really important. It’s kind of like an attention kind of mechanism where it’s, you’re selecting the representations that are going to, um, basically filter the stimuli according to the task context so that you select the correct motor responses. Um, and so lo and behold, we did need a, non-linearity just like we thought we would. Um, I mean, for theoretical reasons, you think so, right? Because it’s context dependent, decision-making you need this interaction so that it’s like contingent, right? So if the stimulus, the stimulus can go to totally different motor responses, the same exact stimulus, it depends on the rule. And so that there’s a non-linear interaction that has to happen, so you can select the correct one. So, so I thought that was pretty cool that that came out of the, the work.
Paul 01:09:09 One of the things that I like that, um, that you are in pursuit of is so you have connections, right? And that’s all networks, and you can talk about the properties of those connections. If this is like network neuroscience, right? Where you talk about path length and, um, the, you know, different metrics of how to characterize a static essentially network, uh, and then you have functional, um, connectivity between them and what your work is doing is, is bringing those two things together. Do you think that, but it’s, it’s still essentially all networks, right. Do you think that, um, this sort of network, vernacular and approach also looking at the dynamics and like you just were talking about the non-linearities, uh, but, and looking at activity flows within networks, what you think that’s going to be enough to quote unquote, explain cognition, or do we, will we need to talk about multi-scale multi multi-level scale, uh, organizational, uh, components.
Michael 01:10:13 Yeah. So one reason I went down this path of, um, making these, uh, empirically estimated neural networks or these network coding models, um, show return we want to use was to make that an empirical question basically. Um, it was a bit like, you know, do I think I really had a couple of moments like this kind of like, do I think that this is the real kind of activity, like they use in models? And I said, I don’t know, maybe not, but I should try it and see, and then I’ve been surprised, you know, that it, you know, these things, uh, I’m sure there’s going to be limits to it, but, um, it does seem to be some sort of equivalents there. And so I, uh, like I said earlier about the, um, does saying, like, we probably won’t be able to like model someone playing a complex piano piece using that for MRI, there’s going to be similar limits at whatever scale where we are and I’m I’m, but I’m hopeful.
Michael 01:11:19 Right. I think it’s plausible to say, like, we could make these tasks that are a little bit artificial, but still informative enough. Uh, you know, it’s a forced choice between two button presses because maybe we can decode the right versus left hand really easily or something like that. Um, and, but you can still get the key network computations, the network mechanisms, um, as long as you, you know, maybe construct the task appropriately, like if we were able to do that, I would be very happy and then it would be like, oh no, we can’t, you know, do this really subtle thing. And then yeah. Then you’ll have to get into, you know, very fine grain things. Uh, there’s also is the question of like, like when I say, okay, um, there’s this connectivity pattern between these two regions and I have all these voxels inside there. So it’s like pretty fine grain on, on, in some sense, but you always could say like these, between these two voxels, what exactly is the physical basis of that? And you go all the way down to individual synapses and explaining that, right. So there’s always levels here. It’s just whether we are at a level where we can say, we’re pretty satisfied with our explanation of this cognitive process. Um, and I’m hopeful that, you know, we’ll get pretty far at this level, but you never know till you try.
Paul 01:12:42 Oh, see, there’s more, more, uh, optimism also. Uh,
Michael 01:12:46 So it’s overall optimistic. Wasn’t
Paul 01:12:50 So Mike, this is ostensibly a show about neuroscience and AI and, you know, often what gets left, uh, off the table in these conversations. And I’m going to make sure and include it in ours is the potential for like your work, for instance, and this kind of approach for actually influencing and benefiting AI. Because right now we’re in this place where, you know, we’re using all these deep learning, we, uh, you guys are using all these deep learning models, even though you hate learning and don’t use deep learning of course. But, um, but the deep learning model approach is, um, is, uh, the flow I’ll say is much more toward neuroscience and benefiting how we’re understanding brains. But of course, um, the whole deep learning approach was began. The whole deep learning approach began with the concepts of concept of neural networks. Right. So, so the activity flow does go both ways. Do you feel like, um, these models that you’re building for instance, uh, will have implications for, or benefits for AI?
Michael 01:14:01 Uh, yeah. It’s actually on multiple fronts, I guess you resume out a little bit. Um, so the one reason I was interested in the rapid instructed task learning stuff was because, um, I am actually interested in learning, but I’m interested in how, you know, humans learn some things much more rapidly than artificial neural networks. And so, you know, it’s possible that some of the insights we get from, uh, the riddle work will translate into, you know, being able to just instruct a machine verbally to do some tasks like you would another person. Um, and also just the general ability to flexibly, reorient to, and reuse concepts and, uh, I guess task rules or task information. And then in terms of the activity flow models, like the Yana and, um, that’s, that’s a little bit more where I’m just, yeah, I think I already described it a little bit, just like will something emerge from these things that, uh, is in biological tissue that we’re able to simulate and then just be kind of surprised by it’s the ability to generalize. It’s a little more of a bottom up kind of thing than the rural work where we’re, we’re, you know, we have this kind of cognitive theoretical target. And I get, I guess, because, you know, I am trying to merge the two whenever I can. That that would be the ultimate, right. If, if it was like re re simulate riddle, and then, uh, it works and then it’s like, we, we dig into how the model is working and we say, oh, this AI model is just did this one thing generalized to allow generalization
Paul 01:15:46 On the same lines as like the system one system to the, um, you know, kind of in system one system, two difference and, or the, uh, AI needs a prefrontal cortex push from Bengio and O’Reilly and those sorts of folks, do you see?
Michael 01:16:03 Yeah, it’s totally related to that. Yeah. So like, you know, I worked with Walt Schneider who, you know, had the controlled versus automatic processing, which maps like even, you know, I believe condiments said it maps one to one system, one system, two concept. Um, so yeah, I mean, controlled processing, but this particular, uh, flavor of control processing that is really about novel task behavior and transferring, um, abilities and to novel situations and that which is directly related to, uh, general human intelligence, which is another topic that I really dug into when I was working with Todd braver. So general fluid intelligence is this really fascinating concept in psychology. That’s really about individual differences and is directly related to, uh, riddle abilities. Um, they, they actually correlate quite strongly. And so if we could really, you know, figure out what’s going on, like why do, why do humans have this?
Michael 01:17:05 Um, isn’t, it is, it is a factor analytic thing that they can see in the statistics that, that each individual, it seems to have this general ability that generalizes across a bunch of different tasks. Um, what is that, you know, where, where is that in the brain? And like, what’s the mechanism behind that? You know, maybe once we figured that out, we can copy that over for AI. And then I guess there’s the term artificial general intelligence, and I’m talking about natural general intelligence, right. And maybe there’s some way to learn from one and ticket over to the other.
Paul 01:17:37 Well, I, I, those, um, control processes, are we going to be talking more like in symbols and rather than lower level network properties, are we going to end up, you know, having, having this mesh between symbolic, uh, and neural network type of, uh, architectures,
Michael 01:17:59 From what I understand, that was a really hot topic, like in the late 1980s, early 1990s. And it was seemingly going that way. And then it that’s when I rolled them out, I read about it more, uh, you know, the old literature. So I don’t, I haven’t been following the recent stuff, but I, I guess my, you know, having thought about it for a long time now from that, that older literature, my thought was, you know, let’s just figure out how the actual biological tissue does the symbol, like stuff. And then, uh, then we can still just stay in this distributed architecture. And you had the benefit of right, like mapping and potentially a one to one on one to the human brain, like we’re trying to do with Enns. Um, right. If we start putting in these abstract symbolic modules, then it’d be like, well, wait, where exactly does it map onto? And then it’s like, can we go any deeper into that? And not really, I guess you might be able to find, you know, in maps onto a brain region, but not the inner workings. I bet wouldn’t, you know, map very well. So
Paul 01:19:04 Very good. Well, so in our final few minutes here, and thanks for hanging with me for so long, what do you, I know you’re, you’re working on multiple fronts. We, we talked mostly about just one of the things that you’re working on, but I just want to ask you what, uh, like last night after you brushed your teeth, you know, and, uh, flossed and put your anti-aging cream on, you know, and, uh, laid down, uh, w what did you think about, what, what, what kept you up longer than you should have been up?
Michael 01:19:32 Um, you know, I thought, uh, I, I mentioned earlier something about causal inference, and I guess that keeps coming, coming up for me as, you know, central to not just what I’m working on, but really neuroscience and science in general. You know, it’s a really hard problem, especially in complex systems, like, um, you know, the brain and even these AI systems. So one big idea that, uh, we’ve been pursuing in my lab is just this idea of using causality a kind of common ontology for different areas of neuroscience. And it’s really based on a general hypothesis that causal interactions among neural populations. Uh, we are really thinking that those will end up being the most critical features for explaining the neural basis of cognition. Uh, of course there’s a lot of other things, but if you have things, if you have neural processes described in terms of causal properties and these kinds of activity flow processes that have been talking about, and that’s going to maybe be the main way of describing like an explanation for how some kind of cognitive process emerges, uh, is generated.
Michael 01:20:52 So, um, there’s tons of other details, of course, but you could think of them more as modifying that process, right? There’s sets a processes. So you have a non-linearity at one step that’s about selecting a subset of the activity flows that then, uh, change how things happen downstream. Um, you also have lots of concepts like, uh, confounders, um, causer colliders that will take a while to get into, but all of those things, I think together again, to be really important for getting explanations for brain function and, uh, how cognition emerges from, uh, neural populations, like the kind of explanations that would be actually satisfied by potentially, oh, yeah. One thing I will say that I’ve been, I’ve been thinking about recently, um, along these lines is the concept of, uh, what I call causal sufficiency. So I don’t know, maybe, you know, this is already out there, I just haven’t come across it.
Michael 01:21:53 But the idea is, you know, even if you ablate or lesion or region, you can show that it’s causally necessary, but you don’t know if that brain region say was causally sufficient to make the cognitive process. And that’s where these models can come in, right? Like the ENN or even an ANM, uh, or, or any sort of model, right. You actually generate the process and you could show, especially if it’s empirically constrained, you can say, this is equivalent on all these ways to the actual biology and then where it generates the kind of the process of interest. So, you know, at the very least it’s causally sufficient. And, um, then, you know, you also would like to have some of these lesions and stimulation to show causal and necessity potentially, but you could even imagine, like say there’s like two different pathways that can accomplish the same cognitive process. So you, you ablate one in, it does nothing potentially you ablate the other. It does nothing, but really they’re both causally, maybe sufficient for generating the, kind of the process
Paul 01:23:00 That speaks to work like from, you know, like Eve martyr and the idea of multiple realizability and how, you know, in the end, anyway, we’re activating our muscles right. To perform some tasks. So it might, you might get away with being pretty ugly internally and still come out with the right behavior. And, you know, this is what everyone’s interested in, I suppose, or what we’re testing. The vast majority of it is behavior anyway.
Michael 01:23:28 Right? Yeah. I did have some interesting reviews for something that I was working on with the ENN where I was emphasizing behavior. And that’s like, kind of the, it was, I felt like it was like a holy, the holy grail. Like if I can predict behavior, well, you know, that’s how, you know, things are index, but then I had like, sort of hearsay, well, like all you’re doing is predicting motor behavior. You know, what about cognitive processes? I’m like, oh, what, like, that’s what we have been doing. And it’s the innovation is that we’re getting all the way to behavior now. So then I just I’ll have to say is like, no, we’ve been doing that. That’s, that’s, that’s the, I have to make the cognitive process to predict and one behavior
Paul 01:24:14 And they accepted,
Michael 01:24:16 Oh, I’m still in the process of, so we’ll see, we’ll see right here, this explanation and be like, oh yeah.
Paul 01:24:26 W well, I’ll push the, oh, I can’t, I got to air it in a few days, man. I can’t push this out until it’s accepted. So, sorry, we’ll bleep that, I suppose. Um, so finally, Mike, uh, I want to ask you career, uh, career type question here. So, uh, I knew you back in graduate school. I know you did a lot of stuff before that. You’ve had a lot of good advisors, you know, throughout, and I know you’ve worked extremely hard, which I’ve always been impressed with. Uh, it seems like you’re always on focus and on point, I’m wondering if there’s a time throughout your career or, you know, a specific time, or I’m sure there are multiple times, but if you could tell a story about some time that when you feel like luck played a integral part of some success in your career,
Michael 01:25:17 Um, yeah. So I guess, um, the early interest in what later became known as network neuroscience, uh, yeah, like, and I guess that really started and marked Esposito’s lab. I just lucky that I ended up in his lab and then continued along that line. Um, and the reason it’s lucky is because it’s beyond me that, you know, the rest of the field really went in that direction so that, you know, I could, I didn’t have to swim upstream, I guess, to like, make progress on that. I, there was a really current going on. And, um, and then also that I was at Washington university. So that was when I was working with Todd braver. And also Steve Peterson, when the human connectome project was started there, I wasn’t actually involved in it, but I was right there. And I had all these advantages for like, knowing about it and what it involved and being able to like ask questions about the data early on. And, and that was just like this treasure trove of, of, you know, questions that we could ask without having to even collect new data, ask a bunch of questions and, you know, and the analysis took a long time and were a lot of work, but it wasn’t nearly as hard as, you know, designing experiments and also, you know, designing experiments and collecting data and then the large and actually made for much more robust conclusions and statistics. So is anyway, all that I have to say that was, that was luck.
Paul 01:26:54 Is it possible to parlay that kind of serendipity into advice for aspiring people, maybe people who aren’t, who feel like they haven’t been so lucky or they are swimming upstream, is it even possible or is it just, is the only thing to say, um, that those are just, uh, lucky events.
Michael 01:27:18 I’ll say that, you know, there are a lot of people at wash U when I was there that didn’t work with the human connection project data. So, I mean, I guess it’s like, what’s that saying, like seize the deck favors the prepared my under excuse the day that that works too. Yeah. You know, I dunno, just really look for opportunities wherever you are and, and kind of, you know, you’re required me to change what I was going to do. Right. Like, you know, even if we didn’t even have a plan, you know, I, I made that my plan instead of something else. So it wasn’t pure, you know, like I’m just totally passive. Uh, there was some kind of like seizing the opportunity. And then there’s also, I guess, in this particular case, um, some intuition. So I dunno, I dunno if that’s like, you can totally plan on that, but
Paul 01:28:17 I’ll also be smart.
Michael 01:28:21 Just like think plausibly, you know, if this trims or this little, this little idea, actually, because it was before it was a trend, I guess, or early, early days of the trend, um, if this kept going, is it even plausibly going to lead to anything? It was like, okay, the brain is a network, you know, we already have known that forever. So like, yeah, studying the brain is a network. Seems like a good idea. So you know, that kind of general logic I think could help, but yeah. I mean, it, you can’t really make general advice on this. I don’t think it’s just like, in this case important factors.
Paul 01:29:02 Yeah. The only advice you can give is like, you have to work super hard and, and, and develop skills in whatever you’re doing and I guess be willing to change. Right. And seize the day when something like that comes along and it feels right and seems right. I don’t.
Michael 01:29:18 Yeah. Yeah. Um, yeah. What is that called? Like exploration, exploitation trade-off or,
Paul 01:29:25 Yeah, but then the boy that’s a whole, that’s a whole other bag though, open, but yet you have to explore and then explode, explode, explode, exploit, and then explore. And I don’t know the, but I don’t know the perfect pattern for that either. That’s something a returning theme actually that I don’t know that there’s the right. I can’t that I can write out that algorithm, but right. All right. All right. I won’t keep you any longer. Thank you, Mike, for coming on. Thanks for answering those guests questions as well. And, uh, I love the work and continued success to you.
Michael 01:29:56 Oh, thank you. Thanks for having me on it. It’s been great talking and the guest questions were a real highlight. It was great to hear from some old friends.
Paul 01:30:10 Brain inspired is a production of me and you. I don’t do advertisements. You can support the show through Patrion for a trifling amount and get access to the full versions of all the episodes. Plus bonus episodes that focus more on the cultural side, but still have science go to brain inspired.co and find the red Patrion button there to get in touch with me, [email protected]. The music you hear is by the new year. Find [email protected]. Thank you for your support. See you next time.
0:00 – Intro
4:58 – Cognitive control
7:44 – Rapid Instructed Task Learning and Flexible Hub Theory
15:53 – Patryk Laurent question: free will
26:21 – Kendrick Kay question: fMRI limitations
31:55 – Empirically-estimated neural networks (ENNs)
40:51 – ENNs vs. deep learning
45:30 – Clinical relevance of ENNs
47:32 – Kanaka Rajan question: a proposed collaboration
56:38 – Advantage of modeling multiple regions
1:05:30 – How ENNs work
1:12:48 – How ENNs might benefit artificial intelligence
1:19:04 – The need for causality
1:24:38 – Importance of luck and serendipity
Catherine, Jess, and I use some of the ideas from their recent papers to discuss how different types of explanations in neuroscience and AI...
Megan and I discuss her work using metacognition as a way to study subjective awareness, or confidence. We talk about using computational and neural...
Support the show to get full episodes and join the Discord community. Doris, Tony, and Blake are the organizers for this year’s NAISys conference,...