Episode Transcript
[00:00:03] Speaker A: In fact, I'll go as far as to say, as far as I can tell. Transformers are almost a counterexample to the successes of neuro AI in that they bear, as far as I can tell, very little resemblance to anything that I expect to find in the brain.
I'm particularly interested in three sort of large questions about, you know, where, where you can hope to gain insights from biology, from neuroscience and bring port them over to artificial systems.
And alignment is sort of, in some sense the most fundamental.
You know, this is crazy.
A crazy amount of flexibility.
[00:00:52] Speaker B: Yeah.
[00:00:53] Speaker A: And so how does that happen? I mean, we don't know the details, but the answer is I think that there is a developmental process, you might even call it a developmental curriculum.
[00:01:06] Speaker B: This is brain Inspired. Powered by the Transmitter. Hello, I'm Paul. I do neuroscience things at Carnegie Mellon University.
Tony Zador runs the Zador lab at Cold Spring Harbor Laboratory. You've heard him on Brain inspired a few times in the past, most recently in a panel discussion that I moderated at this past Cosine conference, a conference that Tony Co founded 20 years ago. As you'll hear, Tony's current and past interests and research endeavors are of a wide variety. But today we focus mostly on his thoughts on neuro AI, roughly the interplay between neuroscience and artificial intelligence. So we are in a huge AI hype cycle right now, and for good reason. And there's a lot of talk in the neuroscience world about whether neuroscience has anything of value to provide to AI and how much value, if any, neuroscience has provided in the past. Tony is team neuroscience in this regard. You'll hear him discuss why in this episode, especially when it comes to using biological processes like development and evolution to improve data efficiency in AI models.
Also looking to animals in general to understand how they coordinate their numerous objective functions to achieve their intelligent behaviors, something Tony calls alignment and using spikes in AI models to increase energy efficiency. If you like written essays by chance, Tony has written two essays on the past and the future of neuro AI respectively, and those are available on the Transmitter website and I think they nicely complement our discussion on this episode. I will link to those essays in the show Notes, where I also link to a couple of the papers that we discuss and Tony's previous brain inspired episodes. Thank you to my Patreon supporters. I think we need to have a discussion in the near future, maybe some sort of live discussion. I'll be on Discord soon to probe your interest in what topics we might talk about when we meet next. If you want to support brain Inspire through Patreon. Just go to Brain Inspired Co and you will find a link to the Patreon there. All right, here is Anthony Tony Zador.
Hey. First, I just. Up front, I just want to thank you for hooking me up essentially with the good folks at the transmitter.
[00:03:37] Speaker A: I'm so glad that worked out.
[00:03:38] Speaker B: Yeah. I mean, just such a. Not random occurrence, but you and I were passing at cosine in the hall and just chatted for a minute and then next thing you know, I'm speaking with Emily at the transmitter and it just all really worked out. So I'm really grateful. So thank you.
[00:03:52] Speaker A: I'm glad that worked out. I thought it would be a good match.
[00:03:56] Speaker B: So you have nysis coming up. I'm not going to be able to be there, but I will see you maybe a month later in Bethesda at the Brain Neuro AI workshop.
[00:04:06] Speaker A: Oh, nice.
[00:04:08] Speaker B: Which, I mean, you're just. You're all over the place. You're organizing all sorts of neuro AI stuff.
[00:04:15] Speaker A: That's right.
I think. I think this is the time for. Are we officially interviewing now?
[00:04:22] Speaker B: Yeah, sure.
[00:04:23] Speaker A: We didn't. We didn't have a transition. Yeah, Yeah. I hope this.
[00:04:28] Speaker B: Tony. Tony, thank you for being on. Welcome. I appreciate you being on.
[00:04:32] Speaker A: Sorry about that. Yeah, yeah, yeah, I'm. I'm pretty excited about neuro AI these days, both, you know, the work that I'm doing myself and sort of the broader field and seeing what other people are doing, getting people with these shared interests together in a variety of settings. So the nicest meeting, the NIH meeting, which brings together not only a bunch of researchers, but also, as they say, stakeholders. Various branches of government funding agencies, nih, nsf, a couple others.
And then there's.
There's also one coming up at neurips.
So, yeah, you may be. That's only in the next two months.
[00:05:22] Speaker B: Right. I know. It's all over the place. You may be the most excited person about neuro AI that I know.
Actually, I was just on a boat in Norway at a neuro AI workshop and at the very end, I had. I was tasked.
[00:05:37] Speaker A: Wait, what? I thought. I thought they couldn't hold one without inviting me. What the.
[00:05:42] Speaker B: I was surprised. I was surprised, to say the least. I was assuming that you were.
[00:05:46] Speaker A: I am hurt. Yeah, no, I'm hurt.
[00:05:48] Speaker B: There was. Yeah, there was a no Tony rule for sure, but. But I. I sort of told the participants. There was about 30 people, like, whether they liked the term Neuro AI, and a few of them were very advocated for it somewhat passionately, which I was surprised by. And of course, some people don't like it, but how is it sitting with you? I mean, you love the term, right?
[00:06:15] Speaker A: You know, honestly, I wasn't a big fan at the beginning.
It was actually like five years ago or so that I started that. Actually somebody I was working with here at Cold Spring harbor who is trying to help me put together a program here at Cold Spring harbor on Norway. I sort of. We tossed around various ideas, and I think she can take Kat Donaldson, who's no longer here at Cold Spring harbor, but I think she can actually take credit in, at least from my point of view of really getting me to land on that as the term, because I had to write a couple proposals. And yeah, I was using these, like three or four word phrases, the intersection of neuroscience and AI. And she's like, that is just so clumsy. That is just so awkward. So I actually did a Google Scholar search a couple of weeks ago to figure out when the earliest occurrence was, and there's been a tremendous uptick. There's essentially no occurrences of the term going until sort of the mid to late 2000 and tens. There are a handful, but it's often used to mean something different, actually something closer to what we would call, you know, what these days is called something like, you know, brain machine interface or brain computer interface. That, to the extent that it was being used, you know, 20, 30 years ago, apparently that's. That's how people were using it.
[00:07:49] Speaker B: I mean, part of my hesitation in adopting it, I agree it is the least clunky, clumsy way to say it. Right. And it's kind of clear what it means, I think. But part of my issue is, you know, even in the early days of AI, was it John McCarthy who termed. Who termed it artificial intelligence?
[00:08:10] Speaker A: I think that's right. It was that meeting I. That he organized. I'm not sure who actually coined it. There was like a 50, 1956 Dartmouth meeting on what is now called AI.
[00:08:21] Speaker B: But maybe it was him who disliked it. There was some. Oh, yeah, people who dislike the term. And I appreciate that going back. So there's kind of been. Not controversy, but there's been some debate on whether it's. Artificial intelligence itself is a good term to use, and I'm sort of on board with that criticism.
[00:08:41] Speaker A: Yeah. I mean, when I was a kid, when I was in graduate school, AI was the boogeyman. AI referred exclusively to symbolic AI.
And by contrast, the thing I was interested in was artificial neural networks, computational neuroscience and machine learning. And Very few people in my circles would have described what they were doing as AI. What they would say is that what they were doing was the opposite of AI. And so it's, it actually took me a while to start even being willing to say that what I'm interested in is AI.
[00:09:26] Speaker B: What would they, would they call it connectionism? What would the term.
[00:09:29] Speaker A: Well, yeah, it's funny. So the term connectionism was more specifically for a particular subset of, of people who did, who did neural networks. The people who came sort of more from the, you know, the PDP books, the people who came from, mostly from psychology and who wanted a connectionist description of human cognition.
[00:09:59] Speaker B: Right.
[00:10:00] Speaker A: They wanted to. So, and again, we could, we could sort of talk about all the substreams and the various approaches. But you know, part of the research program there was to jump to skip rule based approaches, for example, in understanding verb conjugations and to use a so called connectionist approach. And there was a lot of debate in sort of the cognitive science literature about which of those is right.
[00:10:32] Speaker B: So you've seen that historically developed in, and you've made your own transition into accepting the term more or less. Right. But, but so why, why are you particularly excited these days? I mean, I know it hasn't, it wasn't last week that you became excited, but. Because this has been a while now.
[00:10:51] Speaker A: Yeah. So I mean, my own trajectory was that when I was starting graduate school and I was in graduate school sort of late 80s, early 90s, I was excited really about what at that point was one field, which was computational neuroscience, slash artificial neural networks.
They were two sides of the same coin. And there, the idea was that you would build models of how neural circuits compute and then you would extract your understanding of what you thought or your beliefs. You would, you would abstract your beliefs about how you thought neural circuits computed and apply them to build better machines. And so there was no, really, especially, you know, in the late 80s, early 90s when, when I started getting involved with this, there was, there was no clear distinction between those, those two. Right. And so what I was excited about was both sides of it because there was a long tradition in neuroscience of using quantitative models, and that was certainly part of computational neuroscience. But typically those models were not constrained by the recognition that neural circuits not only have dynamics and have behavior, but they have to actually perform a function. They have to enable the organism in which the circuit is embedded to solve a problem.
And I think it was really that constraint that was, was really crystallized in by sort of the early neural Network combination or meeting with computational neuroscience. So, you know, sort of very naively, I would say that, you know, if you asked a lot of people who worked on vision, and this is now before my time, and so I'm actually, I was just recently trying to talk to people who were around, you know, back then to see if my sort of inference about the history is true. But my belief is that, you know, a lot of the early vision scientists, you know, the people who were inspired by Hubel and Wiesel, didn't really see that there was a hard problem underlying vision. Right. They thought that the. In the same way that, for example, the, you know, the early people who started working with computers thought that, okay, chess was going to be hard, but getting a computer to control an arm to pick up a chess piece was going to be easy. And, you know, that turned out obviously, even by the early 60s, to be false. But I think that vision scientists took a, a while, right, to recognize that the circuits they were studying were doing something really hard. And so the research program that I see based on published work following from Hubel and Wiesel through the 60s, 70s and 80s, was let us identify the representations in different brain areas of visual scenes, and that will be an explanation for vision.
Right?
[00:14:17] Speaker B: Like, are you talking like the, like neocognitron kind of days or.
[00:14:23] Speaker A: Well, no. So that, so that is an exception, right? Because that is somebody who's actually saying, okay, let's take what we believe is happening and put it together and see if we can build an artificial vision system. I'm talking on the physiologist side, right?
[00:14:37] Speaker B: Yeah, but that sort of has kind of stayed with us, right? I mean, it's not like that has really left altogether.
[00:14:44] Speaker A: I mean, I think that there. I think you would be hard pressed to find a visual physiologist who does not recognize that vision is a hard computational problem. They might say, look, what I can contribute to it is a characterization of neuronal receptive fields. And I'm going to characterize those receptive fields because that's what I can do.
And I'm going to tell a story about the role that these receptive fields might play in visual processing. But I think it would be very naive for a modern visual neuroscientist to think that that by itself was the answer. Right. Because when I was in graduate school, one might have imagined that, well, you know, it's pretty straightforward. You build a system with these receptive fields and you've got a working vision system. And the reason we haven't been able to do that so far is we just haven't bothered.
We haven't had enough engineers.
We didn't have enough computer power. Right. And you know that, you know, 30, 40 years ago, that might have been a viable argument. But now we know that, like, if you try really, really, really hard and devote lots of computer power and lots of engineers, you get something now that works pretty well. But for a long time you didn't. It remains a hard problem.
[00:16:12] Speaker B: Well, and in those early days, it was all Gabor filters also and edge detections in the receptor fields. And it's like, can you build up to vision from Gabor filters?
[00:16:22] Speaker A: Yeah, exactly. Right. Like, basically you would break the problem into, you know, 12 or 24 or 50 sub problems, solve those into individually. Right. Like shape from shading and, you know, optimal. I can't tell you how many papers I read at various times about, you know, different variations of optimal edge detection.
Right.
[00:16:49] Speaker B: So part of the reason you're excited is because. Because of that history of thinking that it was an easier problem than it turns out to be.
[00:17:01] Speaker A: Well, what got me excited so when I was in graduate school, I thought I could do both things.
I thought that I could simultaneously learn about neuroscience and develop better, you know, take what I learned and apply it to building better systems. By the time I got to doing a postdoc, I decided that you kind of had to pick one. At that point, the. The interest within the neural network. And first of all, interest in neural networks was beginning to wane a little bit. But more relevant to me, I felt like there were real opportunities to learn more about neuroscience. And I felt like I should just take this approach that I had and do the right neuroscience experiments. And so I retooled as a pure experimentalist or as an experimentalist driven by theoretical and computational questions. And for my postdoc, I worked on synaptic physiology. Now, I still was interested in quantitative approaches.
There was an excitement at that time about using information theoretic techniques pioneered by Bill Bialik. And so I was pretty excited by those.
[00:18:26] Speaker B: How do you think about those? What do you think about information these days?
[00:18:30] Speaker A: Well, I think it is the core of how I approach problems.
I think the exercise of measuring information, which is what a lot of us, including me, were doing at that time, it is useful for thinking clearly about how collections of neurons represent information, at.
[00:18:59] Speaker B: Least their capacity to.
[00:19:01] Speaker A: Their capacity to do so. Yeah, there are lots of limitations to it, but I think the actual numbers that you get out turn out to be boring. And sterile. And I guess that's. That's sort of the complaint that people have is that what's for me, compelling is, you know, the framework, the way of thinking about the world, the recognition that, you know, among other things, if you're going to process information, that information has to be propagated and transformed and. Yeah, right.
[00:19:35] Speaker B: So you have the bits and then what the hell do you do with them? Is the.
[00:19:39] Speaker A: That's right. That's right. So it's sort of part of the problem. I mean, it addresses part of the problem that we have in the nervous system. I guess one thing that changed for me, and I think for a lot of the field, was that back then a lot of us were focused more on sensory processing. And so information theory covers a lot of how you take information in from the outside world and you represent it. Right. How do you characterize those representations?
What it left out was something that I didn't come to until really I started my own lab, which was the recognition that, you know, animals behave. This is not a shocker, but that back then, many of us were studying anesthetized animals.
[00:20:36] Speaker B: Yeah.
[00:20:39] Speaker A: And so the behavior that you get out of an anesthetized animal is usually somewhat less interesting. And if it's.
[00:20:47] Speaker B: Can I just pause there for a second? Because there's so many instances and I can't think of one. An episodic memory of one where people point to historical experimental work like anesthetized animals. Right. And then they say talk about how it was a necessary step to get to where we are these days. Right. Using different animal models, using them in controlled experiments versus natural, et cetera. Do you. I mean, do you think that that's true in this case with the anesthetized animals?
[00:21:17] Speaker A: I mean, I think people going back to ancient, historic. Ancient philosophers have asked whether, you know, any particular path through history is an inevitable path. Right. Like, so was it inevitable? I mean, I'm not sure it was inevitable, but to this day. Right. There are experiments that are. I mean, let's even go more extreme. For my postdoc, I studied brain slices.
[00:21:46] Speaker B: Yeah, yeah, right.
[00:21:50] Speaker A: We certainly can learn a lot of things in brain slices that would have been hard to learn in the intact preparation, especially back then.
You know, a crazy thing is that when a neuroscientist refers to an in vivo prep, we refer to one in which the brain is still inside the skull.
If a biochemist refers to an in vivo prep, they are talking about one in which all the proteins are still inside the cell membrane.
Right. I mean, so, you know, different preparations for different questions. I think what, you know, so I'll never. I'll never ding a preparation. I'll never ding. I'll never. I'll never, you know, find fault with a line of research. I think what gets problematic is when the community forgets that this is a model. This is a limited model of something. Right. And, you know, when the community becomes sufficiently large, it begins to talk only to itself.
And then those questions sort of take on a life of their own.
[00:23:10] Speaker B: Right.
[00:23:10] Speaker A: That is independent of sort of how they were originally formulated as part of sort of an overall research program for a field of, you know, I would characterize it as understanding how brains work to control behavior. Right. Or how.
I mean, there are lots of different ways of characterizing it. But I don't think that most people, when they started their careers were. Would have honed in on the representation of a static visual image in the anesthetized brain of a cat as sort of the central question that they wanted to set out to answer like that. That is a super useful question to answer. And it's great to have a model system where lots of people agree on the preparation. You can really make progress. But ultimately that I don't think was the fundamental question that drove all these people to work on that preparation. It was sort of a model of a larger question. Things like, how do you represent the world outside of you? And then probably people were asking, you know, the original people who wanted to know, how do those visual representations get used by the animal or what is thought? Right, Things like that.
[00:24:33] Speaker B: But do you think that the modern deep learning approach, neuro AI approach runs any risk of falling into that same error of sort of mistaking the map for the territory?
[00:24:48] Speaker A: Because it. That's a leading question. Do you have anything in mind there?
[00:24:52] Speaker B: What do you mean, like specific criticisms?
[00:24:54] Speaker A: I just.
[00:24:54] Speaker B: Because people think, right, that for example, a transformer is like doing cognition or something. And, you know, that's a very simplistic way of saying it, but. But we're still.
It's still a model and it's not the thing. Right. It's not the ultimate question. It's still a model, but it's a lot closer. Perhaps. I don't know. What do you think?
[00:25:17] Speaker A: Well, transformers in particular. Right. I don't. Yeah, Transformers I don't think of as.
In fact, I'll go as far as to say, as far as I can tell, Transformers are almost a counterexample to the successes of neuro AI in that they Bear, as far as I can tell, very little resemblance to anything that I expect to find in the brain. And their success basically derives from the fact that they are extremely well matched to our current generation of GPU hardware. Right. Like, yeah, and that's great. Right? Like, I am blown away by ChatGPT. They are, that is, you know, everyone.
[00:26:06] Speaker B: It's necessary to state that. That everyone's blown away.
[00:26:10] Speaker A: But, yeah, well, I mean, I, Well, I, you know, I am more blown away than that. Well, I'm especially blown away, I'll say, because right up until the week that I played with it, I was doubling down on my belief that you would never have a large language model that was any good at all without grounding.
In fact, I was. I was having arguments with people who, you know, worked at DeepMind and Google who were already playing with these. I was like, no, no, it won't work. And they were like, no, no, it does work. I was like, nah, nah, nah, nah, nah. And they're like, no, no, you're wrong.
And, yep, I was wrong.
[00:26:52] Speaker B: Yeah, but what, so what do you think the role of ground. I'm sorry, I'm jumping around, but what do you think the role of grounding is now? Do you think it has an importance?
[00:27:00] Speaker A: Yeah, so we can jump to that. I mean, what I think is that ChatGPT taught us something that I don't think we would have learned any other way, and that was not obvious to me, which is that in some sense, language is a closed system, sort of in the same way that, you know, arithmetic is closed under the integers. Right?
[00:27:24] Speaker B: Yeah.
[00:27:25] Speaker A: So, you know, you. Basically, it's very hard to break chat GPT with by asking it questions. Right. Like it gives reasonable answers and has reasonable things to say about almost saying. And I know there are countless examples on, you know, the fact that, honestly, the fact that it can't do arithmetic, it literally can't do arithmetic, I don't see as a condemnation at all. I think that's a ridiculous claim. I can only do arithmetic for large numbers if I follow the algorithm. And that's, you know, of how to, how to add numbers. And that's not what we're asking. That's not what ChatGPT does. So, yeah, I think that what we've learned there is that, you know, you can't. If it can give a reasonable answer to any, almost any question that's formulated as a string of words.
And that is super interesting and to my mind, surprising, you know, early on, like the 3.5, you could still Break by asking it.
Which I thought was significant. By asking it something like, what is the problem with making shoelaces out of uncooked spaghetti?
[00:28:49] Speaker B: Oh, that's a good one. Yeah.
[00:28:50] Speaker A: Right. Because that required some knowledge of the physical world and I thought that's what you wouldn't have.
[00:28:57] Speaker B: That's the grounding. That's your grounding.
[00:28:59] Speaker A: That was an instance of grounding. And I was pretty smug. I was like, ha ha.
But no GPT4 can give you a long exposition on the problem of using uncooked spaghetti for shoelaces.
[00:29:16] Speaker B: Yeah.
[00:29:16] Speaker A: So I cannot find anything like that anymore that breaks it. You know, there, there are a couple of like the goat, you know, the, the goat, the cabbage and the man crossing a river that is said to break it. Do you know?
[00:29:32] Speaker B: No.
[00:29:33] Speaker A: Oh, okay. You know the classic logic puzzle. The you have a goat, a cabbage and a wolf and you want to get them across a river and you have to like figure out the right sequence to do this.
[00:29:45] Speaker B: Oh, okay.
[00:29:46] Speaker A: It will give you that. What's that?
[00:29:48] Speaker B: Nothing. That. Yeah, I. That's vaguely familiar to me.
[00:29:51] Speaker A: Yeah, yeah, yeah. Okay.
Yeah. And it will, it will give you the right answer, but then if you give it a variant of that, it will not pay attention.
[00:30:01] Speaker B: Oh.
[00:30:01] Speaker A: And it'll sort of answer reflexively and people, various people like Gary Marcus will hold that up triumphantly and say that that's an exception. I think a lot of lazy people who aren't thinking clearly would also make the same mistake. Right.
Or, you know, I can't operate on him. He's my, he's my son. Right, but how is that possible? Right. You know that one.
[00:30:28] Speaker B: No, no, you don't know.
[00:30:29] Speaker A: You don't know these paradoxes. Right?
[00:30:31] Speaker B: Yeah, yeah. Well, I don't. Yeah. I'm not part of the cottage industry of breakfast.
[00:30:37] Speaker A: These are. Yeah, yeah, no, these are, this is the. Anyway, I don't. It's a two minute digression. Not worth. Anyway, the point is that I don't find those breakages interesting.
So, you know, if you ask what is the limitation? It's the things that can't be done and there are hallucinations which are interesting and we can, you know, you know that, that may or may not be. Turn out to be a solvable problem. It seems like people are making a lot of progress on solving them. Solving it.
But that's still. But the point is, and this, this I think is important is that language is only a tiny bit of what we do. Right. And I think for me that's really, the key point is that we do an awful lot of things, in fact, right. We are the product of 500 million years of evolution and language, although I'm very impressed with it and I think it's key to our success. Probably emerge depending on whether you think Neanderthal have language somewhere in the last couple hundred thousand, maybe million years at most. So we've had 499 million years of evolution and language is just this sort of extra on top of that. And that all that other stuff is the stuff that remains incredibly hard for artificial systems, Right. Like we have vision systems that kind of work on static images. They're really impressive, right. You can take a picture. One fun thing is you can. Someone sends you a picture of themselves standing in with some background in some random city, you can upload that picture and it'll say, oh, that's Sophia Bulgaria. Wow, who would have known?
But, yeah, go ahead.
[00:32:29] Speaker B: Well, yeah, but I shouldn't have said Transformers because like you said, that's sort of an example of the opposite. Right. But I was thinking more in terms of the convolutional neural networks with recurrence and that neuro AI push in terms of understanding our brains.
[00:32:46] Speaker A: Right. So I think, you know, convolutional neural networks are the example that those of us who advocate for neuro AI hold up every single time to the point where anyone who doesn't finds them that example annoying. But it is sort of. And I'll say, actually there's an even more fundamental example which is the idea of neural networks in the first place, right? Like the idea that you're going to compute with a whole bunch of elements that are connected with variable parameters. Right. Like, I don't. It's not obvious that we would have gotten there had we not sort of been inspired by sort of squinting and making an abstract model of the brain.
I get, you know, so if the question is what the question that usually comes up, and maybe this is where, you know, what you're getting to as well. What, what are we going to do in the future? Right? Like what. What is. So here, here's the argument that people usually have, right? Which is, okay, sure, brains early on inspired artificial neural networks in the same way that plane birds inspired planes. Right? But we do not design planes based on careful study of birds. Right? And that's, that's the argument. So one, one argument, one counterargument that Jan Lecun likes to bring up is that apparently aerospace engineers do actually study birds and get cool ideas from them. I'LL I'm, I'm agnostic to that.
You know, I'll defer to him. As someone who apparently has read up on aerospace engineering, for me the more fundamental counter to that is simply that we are not, we are building systems in this analogy that we want to be as bird like as possible, right? We will def, we define success as building the most bird like bird we can, right? So it's true that, you know, a 747 can do amazing things that a bird can't do, right? It can fly whatever 10,000 miles, it can carry lots of tons of cargo, it can go 500 miles an hour, but it cannot. And in the same way computers can do all sorts of things people can't do, right? They can multiply big numbers, they can serve up queries for Google, whatever. But that's not what we're asking. That's not what we would like artificial systems to do better. We would like artificial systems to do better at what birds do, which is to swoop from the sky and pick up a fish, right? And so if we wanted to build a bird like bird that could fly through the forest fairly quietly without using too much energy and swoop out of the sky and grab a fish out of the water, right. Then we would probably do well to look very carefully at how birds do all these things. And in the same way, if our goal is really artificial intelligence, which is the ability to do anything a person could do, I mean that is sort of the, the, the most generic explanation, then we should probably figure out how people do what they do. And I would argue that the path to understanding how people do what they do is to look at how animals do what they do because people don't. People have very little, I would say that's novel over what our ancestors did.
[00:36:42] Speaker B: Separate language that what you refer to as alignment.
So I asked you offline like what you're excited about and one of the things that you listed was alignment that you wrote to me that our current models for how to formulate objective functions for like reinforcement learning and stuff are very limited. And you think that we should look to the animals for that?
[00:37:05] Speaker A: I think that's exactly right. So I think there are, yeah, like I'm particularly interested in three sort of large questions about, you know, where you can hope to gain insights from biology, from neuroscience and bring port them over to artificial systems.
And alignment is sort of in some sense the most fundamental, right. So we currently build, are very good at building systems that are pretty good at building systems that maximize A well defined objective function. Right, right.
And if we say, well, we want multiple objective functions, the answer is usually, okay, we'll just add a couple of terms to the original objective function. Right. Lambda one, objective one is lambda one plus objective two times lambda two and so on. Yeah, yeah. That has not turned like choosing those lambdas. That's a very impoverished way of representing objectives.
And you know, I think in most cases that particular expansion hasn't worked particularly well.
It's brittle and it just, it hasn't been effective. So you know, by contrast, animals are necessarily expert through evolution at balancing multiple objectives. You know, the so called four Fs beating, fleeing, fighting and romance.
Yeah. So you know, we have to balance all of those when we're hungry. Like that might take precedence. But you know, at some point if no matter how hungry we are, if we're about to get eaten by a predator, we should probably put our hunger on hold and flee. Right. And you know, romance sometimes takes a backseat to all of those, only when the other three are dealt with. And you know, this is sort of the top level objectives, but those are broken down into sub objectives and sub, sub objectives. And you know, like humans and other social animals have social objectives that are as compelling and profound as hunger. Right.
There's a sort of biological architecture that allows evolution to sort of introduce new objectives that interact appropriately with the existing ones. I don't think we know how that works in biology and I don't think we understand how to do that in artificial systems.
[00:40:07] Speaker B: Yeah. If you ask, some of the cognitive architectures were a big thing for a long time and they still are. But one of the issues or one of the things that those people learned in trying to build those systems is that the coordination between the modules is like a harder problem than the actual objective functions in the modules.
[00:40:27] Speaker A: Exactly, exactly. I think that's, that's exactly right. And yet. Right. I think that this is a case, you know, so I make the argument that neuro AI is really, you know, this virtuous circle, this virtuous cycle between taking insights from neuroscience, applying them to AI, using AI as a model of neuroscience. Right. In the same way that I don't think we under vision scientists understood just how hard vision was until they, they sort of took their fuzzy ideas and tried to build a system on them. I don't think even the people who work on, you know, various motivations understand how hard it is, as you say, to coordinate them. And we won't really understand how hard it is to sort of generate behavior from an agent that has a whole bunch of objectives until we start trying to build such agents.
[00:41:23] Speaker B: Okay.
[00:41:24] Speaker A: And that will sort of define the problem better even for the experimentalists. Right. We're still at the experimental level, still trying to define what are the signals for reward. Right. And you know, obviously important groundwork, but it doesn't necessarily get us to the really hard problem or what may turn out to be the really hard problem of coordinating a bunch of these things simultaneously.
[00:41:49] Speaker B: This goes back to the question of whether you, what you want your artificial intelligence system to do. And I don't know, replicating us is not, maybe that's not the best use of building these things. You know, why would we want a system to have all the hard won evolutionary coordination Dynamics among our, whatever, 16, 17 objective functions? Why would, why would want them to have to battle that out? Have to, have to implement all that?
[00:42:20] Speaker A: I mean, I want a robot that washes my dishes.
[00:42:25] Speaker B: Yeah, yeah, right.
[00:42:28] Speaker A: But I want that robot also not to like step on me.
I want it to be aware if my house catches on fire and do something appropriate about that.
[00:42:45] Speaker B: What about the romance part? Do you want that too?
[00:42:47] Speaker A: I do not want romance. But I will tell you that what drives technology traditionally has been romance. Right. If you look at the history of various technologies, it turns out that the rise of VHS was apparently driven by, let's say, romantic movies.
[00:43:07] Speaker B: Oh yeah. Is that true? I didn't know that.
[00:43:10] Speaker A: That is. It is apparently, yeah.
Movies featuring private romance were one of the main drivers of vhs, certainly one of the main drivers of the early Internet. I believe it's still like one of the main sources of traffic for the Internet. I think so, yeah. So. So I have no doubt that romantically inclined robots will, will be a huge market. Not, not my own personal dream, but like.
[00:43:44] Speaker B: Right, yeah. Wash the dishes and have a little romance.
[00:43:48] Speaker A: That's right, but yeah, so, so, you know, I think in practice the particular objectives that we want that robot to be guided by may turn out will undoubtedly turn out to be very different from our objectives. But what's important is not that the sort of content of those objectives, but the sort of computational framework for trading them off appropriately. Right. Like, you know, our objectives in our objective function, survival features pretty heavily for most humans, but even that, that can be relaxed. Right. Like an individual ant does not really care that much about its own, I guess, her own survival. Right. Like an individual ant, in large part, I guess, because it's a clone of all the other ants in its colony, is very willing to sacrifice itself for the good of the colony. Right. Yeah, you know, it's. It all other things being equal, I'd rather not die. But it's not mostly focused on not dying.
[00:45:02] Speaker B: But there's still something at stake.
[00:45:05] Speaker A: Sure. Yeah. And same thing with a robot. Right? Like, you don't want your robot walking into. Randomly walking into a lava pit.
Right. But you, you do want your robot not to value, you know, I guess, Asimov's three Laws of Robotics. Right. I guess survival, I believe was. Yeah, yeah, don't harm. Don't harm another.
Do. Do what other people tell you unless it violates the first and try to survive unless it violates the other two. I think that's roughly speaking, all right, what they were. But, you know, maybe we need something richer than that and certainly we don't necessarily want it.
Like, given how we.
How we envision programming or instructing our agents these days, it's unlikely that we'll lay it out in words.
Although maybe with LLMs, one of the things.
[00:46:02] Speaker B: I mean, I had you on a really early episode and we talked about your paper that argues that most of what's useful is actually innate due to evolution over time and more recently. So you're still on that.
Not bandwagon, you're still on that idea, but also you've incorporated development as something that would be interesting to study in terms of how is this related, your interest in development? Is it related to this coordination of the objective functions?
[00:46:33] Speaker A: Absolutely.
[00:46:33] Speaker B: Yeah.
[00:46:34] Speaker A: So that's right. The original idea that the original bandwagon upon which I hopped was.
[00:46:41] Speaker B: You built the bandwagon, right? You helped build it.
[00:46:44] Speaker A: I helped, yeah. Well, it turns out other people were on this bandwagon, but yeah, I've been.
I don't know.
[00:46:51] Speaker B: You gave it a fresh set of wheels that there.
[00:46:54] Speaker A: Well, I think, I think, yeah, maybe I put some flags on it. I don't know. Anyway, yeah, so. So I'm, I'm excited by the idea that that much, much of all at all behavior, all animal behavior, and humans are animals, derives from, you know, deep innate drives.
And this is true at every level, that we just don't have time to learn everything from scratch.
In fact, I, you know, I would go as far as to say that, you know, learning can be seen as sort of on a continuum with. And really an extension of simply development. So if you buy the idea that most of what we have, most of our neural circuits, most of our, therefore our behavior is determined by our genome, which specifies a neural circuit, then that sort of. It took me a While to come around to the idea that that really requires that you pay some attention to the relationship between the genome and the final brain, you get. And the process by which biology takes a genome and produces a brain is called development.
I thought for a while that I could sort of ignore the biology of that, but it turns out that it, number one. Well, mostly I wanted to ignore it because I didn't know much about development and I'm still woefully ignorant.
[00:48:33] Speaker B: Me too.
[00:48:34] Speaker A: Yeah, yeah. But it turns out to be pretty interesting, and there are principles that you can sort of abstract from it and that maybe can help guide your sort of how we approach these problems. And so sort of one of the core principles, for example, is the idea that the process by which you derive a brain from a single cell involves sort of the recursive application of a relatively simple set of rules, and then, when necessary, those rules are modified across developmental time.
[00:49:19] Speaker B: But so people like Robin Heisinger, who would say, and he's very focused on development, and you make this point that there's not enough coding capacity in our DNA to specify the entire structure of, for example, our human brains.
And he would make the point, and I don't know how you think about this, that what the DNA is doing is actually encoding those recursive rules.
And you have to have that development is necessary. You can't just go from DNA to the connection.
[00:49:48] Speaker A: 100%. 100%. So, in fact, Robin was just here for a meeting at Cold Spring harbor last week, and we had a wonderful meeting of mines. And I'd never met him in person, I bred his stuff. But we.
And, you know, we're. We're 100% aligned on that. And I would say I'm now hopping on his bandwagon.
Yeah, the development bandwagon. Yeah, the recognition. Exactly as you said that.
So just to back up, like. So I had the idea that I was pursuing was that the genome is a pressed representation of our wiring diagram.
[00:50:31] Speaker B: The bottleneck.
[00:50:32] Speaker A: The bottleneck. And it represents a genomic bottleneck. And in fact, just this week, a paper, our very first attempt to sort of formulate that rigorously, that paper was finally published. It was out on BioRxiv for, I don't know, the last four years.
[00:50:52] Speaker B: Wow.
[00:50:53] Speaker A: But it was finally published in PNAS work with Alex Kulikoff and who's a fellow professor here at Cold Spring Harbor.
And in that vision, in that version of it, we formulated the problem of compressing a weight matrix by using another smaller neural network. So we're compressing the weight matrix of a neural network by using another smaller neural network to predict the weights of the final neural network. And so like an autoencoder, it's not quite an autoencoder.
Basically you have a weight matrix and the weight matrix has, you know, is N by n. So you have N squared elements.
[00:51:53] Speaker B: Yeah.
[00:51:53] Speaker A: And then you have a smaller neural network whose input is a pair of elements, two indices of the larger weight matrix, and its output is a prediction of that weight. Yeah, and that worked. I mean, we got great results. We were able to compress big matrices from, you know, MNIST and CFAR and ImageNet into, you know, factor of 100, factor of 1000 in the compressed weight matrix basically could perform almost at the same level as the uncompressed one, the one after learning just out of the box. Right.
And so that was on top of that, we showed that these compressed representations sort of led to better transfer learning, suggesting that when you compress a weight matrix, you're throwing away the junk and you're keeping the stuff that's important, generalizable, More generalizable. Yeah, exactly, exactly. So we saw compression as a regular riser, and so that worked. And then more recently with Blake Richards, we had another version of this inspired by some, which is now on Biorxiv and under review, be another six years before it's published, where we use cell types and stochastic connectivity among cell types.
And that also works, and it has somewhat different properties. So.
[00:53:38] Speaker B: So in a sense, you've solved that bottleneck problem.
[00:53:41] Speaker A: Well, so I would say that both of those were fun and great learning experiences. But the stuff that I'm working on right now with a postdoc named Stan Christians, it's really sort of driven by some ideas he had for how you can formulate that developmental process recursively, how you can grow a network using very simple rules. And this network grows and can be guided to produce a final result that solves tasks that to my mind, has captured some of the key elements of development.
And these recursive. What's that?
[00:54:25] Speaker B: Would that process also then be more efficient?
[00:54:28] Speaker A: Well, that's what we're interested in finding out. So what I think is, is that it represents a prior on the possible circuit. Like any set of rules you have for compression represents a prior on over the circuits that you can generate. Right. You might not be able to articulate that prior, but if you have a small thing making a big thing, Right, then it's only going to be able to make a subset of the big things. Right. Just by information, theoretic arguments and the subset of things that it can, that it can produce and which ones it learns most quickly. That represents a constraint on the set of networks and a prior over those.
[00:55:17] Speaker B: Okay, yeah, I was, you said the word. I was thinking constraint and.
[00:55:21] Speaker A: Yeah, exactly.
[00:55:22] Speaker B: Yeah. I've come to think of constraints as. It's sort of like that coordination problem. Like constraints are everywhere and like in some sense they're more important than the process is the constraints.
[00:55:31] Speaker A: Exactly, exactly. So, you know, people will argue that the success of artificial neural networks is really that they represent, well, a prior, a smooth prior over data. Right. Like, there's a lot of work on that, but it's the same sort of idea. And you know, you can't know ahead of time what the right set of constraints is, what the right set of priors is. This is an experiment, you know, sort of an experimental question. You know, the proof is in the pudding. But in this case, right. The fact that the prior at a very high level from 30,000ft, kind of looks like the prior that guides the formation of actual neural circuits. Right. The idea that every neural circuit ever in existence in biology arose from a single cell.
Right. And then the set of rules that took you from one cell to many cells has to have fit in the genome that maybe, you know, is one of the key constraints. So anyway, that's something I'm pretty excited about now.
We'll see. We'll see if that, if that. How that plays out. But I feel like that's sort of the right way to go. And there's also a very natural sort of ex. Interpretation of that or evolution, because what evolution does, right, Is it produces a circuit. So evolution.
So you start with a circuit, and then the circuit is good at doing some things. The, the organism, well, the organism in whose brain that circuit is, is good at performing some behaviors and perhaps not others. Then you select for animals that are maybe somewhat good at performing some other behavior and then they will develop possibly in the next generation. If the rules of development are such that the animals circuits get even better at that behavior, then you select for those. Right. And then you sort of add on to the existing circuit. You add on new abilities, new circuits, but you have to make sure that every generation you can produce an organism that has a brain that can do all the things you want it to do.
[00:57:51] Speaker B: Why do you need development in there though?
[00:57:55] Speaker A: Well, in biology you need development because you start every generation with a single cell. So you have to give a plan for how you go from One cell to a body and a brain connected to the body. I personally focus, I'm at this point more interested in the rules for generating the brain. But honestly, if we're ever going to understand robotics, we might want to think about the fact that bodies also are built that way.
[00:58:30] Speaker B: Yeah, but the AI field, earlier you alluded to your reluctance to take on development and I have felt it scares me essentially because it seems so hard.
But then I imagine the AI field would want nothing to do with development or think that it's something that biological organisms have to go through because they're coming from one cell, but might not contribute to, to AI, for example.
[00:59:01] Speaker A: Sure.
Let me tell you about a case that I've been thinking of. We haven't made any progress on it. So, you know, I'm giving away my research ideas, but that's okay. I think I'd be thrilled if other people pursued them as well. So let's. I've been thinking a lot about robots recently, right.
There's a problem, right, that we don't really have terribly good robots. They're not very good at interacting with the world for a while. I think, you know, there's a community of people who are excited about these physics simulators. I've been playing around with them like Mujoco, but I think a lot, right? So in these physics simulators you can specify agents that have arms and legs and they're connected by sort of something like muscles and you can apply forces on them and then you can learn policies to control them. Right. In these artificial environments. And you know, I think they're, they're, they're thought to be pretty, pretty realistic simulators. As they're trying, they try to make them as realistic as they can and it turns out to be remarkably hard to build.
Even in these reduced physics simulators, agents that can walk around, right, like that, that it was a real sort of teaching point to me that how hard it is to make a simple agent walk around in a physics simulator.
And so, you know, we fooled around with that a little bit and many other people have done sort of serious work in these. But I think that my understanding is that relatively few serious roboticists spend a lot of time in simulation because the problem of taking the sim, the simulated agent and bringing it into the real world is basically unsolved. It doesn't work. The so called, it's the so called sim to real problem, right? And so here's my thinking on that is there is a kind of similar, if you like them to real problem that confronts us in that we have a genome that specifies a body and a neural circuit and those the specification of the body need not be terribly closely coupled to the specification of the neural circuit. So you are born with a brain that had better be. Be able to fairly quickly learn to control whatever body you are born with. Right.
[01:01:50] Speaker B: Well, there's an exception to this. So I was just talking with Karen Adolf. She studies human motor development and has spent years, you know, studying children and, and the rate at which they fall and run into stuff. It's just super high, you know, starts super high. And they're. Because they're exploring this space so they don't necessarily have to come out, you know, with. So like a horse, you use the example can, can walk in a minute or two of being born. But we're humans, we're awful.
[01:02:20] Speaker A: Yeah. And it's a great question why, why human. Humans are always the exception. So I, and I think it's. It will. Once we understand animals, it will be interesting to understand why humans are the exception to a lot of what I'm saying. Right. Like why is it that it takes us many years to learn how to walk? It's. I think it's pretty clear that it's not because we couldn't have learned to walk more quickly. In fact, I'll go as far as to say as a, as a. Well, my kids are older now, but I remember when my kids were younger and I would have preferred them to take even longer to learn to walk because if they don't have the common sense not to do stupid things. Yeah, right. There's this long. I don't know how. I think you have kids. How old are they?
[01:03:10] Speaker B: 9 and 11. And I'm just. Okay, there's a breakthrough where I'm trying to. I'm trying to make my son ambidextrous, which is impossible. So we've been doing a lot of throwing with left handed and stuff. It's hard but.
[01:03:21] Speaker A: But you remember, you remember that period where they're toddlers, right? Where they now have the ability to locomote but not the good sense not to locomote to the wrong place.
[01:03:29] Speaker B: Right.
[01:03:30] Speaker A: They have the ability to pick something up and put it in their mouths, but not the good sense not to put the wrong thing in their mouth. So you know, I think this long delay before we can even learn to stand is just a reflection perhaps of the one. The fact that the ones who learned to stand too early made bad decisions anyway. Going back to animals, I would go and point out that you can meet a Chihuahua and a Great Dane might be a little awkward, but it can happen. And basically their DNA is compatible, the same circuit. Basically what that tells you is that the instructions that build a Chihuahua brain are essentially indistinguishable from the instructions that build a Great Dane brain. And that's two orders of magnitude right. When you run Mujoco, if you build a typical agent in a Mujoco physics simulator, at least in our hands, if you, if you train the agent to control its body and then you change the body by 10, 20%, it breaks. And we're not talking 10, 20%, we're talking, you know, Great Great Dane is a Chow is a couple pounds, Great Dane is 200 pounds. So, you know, this is crazy.
A crazy amount of flexibility.
[01:04:55] Speaker B: Yeah.
[01:04:56] Speaker A: And so how does that happen? Right? I mean, we don't know the details, but the answer is, I think that there is a developmental process, you might even call it a developmental curriculum.
Right. And at each step, you know, you solve sub problems of the overall problem that somehow enable this entire brain body combination to learn to walk and run within, you know, a few months, even a few weeks. You know, I, you know, I think that there are other, you know, I'm picking Great Danes, but I think even those, you know, dogs, and I think even those. Probably the evolutionary pressure wasn't as high to get things moving immediately. Right. You could probably use a pony and a, I don't know what, what large horse. I don't know the names of a large horse.
[01:05:52] Speaker B: Anheuser Busch horses, whatever those are.
[01:05:54] Speaker A: Yeah, exactly. Exactly. So that's probably only one order of magnitude difference in size, but I think the same argument holds. And those guys can walk within a day or two.
[01:06:03] Speaker B: Yeah. This reminds me of. So you use the word curriculum. So curriculum learning, and you can correct me, is the idea, is that idea that you just mentioned of instead of learning to, oh, let's use tennis, tennis serve as, you know, the common example. Right. You don't just go and just do the serve. You, you learn how to stand, you learn how to bend your knees and, and then you put all these, then you do those kind of separately. And that's curriculum learning. And then you can put them together. Exactly. Someone at the was just. Alexander Mathis was just talking about this and how it actually helps to teach an artificial system how to do something like that. So I guess that's what you're.
[01:06:43] Speaker A: That's right. That's right.
You know, there has been a fair amount of work in ML and Machine learning on curriculum learning. Yeah. Usually people push back and say that the failure there is that it's too hard to choose a good curriculum.
[01:07:04] Speaker B: Yeah. Right.
[01:07:07] Speaker A: And I guess I would say that it for probably for many problem domains.
That's kind of. Right. Like you. If your goal is to learn image recognition, you know, to build a system that does image recognition, you know, and your goal is to train it faster, I guess it's not clear what role curriculum would play there. You might take a guess as to what are good building blocks, but by the time you've tried a whole bunch of curricula, you may as well just have, you know, used your data and trained end to end once. So I, I think that's kind of how it got a bad name because it's been applied to solving a different problem from the one that I outlined here. Right. Because where the curriculum is potentially useful, at least one place where I see it is potentially really useful is if you have essentially the same problem that you need to solve over and over again with slightly different constraints, a slightly different formulation. Right. A slightly different brain, a slightly different body.
Now we're in a range, I think now we're domain where that curriculum could really pay off. Right. Because I think. Yeah, go ahead.
[01:08:30] Speaker B: Well, so, so you're thinking in terms. I'm thinking and I'm thinking in terms of like Karen Adolf studies and stuff just because she's been on my mind. You're thinking in terms of like taking inspiration from how, let's say an example like as a baby crawls, it's, it's the way it can even hold its head is different. Right. So it looks at different things at different times and then as soon as it can like sit up and move around better, then it's, it's outlook on the world. It's physic. The way it actually takes in the world changes and it's scanning different, a different set of things. But again it's the same sort of problem of like getting visual information in. Right. And then by the time they're walking, so there are like these. Is, is that what you're talking about? Kind of using that inspiration that from development as inspiration? Because they're solving slightly different. They're.
[01:09:22] Speaker A: That's right.
[01:09:22] Speaker B: Yeah.
[01:09:23] Speaker A: So. So that's right. They're solving different problems.
And the solution to, to one problem provides a foundation for the solution to the next problem.
[01:09:35] Speaker B: Right. Okay, right.
[01:09:37] Speaker A: And the particular. So this is where evolution comes in is that evolution in some sense is, provides guidance as to the sequence of Problems that were solved. Right. So I don't know if you've yet had Max Bennett on as a guest.
[01:09:55] Speaker B: Yeah, yeah. He's a good book that he wrote.
[01:09:57] Speaker A: Yeah, yeah, yeah, yeah. So he's, you know, he. He lays out a particular collection of five problems. Brief History of Intelligence. I recommend it to all.
[01:10:06] Speaker B: He's one of the speakers at Nisys.
[01:10:07] Speaker A: He's one of the speakers at Nisys. Yeah. Beautiful book. Just frustratingly good in that I was thinking of writing a book and now, like, I don't know what I write. He wrote. He wrote a better book than the one I was. Much better book than the one I was envisioning writing.
But in any case, he lays out sort of five big problems that. Yeah, there you go. That's right.
At this point, I've recommended his book so widely, I feel like I deserve like, a fraction of the royalties that he's getting.
[01:10:39] Speaker B: No, it's a very well written book. Yeah. Okay.
[01:10:41] Speaker A: Yeah.
[01:10:41] Speaker B: So the five breakthroughs.
[01:10:44] Speaker A: The five breakthroughs. Right. So, you know, I don't know if it's exactly right, but it has the flavor of, like, a really nice framework to think about the problem. And, you know, the idea is that you can't get to the second breakthrough until you've sort of had the first one.
[01:11:05] Speaker B: Right.
[01:11:06] Speaker A: And then, you know, there's this old adage which has a great. Some truth in it, and we could even talk about why it's not quite true. But ontogeny recapitulates phylogeny. In other words, development.
Development replays the evolutionary history up to a point. And so then sort of the. The natural interpretation is that, well, you have an agent, you have an animal that can perform a bunch of things, and those things are so important, it can sort of do them at birth. And then it kind of learns other things that turn out to be useful. And those animals that are particularly good at learning those other things quickly get selected for. And the fastest way to learn something quickly is to not have to learn it at all, but just stuff it into your genome or stuff it into your genome as much as is possible. And that's how things. That's. That's sort of what led me to say earlier that it's extremely hard to distinguish between development and learning. Basically, the faster you are at learning something, sort of the better your priors for learning it are, the less information you need from the world to learn it. And that all happens by stuffing it into your. Packing it into your genome.
[01:12:23] Speaker B: Are you familiar with Justin Wood? His work He.
You should check him out. So he used to be a staunch nativist, and now he is a staunch empiricist. And it's, it's because of his work he does controlled rearing of chicks. So, like, when they're hatched, they go straight into this automated box where they have complete control over, like, what they see and what they can do and stuff. And some of his research findings have, you know, everything he's finding is leading him to think that everything is learned and nothing is innate. And so you guys should have a conversation, perhaps.
[01:13:01] Speaker A: Yeah. I mean, yeah, I don't think I ever want to have an arm muscle with the nature nurture crowd.
The answer is it's both. Right. Of course. Famously, I had a friend who used to always ask me, do you walk to school or carry your lunch?
[01:13:18] Speaker B: Right. That's good.
[01:13:19] Speaker A: I mean, yeah.
But that said, he's wrong.
I have no idea. No idea.
I just googled some of his stuff. So after we're done, I'll take a look at it.
[01:13:34] Speaker B: Sure. Yep.
I don't know. I interrupted us. You were talking about packing information into the genome.
[01:13:41] Speaker A: Yeah, yeah. And I would say that part of going back to AI. Right.
So one problem this, this curriculum issue could solve or address or provide a way forward on. Is it like you could imagine in simulation figuring out what the right developmental curriculum is and then. Right. And the what success would look like is you pick a series of 10 things or 20 things that an agent would have to learn and sub problems, and the hope would be. Right. The expectation would be that if there exists such a curriculum, learning those 10 things, the sum of the time it takes to learn those 10 things is shorter than the amount of time it takes to learn the thing that you're ultimately trying to learn. For the sake of argument, walking. Right. And that then if those 10 things are relatively straightforward. Right. You could maybe follow that same developmental curriculum in an actual agent with the idea that, yeah, it's still going to have to like, learn the body. Like all the differences between the body it would have thought it had from development or from the simulator and the one it actually got in the real world.
But maybe, maybe those differences, if you break that into pieces, are smaller than trying to learn the whole thing end to end.
[01:15:16] Speaker B: But you wouldn't want to create a robot that develops.
[01:15:23] Speaker A: I might want to create a robot that maybe is born into its body, but then learns how to control the nuances of its body. Because really what I want is to be able to build many robots and not have to spend a year each time I build a new robot body.
[01:15:42] Speaker B: Yeah.
[01:15:43] Speaker A: Right, right.
[01:15:44] Speaker B: So you're not 100% on board with all facets of development then?
[01:15:49] Speaker A: No, no, no, look, I mean, that's always the rule from, at least that's always my idea when looking to neuroscience for guidance, for inspiration. Right. Like, I, I don't, you know, I, I don't want to make, I don't even know what it would mean to make a perfect, to incorporate all the details from biology. Right. Into an artificial system. The only thing that has all the details of biology is biology. Right.
[01:16:18] Speaker B: Yeah. Yeah.
[01:16:19] Speaker A: So, you know, I, I, I spend a fair amount of my graduate work studying single channels and I think they're fascinating. I like open, you know, I like studying channel kinetics. I think these are fun problems. I don't think they're at all relevant, as far as I can tell, to anything I'd ever want to put in an artificial system. Right. Like how you, how you make an action potential is a cool problem that stands on its own and is as far as I can tell. I mean, maybe someone else will come around and explain why it is relevant, but I will certainly not want to build an artificial system that has sodium and potassium channels. At least it's not obvious to me why I would. Yeah, yeah, right.
[01:16:58] Speaker B: That's even, that's below my line as well.
[01:17:00] Speaker A: Yeah. And, you know, so sure, I'm open to the idea that neuromodulation is important. Right. Obviously it's super important for how we, you know, how animals work.
I don't personally yet know how to, and this just reflects my ignorance. I'm not saying that other people don't know how to do this well, but I don't know how to sort of abstract the principles of neuromodulation in a way that makes them useful for an artificial system. So I'm not just going to, we.
[01:17:34] Speaker B: Don'T know the principles of, how they're useful in biological systems. Right. So I think we're still a pretty far, far away from having, from feeling like we have that even close to being tackled.
[01:17:46] Speaker A: Exactly, exactly.
You know, the tremendous success of, of, you know, convolutional neural networks, which were inspired by Hubel and Wiesel. Right. Like that, that is a great example that not all aspects of receptive fields were stuffed into a neural network. It is the idea, like you can, you can now in retrospect, 2020 hindsight, we can boil it down and just say, look, it's, it's the idea of Translational invariance, which maybe could have come from some other angle. Right. Like maybe you didn't need to study receptive fields, but like that's how, that's how the idea was hatched.
[01:18:31] Speaker B: Yeah.
[01:18:31] Speaker A: So that, that's my attitude toward how these things. And I'll just go and make one other comment though about, you know, the, the appeal of a curriculum as opposed to end to end learning and the idea that a curriculum that is rooted in the actual evolution, evolutionary path that humans follow to get where we are. Why that would be useful. And I, I think about, I think a lot about the example of self driving cars. Right. So self driving cars don't work that well.
You know, I just, I was surprised. I just recently read an article, apparently even Waymo, which is pretty widely deployed there, there's basically literally a room full of people who are helping the VMO cars out. When they get into trouble, they need grounding.
[01:19:25] Speaker B: Right? Is that what they need, grounding? That's.
[01:19:27] Speaker A: I don't know. Well, we, but no, no, like Waymo, apparently there's like a control center and a bunch of people.
[01:19:35] Speaker B: Yeah.
[01:19:36] Speaker A: Yes. So there are people sitting in a, in a room. This was a New York Times article a couple weeks ago.
[01:19:43] Speaker B: Okay.
[01:19:44] Speaker A: There are people sitting in a room somewhere and every, I think it said every three to four minutes they are called upon to help out one of their cars, which is having trouble.
[01:19:55] Speaker B: I said online, what I meant is real time. So they're like doing this in real time.
[01:19:58] Speaker A: Like real time. Like Waymo is driving around and then, oh no, there's a yellow cone. What do I do?
And then, you know, there's a guy sitting there saying, okay, veer to your left, go. Okay. And then, I mean, I think he has a little mouse or something.
[01:20:13] Speaker B: Yeah.
[01:20:14] Speaker A: So. And you know, I've, I've white knuckled it on a self driving Tesla before and it's an exciting experience. But so why doesn't it work? Right.
You know, the hope has been that, you know, just increase the amount of data by another order of magnitude and another order of magnitude and you'll start fixing all these problems on the long tail. And I guess my argument is twofold.
One is that, like it again, you would. The reason driving is pretty easy for us is that we already understand everything we need or everything we need to know about how to parse visual scenes. Right.
We didn't have to learn how to parse visual scenes by sitting behind the wheel of a car. Right? We didn't have to. Yeah. Yeah.
[01:21:05] Speaker B: Well, there's, there's also the frame problem, which is still a problem and that the, that we have solved the frame problem to know what's relevant when we're driving.
[01:21:15] Speaker A: Sure. Well, we have solved. We more generally we have solved the problem of being able to in a novel situation, figure out what's relevant and what isn't. Right. And then we learn the details. You know, I'm teaching my 14 year old how to drive and it's.
[01:21:34] Speaker B: Talk about white knuckling.
[01:21:37] Speaker A: He's very good.
[01:21:38] Speaker B: Yeah.
[01:21:38] Speaker A: Okay.
But, yeah, so, so, you know, it's definitely a learning experience, but he's, you know, he's. We've gone out a half dozen times and there's been dramatic improvement. Right, sure, yeah.
So you know, he has to fill in some of the blanks. But I guess the other problem, in some sense I would say that this is even a more fundamental problem. So even if we get to the point where self driving cars on average make fewer mistakes than people, which we may get to, we're going to be very disappointed if the mistakes they make aren't kind of similar to the mistakes we make. Right. So if a self driving car on average has a much better track record than a human, but it occasionally just runs down a kid in the middle of the street, who you're going to review the video and you're like, what the heck happened there? Anyone should have seen that. That is going to limit our enthusiasm about adopting these self driving cars. And I think this is a general principle that. So what is going on there?
[01:22:56] Speaker B: Right.
[01:22:56] Speaker A: If the objective function is don't run things over, you're going to get a system, out of many systems that runs things over as little as possible. Right.
In order to ensure that it not only does a good job, but it fails in the same way we do. Right. We need a system that is as isomorphic to the way we do something as possible. Right. It, it can't just not make mistakes. It has to. Or it can't just make as few mistakes as possible when it makes mistakes, they have to be human, like errors. And that. Yeah, go ahead with that though, is.
[01:23:36] Speaker B: That we're not great drivers.
I mean, you and I are above average. Right. Everyone's above average.
[01:23:43] Speaker A: Right. But I mean, I'm not sure what we would, we would like. We can mostly agree on what are reasonable mistakes to have made. Right?
[01:24:01] Speaker B: Yeah, yeah.
[01:24:03] Speaker A: And I think that's sort of the key point here. Right. The fact that we do make mistakes, those are mistakes of lapses of attention, et cetera. And I expect that those are mistakes that the artificial agents won't make. Right. They hopefully won't get distracted. They hopefully won't be talking on their cell phones. They won't be driving drunk.
[01:24:23] Speaker B: Romancing and driving.
[01:24:24] Speaker A: They will not be romancing and driving.
[01:24:27] Speaker B: That is, that's what we can do while they're driving. Right?
[01:24:32] Speaker A: That's right, that's right.
Before just, just to, to that point. Right. When we have robots that drive and romance, they will know that they shouldn't mix the romance and the driving. If we can get their objective functions. Right, if we have a complex multi objective function. Right. And they will know.
[01:24:53] Speaker B: Yeah. You're on board with thinking that that is a harder problem than implementing an objective function itself.
[01:25:00] Speaker A: Yeah, yeah, yeah, yeah.
[01:25:02] Speaker B: Tony, what, what have we missed here? We've gone through kind of like those three main things for the reasons that you're excited about, maybe the most excited about neuro AI.
[01:25:13] Speaker A: Yeah.
[01:25:14] Speaker B: Did that paper, the one that maybe was it a year ago now, the catalyzing the next generation of artificial intelligence from neuroscience principles, something like that.
[01:25:23] Speaker A: Oh wow. You know the titles of my papers.
[01:25:25] Speaker B: Something. Yeah, that's close. Did that get much pushback?
[01:25:30] Speaker A: No, no, I don't think anyone bothers. So you know, my take on it is that.
So when I started, like I said, there was widespread agreement among the leading AI researchers, you know, Jeff Hinton, Yann Lecun, Joshua Bengio, people like that, you know, they, they, they were excited about neuroscience, they wanted to learn about neuroscience.
They were very clear that these things were important. I think what has happened is there have been several generations of engineers, AI scientists who never actually had direct contact with neuroscience. Maybe in like an introductory lecture they heard that in the early, in the, in prehistoric days, AI machine learning had something to do with neuroscience, but it was kind of prehistoric for them. And so if you, you know, if you go to Neurips today, my guess is that fewer than 1% of the people there have any interest in this potential for a, for neuro to have any impact on AI. And I think that's fine. Like if, you know, if my goal is to deploy a commercial LLM for, you know, a recommended, a recommender system or whatever where, you know, something that digests law documents. There's absolutely no reason that any of these people should be learning about neuroscience. I think the, even the next level of optimizing algorithms, right. Which takes a lot of work and a lot of theory.
It's not clear to me why those people should care about neuroscience. I think the bet that I'm placing, and I think some very small fraction of the overall community is willing to place is that just continuing on our current trajectories without some really new ideas may asymptote before we get to where we want or where they want.
[01:27:50] Speaker B: But you know, I was just talking with Kim Stackenfeld and one of the things I brought up is didn't DeepMind in a sense fail because their original mission was to use neuroscience principles and now they've outgrown that with scale, et cetera. Right. So that mostly they've moved on from trying to find inspiration in that sense.
[01:28:09] Speaker A: I mean, have they failed or have they abandoned their mission?
[01:28:14] Speaker B: I use the word failed because it's a hotter take.
[01:28:17] Speaker A: Yeah, well, for. I think there's no world in which you'd say deep minded.
[01:28:22] Speaker B: No, no, no. I know, I was just being.
[01:28:25] Speaker A: You're being provocative, but even there. Yeah, no, I mean my understanding, and she and others know this better than I do, is that they're under some pressure to deliver shorter term. And I think there are people there who are disappointed that they now actually, you know, they're being asked to. I think this is happening also at Google Brain and elsewhere. People are being asked to actually get away from the basic research that they were doing and you know, help build a better. And look from a corporate strategy point of view, when, when you know, your profits are only 40 billion and a year, you know, you could see how you might get nervous.
Right?
[01:29:07] Speaker B: Yeah, yeah, yeah.
[01:29:11] Speaker A: Yeah. So, yeah, I don't, I, yeah, I think these are strategic decisions.
I think basically there's a gold rush going on with LLMs.
Each of these companies is trying to figure out how to capitalize, how to cash in on that. I don't blame them. Right. Like if, if they miss the boat on that, they're right. They're cooked like that. You don't want to fail on that. But I don't think that that necessarily has much to say about the sort of medium 5 to 10 year time horizon of potential impact of the intersection of neuroscience and AI.
[01:29:52] Speaker B: Well, I see that paper getting cited a lot. I've not looked at citation counts. You probably know that, but.
[01:29:59] Speaker A: No, I don't look at them either.
[01:30:00] Speaker B: Oh, okay. It seems to be a pretty popular.
[01:30:02] Speaker A: Yeah, no, I think it's, I think it's for a small group of people. It sort of helps galvanize their interest, it helps focus their interest. And I don't think it's, I don't think anyone cares enough to say no, this is a waste of time. Right. Like, if you don't. If the people who do think it's a waste of time, and like I say, I think that a large fraction of the community, they just ignore it.
[01:30:29] Speaker B: Sure.
[01:30:29] Speaker A: Right.
[01:30:29] Speaker B: Yeah.
[01:30:31] Speaker A: And, you know, the only reason that people would push back, I guess, sort of, now that I think of the sociology of it, is if we were, you know, fighting for limited resources by doing this, we were, you know, getting a larger slice of the pie. And, you know, that pie is so enormous that this is barely even a crumb.
So there's. Right now, it's not in anyone's interest to argue against this research program.
[01:31:01] Speaker B: Yeah. All right. Well, thank you for taking me on that meandering walk through your recent work. Keep up the good work.
I'm glad that you're delving into development and that I'm not, because it's still scary.
[01:31:15] Speaker A: Can I just say one final word?
[01:31:16] Speaker B: Yeah.
[01:31:17] Speaker A: Two words. One is.
Delving is a word that's overused by ChatGPT, and I'm surprised to hear. Oh, you didn't know that? No, it is the most overused word by. It is the. Like. I'm now suspicious that you are being powered by ChatGPT. Oh.
[01:31:35] Speaker B: And so should I remove it from my vocabulary or do people say it.
[01:31:39] Speaker A: If you don't want. Certainly from your written vocabulary, because that is the.
Yeah. The marker of ChatGPT. Anyway.
[01:31:49] Speaker B: All right, I will undelve.
[01:31:51] Speaker A: Yeah. But what I was going to say about development, and this was the one point I wanted to make, is that I realized that the reason that I was turned off to development is that over most of my professional career, development has been a pretty uninteresting list of molecules.
And so I just thought of it as some of the least interesting, like, important. Like that list of molecules is super important. But I can't. Unless you're in the field. Right. You don't. You don't want to.
At least I don't want to just memorize a list of molecules or learn any new ones. But there's a previous generation going back, you know, even to Turing, who worked on develop. Von Neumann and Turing both worked on development. The question of, you know, how you build a system from a single.
A single building block, how that thing can make copies of itself and self organize into a global structure, I think is a really interesting problem and not utterly unrelated to the kinds of problems that neuroscientists and AI people think about.
[01:33:10] Speaker B: Yeah.
[01:33:11] Speaker A: And that has sort of been the realization that I've had as I, as I went back to generations of literature before I started, you know, watching or not watching talks about development.
[01:33:23] Speaker B: I didn't know or I had forgotten that about von Neumann. But for Turing, that's like his lesser known but still really cool work on those cascades. Instability.
[01:33:34] Speaker A: Yeah, yeah, yeah.
[01:33:36] Speaker B: Work. Yeah.
[01:33:37] Speaker A: Anyway, okay.
[01:33:39] Speaker B: All right, so thanks again, Tony. Have a great conference. By the way, I'm sorry, I'm not.
[01:33:42] Speaker A: Yeah, I'm sorry you can't make it.
[01:33:44] Speaker B: But I'll see you in a month or so.
[01:33:45] Speaker A: See you in a month.
[01:33:52] Speaker B: Brain Inspired is powered by the Transmitter, an online publication that aims to deliver useful information, insights and tools to build bridges across neuroscience and advanced research. Visit thetransmitter.org to explore the latest neuroscience news and perspectives written by journalists and scientists. If you value Brain Inspired, support it through Patreon to access full length episodes, join our Discord community and even influence who I invite to the podcast. Go to BrainInspired Co to learn more. The music you're hearing is Little Wing performed by Kyle Donovan. Thank you for your support. See you next time.