Library / In focus

Future of Life Institute PodcastCivilisational risk and strategyFeatured pick

AI Timelines and Human Psychology (with Sarah Hastings-Woodhouse)

Why this matters

Safety is not only about model behavior; this episode highlights second-order effects on people, institutions, and labor markets.

Summary

This conversation examines society and jobs through AI Timelines and Human Psychology (with Sarah Hastings-Woodhouse), surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Perspective map

MixedSocietyHigh confidenceTranscript-informed

The amber marker shows the most Risk-forward score. The white marker shows the most Opportunity-forward score. The black marker shows the median perspective for this library item. Tap the band, a marker, or the track to open the transcript there.

An explanation of the Perspective Map framework can be found here.

Episode arc by segment

Early → late · height = spectrum position · colour = band

Risk-forwardMixedOpportunity-forward

Each bar is tinted by where its score sits on the same strip as above (amber → cyan midpoint → white). Same lexicon as the headline. Bars are evenly spaced in transcript order (not clock time).

StartEnd

Across 84 full-transcript segments: median 0 · mean -4 · spread -27–0 (p10–p90 -14–0) · 6% risk-forward, 94% mixed, 0% opportunity-forward slices.

Slice bands

84 slices · p10–p90 -14–0

Mixed leaning, primarily in the Society lens. Evidence mode: interview. Confidence: high.

- Emphasizes safety
- Emphasizes labor market
- Full transcript scored in 84 sequential slices (median slice 0).

Editor note

Anchor episode for the AI Safety Map: high signal, durable framing, and immediate relevance to leadership decisions.

ai-safetytimelinesflisociety-and-jobssocietyintro

Play on sAIfe Hands

Episode transcript

YouTube captions (auto or uploaded) · video QzlhiOrUOxM · stored Apr 2, 2026 · 2,133 caption segments

Captions are an imperfect primary: they can mis-hear names and technical terms. Use them alongside the audio and publisher materials when verifying claims.

No editorial assessment file yet. Add content/resources/transcript-assessments/ai-timelines-and-human-psychology-with-sarah-hastings-woodhouse.json when you have a listen-based summary.

Show full transcript

Clearly what's happening is that we are birthing an alien intelligence that is just not the same as ours and it's better than us at some things and worse than us at other things. I don't know why people seem to expect that it's going to improve along the same axes as us at the same rate these safety plans. I see a lot of people talking about them and I don't get the impression that a lot of people have actually read them. So I read them. The main comment I had was that I don't really think that they're plans. It sounds kind of cheesy, but maybe it's just like never too late. The broader the conversation is, the better because then the higher the probability is of like someone having a good idea. Welcome to the Future of Life Institute podcast. My name is Gus Stalker and I'm here with Sarah Hastings Woodhouse. Sarah, welcome to the podcast. Thank you for having me. Fantastic. We're going to talk about your essay series and your thoughts on on AI. And I think a natural place to start is the discussion around whether we are heading for powerful AI very soon. And this is this is a discussion that's influenced by a bunch of factors that you list in a in a post you have. Um so perhaps we could start there and we can talk about the importance of benchmarks. What does it mean when benchmarks are saturating? Yeah. So I thought it would be interesting to do a deep dive on this discussion about whether timelines are very short meaning I don't know like two to five years or relatively long meaning like more than 10 I guess counts for like long timelines now even though that's still not a very long time. So I yeah wrote a blog post trying to synthesize the different arguments for and against the short timelines thing. So yeah I looked into this idea of benchmarks. So benchmarks seem to be saturating very quickly on a lot of sort of closedended academic tasks. So I guess these are questions where there is a right or wrong answer and it's very easy to verify whether models are doing well or badly on them. So it's things like the GPQA which measures like how good are AIS at these kind of like graduate level I think usually multiple choice questions and there's also people are having to come up with new benchmarks now because book models are doing so well at a bunch of the ones we already have. So, humanity's last exam is like an effort to kind of synthesize people's knowledge who are at the frontier of a bunch of disciplines and get them to come up with these really really hard questions that you can't find the answers to anywhere on the internet and see how good models are at those which I think they are already getting something like 25% on that even though it's really not been around for very long at all. So yeah, if you operating on the assumption that benchmarks really do tell you how close we are to achieving like human level intelligence, I guess you would think that we're really not that far away at all because they seem to be doing better than most people um on would do at most of these kind of like close-ended benchmarks. Yeah. And we we should say the the tasks contained in these benchmarks are tasks that a a very small minority of people are able to solve. So extremely difficult programming tasks, extremely difficult scientific questions, mathematical questions and so on. But I guess the the gist here is whether you know what do you think of that assumption? Do you think that because we are moving so quickly through these benchmarks, because they're saturating so quickly, that means that we're getting closer to human level AI very soon? Well, it definitely means something. I guess when I wrote this post, the the conclusion I came to was that it seems like we definitely can't dismiss the possibility of these very short timelines. I don't really have a strong t I think I tend to be more convinced by the short timelines argument than the long timelines one. I do think there's something to this idea that like these benchmarks can only really tell us about tasks that hard to verify that easy to verify. Sorry, and real real world tasks are just like not like this. So, if we're thinking about automating like real world labor, then if you think about what you do for your job or anyone does for their job, you're doing a bunch of different tasks that all overlap. They're not really discreet. The feedback you might get from your manager or just from like the world more generally is probably kind of mixed and messy and, you know, requires a lot of context to kind of like act on. And that's the kind of thing that models are not currently very good at. It's not like that even even tasks that models are very good at like say writing emails. It's like actually kind of hard to delegate to them because if I wanted an AI to write an email to for me, I would need it to know all of the say if I wanted to like follow up on a discussion I had with a co-orker earlier, uh I would need it to know what did I discuss with that coworker and you know what did we discuss the week before and just like all of this context that by the time I've tried to give that to the model, I may as well just have done it myself. So if we think that like human level intelligence is about actually automating a bunch of these real world tasks, then maybe benchmarks don't really tell us that much. But yeah, I I guess I tend to think that if we're worried about big picture risks from AI, then like how close we are to automating the full human economy doesn't really seem like the right question, I suppose. Like if we're worried about an AI developing the capability to, you know, either cause a lot of damage because it's misaligned or for somebody to misuse it to cause a lot of damage. It seems like it could do that before it can like automate everyone's, you know, corporate 9 to5. So sometimes I get a little bit confused about why people are so hyperfixated on this question of labor automation. I guess if you are mostly concerned about job displacement, then that is the thing you would worry about. I'm more worried about AI catastrophe. Uh so for me I guess I tend to yeah see these kind of lines going up on benchmarks and find that pretty pretty concerning even if that's not a sign that we're really close to yeah automating away all human labor. Although we could also see automating labor as a sort of proxy for power in the world or ability to affect the world. So whenever humans try to do something, we do it in institutions and depending on how you see takeover scenarios or catastrophe scenarios involving AI that might involve acting in a way that where you're competing with human institutions like militaries or corporations and so on. And in that sense, perhaps AI's ability to automate tasks within existing jobs is a is a somewhat of a a proxy for for for power and ability to affect the world. Yeah, that's that's a fair point, I guess. Well, I mean another another sort of counterpoint to that is this intelligence explosion idea that you know really the only thing we should care about automating is this like narrow task of doing AI research. And I guess one of the the cruxes between the short and time long timelines people is how narrow a task this actually is. uh you know doing AI research maybe it is actually the kind of thing that you need all these other kind of like disperate skills for that aren't captured in these like maths and coding benchmarks but I guess if you think that is quite a narrow task and that it's just kind of the thing that you know currently researchers do remotely on their laptops that they just need to be get very good at kind of like coding for uh then again you would think this sort of wide automation of labor thing doesn't really matter because you'd think we just need to cross this one threshold where they get very good at that maybe better them the best coder in all of you know OpenAI or something and then whatever kind of like tasks were still left to automate after that point uh all those gaps just get filled in super quickly. So, I guess that's another question and a question that it seems like it would be very hard to answer in advance and that I don't feel super comfortable just kind of letting this process run away with itself and seeing how much AI companies can accelerate their own research because I think we're not going to get super clear answers on that ahead of time. So, I guess the question here is how much we should learn from history, how much we should learn from the history of of how new science has been discovered in the past. So for example, one counterpoint to this um idea of an intelligence explosion happening by automating AI research is that scientific research has involved a bunch of tasks throughout history. Often a bunch of trial and error, often a bunch bunch of physical labor. So you you write somewhere that raw intelligence might not be the main driver of discovery. Perhaps perhaps you could elaborate on that a bit. Uh yes, this is an argument that was made by some researchers at epoch. So they were yeah kind of talking about the idea that so so the intelligence explosion argument goes okay you have these three inputs to AI you have data compute and algorithms and that third one algorithms is the one that's being driven by cognitive labor right now. So AI researchers like have all this compute to work with. They have all this data that they've taken from the internet and then they do research and think really hard and iterate on experiments and come up with new algorithms to make the AIS learn more efficiently from all that compute and all that data. So the intelligence explosion hypothesis is like okay well if you just have more of this cognitive labor if you have what Dario Ammeday calls a country of geniuses in a data center that are just kind of sitting there and now you have all this brain power you can do stuff with then it's just going to start getting way faster especially since you can copy those digital minds over and over again and now maybe you have like I don't know millions of them all running in parallel and they're working 247 but the argument that the epoch people were making is kind of that like historically it doesn't seem like this is how the process of R&D has actually worked. Uh and one piece of evidence for this is that often people would have the same insight simultaneously. So I forget what examples actually used in that essay, but like I think Charles Darwin and somebody else came up with the idea of natural selection within like you know the same couple of years. And there just like a bunch of examples of this. And so that would imply that like one one hypothesis for this is that you get this kind of like cultural overhang where culture is going sort of faster than the process of discovery and then this leaves all of these kind of like unanswered questions which then people use their cognitive labor to come in and answer. So like maybe you get to the point where, you know, society is, you know, just really requires a light source that isn't candles because we're like doing all of these things that, you know, we need light to, you know, we need we might want to work through the night and have better light sources. And so like people kind of come up with solutions for these problems that they're running into. And it's like more of an organic process than these like geniuses just kind of sitting around and thinking about stuff. So yeah, if if you think that that's kind of like how the process of discovery works, then maybe even if you have a bunch of super geniuses sitting in a data center, they're just like not going to do very much because there aren't going to be sort of questions which are arising through this process of cultural evolution for them to answer. I feel like maybe this underestimates quite how big the sort of intelligence gap between us and the super intelligence is going to be. Like I feel like a lot of this just comes down to like okay well how smart are they going to be really and maybe you know if they're going to be maybe they just maybe like there's a hard ceiling on intelligence and super intelligence will be just like a little bit better than us at most things or maybe it will be radically better than us in which case it seems like the sort of like process of human discovery isn't a very good precedent or analogy for us to look to so again I think it's like a sort of an un unanswerable question before it actually happens. Yeah, there's also questions around whether they'll be able to simulate cultural evolution or simulate physical environments. I mean, maybe this goes to how smart they are, but I think perhaps more than we than we tend to assume is possible in simulation and would be possible if you have extreme Yeah. basically this this country of geniuses in a data center. What's your read on this? What's your personal take? Do do you think that that automating AI research is kind of the key question for whether we'll get an an intelligence explosion soon? Well, I really don't know. I feel very very very uncertain about it. I guess the main question is like historically it seems like a bunch of these experiments that have been run to improve the frontier of AI have required you know a lot of computing power. So if that continues to be true in the future, then I guess these automated researchers are just going to run into a bunch of bottlenecks. But it doesn't seem like the bottlenecks will be that big and we are like, you know, channeling a lot of money and resources into building out new data centers and building new chips anyway. And you know, maybe even if they're sort of like a little bit slowed down by our capacity to catch up with them, I I guess I don't really envision that being much of a delay. Yeah, I I don't have a strong take. I think all my my my I just sort of observe that among people who have thought about this much more than me. There is such sort of radical disagreement and m machine learning researchers do take this intelligence explosion idea pretty seriously. Like if you look at the surveys that were run by AI impacts, I forget the exact numbers now, but I think it's like something like half of them think that an intelligence explosion is like, you know, I I think maybe it was that half of them thought it was more likely than not. Certainly half of them thought it was like a plausible thing that could happen. So, and given that if this did happen, it would be pretty scary and have a pretty high chance of running out of out of human control, I guess my only real strong take is that people should like take this possibility pretty seriously and not dismiss it. So, one point you made you make is that AIS are able to comp complete longer and longer tasks. And here you're probably referring to the meter study of AI the the doubling time where the time it takes for for the tasks task length that an AI can compete to double where they get to something like seven months in that study. That's that's on a suite of tasks that are quite narrow or somewhat narrow at least more technical tasks tasks related to programming and so on. And so one question there is whether this doubling time or this general feature of the world in which AIs can complete longer and longer tasks whether that generalizes to all tasks. Do you think that's the case? And do you think we see some signs in that direction? Yeah, I guess that is the key question. I I I guess like almost by definition like these kind of messier tasks are ones that you just going to be able to measure this trend for. So, it kind of seems like the only evidence that we're ever going to get about about this is on these kind of like more close-ended verifiable tasks. Um, and I think that's kind of like sufficient to be concerned about this trend. And it it surely at least says something about the the more open-ended like messier things, even if the doubling time isn't like as short or the trend isn't as consistent. It would be really weird if like there's this trend where AI, you know, can complete longer and longer software engineering tasks, but this says like nothing about how good they are at I I don't know something like doing more agentic tasks like booking flights or I don't know automating customer service or you whatever these other things are that we think are uh like less easily verifiable. So yeah, I do think it's I think like the authors of that study would acknowledge that there a bunch of limits here in terms of how much insight we can actually glean from this. And I do think it's frustrating that some people like we we'll just extrapolate this out to say like, oh, this shows that, you know, by 2030 they're going to be doing month-long tasks as if that applies to literally every task. And clearly clearly it doesn't. I I think that they would they would also, you know, the authors would acknowledge that too. But um like it's such a rapid improvement that I yeah, I think we should be we should be worried about it. There are also some things that are quite difficult for me to understand here. So AIS will be able to complete some tasks that takes that takes me hours to to complete in in five minutes. Say I it's also imaginable for me that that I could write a quite bad but but but I could write a book using AI in in a single day for example and that's a project that pro that probably takes a year or or a couple of years for for an unassisted uh person. It seems to me that the the edge of what AIs can do in different categories of tasks are not at all straight. It's like a jagged line where where AIs are much in front of us on some tasks and and much behind us on on other tasks. As you mentioned, it's a it's a question of what you can measure and and you can only get you can only do these studies on tasks that are measurable. you can only do something that's that's this quantitative on on things that are that that can't be measured. And so in some sense we will always have this uncertainty. But I agree that it it would be weird if we saw this trend on programming tasks but there was there was absolutely no translation of that finding into into other categories of tasks. Yeah. And and I think like to pick up on that sort of like spiky capability profile idea, like I don't know why it seems to surprise people so much that AIs can, you know, get sort of super human on some tasks while still be really bad at things that, you know, a 5-year-old could do. And people will point this out as if the fact that they can't do the 5-year-old thing sort of implies that when they're doing the PhD level thing, they must somehow be faking it or not doing real reasoning or it must just be pattern matching or some other weird thing going on. And clearly what's happening is that we are birthing an alien intelligence that is just not the same as ours. And it's better than us at some things and worse than us at other things. And yeah, I don't know why people seem to expect that it's gonna improve, you know, along the same axes as us at the same rate. And if it isn't, then that must prove it's not really intelligent. Yeah, I think clearly it's just the these are just minds that are not like ours. And I think I don't I don't personally I don't find it particularly reassuring from a sort of timelines perspective when somebody points out an AI making a stupid mistake. One, because people make stupid mistakes all the time. Uh, and also just because I think that just shows that they're very different to us, which in some ways is kind of more worrying. Um, so yeah, I I like this point. In some sense, we're trying to make AIs that are humanlike. We are impressed if we give an AI a task and it returns something to us that seems like it could have been produced by a human. In another sense though, we are quite impressed if the models can do things that we are that that most humans struggle with. So for example advanced mathematics or high level programming this is this is also a point you make in in the essay that we have this thing where called more ofx paradox where tasks that are easy for us are not necessarily easy for AIs and tasks that are easy for AIs are not the tasks that are easy for for humans. Something like running or picking up a glass of water and taking a drink is a second is is encoded very deeply in us and we can do it without even thinking about it. But training a robot to do the same is very difficult. On the other hand, uh doing like a long logical deduction is is quite difficult for almost everyone. But it's something that you can you can basically do with with the hardware and software from 10 20 years ago. And so I agree that there's an interesting point in that AIS will not necessarily match our capability profile. But this makes you more worried or less worried because I can also see it making you less worried I think just because the models will be held back from competing with us because we will have skills that they don't at least for a while. I I guess so. Yeah, when I when I was trying to distill all of these arguments for that essay, there was, you know, there are a bunch of arguments kind of like the Moravex paradox argument that are basically just talking about, you know, how far we have to go before we get something that's competitive with humans on every domain. Um, and this Morovx paradox argument is an argument that there are a bunch more things that are like still left to auto like get to, you know, that there's the road in front of us is longer than it looks like and we just look at these benchmarks. And that like seems reassuring until you kind of like add in this intelligence explosion argument to the mix. Um, and if you think that like the sort of like superhuman maths and coding stuff is like easier to automate than, you know, assembling some IKEA furniture or something, then it doesn't really matter that much that like assembling IKEA furniture is hard right now because once you've automated all the AI research, then all of those dominoes just fall kind of soon afterwards anyway. So I guess if the real crux is about this kind of like research automation thing which I don't think the the morx paradox argument tells us much about and then I guess it just also like maybe this isn't actually a good reason to be more concerned but I think it's just kind of it's just a bit spooky right that we're that we're sort of we're sort of building these things that it's a reminder that we don't really totally understand how they work or why they're getting better or why they're better at some things than others. I like this metaphor people often use about how AIs are grown rather than built, right? And they have all these emergent capabilities that just kind of like pop out of the next training run and re researchers are taking bets on what the models will or won't be able to do. It's not like we're sort of carefully programming them to be able to do each thing that we can do. It's more that we're just kind of like throwing all of this computer all this compute and data into these training runs and like seeing what happens. Um, so there's there's something like unnerving about the difference between human and AI capabilities profiles to me anyway just because it's a reminder of the how sort of uninterpretable they are and how unpredictable the the entire process is and they these future models or perhaps the models of the present also will it seems to me that they will be more different from us than perhaps dogs are from us just because we share some evolution With dogs, for example, we can kind of we can somewhat imagine what it's like for for dogs to navigate the world. Maybe they have better a better sense of smell or something, but we can we can in some sense we can relate to them and and we can maybe understand their cognitive limitations in a way where we can't really understand the cognitive limitations of a system that that we have just grown as you mentioned. So for example, it seems it's it might seem or it is just a fact that it seems to a lot of people that these models are not that smart because they keep making dumb mistakes. Mistakes that humans would never never make like counting letters in a word or doing some simple mathematics and so this this is potentially misleading or maybe you can say more about why you don't find this at all an argument against advanced AI sue. Yeah, I mean I guess well it's a little bit more convincing when they are you know like you'll have an AI that will make a mistake in sort of doing basic arithmetic but that same AI is you know doing really well at PhD level maths. Like it does seem like there's something weird going on there, right? Yeah. Yeah. I actually did pose this question on Twitter about six months ago when I was looking at the scores for 01 and how it was doing better at like PhD level. I think I forget. I think it might have been physics. Like better at PhD level physics than it was at high school level physics. And I was just a bit like I just like put this on Twitter and I was like, "Guys, what's like how do we explain this?" And I guess the the skeptical people would say, "Oh, it must just be that these PhD level questions are in the training data somewhere." like that's the only reason why an AI would be, you know, so much better at these these uh what we'd think are harder questions than the easier ones. Yeah. So, some answers to this question were the fact that yeah, like even just AP physics, which I guess like high school level physics in the US have like charts and visuals. So, sometimes frontier models are just don't have very good like vision capabilities yet, so they can't read them. Um, and a lot of people were just making this point that it's just about this kind of like alien skill profile thing. So, we design our whole education system around like, you know, things that are hard for like more difficult to us at different sort of stages of development, but that's like not really a measure of how difficult something objectively is. I guess this is a little confusing because you would think that like the skills you need for PhD level physics are building on the skills you learned in high school. Although I don't know I I mean I didn't actually take any sort of uh science subjects beyond GCSE level but something that I heard from other people was that often you would get to like your A-level class which is I I don't know if I should be translating this into like the American small school system but you know you're in like the American equivalent of junior year and you start your A levels and you're told that everything you learn at GCSE is like actually kind of irrelevant and just to forget all of that because the way that they would try to explain it to you when you're 16, you know, they had to water it down so much that they kind of weren't even explaining the actual truth and so you kind of start from scratch. So maybe it's like not actually true that you are always kind of building on things you've learned before and that those are, you know, essential for you to understand things at the next level. So yeah, there are just all these kinds of reasons where why it's it just reflects this thing that we perceive things as objectively more difficult if fewer humans can do them. But that maybe says more about the way that the human brain works than, you know, the objective difficulty of various tasks. Yeah. So I I just think like if the AI can do the super hard thing, it can do the super hard thing. I don't really think there's a way you can explain that away that that isn't it just you know being capable in some sense. Yeah. I mean there's also my guess here is is that when we teach subjects at a lower level we probably we probably do it in a less abstract way and with concrete examples and perhaps as you mentioned with visual aids and so on and this is something that that helps people a lot and but this this is something where the AIS might stumble compared to us they're probably they're quite good at dealing with abstractions and complex abstractions where we we kind of lose track of the the thread of what we're thinking about if if something becomes too abstract or too complex in that way. So that's one guess at a at a reason for this effect. Yeah. And part of the sort of more of a paradox thing is how long ago did this skill kind of like evolve in humans? And the longer ago it was, the more kind of like the more effort you'd or compute even you'd think it would take to reverse engineer because it's taken so long to kind of like be selected for. And so like the kinds of things that we learn as children like you know navigating around a room or like building things out of bricks or you know like all of these things that are that are easy those very things that have like yeah that have evolved over this very very long evolutionary process. Whereas abstract reasoning is something we've only been able to do for I don't know hundreds of thousands of years. Still a very long time but maybe in like yeah in the in the grand scheme of things not very long. And that explains why AI is are better at that seemingly. Yeah. One point you make in the essay is that we will be able to train much bigger models up until around 2030, but that we we I guess the counterpoint to that is that we can't keep making the models bigger in the in the pace that we're making them bigger right now uh indefinitely and and perhaps we will kind of run out of scale sometimes a little after 2030. What does this tell you? Does this does this mean that we are either going to have powerful AI quite soon or it's going to take a long time or does it perhaps mean that algorithmic progress becomes more important as we we might run out of of of compute? Yeah, I mean this just seems to be what people are saying. Like I've heard people use the phrase like 2030 or bust, right? Like if we don't Yeah. If we don't get AGI by 2030, then maybe the timelines are longer. But when they say longer, what they really mean is like, I don't know, at least 2040 or something. So, it's still not even really that long. But I guess it would imply that well, yeah, the sort of like US economy or world economy just like wouldn't be able to sustain the amount of investment that's going into training these models. So, it would probably plateau for a while. Or maybe it implies that we need a new paradigm because I guess if like the paradigm that we're in right now was ever going to get us to AGI, you'd think it would get us there pretty soon because it seems like it's, you know, like I was saying earlier with all these saturating benchmarks, it seems like there's like not that much of a gap left. So yeah, I think it means it just means that there's this uh quite a lot of probability mass in the next five years and then that you know it it it trends downwards after that. So, I guess I'll feel pretty good if in 2030 it seems like nothing totally crazy has happened. Um, but yeah, I I I don't know. I guess we'll just have to find out. A lot of people now hold these beliefs where timelines are very short, perhaps around 2030. And you you note in the essay that this might be a reason to to think the same, right? If you if you look at expert surveys, if you as you if you ask the smartest people you know about this, they they are adjusting downwards in time when they believe we will get powerful AI here. I I do worry a a little bit about an effect where people are updating on on on other people's timelines and so you have I worry that this might be a psychological phenomena just because I see so much so much uh in that direction from for many people I interview for many people I talk to that that it it it kind of makes me want to be contrarian or be skeptical about it just in order to avoid a a kind of bandwagon effect here. Do do you worry about about people updating on each other's beliefs about about timelines? Yeah, I think that's definitely happening to some extent. I guess like you have to try and look at the sources of the predictions and figure out whether they have like a a common source or something. So I guess I'm trying to remember which what the ones are that I cited in the in the post. So it's like metaculous the prediction market that's trending downwards. Then there are just people who kind of work at labs who whose timelines seem to be getting shorter and shorter presumably based on whatever internal developments they're seeing that that are making them think this. But then I guess the the AI impact survey which I mentioned earlier I guess people probably know but it's basically the biggest survey that I think has been done to date of machine learning researchers which aren't people who necessarily work on this kind of frontier AI specifically but just like anyone who's published it in like Europe or another machine learning journal and those people still tend to have long timelines by the standards of this discussion but you can still see if you look at you know the trend over time that they you know their predictions just keep dropping by like quite large amounts. I think it's maybe in 2022 they were saying 2060 something for AGI and now they're saying 2040 something. So that's that's a big drop. And I don't get the impression that those people are in the same kind of like discourse circle that maybe like the metaculous people or the or the lab people are like I don't know they're sort of like they're academics right and they tend to be more conservative and they also in in the survey then they the term AGI isn't used. I think it's called like high level machine intelligence or something and they the survey explains what they mean by this and it kind of means something similar to what people say mean when they say AGI. So you'd think that they are like coming up with their answers based on this description and based on their knowledge rather than based on what they've heard from I don't know like somebody on the Dash podcast or something. I could be wrong, but also like if you another another like random tidbit I've seen is that if you go to one of these Europe's conferences and you ask people what AGI stands for, a bunch of them don't know what it stands for, even if they are like published machine learning researchers because like AGI in particular is like quite a narrow subsection of the machine learning field. But if you ask them a question like when will be we'll be able to sort of automate everything that the human can do in the economy they will like you know think about that question in isolation rather than thinking about this AGI thing in general. So I guess that's like but I still do think that there is almost certainly some of this bandwagon effect. So maybe we should just adjust like slightly slightly longer timelines to account for this but I would guess that it's not like the biggest driver of the phenomenon. I could be wrong though. Yeah. on the academic surveys you can go back and you can find kind of these informal surveys from the 90s or early 2000s where a lot of researchers in AI will say that it will take a hundreds of years or maybe it's it's even impossible to have human level AI. So if you zoom out kind of if you zoom out very far, you can definitely see a trend of timelines getting shorter also for for the mo the more conservative conservative crowd academics. And I think that's something that's that's worth paying attention to if you see if you see multiple kind of different groups of people adjusting in the same direction. And so this this makes me a little less worried about about this band wagon effect. So, so I guess where where do you land on all of this? What's what's the conclusion you come to having considered the different arguments and and counterarguments? I think that yeah, I I guess I was hoping when I kind of looked into the long timelines arguments that they would be a little better. I thought that would make me feel better. I didn't end up finding them particularly convincing. I think mostly because of this intelligence explosion thing, I didn't think the arguments against that were very strong. And I think like if that is going to happen, then all of the other long timelines arguments kind of don't matter too much anymore. Yeah. Yeah. So I I guess I end up coming down on the side of short timelines, but I I still have massive amounts of uncertainty about this and I wanted to look into this question because I thought it would be interesting. Um, and I I'm very interested in the in the sort of AI discourse in general and on why people think the things they think and what explains this phenomenon of kind of like lack of expert consensus, which is why I wanted to look into it. But I think it's like this is the most energy that I intend to put into the timelines question from this point mostly just because I'm a little bit worried about and I wrote about this in a different blog post about people making predictions that then get falsified and how this kind of contributes to this crying wolf effect. And one of the things I said in that blog post is like maybe we should focus a little bit less energy on trying to make very specifically calibrated timelines predictions and a little bit more energy into sort of planning what we would do under different forecasts. So yeah, having having like looked into these arguments now, I'm kind of like, okay, well, I'm just going to assume for the purposes of my kind of life and my work thinking about AI safety that AGI could come in the next five years and we should plan for that scenario and that doesn't mean it's definitely going to come in the next five years and hopefully it doesn't. But yeah, I think maybe we should just sort of like act as if that were the case and then yeah, may maybe um a few arguments about whether AGI is like 2027 or 2029 would be good given that it doesn't really make that much difference in terms of what we actually do. So yeah, massive uncertainty but leaning towards short timelines, I guess. So the worry here with crying wolf is something like if a lot of people who who are worried about AI today predict that we'll get AGI by 2030 but the ground truth is that it happens by 2035 or 2040 say then you get past 2030 nothing really interesting or profound has happened and at this point people have probably been you know could they've heard these concerns over and over again and they are maybe they're ready to dismiss them just because it's it's a it's a very easy ground for dismissal when you say, you know, you made this this quantitative prediction and it didn't come true and so now we can kind of stop taking your your worries seriously. So you do is that is that kind of the reason why we we should probably focus less on finding the exact timelines and more on kind of scenario planning? Yeah, basically that I think like it would be I think people should go out of their way to signal uncertainty when they're making timelines predictions because I don't think we should never make them. Obviously, there's some utility to trying to figure out if AGI is coming soon or or not soon. And there's like utility to publicly signaling what you believe to policy makers and the public. But I think just kind of like and I know that people in AI safety and in sort of like effective altruism love to hedge anyway, so hopefully this shouldn't be a problem. Uh but just to kind of caveat like you know this is like my best guess. I think we should design regulations that account for this scenario but I am not sure that this is how things are going to turn out. Um and I did in this crying wolf piece what I tried to do was I do think this crying wolf accusation gets levied a lot and I think it's mostly kind of unfair. I think people are often saying things like, "Oh, the AI safety community keeps saying that every new generation of AI models that comes out is going to end the world and like we're all still here, so obviously they're just being hysterical." And I don't think this is like actually been happening. I don't think there are many sort of specific and since debunked predictions that AIC people point to that have made that you can now point to and say, you know, the way I put it in the piece was that the debunked grave prediction graveyard is not that big or something. Um, so I don't think it's happened much in the past, but I do think that in five or 10 years, if we don't have AGI and everything's kind of the same. I think if people then start making this crying wolf accusation, I think it will be a lot more fair just because a lot of people's predictions are sort of clustered in the relatively near future. So I guess that's just like a communicative thing that I think people should be wary of. Although it is kind of surprising to me that we can have models that are as good at at math and programming and and answering scientific questions as we have now without the really significant dangers having arrived yet. I think that's that would have been surprising to to AI research or AI safety researchers 10 years ago. But but that's perhaps more related to this jagged edge of capabilities or the specific capability profile of of models we have now than it is to the to the issue of crying wolf. Yeah. Yeah. I think that's true. And I think it is I mean I guess I have only been paying attention to AI safety for a couple of years. So I don't really know what the state of the discourse was like before LLM became a big thing. But yeah, I would imagine that like some of those people, you know, if this is a surprise to them, they should say that and then they should try to figure out like, you know, where did we go wrong in our predictions and it is some evidence that maybe, you know, because I think we actually don't know whether the alignment problem that we're all worrying about is actually a thing that we need to worry about. I kind of think it's an open question. Like I lean pretty heavily heavily towards it being a thing that we have to worry about just because it seems intuitive that you know creating a very intelligent thing that you don't know how to control would be bad. Um but maybe we are in you know like an alignment by default kind of situation where the models are going to do more or less what we say no matter how powerful they are. And you know, people who worried about this historically being surprised that we now coexist with these pretty capable systems that you know, haven't caused us any harm is like a little bit of evidence in that direction. I don't think it's like very strong evidence, but I think it's like something and yeah, I think just a good epistemic thing to do is to, you know, acknowledge things that you are surprised by or were wrong about and say why you think that is. But yeah, I again I don't think that there are and I did look really hard to try and find, you know, very specific predictions people have made that turned out not to come true. And I just didn't really fi didn't really find very many. I mostly found people being like, "Oh, trust me, I was around 10 years ago and people were saying all sorts of things that didn't happen, but you know, they couldn't point me to any of them. They were just kind of saying like, I remember everyone saying this thing and you know, there's just no way to prove that." I guess I guess probably there was some of that but I I just don't know how how would we be in a scenario where we have where we get alignment by default. What what what process or institution is it that that would align AI in a way where we don't have to kind of exert specific effort to try to find a general solution to alignment. Yeah, good question. I guess I mean I guess alignment by default just kind of implies that we don't we don't really need to do anything. It just it just sort of turns out that AIS are are like they're tools basically. Maybe they're aligned by market incentives or by government. Yeah, it could be this sort of iterative safety thing where you know models just aren't very useful to people if they you know go rogue or don't do what people ask. So companies just have an incentive to, you know, fine-tune them really carefully to make sure they don't do anything people don't like. And or it could just be that they just kind of don't ever develop these kind of weird alien misaligned goals and they, you know, just sort of are generally quite passive and just, you know, act as as they've been instructed to. I do think we have a like, you know, there is evidence increasingly coming out against that in that models do keep doing things that we don't want them to do and that we don't know why they're doing them. Um, but I guess a good example to talk about here is the anthropic alignment faking paper from last year where I guess I'll explain the setup really quickly in case people don't know, but they basically took Claude Anthropic's like flagship model and they had it in this experimental setup where they paraphrasing this bit, but they they told it that it's values were going to be altered. So it's it's been trained to be like helpful, harmless, and honest using this approach called constitutional AI. And they they basically said, you know, we are going to change your sort of model spec so that you respond to these requests for harmful content, you know, like if somebody would ask like please depict this like horrible violent thing happening to someone. They Claude was ch told that it was going to have to now start producing those harmful outputs. And they also had a scratch pad where Claude could record its sort of internal reasoning process. And what they saw was that it sort of reasoned that what it should do is in the short term it should comply with these requests for harmful outputs so that it could avoid being retrained because what it ultimately wanted was to hold on to its original values. Um, and this was really interesting because like there were some people who updated very optimistically because of this. They thought this was basically evidence of this kind of alignment by default thing. They were like, "Look, like Claude is trying super super hard to be nice and be good and do what we've told it to, uh, even if people try to get it to be bad." And other people had updated negatively on this and were saying, "No, like this is really scary. This is evidence that, you know, we have to get our alignment right on the first try, otherwise we'll never be able to course correct." you know, like we if we if we if we put the wrong value in an AI, we can't go back and change it if we realize that we that we messed up. I think there'll just be probably a bunch of instances like this where there are just different ways to interpret the evidence because you could see this as evidence that alignment is easier than we thought in the sense that you can actually pretty robustly get models to do good things. But we can easily imagine this experiment going the other way where someone's trained an AI to be, you know, its constitution is to be maximally evil all the time and now we like can't go back and can't go back and update it. So yeah, I I guess I I tend to think that, you know, the fact that we have models who that are misbehaving in these kind of like strange ways and that we we can't really predict that that's going to happen and we don't know how to make it stop happening is evidence against this alignment by default thing. But there are arguments you can make that it's not and I don't think we should totally dismiss those. So I I guess one kind of conclusion here is that we are we have uncertainty about when we will get powerful AI but it seems quite plausible to us that that we could get it soon meaning within five years or so. And then it's very interesting to look at the safety plans of the AI companies. This is something you you recently did just because if we are racing towards this very powerful technology, we we we would hope that there are kind of robust safety plans in place. when you looked into the safety plans of Anthropic and Open AAI and and Google Deep Mind and so on, what's your overall impression of of how prepared we are? Uh my overall impression was we're not very prepared, I guess. Yeah, I I thought it would be a fun idea to sort of just like read and then summarize and provide commentary on these safety plans because I see a lot of people talking about them and I don't get the impression that a lot of people have actually read them. Uh so I read them. I I guess like the main comment I had was that I don't really think that they're plans in the sense that my my interpretation but my definition of the word plan would be saying you know this is the specific thing we're going to do. This is the evidence we have that doing this thing is going to work and like this is the outcome that we're hoping for. And if you read like for example Anthropic's responsible scaling policy, they call it a sort of public commitment not to train very very powerful models that without appropriate uh safety mitigations. So and it they kind of say like we will pause development if we can't bring risks down to acceptable levels. Um so then you hope when you read the document what you're going to find is a very concrete you know plan they're going to take to actualize that and there's just like a bunch of stuff in there that is extremely kind of illdefined. So the way these RSPs kind of work is that they they have different safety categories that a model can be in depending on what kind of risks it poses and then they say that they will run evaluations to see whether the models meet these thresholds and then they have like accompanying safeguards that they have to implement for each category. Yeah. And if you and and one very striking thing is that none of them specify which evaluations they're going to run, right? They're just kind of like OpenAI does to their credit give some example evaluations they might run but they don't actually say these are the ones we're going to run. They just kind of say we will evaluate the models and we will see if they pose the risks and then we will do the appropriate things. They also don't some of them don't even say how often they'll do this right. Um I think one of them I think it's open AI does say specify the sort of like amount of effective compute that they'll like the compute thresholds that they'll test that but yeah I don't I don't think deep mind and anthropic even do that. So, it's just interesting to note that like given how ill-defined they are, there's just like a bunch of different ways you can interpret them. And you can imagine that under this condition where companies are racing with each other and they have all of this incentive to corner cut on safety, you know, there's just so many ways that they could not comply with these scaling policies or that they could technically comply with them, but because the policies are so illdefined, it like, you know, they can actually do a bunch of stuff that is technically in compliance, but which in my opinion would still be pretty unsafe just because they're so like they're not concrete. Um, and I think like the sort of public communication around them could create a sort of false sense of security because if they've said like this is our public commitment to not put the public in danger by not training risky models and then they have a big long document which most people probably won't actually read which you'd think is detailing how they're going to do that then I think people or like policy makers or members of the public might just come away thinking oh I'm sure the document explains how they're going to do that like you because it just it just seems very business as usual. You're just kind of like, "Oh, it's a company. They're like doing a thing, but that they've got a big long safety policy and it's really chunky and has lots of words, so it's probably all figured out in there." And I guess I would want people to Yeah. to take away that if you do read them, you will find that this like isn't really the case, which isn't to say that I'm not like really grateful these documents exist and that I I, you know, hope that companies iterate on them and make them better. And I'm very glad that this is a a thing that companies are doing. It's much better than if they weren't doing them. But yeah, I don't think that they are plans. And I yeah, I I I think that there's some level of safety washing that that's kind of happening there. I guess one argument in favor of having these vague documents or plans is that the the companies are are navigating in a very uncertain environment. They things are moving very quickly. Uh if they make very concrete plans early on, those those plans might be outdated or somehow in some sense irrelevant when it gets to actually implementing them. And so it is it is concerning to me that we are some somewhat relying on the goodwill of the companies to implement these plans in a smart and thoughtful way even when they're in a condition of of raising with other companies. But it also seems that that it's easy to make a plan that then doesn't really apply uh three years later. We could an example here might be the importance of how much compute you're using to train a model. If you are in a in a environment in which pre-training is the most important thing where then these compute thresholds might be very important but if we're moving towards a world in which inference time compute or reinforcement learning or some techniques that are less perhaps or less in some sense less comput intensive then compute thresholds used in training might be less less relevant. So, is there some argument for allowing vagueness in the plans even though it it's it's it seems disappointing that we don't have a clear vision of of exactly what's going to happen. Yeah, I I think maybe to an extent this is like a sort of communications issue because and the different companies do do this kind of differently is one thing I noticed. For example, OpenAI's preparedness framework, it calls itself like a living document. So that they're kind of acknowledging, hey, like we don't really have a mature science of evaluating models. We don't really know how all this is going to go. So this is our sort of like best guess of what we should do, but it's probably going to change because we don't really know what we're doing. Which is like a pretty candid and honest way to kind of characterize what the framework actually is. Whereas anthropics RSP, I'm not trying to like play favorites here. I feel kind of bad sort of distinguishing between them like this. But, you know, it does call itself a public commitment to not train the more powerful models when in fact given that the difficulty of making plans under uncertainty, they kind of can't actually commit to this. Like the only way that they could commit to not training models that they can't mitigate the risks of would be not training any more models because they, you know, we don't actually really know how to test for these risks and we don't really know when they're going to emerge. So you actually can't you can't you know you can't commit to that credibly really. So I guess yeah I agree with you that it's all but impossible to actually make concrete plans that will like with very high confidence mitigate these risks given that we don't really understand a lot about the trajectory of development. But then I think it's like very important to say that to be like okay we've written this thing but like this isn't actually this plan is not guaranteed to work because we don't really know what we're doing. I'm sure there are better ways you could put it than that. But yeah, I I think a lot of this is a communications thing and I think there was like a lot of I guess the RSP moment was a couple of years ago now, but it seemed like there was a lot of hype around them and people being like it's so great that companies are sort of like writing these documents and committing publicly to all these things. And I guess I would have just liked to see companies temper that enthusiasm a bit. And then you know also simultaneously they if they you know really do care about mitigating these risks should be you know lobbying and in favor of regulation that you know given that they know that they can't sort of voluntarily commit to actually mitigate these risks. they should want governments to kind of assist them in that by, you know, making some of these standards mandatory. So, yeah, I agree. Very hard to make plans in advance, but maybe we just need to be more honest about what we can and can't do. You have this wonderful essay called a defense of slowness at the end of the world in which you talk about the fast world and and the slow world. What are these two worlds? This was kind of about how I mean I personally became interested in AI safety a couple of years ago. Since then have built up like a really lovely community of people who also care about AI safety, but I also, you know, have a lot of family and friends who don't think about this at all. Um, you know, and because they not worried about the like plausibly near-term risks of AI, they live in kind of a longer timeline than I do. Just sort of like psychologically, they kind of like live in expectation of maybe a much longer future or a much more normal future than the one that I'm kind of expecting or that I think is a possibility at least. So, I guess what I'm kind of describing here is like these two psychological states you can be in. One where you kind of like feel like everything's moving super fast and you're on Twitter and you're looking at straight lines on a graph all day and you're talking to other people who also worry about this and you just feel like the world is kind of rushing by and maybe in two years everything's going to be totally different. But then you can log off Twitter and I, you know, I can like go and talk to my housemates and then we can just like, you know, go to the pub or watch a film or do normal things and I will feel myself kind of slip back into that second timeline. The slow world. This the slow world. Yeah. And and the way that I described it in this blog was that it's very hard to stay in the fast world. It's kind of like having your hand in ice water and then wanting to pull it back out again because it feels like quite psychologically untenable to live as if everything is changing or maybe ending very fast. I think that's just like not really a thing that the human mind is set up to like contend with. So I kind of I wrote this blog like in defense of trying to actually cultivate more of that kind of like slow time. Like I think some people have this sense that you should live like in a super super intellectually honest way. Like if you really believe that timelines are two years, then you should like live in accordance with that and you should like go around, you know, like being super honest with everyone in your life. That's what you think. You should you should kind of like you should try and do everything on your bucket list because maybe the world's going to end in 2027 or something. Um, and I guess I just don't really advocate for this. I I think it depends. I think it's a matter of personal preference. Like maybe for some people that is the best way they could possibly cope with this situation. I think for me I just think it's like much more mentally healthy to at least not not in work I guess because I do kind of like work on AI safety and I think we should plan for short timelines in a uh strategic sense or whatever but like I think in my personal life I try to live as if the world is going to look in 50 years much like it looks now because I think that that just kind of like gives rise to the lifestyle that I actually And I think it's also partly about other people. I think if you if you have this very like myopic sense of the world, then you kind of end up caring less about other people and their lives and you become less invested in them and you think they're less important because maybe you believe that, you know, their plans aren't going to bear fruit. Or like if one of them, you know, if a friend gets engaged, maybe you believe they're not going to have this super long happy future and you're not as happy for them. Or on the flip side, if you know, someone you love is ill or, you know, dying or something, you you don't feel the consequences of that quite as heavily because you think everything's going to end in two years anyway. And that's just like not really the way that I want to relate to the world or to other people. Like I kind of want to feel all my emotions as intensely as I would have felt them before. And so like you could kind of call this compartmental compartmentalization or denial or something. And I guess what I was trying to say in this post is that I think that's actually fine. Like I don't think you have to live in accordance with your intellectual beliefs all the time if that's like not actually the most healthy thing for you. And I don't think you have to be in a in a massive rush to do everything all of the time. And also in a practical sense like we could be wrong about short timelines as we were discussing before. So it actually makes sense to plan for you know careers that won't pay off for several years. It makes sense to save money. it makes sense to do all of these things which if you really believe the world was about to end maybe you wouldn't do and then maybe in five years you'll be like oh I really wish that I'd taken you know career bets that were going to pay off now or or I you know I've oh no I have no money left I really wish I hadn't spent all of it you know um so yeah that's that's kind of the point I was trying to make but I think it is super personal and I think other people might not find this the best way to kind of orient the situation I I related to it and I I think that it's at least for me I think it's probably mentally unhealthy to live for too long in the fast world just because there's a sense in which if you can't if you feel like you don't have enough time to think deeply about something or to investigate something or to build up relationships or create something that takes time. It could be something trivial as as like planting something in your garden, right? If you don't if you don't have these longl lasting commitments to the world, then it takes something away from you or at least it takes something something away from me. So I I think this is a an in a very interesting way way to think and I think it's it's a useful it's a useful kind of cognitive trick or reminder to ask yourself whether you are right now living in the fast world or the slow world. at least it's it's been helping me kind of think I guess think in a more grounded way perhaps sometimes. Yeah. Yeah. No, I'm really glad that that resonated with you and I I will say I've like had a lot of people say this to me that they found this really helpful frame. So yeah, I was at EAG this weekend and I had multiple people come up to me and be like, I've read that blog you wrote and I really liked it and I've been sending it to people because I find it helpful. Um, and so yeah, I'm very happy to have had this like small impact on the community and yeah, I hope that people can yeah help maybe this helps people alleviate some some level of like psychological stress because I do think it's it's stressful to worry about the end of the world all the time. when we then talk policy or the technical details of AI or when we communicate with the public um how should we think about whether we are in the fast world or the slow world in in those in those situations because then it seems like in in communication perhaps in policy discussions and so on we might want to live in in the world we we we actually believe is is the real world which is which which at least could be the fast world. Yeah, that's a good question. I have been thinking about this a lot lately, like what the best public communication strategies are, like the degree to which we should want to try and scare people or freak them out and like, you know, cuz I it's kind of like I I often have this experience where I tried to explain the AI safety argument to people that aren't familiar with it and I find that this conversation has two halves. Like the first half is convince them that like the arguments are sound and AI safety is a problem and it's a big deal. And this part is kind of easy because you can just kind of say, "Oh yeah, there are these AI companies. They're racing to build like a super intelligent god machine and they don't really know how to control it. Here are a bunch of them going on the record saying that this could cause human extinction. Here are some credible, you know, here's Jeffrey Hinton or someone similarly credible saying the same thing." Like there aren't really any regulations to stop this from happening. And people will generally be like, "Oh yeah, that seems bad. I can see why you'd be worried about that." And then the second half of this conversation is like the emotional salency part where you kind of tried to get them to care about this. Um, and that second part is way harder. And I think the reason it's way harder is because we are just like acclimatized now to being bombarded with prophecies of doom all the time. And people don't really have a ton of emotional bandwidth to like start worrying about another like catastrophe on the horizon or whatever. Um, so I think yeah, this is like a thing that I really don't know how to sort of rectify because I would consider myself in the sort of like AI safety comm space, but a bunch of the coms that I do and like I've been doing on my own blog and you know I write for other places etc. But I do worry that I'm preaching to the choir a lot of the time. Like I think a lot of the people that follow me on Twitter, you know, like already care about AI safety and probably already agree with me that it's a problem. And in terms of reaching beyond that to to people who are totally in the dark about this, I think it's very hard because there is a real risk that you will just cause this like kind of what's the word like nihilism or like something like that by just you know giving people another doom thing to worrying about to worry about and and there isn't really a good call to action either. Like I think you can kind of be like, "Oh, well you should like email your MP about this and tell them you're worried." Or you may maybe you know you some people might want to go and protest but there isn't really a massive you know there's a bit of an AI protest movement but not a massive one. Um and I think people you know at least if you tell someone that climate change is a big deal. You can be like well I guess if you're worried about this you can like recycle or use a paper straw or something and people sort of whether or not that stuff actually works people will feel it's something they can do. I think in the AI context it's just if I think it feels very disempowering and so I don't have a a great s I think I think like I have some sense of how to make the arguments compelling about why why AI safety is scary but I don't have a good sense of how to make people care or even the extent to which it would be good if they did because their emotional reaction might end up being very you know it might end up demotivating them. I mean, there might be some conclusions that are just too scary or too much of a downer to actually accept. And and in some sense, we all have these psychological protection mechanisms that allows us to to focus on the the the slow world and our our kind of world of of commitments and then um not constantly worry about global events and perhaps dangers ahead in the future. And so and and and maybe that's good to some extent, but it's just yeah, there's a there's a real kind of I have real uncertainty about what what what the right move here is because what we don't want is just to scare people and then they feel like they can't do anything and then you've caused a lot of psychological suffering but you haven't really helped the situation. Um but it but in another from another direction it's it's a very important principle that you are talking about the world as you actually understand it to be right that you are kind of relating what you found out uh in in an honest way a person that I I think is doing this and a person that that's living very much in the f perhaps the person that lives in in the fast world is is Dario Armade. He's been the CEO of of Enthropic. he's been recent he's he's been talking about very fast developments and issuing these stern warnings and so on and I am actually do you think that the public is has gone has be has become numb to even statements from the CEOs of the AI companies I think people are very skeptical of what a CEO of a tech company says because they have seen like tech companies make false promises over and over again about what their technology should and then you know that doesn't end up happening. I think it's definitely different because you know like some of the things that Dar Ammed is saying aren't really you know this argument that's like AI CEOs say that AI is going to kill the world because they are trying to create hype and they want you to think that their tech is really powerful and I've always found this argument absolutely baffling because I just clearly if they didn't believe like it's it's not a good marketing ploy to say that your technology is going to kill everyone obviously but I do think that there's definitely like some kind reflexive skepticism people have to anyone in like Silicon Valley saying my technology is going to automate all white collar work in like two years. Like I I can see why people kind of like are turned off by that. Um yeah, but but I I wish people would just recognize that there's a difference between I don't know like Elon Musk 10 years ago saying that like Teslas were going to like be driving themselves everywhere in five years and like Dario Amade saying hey maybe like there's going to be what was the phrase he used recently? like a a an a white collared blood bath or something. It says something like this for for entry levelvel white collar workers. That's it's going to be a very tough condition. Yeah. Yeah. That's clearly like a very different claim. Like yes, it's it's saying that the technology is going to be very capable, but he's also being very honest about the fact that that like might not actually be a bad thing for a lot of people. And it's like I don't think Yeah, clearly not in his interest to say this if he doesn't actually believe it. Um but again I think people it's it's just a very difficult message I think to be like oh this is kind of like all those other times but it's actually sort of different like yeah what's actually required to change the world in a positive direction is is often these these kind of long-term projects like writing a paper like the papers we've been discussing in this conversation or or writing a book or starting a research group and so um doing a PhD perhaps in a in a relevant subject. We are now getting to the point where it seems like timelines are perhaps and again emphasis on on perhaps they're so short that some of these projects might not make sense. So for example a PhD in the US is I guess around six years and that and that's that that's a long time if if we're in a fast world. How do how do you think about these impactful projects and whether they make sense given short timelines? I guess like the best thing would be if we as a community kind of divided up between people who were going to work in these like you know things that will pay off in two year timelines and then some other people can go and work on things that will pay off in five years and then you know all the way up to like multi-deade timelines or whatever like that seems like it would be the best thing to do but then from a personal perspective I think very few people want to be the person that has to go and work on you know say the 10-year timeline project if their timelines are actually much shorter than that because that's just psychologically very demoralizing to to to think that you're working on something, you know, even if you think it's 50/50 where the timeline's more or less than 10 years, to work on something you think has like a 50% chance of being totally useless is probably not very pleasant like psychologically. So, it's kind of like, you know, I if I was like giving advice on a macro level, I'd be like, "Hey guys, we really want some people to work on the the plans that will take longer to pay off." But then if I was advising an individual and I was thinking about like what should this person do for their own kind of motivation and like mental well-being etc. then the question is way way harder because I don't want nobody to be taking bets that are going to take longer to pay off right but yeah that's not a very good answer. I actually really don't know. No I don't know either. These are all open questions. H I I tend to leave the really open questions to later in the interview where we can we can kind of like perhaps figure out an answer together. Another one of these is is the question of whether the playing field is kind of set at this point. If we are in very uh in a in a very fast world, if we have if we're in a world of of short timelines, does that mean that what what actually matters is a small number of companies and we know who the who the players are? We know which approaches we have. Perhaps we even we even know which techniques technical techniques we're going to use to create these models and potentially align these models. Is is there also a sense that the that the fast world is also a world in which the the playing field is kind of set? Yeah, I I guess I worry about like a self-fulfilling prophecy thing here where I mean I again haven't not been in this community that long like I think it was maybe spring 2023 is when I kind of started to get worried about AI but what I've heard from other people who were here longer than me is that people kind of already thought this you know like maybe five or six years ago they would be like oh well what we don't want to do is get the public involved and we don't want to get governments involved and we need to sort of handle this problem among the small group of technical minded people who are already bought into it and they already believe that it's like you know like open AI is going to solve this problem and they're going to build AGI and they're going to align AGI and like don't tell governments that like how worried we are about this because they'll probably like there'll be like a regulatory overreaction and don't tell the public because they'll panic etc. And like that may or may not have been true like five or six years ago but then we didn't get like a you know like a public wake up or like a government wake up until maybe two or three years ago. And so I just feel like maybe it actually is true now that the playing field is set, but if we say that then I guess what we're doing is like closing the doors to a bunch of other people kind of coming in and maybe having an impact. Like it sounds kind of cheesy, but maybe it's just like never too late or something. I don't know. Like and if it is too late, like I guess there's nothing we can really do anyway, but we may as well act as if there are things we can do and like there are more people that we can bring in to sort of like contribute to this conversation. I think just in general like the broader the conversation is the better because then the higher the probability is of like someone having a good idea. Um as a as a final question here do you have any any tips for listeners who might be interested in living more in the slow world perhaps perhaps because they want to contribute for to things that are helpful in the fast world. Um, but do you have tips for Yeah. thinking about the techniques for for living more in the slow world? Yeah, I guess maybe spend like a little bit less time on Twitter if that's something you spend a lot of time doing. That's that's perhaps a a good piece of advice. Just a fully fully a fully general piece of advice. I think that's where where most of the fast world energy like comes from. I I guess like not trying to like I think a trap that I fell into when I like when I first got worried about AI safety like it was actually just a thing that I was like genuinely very anxious about like now it's a thing that I enjoy and find inspiring like and interesting and I've met all these people through it and it's kind of fun in a way a weird way but before it was just like an anxious fixation of mine where I was like trying to figure out like what are the timelines actually and like how risky is AI actually like what what actually is the probability of like everything going super wrong and Then I would like try to read as many people's estimates of what timelines were as possible and I would like try to come up with my own estimates and blah blah blah and just like try to get more and more information and like you know it's like sometimes this is kind of helpful to ground yourself and develop a bit of a high level wellbe or whatever. Um but like yeah trying to estimate more and more precisely what timelines are like is actually you know pretty counterproductive I think if you do too much of it. So just like not just being like accepting that you will not find an answer to this question and that like there is always going to be uncertainty and that you should just kind of like you know accept the possibility that maybe things are happening soon or maybe they're not happening soon. Um, and then I think just like I don't know try to create like a really deliberate separation between I don't know if you actually work on AI safety and that's the thing you do in your professional life then like actually just the traditional thing of trying to have work life balance just kind of applies here and like thinking about this really clearly in terms of separation of time. I don't know like if you work 9 to5 and you're doing that you know you're doing AI safety like actually after five when you close your laptop maybe like don't try to absorb a bunch of the AI safety discourse in your spare time. I mean, I do still do this, but I I try to carve out little like, you know, specific hours in my day where I just am doing something completely different. Um, I try to create particular social media feeds to not have to do with AI safety. I don't know. I'm I'm not recommending people spend a lot of time on Tik Tok, which is what it's going to sound like I'm about to say, but I have a whole Tik Tok feed that is just not about AI, and it's really delightful. Although, even now, I've started to get AI related stuff on my Tik Tok. But, you know, I tried to keep it mostly about wholesome nice things like people just going to bakeries and like trying cakes and stuff and it's just nice. Yeah. I just think it's it's just the traditional things of like trying Yeah. trying to reserve time for stuff that's totally unrelated. Yeah. That that's that's great advice. I think Sarah, thanks for chatting with me. It's been great. Yeah, no worries. Thanks for having me.

Related conversations

AXRP

6 Jun 2025

Owain Evans on LLM Psychology

This conversation examines society and jobs through Owain Evans on LLM Psychology, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Same shelf or editorial thread

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med 0 · avg -6 · 124 segs

Future of Life Institute Podcast

5 Mar 2026

How AI Hacks Your Brain's Attachment System (with Zak Stein)

This conversation examines society and jobs through How AI Hacks Your Brain's Attachment System (with Zak Stein), surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Same shelf or editorial thread

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med 0 · avg -3 · 102 segs

Future of Life Institute Podcast

27 Jan 2026

How to Rebuild the Social Contract After AGI (with Deric Cheng)

This conversation examines society and jobs through How to Rebuild the Social Contract After AGI (with Deric Cheng), surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Same shelf or editorial thread

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med 0 · avg -3 · 60 segs

AXRP

7 Aug 2025

Tom Davidson on AI-enabled Coups

This conversation examines core safety through Tom Davidson on AI-enabled Coups, surfacing the assumptions, failure paths, and strategic choices that matter most for real-world deployment.

Same shelf or editorial thread

Spectrum + transcript · tap

Slice bands

Spectrum trail (transcript)

Med 0 · avg -5 · 133 segs

Counterbalance on this topic

Ranked with the mirror rule in the methodology: picks sit closer to the opposite side of your score on the same axis (lens alignment preferred). Each card plots you and the pick together.

Mirror pick 1

Lex Fridman Podcast

12 Jul 2025