AI has been a hot topic in recent Twitter discourse, with two opposing camps dominating the conversation: the Doomers and the AI builders. The Doomers, led by Eliezer Yudkowsky and other rationalists, advocate for caution and restraint in the development of AI, fearing that it could pose an existential threat to humanity. Prominent figures in this camp include Elon Musk, who has expressed concerns about the potential dangers of AI while also founding AI-focused companies like OpenAI and up-and-coming “BasedAI.” On the other side of the debate are the AI builders, including Yann LeCunn and Sam Altman, who are eager to push the boundaries of AI development and explore its full potential. While some members of this group have been dismissed as "idiot disaster monkeys" by Yudkowsky, I will refer to them as "Foomers" for the purposes of this blog post. The divide between these two camps is significant, as it represents a fundamental disagreement about the future of AI and its potential impact on society.
The debate around AI often centers on the concept of superintelligence, which refers to AI that surpasses human intelligence in every way. Doomers argue that superintelligence could pose an existential threat to humanity, as it would be capable of outsmarting humans and achieving its goals at any cost. This is particularly concerning given that the goals of such an AI would be difficult, if not impossible, to specify in advance. If the goals are misaligned with human values, the consequences could be catastrophic. The AI builders or "Foomers" tend to downplay these risks, arguing that superintelligence could be used for the benefit of humanity if developed and controlled properly. However, the Doomers counter that the risks are too great and that any attempt to control superintelligence is likely to fail. As such, the debate remains a contentious one, with both sides offering many arguments.
While Foomers may reject critique through thought experiments and argue for incremental improvement of AI through trial and error, there seems to be a lack of engagement from both sides in identifying the underlying assumptions and values that shape the debate. This can lead to the same discourse tiling Twitter with copies of itself without any meaningful progress. As a result, many people are left frustrated and exhausted by the debate. In my blog post, I aim to provide a fresh perspective on the debate and contribute to a more productive conversation. By analyzing the arguments of each side and exploring potential areas of common ground, I hope to help re-align the discourse in a more positive direction.
It's worth noting the curious fact of a pipeline between the Doomer and Foomer camps. Organizations like OpenAI and Anthropic started as "safety" organizations, but have since pivoted towards a more Foomer-like position. Similarly, Doomers have historically broken away from Kurzwelians, who were the original Foomers. While changing one's position based on new evidence is commendable, this two-way pipeline casts doubt on the strength of both positions. Alternating between two extremes suggests that neither side has a firm grasp on the crux of the debate. It's important to engage with opposing views and seek out potential areas of agreement, rather than simply oscillating between extremes.
So I decided to make my OWN POSITION in what I claim is a reasonable center. I have the following 10 beliefs:
1. Safe AGI is a small portion of the space of all AIs or all algorithms.
2. AI is dangerous, discontinuous jumps in capacity are particularly dangerous
3. We are unlikely to get a really fast takeoff
4. There will be warning shots and "smaller" AI failures to learn from.
5. AI-caused social and mental health issues are more likely than bio/nanotech
6. "Slowing down AI" can be good, but getting the government involved is not.
7. We can learn from empirical, simulations, and logical methods
8. A lot of existing techniques to make AI safer can be used for AGI.
9. Problems of civilization have analogs in AGI problems.
10. Humans must come first. Now and Forever.
Explanations:
1. Safe AGI is a small portion of the space of all AIs or all algorithms.
"Algorithms" is a large space, "AIs" is a large sub-space. Many people wish to ascribe some property X to all AIs when not even all humans have said property X. However the subset of AIs that are both powerful and ones we want to build is a small subset of all "powerful AIs." The analogy is that if you want to go to the nearest star system you'd are trying to hit a small target in space. That said, going to the nearest star system is hard, but not impossible.
2. AI is dangerous, discontinuous jumps in capacity are particularly dangerous
There is a particular doomer world view that I am sympathetic to and that is if a hugely powerful alien ship or AI appeared in the sky and had goals regarding the planet, there is nothing we would likely be able to do against a civilization vastly technologically superior to ours. However, the important part of this hypothetical is discontinuity. I think we are unlikely to get strong discontinuities in AI.
3. We are unlikely to get "really fast takeoff".
I wrote this a while ago. The TL;DR is that the AI improvement process is going to become less and less constrained by humans. The AI development loop is "people think for a little bit" and "fire off an AI to test their theory". Given that AI demands are growing in computing terms and theories are becoming complex, the "fire off an AI to test the theory" is becoming a larger portion of the loop gradually. So replacing people in the loop doesn't necessarily make the loop exponential in millisecond terms.
4. There will be warning shots and "smaller" AI failures to learn from.
Some examples of warning shots:
Some company uses a neural network to trade their portfolio and loses everything
Some company "accidentally" violates copyright by training AI and get sued for it.
Some people create an AI bot to try and make money online and it becomes a scammer (again lawsuits+prison for them)
Someone actually uses an AI to convince someone else to do something wildly illegal or hurtful
Someone builds a bad chemical and several people die as a result
I would consider these to be "small" warning shots that may or may not lead to people learning the right lessonssd. I think warning shots could get bigger before the lesson is fully learned, however it will be learned before "doom". For example, a complete socio-economic breakdown of a major country due to the financial system being exploited by bots and becoming unusable for people is a warning shot that is plausibly big enough for decision makers to start paying attention. A collapse of "an entire nation" is my guess at an upper limit of "warning" that is required for decision-makers to take AI seriously.
5. AI-caused social and mental health issues are more likely than bio/nanotech
I have written about plausible pathways AI will disrupt civilization at length here.
The general theme is that social manipulation, behavioral modification and scam-like behavior is far easier to do than new destructive bio-tech. Social media causing mental health problems for decades means this can be done using not-that intelligent algorithms. This is a near term concern as signals that were previously load-bearing for social function become polluted.
This is bad news for the near term trajectory of Western civilization and will lower the standard of living and counteract a lot of the near term benefits of AI. However this isn’t “doom”
6. "Slowing down AI" can be good, but getting the government involved is not.
Again, given the fact that we are going to have those warning shots, it may be worth mobilizing some of society’s existing resources to create learning about them. Calling a group of labs to voluntarily slow down as to we can understand the real power level of what models have already been created is a reasonable call.
However, where this starts getting unreasonable is to ask to get the government involved in either domestic or foreign policy through either local regulation or data-center “bombings.”
At this moment the US government displays a deep lack of state capacity in terms of addressing problems along with a desire to create new ones. It is no longer safe to ask the government to ban TikTok, let alone attempt to create new international agreements. The US government is no longer really perceived as agreement-capable by it’s geo-political competition.
A recent post Catching The Eye of Sauron the author argued that “not enough is being done” and that it doesn’t look like options are at all exhausted before drastic calls. I agree with most of the post and would also like to add that even an action such as speeding up lawsuit against relevant companies has not been explored much. Many people question both the copyright problems that are involved in training large generative models as well as potential for auto-mated libel. Lawyers may just be the heroes we deserve right now.
7. We can learn from empirical, simulations, and logical methods
This feels to me like one of the cruxes of the whole debate. If you want to learn about AI, how do you do it?
The Foomer position seems to be that you learn by empirical methods - run the AI and see what happens incrementally. The Doomer position seems to be that at some point incremental changes are “not so incremental” and will get people killed. However, the Doomer position also gives off the vibe that implementing current paradigms doesn’t teach us much or that knowledge can only be acquired through thought experiments.
In my view, all kinds of methods can bring us new valuable information on AI / people and how to make AI safe. The fact of Open AI spending a lot of resources on RL HF and people jailbreaking the AI anyways is an important piece of learning.
Thought experiments are a good start to learning about AI, however, if the thought experiment becomes complex enough for people to really start disagreeing, it's time to formalize it. First start with a mathematical formalization, then follow through with simulation in the smallest possible environment.
Other types of simulations that could be helpful are simulations in particular video games, specifically sandbox games. It's easier to tell what doesn’t work through this method than what does work. However, knowing 10 million things that don't work is extremely valuable.
8. A lot of existing techniques to make AI safer can be used for AGI.
This is my #1 problem with the Doomer worldview.
I am going to talk about a specific example, called inverse reinforcement learning. (or IRL). However, keep in mind this is one example and there are many others. IRL is used by Waymo among others to help guide self-driving cars. It is an example of a technology that is actively being developed on a fairly complex task and a lot of learnings about it can carry over to learning about more general tasks. While learning “values from behavior” perfectly may not happen because of some human deviation from optimality, this seems like a solvable problem. You can still learn how humans drivers handle the “not-run-into-things” problem through such techniques even if they get it wrong sometimes or disagree on questions of what is polite on the road. The book “Human Compatible” makes some arguments along the same lines.
If certain experiments with techniques like these seem too dangerous, then one can use simulations to refine them.
When I hear doomer talk about IRL, either here or here, the set of arguments used against it points to a pretty big philosophical confusion between cultural values (egalitarianism) vs fundamental values (non-kill-everyoneism.) as well as confusion around what the shape of human “irrationality” is. The argument that IRL can’t coherently learn cultural values may be true, but this isn’t the same thing as coherently learning fundamental values. So IRL gets a lot of negative feedback incorrectly while it may be a core technology of “not-kill-everyoneism". Building utopia may in fact be hard-to-impossible, however getting to AGI “not kill everyone” may be significantly easier. However if the public messaging is “we don’t know how to not kill everyone,” while the private research is more “we don’t know how to build utopia,” this is wildly irresponsible. Not to mention dangerous in that existing techniques refined on real-life tasks such as IRL are going to be unfairly critiqued.
9. Problems of civilization have analogs in AGI problems.
This is a very big topic. Problems in AI are both new but also have precedent or analogs in the past. However, a lot of problems have analogs. What utility function should AI have is a question analogous to questions of how to measure societal utility in economics. Economics also explores questions of how coherently one can model a human as a rational agent. There are questions of philosophy that deal with the nature of ethics, Beings, philosophy of language, etc
Now, just because these questions were previously considered, does not mean that they were solved. However, this fact points to the idea that questions of a lot of previous thinkers can be used to help understand future AGI and that lot of sub-problems can be fanned out to the outside world if framed and incentivized carefully.
10. Humans must come first. Now and Forever.
Parts 1-9 are a mix of predictions, heuristics, and general facts. 10 is a value statement that is here so that people don't lose sight of the big picture.
AIs, if they are to be built at all, they are meant to be built to help people do stuff. Whether it is economic productivity, helping one's well-being, or bringing one closer to other people, the AIs are always tools. Building AI is an instrumental goal and people are terminal goals and this should stay that way.
If an AI begins hurting people it's time to shut it down.
There is a lot of strangeness coming from both camps and from other people with even worse epistemic standards than either camp (I know that can be hard to believe). I don't want switcheroos, where people promise "prosperity" and instead society begins to be built "for AIs," rather than people. I don't want to build AIs that have consciousness, moral worth, are capable of suffering, etc, etc. I don't want uploads. Not a great fan of over-cyborgization either. It's possible some countries might allow the above, but I predict and hope many will not.
I want biological humans to live long lives and conquer the galaxy. Nothing more. Nothing less.
Hey Pasha! I wrote on this last week - I defined four groups vs. two, and talked about finding the common ground between them also.
https://trustedtech.substack.com/p/trusted-001-navigating-ai-safety
https://trustedtech.substack.com/p/trusted-002-finding-common-ground