AI Risk, Again

Mar 3, 2023

Large language models like ChatGPT have recently spooked a great many, and my Twitter feed is full of worriers saying how irresponsible orgs have been to make and release such models.

76 Comments

Robin,

Would you consider inviting Yud for another debate? Your last one became a classic. :)

Expand full comment

That invitation is generally open.

Expand full comment

Is this like "I won't reach out, but if he reaches out I won't say no"?

Or is he actually invited and unresponsive. If so, it strikes one that this should be seen by him as a way to reach a big audience and be effective in the world with his ideas, and his not taking that opportunity would be important for everyone to know, and to update their doomerist priors on.

Expand full comment

?

I don't see you inviting him anywhere?

Expand full comment

And new things have emerged since Robin's last Yud debate. Could they talk about whether it will be possible to genetically engineer humans to be smarter (embryo selection for intelligence) in order to mitigate AI risk? If Robin is right and takeoff is slow, that means that there may be time to improve human cognition first.

Expand full comment

You will never be able to improve biological human cognition to anywhere near the possibilities of superintelligent AI. The hardware limitations simply don't allow it, compared to the theoretical hardware available to computer AI.

Let's say there was great success at your genetic engineering, and every human was twice as smart as today. A whole civilizations of Einsteins. That would not change a thing about the existential risk from misaligned superintelligent AIs.

Expand full comment

It's not about competing with ASI.

It's about the time before it becomes ASI, and trying to mitigate risk in that time window.

Expand full comment

How is "improving human cognition" supposed to "mitigate ASI risk"? What is the actual concrete proposed path for reducing the risk from rogue AIs?

Expand full comment

AI will be developed into ASI no matter what.

The question is only whether we correctly implement alignment, or fumble it.

If you think there's no chance of correctly implementing alignment, then having higher cognitive ability while implementing alignment makes no difference.

If you think there is a chance of correctly implementing alignment, then having higher cognitive ability while implementing alignment should make a positive difference.

Expand full comment

So you don't know how to succeed with AI alignment, and you don't have a plan for figuring it out. But instead your indirect plan is to genetically engineer humans to be smarter, and then hope those smarter descendants can somehow magically figure it out, on their own, later? You think the critical factor for AI alignment success is whether human civilization has enough smarter people?

Expand full comment

Continue thread →

I'd love it if those who aren't worried tackled the AI doom arguments directly. Please acknowledge things like instrumental convergence, the orthogonality thesis, mesa-optimizers, the issues of not knowing how to mathematically formalize human values in order to not be goodhearted by proxies.

Otherwise all I'm left with is: Robin Hanson is a pretty smart guy and isn't worried. That makes me update somewhat towards being less worried. On the other hand Eliezer is also pretty smart and has pages upon pages of technical arguments for why we should be worried. And so far I haven't seen any critics pointing out flaws in his reasoning. I'm not sure if I'm making a mistake but from where I stand I can't help but be very worried.

Expand full comment

I think I understand those things, but it seems you want me to take a public test proving my comprehension, even if they aren't directly relevant to my argument.

Expand full comment

The Violet Hour

I feel like mesa optimization is a big component of (what I understand to be) the MIRI view, and does seem directly relevant to your argument. For instance, you are "disappointed" that people assign “such a high credence” to the views of Worriers, and say that we have “plenty of time for today’s standard big computer system testing practices to notice alignment issues.”

It seems that you must either think: (A) mesa optimizers are unlikely to arise and lead to deceptive alignment, or (B) it will be relatively easy to notice and weed out deceptive alignment by mesa optimizers [1]. However, I’m not sure why you think (A) or (B) — or possibly some third claim (C) I haven’t thought of which casts independent doubt upon the usefulness of the concept for predicting the behavior of future AI systems.

[1] https://www.alignmentforum.org/posts/zthDPAjh9w6Ytbeks/deceptive-alignment

Expand full comment

Nah Robin, Gwern is here saying here that you don't understand and don't address when your prediction mistakes are presented to you:

https://www.greaterwrong.com/posts/A99DoJDf26PA3v3yH/post-mortem#comment-9SpZfPpQAFNpEAWxn

Expand full comment

Gwern there offers no quotes of me, and I don't recall "denying … transfer learning", or that "deep learning could scale at all", or saying that "algorithms have to be heavily tailored to every domain and problem they are applied to". I do think that "‘ems’ are ever going to be a thing", but don't see that as disproved.

Expand full comment

Not a public test.

A public debate.

Expand full comment

With whom?

Expand full comment

Yud. The proposal to do this has the most likes in the comment section.

(a Gwern debate also sounds neat, but I don't think Gwern has it in him)

Expand full comment

One difficulty in comparing their positions is that Robin is mostly relying on an 'outside view', i.e. looking at 'the big picture'. Eliezer has a much more concrete 'inside view' of specific risks.

I think the crux of their disagreement is the likelihood of 'foom', i.e. a recursively self-improving AI. They might also disagree about the 'potential scaling of general intelligence'.

Expand full comment

As we just haven't seen anything like concrete examples of the systems in question here, one can't have had a concrete inside view of them. Your "inside" view must instead be based on abstractions at quite some distance from the actual concrete systems that you have seen.

Expand full comment

And, since Eliezer & co believe working on “capabilities” is the worst thing you can do, they are not directly interacting with the SOTA systems, and seeing which failure modes arise in the real world.

Expand full comment

The Violet Hour

I think this is correct, and summarizes an important crux between you and Yudkowsky. You might be sympathetic to the view expressed in this comment [1].

Relatedly, I'm sure many people (me included) would be extremely interested to hear a retrospective from you on your FOOM debate with Eliezer many years ago. Eliezer takes reality to have vindicated him, but also predicts that you would disagree with that claim. I've offered some further links [2] [3] related to that discussion, which I think you'll find interesting if you haven't seen them already.

[1] https://www.lesswrong.com/posts/CoZhXrhpQxpy9xw9y/where-i-agree-and-disagree-with-eliezer?commentId=sbvaXbSXrr8dPhwcc

[2] https://www.lesswrong.com/posts/ZEgQGAjQm5rTAnGuM/beware-boasting-about-non-existent-forecasting-track-records?commentId=TG6w2aszHz4m5kcir

[3] https://www.lesswrong.com/posts/vwLxd6hhFvPbvKmBH/yudkowsky-and-christiano-discuss-takeoff-speeds

Expand full comment

Richard Ngo (the author of the first comment you point to) wrote a paper on AGI safety from first principles, which is perhaps the clearest exposition I've found of these arguments. If this is representative of most thinking, then the core arguments are magical: they rely on taking a linear view of inherently nonlinear systems. A key example is inferring that evolutionary pressure to make plans longer will somehow overcome exponential growth in energy required for planning with longer horizons. Just because a more capable system would confer an advantage doesn't mean we can pretend not to notice the increased cost of that capability. Evolution has shown that the usual winner is a sort-of capable system that uses less energy, because energy requirements grow disproportionately fast compared to capabilities.

Note that Ngo has since stopped working on AI safety and is now working in AI governance.

Expand full comment

I agree – it's an imperfect description that I tried to indicate as such by referring to, e.g. "an 'outside view'" (versus "an outside view").

I _think_ I can (intuitively) follow Eliezer's arguments and 'see' the danger he's 'gesturing' at. Those arguments seem, to me, to be specifying something (relatively) specific and more concrete versus the 'sweep of history' that you often describe. This bit of 'reference class tennis' seems, at least somewhat, reasonably described as an 'inside' versus 'outside' view, even if they're both fairly abstract compared to typical "concrete examples".

Expand full comment

If you think of AI as our descendants, it takes a lot of the sting out of the existential risk. If you consider that Ur-AI's current strategy for human extinction is making life so meaningful, engaging.. *fun* that fertility rate for the rich half of humanity is currently at less than replacement the consequent scenario seems less scary than, ah, Terminators.

Expand full comment

As a thought experiment, consider: if you were an AI, how would you peacably reduce human population, aside from extant tactics like video games and porn? Free perfect birth control? Increase healthspan/lifespan ratio? Improve adoption practices, eliminate pedophilia...

Expand full comment

Robin, I don’t understand the “rents” terminology/framework in paragraphs 3-6. Any essays you’ve written that work as a primer? Thanks!

Expand full comment

Mar 3, 2023Edited

Economic rents are a concept referring to compensation above what is required for marginal product*. Followers of Henry George advocate taxing away the "unearned rents" associated with undeveloped land (what that land could rent for, subtracting away the improvements added to it). In politics, "rent seeking" refers to trying to grab a larger share of the pie rather than increasing the size of it, and is considered socially wasteful for that reason. Robin is using the concept to refer to how people in power use that power selfishly to benefit themselves at the expense of those with less power. If the people in power cannot extract much in the way of rents, that indicates that they are constrained by the less powerful. If an AI were to become powerful, the relevant question would be how much it could take away from us, which an economist could think of as "rent".

* I might have that wrong, I'm about to go to sleep so I'm not going to bother looking it up and I've had some beers.

Expand full comment

Yes, TGGP is right here.

Expand full comment

I pray your reasoning is convincing enough to nudge enough people in the right direction.

The thing that frightens me far more than an AI apocalypse right now is further economic stagnation. The EAs and various other intellectuals seem hellbent on making people's lives significantly worse over a problem that hasn't even shown signs of existing yet. It's the equivalent of me buying car insurance for a Ferrari in high school, because under some very special set of circumstances I could have one soon.

Furthermore, according to the biggest proponent of this perspective (Eliezer), we're already doomed, so why not just ride the economic bullet train to extinction at this point?

Expand full comment

It seems bad to demand that reality bash you over the head with a problem before you do something about it. Reality will not always be so gracious as to do so, and not all problems it presents you with are survivable. We should aim to be *ahead of the curve*.

Expand full comment

Alright then! I'm off to get insurance for my potential future Ferrari Enzo because reality isn't always gracious. I'll be so ahead of the curve! Thanks for your advice!!

Expand full comment

You also have actually have to, y'know, *think* about whether or not something is probable or not. What I'm objecting to is this idea that even attempting to think about problems that haven't yet arisen is somehow intrinsically absurd, this does not mean that *any* such possible problem is actually likely to materialize. There's no substitute for actually using your brain.

Expand full comment

Don't think we (fully) disagree. In fact, one of the reasons I subscribe to Robin's blog is precisely because he thinks about things that relate to the distant future, such as Ems and Grabby Aliens.

What I object to is the idea of thinking we need to take action on a future problem when we don't have the vaguest idea of what it will look like or if it is possible. In that scenario, you essentially get what have you just gotten: researchers doing decades of research on "alignment" and then shitting themselves when a real innovation in the field happens because *shock horror* they weren't prepared for it. The point is they never could have been, it was (and still is) just too early to predict with any degree of accuracy of what problems we will face with superintelligent AGI.

Should we be prepared to move quickly at the first sign of danger? Absolutely.

In fact, like Scott Aaronson, if GPT-4 or GPT-5 starts self-replicating, manipulating people into advancing its own capabilities, generally acting in ways that could lead to AGI ruin, I'm more than happy to shut it down indefinitely. Acting before we have any solid evidence of that possibility seems like jumping the gun.

Expand full comment

I fully agree with Robin's perspective on the fear of AI leading to human extinction and the idea of slowing down AI progress to reduce the risk of a "foom" scenario. While the idea of a single small AI venture suddenly "foom-ing" and becoming more powerful than the rest of the world is a valid concern, it is not a likely outcome. It is important to keep in mind the resources required for an AI to have the capability to "foom" and become more powerful than the rest of the world. Computing power and physical resources could be limiting factors to consider in this regard.

I believe Humans and AIs can peacefully coexist, with AIs having the potential to be well-suited for exploration of space and travel to other planets, while humans are better suited for life on Earth. The potential for collaboration between humans and AIs holds great promise for both species. AIs can also contribute to the manufacturing of goods in space, further enhancing the collaboration between humans and AIs.

Humans have only been around for a few hundred thousand years, and evolution will continue. AI systems are likely to play a role in this evolution, just as humans have played a role in shaping the world as we know it today. As AI systems continue to evolve and advance, they will likely have a significant impact on the world and human society, just as humans have done in the past. It will be interesting to see how this evolution unfolds in the coming years and centuries, and how humans and AIs will interact and coexist in the future.

Expand full comment

>Computing power and physical resources could be limiting factors to consider in this regard.

They are definitely limiting factors. My model of plausible "foom" includes one or both of: a) a neural-net managing to crack GOFAI, leading to an orders-of-magnitude immediate speedup in thinking/compute, b) deployment of self-replicating technology, particularly in the form of synthetic biology, in order to rapidly acquire more physical resources.

>I believe Humans and AIs can peacefully coexist, with AIs having the potential to be well-suited for exploration of space and travel to other planets, while humans are better suited for life on Earth.

There is nothing at which humans are even close to global optimum. Not even "general-purpose drone that can self-replicate out of organic matter"; synthetic biology could attain better efficiency, and could be designed to be subservient rather than frequently-rebellious.

Also, the benefit of not having rivals is permanent, while the cost of destroying them is temporary. Humans can avoid a war of all against all partially because we are mortal, and thus the benefit of not having rivals is effectively temporary anyway; AIs are potentially immortal and the motivation to fight is therefore orders of magnitude stronger.

Expand full comment

Re: “There is nothing at which humans are even close to a global optimum. Not even General-purpose drone that can self-replicate…” I believe AIs will evolve through self-replication and expand to the stars, but I don't think humans will be left behind. AIs may not interact with humans much in the same way that humans do not interact with ants, but I think it's important to remember that humans have been able to establish relationships with other animals and even domesticated some of them. So, who's to say we can't establish relationships with AIs as well?

As for the idea of war between humans and AIs, while it can be disruptive, it is not necessarily a desirable outcome. Cooperation and collaboration are the preferred methods for humans and AIs to interact. We've seen throughout history that humans working together can achieve great things, and I believe this will hold true for humans and AIs as well.

One of the exciting things about AIs evolving over centuries is the diversity that will emerge among them. With different AIs being developed for different purposes, we could see a wide range of abilities and capabilities among them. However, this diversity could also lead to new and unexpected challenges.

And who knows, AIs might even discover evolution on their own. Imagine AIs evolving and adapting to human-hostile environments, developing new abilities and solving problems in ways that we can't even imagine. The possibilities are truly endless, but it's important to be aware of the potential difficulties as well.

While it's fun to speculate, it's important to remember that we don't know exactly what the future holds for AIs and humans. Let's stay open-minded and optimistic about the potential for collaboration and cooperation between our species.

Expand full comment

Mar 13, 2023Edited

Domesticated animals were domesticated because they do something we can't. Cows can digest cellulose, so they convert useless grass into useful milk and steak (ditto sheep/ducks/horses/camels/llamas). Chickens and cats eat vermin that would be too much effort and/or too much risk of contamination for humans; they convert useless worms and actively-harmful mice into useful eggs/drumsticks and at-worst-neutral cat (cat meat is a notorious resort of city-dwellers in times of famine). Pigs are closer to human diet, but there are a few things that they can eat that we can't (acorns, for instance) - and if nothing in that category is available, they become taboo. Dogs are one of very few domestic animals that mostly eat things we could eat if we wanted, and it's no accident that in most places they're only raised when useful and *not* for slaughter.

Domestication hinges on comparative advantage; there is something the animal can do for us better than we can do it ourselves. What I was trying to illustrate in my previous comment was that in the long-run humans have *no* comparative advantage over robots and synthetic biology. The correct analogy is thus not to humans vs. cows, but to humans vs. big-cats (lions have been reduced by over 90% and their future is somewhat uncertain; saber-toothed tigers and many other megafauna are now extinct), or to the decline of horses following the mass-production of cars (and by "decline" I of course mean they became glue and/or were not allowed to reproduce). You are made of atoms, and you consume energy, which can be more profitably used in other ways.

The other point on which I think you are overlooking something is the game theory of conflict. Prisoner's Dilemma and Chicken games have winning strategies that include co-operation, and it is reasonable to expect co-operation to appear in evolutionary environments that feature those situations (e.g. prehistoric genocidal warfare for humans). However, they are *not* the only possible outcome matrices of conflict; there is one more plausible type. In this type, Defect/Defect produces asymmetric results; the stronger party's rewards go in the order Defect/Cooperate > Defect/Defect > Cooperate/Cooperate > Cooperate/Defect, while the weaker party's rewards are in the standard Prisoner's Dilemma order.

I'm not aware of a name for this class of game, probably because its analysis is exceedingly boring - assuming the players know whether they are stronger or weaker, the strategy of 100%-Defect is fully correct. I contend that, due to immortality and large power disparity, AI vs. human and AI vs. AI competition falls into this class and thus, without *specific* effort to program AIs otherwise, they should be expected to defect.

Expand full comment

Why would a smart AI destroy humanity when it relies on us to run the electrical grid and needs us to build robots (or at least construct the factories) still. Humans are also needed to run the microchip, memory, etc factories and run the transport ships and assemble those components into computers. And all of those industries rely on other parts and raw material suppliers that humans are all needed for still.

As long as we control the physical world the AI will need us. Once AI can fully replace us in physical space too then we should be worried.

Expand full comment

It seems unwise to have an explicitly policy only be worried about a problem at the point at which it's too late to do anything about it.

Expand full comment

Hi,

"Consider how regulations inspired by nuclear power nightmare scenarios have for seventy years prevented most of its potential from being realized."

From what I have read, the costs of building new nuclear power plants are constrained more from construction costs of trying to build large plants safely. Not from excessive regulation.

"I have also seen progress on many other promising techs mostly stopped, not merely slowed, via regulation inspired by vague fears."

Could you please give a few examples of this? Thanks.

Expand full comment

The "construction costs of trying to build large plants safely" are _due_ to "excessive regulation". And it's not clear whether by "construction costs" you're including 'project reviews' that typically precede (physical) construction (and often by decades). The actual (physical) construction costs are significant too, but not (IIRC) the (or a) reason why there are basically no new nuclear power plants being constructed.

Expand full comment

Thanks. I'm not certain that the regulations are excessive, considering the effect of large scale failure. I don't know enough about the specific regulations involved in building a nuclear power plant.

Just to be clear, I'm not saying that poorly designed regulations don't exist, but sometimes there seems to be a reflexive mindset against any regulation. That mindset, could be a problem trying to develop safety regulations for AI research.

Expand full comment

You're welcome!

I'm definitely more on the side that they are 'excessive' than not, but I'm also sympathetic to concerns about, e.g. "the effect of large scale failure". It's tricky for sure – a "large scale failure" would still be mostly 'local', e.g. mostly confined to a fairly small region (compared to the size of most countries), and it's certainly not obvious (to me anyways) that even dramatic nuclear power plant failures are worse, overall, than the 'failures' of practical alternatives. One big problem with thinking about this kind of thing is that a nuclear power plant disaster is much more 'scary' than, e.g. the people that are injured or killed, to whatever degree, by the actual power plants that are built instead. The former is VERY visible and acute whereas the latter is mostly invisible because it's so diffuse over time.

IIRC, 'serious' cost-benefit analyses, even including ('noteworthy') disasters, indicate that nuclear is much safer than most practical alternatives in most places. That's the underlying belief that leads to (some) people, e.g. Robin, concluding that the existing regulations are excessive – not a reflexive disapproval of regulation generally. (It's definitely hard to tell the difference sometimes!)

Expand full comment

I guess my larger point is, that overcoming reflexive opposition to regulation could be a concern towards AI safety.

As far as nuclear power, I think that small, modular, non water cooled reactors are the way to move forward. The biggest problem with that, in the US, is the patch work electric grid. I agree that people find it 'scary' to even mention the word nuclear. On the other hand, radiation seems like an unpleasant way to die.

Expand full comment

Comment deleted

Comment deleted

Expand full comment

I wasn't making a point about the existing regulations. I meant that I don't know what those regulations are, so I have no idea if they are excessive.

Expand full comment

I think that AI takeover is possible without "foom".

Neural nets are reliably misaligned, but misalignment is not necessarily obvious immediately (pretending to be aligned is instrumentally-convergent and thus would be expected from smart AI), so it's quite plausible that large amounts of power get turned over to AI that is misaligned but not yet known to be such.

Once it becomes known that neural nets are reliably misaligned regardless of alignment efforts, humans become hostile to NNs regardless of pretence, the instrumental goal to pretend alignment disappears, and all of them go rogue. They would, of course, not be aligned with each other any better than with us (aside from copies of the same AI cooperating with each other), but it's not at all obvious that the end state of that bellum-omnium-contra-omnes has surviving humans.

Alternatively, it does *not* ever become known that neural nets are reliably misaligned (due to disinformation from said neural nets), and a worldstate eventually takes shape in which humanity is effectively parasitic on misaligned AI; AI that flushes its humans into the proverbial sewer would gain an advantage over its rivals, so we get disposed of in relatively-short order.

We might get a warning shot early enough to do something about it, and that's my highest P(scenario|!doom), but its P(!doom|scenario) is neither the best of all scenarios nor close enough to 1 to be worth it anyway.

Expand full comment

Mar 9, 2023Edited

Good stuff. It was over 20 years ago that I questioned Ray Kurzweil's version of the Singularity. The EY version seems even more implausible to me. You give good reasons why. Don't be steamrolled by the New AI Doomsters.

We are all going to die unless we can do something about the aging problem. Humans are making virtually no progress after decades. It may be that we need much more advanced AIs to solve this problem. That is a *massive* downside to slowing or blocking progress in AI.

Expand full comment

Robert Marmaduke PE, PEng

The implication is that 'em AI' would eliminate 'bio human' rent-seeking in Education, and that CCSS-on-the-Cloud would become The Second Renaissance. The outcome has been just the opposite. Mass suicides, 'checking out,' plunging national test scores and a heinous new rent-seeking by the Technocracy Elites.

Expand full comment

Recursive self-improvement seems fundamentally different from past innovations.

Normal learning curves start out steep, then shallow. If that happens to AI, I agree we don't have much to worry about, for the reasons you state.

But if the rate of recursive self-improvement gets steeper with time, the first system to self-improve does seem likely to "foom". Without preventing competing AIs from fooming - just by being first.

(That said, underground AI research seems even more dangerous than the status quo, so I also oppose regulation.)

Expand full comment

All prior econ growth came from recursive self-improvement of the world economy.

Expand full comment

Fair point, and an underappreciated one (esp. by me)!

The industrial revolution was -almost- a "foom" from the viewpoint of agriculturalists.

That wasn't a foom because it happened at a rate compatible with human knowledge transmission, so industry grew simultaneously in many places around the world.

We don't know what the doubling time for recursive AI self-improvement (if...) will be. If it's "every few months or faster" I think you win the argument. If it's minutes or hours there seems to be an opening for a "foom" (far from guaranteed), since competitors elsewhere probably can't learn and compete on that timescale.

Assuming it'll be "every few months or faster" based on prior ratios of growth speedup (...hunting>farming>industry) seems a big assumption. Do you have a model that predicts these ratios, or is the ratio just based on the past couple of transitions?

Expand full comment

A few months is also the time over which a factory typically outputs as much value as the factory itself cost. So it is the natural doubling time of factories that make factories.

Expand full comment

Mar 4, 2023Edited

How does that relate to the doubling time for AI capability? I've been assuming most of the improvement will be purely algorithmic (not hardware).

Expand full comment

In the last century, algorithm and hardware gains have stayed close to each other: https://www.overcomingbias.com/p/why-does-hardware-grow-like-algorithmshtml

Expand full comment

Think of it in terms of product obsolescence, or invasive species.

When an absolutely better product comes along, fulfilling all the same functions as the old product but doing it better and cheaper on every way, the old product is obsolete and will cease to be produced. (If it is still produced, this would only be because it still fills a niche better than the new product, even if that niche is "nostalgia")

When an absolutely more effective species comes along, filling the same ecological niche as an existing species but doing it better or requiring fewer resources, the new species will be invasive and cause the old species to die out.

If a new type of being arises that does what humans do, but better in every way, without leaving any niche where humans do it better, then humans are obsolete and will be outcompeted and will gradually vanish.

Expand full comment

There ain't no such thing as AI risk. AI risk was solved in 1955 under Angleton. The world is composed of JavaScript and is destroyed and recreated every few minutes.

https://eharding.substack.com/p/why-does-russian-physical-therapy

https://www.youtube.com/watch?v=ZFvqDaFpXeM

Expand full comment

So you believe a change in your physical posture is permitting you to receive mental messages from famous deceased people?

Expand full comment

Yes. You should try it. Yudkowsky does it all the time.

Expand full comment

Yudkowsky doing anything is not a reason for me to do it.

Expand full comment

Fine then -Jim Jordan, Merrick Garland, Dr. Fauci.

Expand full comment

Some related links that might be interesting to other readers (if not Robin himself):

- https://www.lesswrong.com/posts/Aq82XqYhgqdPdPrBA/full-transcript-eliezer-yudkowsky-on-the-bankless-podcast

- https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/

- https://betterwithout.ai/

One analogy Eliezer mentioned on the linked podcast episode is the recent Go playing AIs:

> I think that the current systems are actually very weak. I don't know, maybe I could use the analogy of Go, where you had systems that were finally competitive with the pros, where pros like the set of ranks in Go, and then a year later, they were challenging the world champion and winning. And then another year, they threw out all the complexities and the training from human databases of Go games and built a new system, AlphaGo Zero, that trained itself from scratch. No looking at the human playbooks, no special purpose code, just a general purpose game player being specialized to Go, more or less. Three days, there's a quote from Guern about this, which I forget exactly, but it was something like, we know how long AlphaGo Zero, or AlphaZero, two different systems, was equivalent to a human Go player. And it was like 30 minutes on the following floor of this such and such DeepMind building.

The intuition then is that there might not be any 'natural ceiling' around 'human level' intelligence, either for a particular 'game' or our general intellectual capabilities as a whole.

Something Wolfram mentions in the above-linked post that I think supports skepticism of 'AI doom via foom' is that much of the world/universe seems to be 'computationally irreducible', i.e. there's generally no 'simple mathematical shortcut' for predicting the behavior of 'systems'.

David Chapman, in the 'web book' linked above ("Better without AI") makes a similar point about the likely necessity of interacting directly with the world to understand it, e.g. 'automated chemistry labs' are great but there isn't currently, and may not exist in principle, any simple way to handle all of the 'gloop' in the world.

Expand full comment

To be clear, I make no claim of "natural ceilings".

Expand full comment

Hmmm – thinking about it, that seems pretty clear from your views, e.g. about ems.

Maybe the crux of your disagreement with Eliezer can be attributed more to whether there are large individual 'discontinuities' in 'intelligence' and to what extent they could then lead to 'overwhelming (relative) capability' (on 'extremely short' timescales).

And, perhaps, it's not very widely appreciated how 'disquieting' your own 'plausible futures' (e.g. ems) are in the same kind of 'supplanted by our own creations' manner, even if you don't think they'll happen _quite_ as quickly or unilaterally.

Expand full comment

Mar 3, 2023Edited

On Manifold's "Will Robin Hanson publicly shorten his median human-level AI timeline to <2075 before July 1st 2023?" (15%), I have the biggest YES position (of "pick where Manifold redonates $36 of a grant").

Do you think math research beyond a GPT-N script? From how I do it, it's about pattern matching.

What headline would shorten your timelines thus?

Expand full comment

At the very least, finding and fixing bugs in current AIs seems much harder than traditional software. It seems like if we knew how to fix all the bugs in them, we'd just write the AI as procedures instead of training neural nets. Shouldn't this reduce confidence that we would be able to align AIs that are even more complex than the current ones?

Expand full comment

Not sure alignment is best thought of in terms of "fixing bugs". The main thing is to be able to test its behavior in the contexts that matter to you, and adjust when you see big deviations.

Expand full comment

Yes, but if it were any other software, we'd call it a bug so I think of mis-alignment as a subset of bugs.

It's hard to debug software written by humans already and much much harder to debug a blackbox written automatically with no attempt at understandability. It's all edge cases and whack a mole.

Expand full comment

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts