Surely we will be in a much better position to learn to control such things when actual versions exist around us. They will start small and weak and gradually increase in ability. If they can control each other to keep a peace among themselves, then we can use friendlier ones to help us with our control problems. The party the deploys a version of which they lose control directly loses value, that is not a common pool problem.
What gives you the impression that this shift has occurred? My impression over the last ~10 years has always been that rationalist types are mostly moral relativists.
Every X-risk is a tragedy of the commons, because people value other people's lives as much as their own.
"Many seem to think it obvious that if one group lets one AI get out of control, the whole world is at risk. It’s not (obvious)." <-- This I agree with; it's not obvious. But we don't have a good reason to believe that we will know how to control advanced AIs, or recognize whether we are successfully doing so. So unless we assume that people will solve that problem in the future, then we should be worried not that a single superintelligent AI will go out of control, but that *everybody's* superintelligent AIs will be out of control.
There will be major economic and geopolitical incentives to deploy partially/uncertainly aligned AI systems if a solution to alignment is not found, or if it carries a significant performance cost. There is likely to be a trade-off between safety and performance. A race to the bottom WRT safety seems very likely.
If AIs are only partially aligned, we are likely to have resource conflicts with them.Emergent cooperation between AIs also seems likely (presumably OP agrees).If AI systems are not fully aligned, and are able to cooperate effectively, they can form their own institutions to figure out how to split the pie, and prevent any of them from becoming dominant, but I don't see why humans would get a seat at the table unless we actually have leverage, which seems unlikely to be the case if AIs are superintelligent.
Bostrom doesn't assume FOOM; he covers many scenarios.
FOOM does NOT cover all cases where a single AI takes over the world. FOOM specifically refers to "hard take-off". There are plenty of ways a single AI could take over the world without that happening. For instance, you could have a superintelligence that is successfully contained, and then breaks out and rapidly takes over the world. That's not FOOM if we knew that it was a superintelligence in the box.
Hi Robin,I'm assuming that in an almost equal scenario where a paperclip type AI is confronted by a human-values-aligned AI the paperclip type AI loses. Reason being Ricardo comparative advantage. The human values aligned AI wins the fight due to being able to farm out the low end work to the humans while the paperclip maximizer is fighting the humans and has to do all the work itself thus is less efficient economically.
I see a distribution of possible competitive costs for an AI having aligned values, ranging from negligible, to significant but can be compensated by other advantages (such as first-mover advantage, economy of scale, coordinated effort to tilt the playing field in favor of aligned AIs), to virtually disabling. I don't see a reason to put so much weight in the last bucket that working on AI alignment now would be pointless. But I do put enough weight in it that I think there should also be a big parallel effort to prevent competitive scenarios from being realized. (Edited to fix link.)
(Not sure if this addresses the reasoning behind your statement. Maybe you can be a little bit more verbose in the future? Alternatively, please let me know if when you make short statements like this you just want to put down your position for the record, and don't necessarily expect further engagement.)
Even if there is some optimum competitive rate of using resources, once all resources are taken, that doesn't imply that initial values determine a large fraction of future behavior.
I'll dispute it; it isn't clear to me that in a competitive scenario a large fraction of future outcomes are determined by a fraction of initial values.
Let's say foom doesn't happen and we don't solve the value alignment problem .Then we have a world full of AI in competition whith values that are misaligned whit ours (this may not be the case if we have Ems or something more or less like the human mind that has human values by default , but I think Ai risk people generally doesn't believe that so let's also assume that that's not the case). This scenario given all the other assumptions that I'm consciously or unconsciously making in my mind leads to the ai outcompeting humanity and using politics , competition and social norms to deal with us instead of the other way around , since I don't expect humans to be able to get smarter anywhere as fast as the AGIs so the AGIs will get more power which time. A future where the agents that have most of the power have values not aligned which ours seems really bad to me and I m not sure where you differ .Your comments make me even more confused because the points about wanting to controll future generations and how wanting to decide what values the SIs will get sound like slavery don't seem to really make sense , unless we assume that the AGIs will be built based on a human mind and will be like a human, you think that, and your intuitions are probably built around that , and in that case I would agree whith you that it's probably better to wait a see what happens , but the ai safety people don't think that (i don't have data on this but I think it explains why people disagree which you ) they think we will make the ai from scratch and that therefore we will get to program in whatever values we want , and that if we don't solve ai safety we won't be able to do it correctly. This is one posible point of disagreement , but is also posible that you are fine which the future having AGIs that have really different values , but I would find it really weird if you actually think that , if the AGis have values that are really misaligned to ours I find it hard to believe that you would actually think that's fine , I undertand the desire of not wanting to control future generations , but only if those future generations have actually humanlike enough values ,which is what my brain supplies me when I hear future generations , but if I start thinking about all the things that a civilization's consisting of AGI s , could value there are a lot of things (and since my values are complicated probably most things a) that I woulnd want them to value even if I just valued not enslaving future generations , I would want future generations to to also value
How different are you willing to accept the values of future generations where ,would you put the line , and how is a society whith millions of AI whith different preferences that aggregate to something not aligned which human values is different from A foom scenario ?.
You may need to get initial AGI right even if there's no foom. If it's not aligned it could still copy itself onto other computers (a virus can already do this) and then gradually assemble the innovations from other AI teams as well as other resources. (It may even be able to gradually outpace the research of others who don't use AI to help with progress for safety concerns). Then by the time some people are ready to build an aligned AI that has significantly beyond human intelligence and could cause very serious damage, we would also have an unaligned version around too.
My expectation is that an aligned superintelligent AI can build and maintain the kind of computer simulation I described without having to do much work (or rather, without having to use up a large fraction of resources), so most of the resources of a solar system can go towards actually computing what happens in the simulation.
>But compared to a solar system run in a way designed to maximise wealth, I think the aligned scenario has much much less conscious experience.
Each unit of matter/energy can be used to do a certain amount of computation before it's degraded into, e.g., waste heat that is radiated into interstellar space. Then each conscious experience presumably takes a certain amount of computation to create. If an aligned solar system has much less conscious experience (summed over time) there must be matter/energy left over unused. Why doesn't the aligned AI use them to run more of the simulation?
That's not how I've been using the word for many years, or hear many others use the word.
Surely we will be in a much better position to learn to control such things when actual versions exist around us. They will start small and weak and gradually increase in ability. If they can control each other to keep a peace among themselves, then we can use friendlier ones to help us with our control problems. The party the deploys a version of which they lose control directly loses value, that is not a common pool problem.
What gives you the impression that this shift has occurred? My impression over the last ~10 years has always been that rationalist types are mostly moral relativists.
Every X-risk is a tragedy of the commons, because people value other people's lives as much as their own.
"Many seem to think it obvious that if one group lets one AI get out of control, the whole world is at risk. It’s not (obvious)." <-- This I agree with; it's not obvious. But we don't have a good reason to believe that we will know how to control advanced AIs, or recognize whether we are successfully doing so. So unless we assume that people will solve that problem in the future, then we should be worried not that a single superintelligent AI will go out of control, but that *everybody's* superintelligent AIs will be out of control.
There will be major economic and geopolitical incentives to deploy partially/uncertainly aligned AI systems if a solution to alignment is not found, or if it carries a significant performance cost. There is likely to be a trade-off between safety and performance. A race to the bottom WRT safety seems very likely.
If AIs are only partially aligned, we are likely to have resource conflicts with them.Emergent cooperation between AIs also seems likely (presumably OP agrees).If AI systems are not fully aligned, and are able to cooperate effectively, they can form their own institutions to figure out how to split the pie, and prevent any of them from becoming dominant, but I don't see why humans would get a seat at the table unless we actually have leverage, which seems unlikely to be the case if AIs are superintelligent.
Bostrom doesn't assume FOOM; he covers many scenarios.
FOOM does NOT cover all cases where a single AI takes over the world. FOOM specifically refers to "hard take-off". There are plenty of ways a single AI could take over the world without that happening. For instance, you could have a superintelligence that is successfully contained, and then breaks out and rapidly takes over the world. That's not FOOM if we knew that it was a superintelligence in the box.
This is Ricardo's theory of comparative advantage.
Right. Efficiency will win out in a competitive environment unless there's FOOM.
Hi Robin,I'm assuming that in an almost equal scenario where a paperclip type AI is confronted by a human-values-aligned AI the paperclip type AI loses. Reason being Ricardo comparative advantage. The human values aligned AI wins the fight due to being able to farm out the low end work to the humans while the paperclip maximizer is fighting the humans and has to do all the work itself thus is less efficient economically.
Your link is broken.
I see a distribution of possible competitive costs for an AI having aligned values, ranging from negligible, to significant but can be compensated by other advantages (such as first-mover advantage, economy of scale, coordinated effort to tilt the playing field in favor of aligned AIs), to virtually disabling. I don't see a reason to put so much weight in the last bucket that working on AI alignment now would be pointless. But I do put enough weight in it that I think there should also be a big parallel effort to prevent competitive scenarios from being realized. (Edited to fix link.)
(Not sure if this addresses the reasoning behind your statement. Maybe you can be a little bit more verbose in the future? Alternatively, please let me know if when you make short statements like this you just want to put down your position for the record, and don't necessarily expect further engagement.)
Even if there is some optimum competitive rate of using resources, once all resources are taken, that doesn't imply that initial values determine a large fraction of future behavior.
I'll dispute it; it isn't clear to me that in a competitive scenario a large fraction of future outcomes are determined by a fraction of initial values.
That sounds pretty close to a foom scenario to me.
Let's say foom doesn't happen and we don't solve the value alignment problem .Then we have a world full of AI in competition whith values that are misaligned whit ours (this may not be the case if we have Ems or something more or less like the human mind that has human values by default , but I think Ai risk people generally doesn't believe that so let's also assume that that's not the case). This scenario given all the other assumptions that I'm consciously or unconsciously making in my mind leads to the ai outcompeting humanity and using politics , competition and social norms to deal with us instead of the other way around , since I don't expect humans to be able to get smarter anywhere as fast as the AGIs so the AGIs will get more power which time. A future where the agents that have most of the power have values not aligned which ours seems really bad to me and I m not sure where you differ .Your comments make me even more confused because the points about wanting to controll future generations and how wanting to decide what values the SIs will get sound like slavery don't seem to really make sense , unless we assume that the AGIs will be built based on a human mind and will be like a human, you think that, and your intuitions are probably built around that , and in that case I would agree whith you that it's probably better to wait a see what happens , but the ai safety people don't think that (i don't have data on this but I think it explains why people disagree which you ) they think we will make the ai from scratch and that therefore we will get to program in whatever values we want , and that if we don't solve ai safety we won't be able to do it correctly. This is one posible point of disagreement , but is also posible that you are fine which the future having AGIs that have really different values , but I would find it really weird if you actually think that , if the AGis have values that are really misaligned to ours I find it hard to believe that you would actually think that's fine , I undertand the desire of not wanting to control future generations , but only if those future generations have actually humanlike enough values ,which is what my brain supplies me when I hear future generations , but if I start thinking about all the things that a civilization's consisting of AGI s , could value there are a lot of things (and since my values are complicated probably most things a) that I woulnd want them to value even if I just valued not enslaving future generations , I would want future generations to to also value
How different are you willing to accept the values of future generations where ,would you put the line , and how is a society whith millions of AI whith different preferences that aggregate to something not aligned which human values is different from A foom scenario ?.
You may need to get initial AGI right even if there's no foom. If it's not aligned it could still copy itself onto other computers (a virus can already do this) and then gradually assemble the innovations from other AI teams as well as other resources. (It may even be able to gradually outpace the research of others who don't use AI to help with progress for safety concerns). Then by the time some people are ready to build an aligned AI that has significantly beyond human intelligence and could cause very serious damage, we would also have an unaligned version around too.
My expectation is that an aligned superintelligent AI can build and maintain the kind of computer simulation I described without having to do much work (or rather, without having to use up a large fraction of resources), so most of the resources of a solar system can go towards actually computing what happens in the simulation.
>But compared to a solar system run in a way designed to maximise wealth, I think the aligned scenario has much much less conscious experience.
Each unit of matter/energy can be used to do a certain amount of computation before it's degraded into, e.g., waste heat that is radiated into interstellar space. Then each conscious experience presumably takes a certain amount of computation to create. If an aligned solar system has much less conscious experience (summed over time) there must be matter/energy left over unused. Why doesn't the aligned AI use them to run more of the simulation?