I see us getting to a good outcome from Artificial Intelligence to be a three-headed dragon. There’s the main head, Technical Alignment. The slightly-less popular head to the right, Governance. But then there’s the neglected head: Strategy.
You did it! You solved alignment! And all the tech companies and nations take this very seriously, and have done whatever is needed to ensure only a safe, aligned Artificial General Intelligence gets built. Now what? What’s the strategy? We’re gonna assume you can’t prevent someone from creating a misaligned AGI forever. So what do you do?
Have it just do a safe, pivotal act to coup humans and prevent any other AGIs from rising? Have it work on science and tech? Protect humans? Benevolent god? Make humans god? What’s the plan? What’s the strategy? What can you do that humans might agree to? What ideas could you get inside of tech companies? How complicated? Are there stages? How fast? How many have to agree on it being a good idea? Concerned about the longterm future? Worried you might be trapping people for millions of years in a future they decided 100,000 years too late was a bad idea?
People have spent a shockingly small amount of research on determining what to actually do if we do solve alignment. It would be wonderful if someone (maybe you) shouted “Eureka! I have solved scalable alignment and I can prove this shit to anyone!” and then everyone could relax and enjoy all those delicious AI fruits. Unfortunately, even that might just be the first stage of the problem, and maybe even not the most difficult. Yes, possibly not the most difficult part of this.
Figuring out how to get a trillion-dimensional matrix of floating point numbers to do what you want is difficult. But you know what else is difficult? Figuring out what to do with 8 billion humans, and the trillions of other humans/AIs/???/aliens/shoggoths and whatever else the future might throw at us.
And we can’t just program the humans to all want the same thing. I mean… we probably could program the humans to all want the same thing, but we probably don’t want to do that! Right? (Please, please don’t try to do that). So now we’ve got to figure out how to align humans around a goal they’re mostly okay with. Oh, and many times even individual humans might not be sure about what they really want from an AI. And suddenly, all of this starts to look very difficult.
Like, imagine all the AI does is prevent other superintelligent/dangerous AIs from developing. This doesn’t save us from nukes. And many people who worked on developing the AGI will probably be really mad to hear someone only wants to use it to stop other AIs from being born. This would probably be a hard sell to the people who are actually building an AI: the people we would want to win over.
Say, you have it focus on doing science and tech. Will some of these advancements eventually lead to humans being able to create new existential risks for themselves down the road? How far can even a superintelligence predict what humans might do, say a thousand years from now?
Perhaps have it focus on making humans smarter, so that they can become the gods. Incrementally, so each stage can be risk-assessed by even smarter humans. Would people have patience for this? Would smarter humans develop new x-risks? Perhaps have the AI prevent other AIs and x-risks.
Maybe get the AI to develop technology to merge humans and machines, creating a glorious cyborgism future. If aligned, we could trust the AI not to hack us, and could remain in constant keep, neural communication with it. For some, their entire identity might become part of the AI. Others who didn’t want that, could simply continue their normal human lives. And the humans then could have more autonomy in shaping their own future, avoiding value lock in.
The point is more research is needed here, since it seems very important, and also very neglected. And getting people onboard for a crazy idea might be a hard sell. Persuading people, nations, and companies to get onboard about a decision that could shape the future could be perilous.
Try it yourself.
Pick a strategy. Now imagine it being implemented. How does it change the world? What could go wrong? How does it look after a year? After 10? 100? 1000? 1,000,000? Is that still the future you want?
Now, imagine how you might persuade people. Explain it to everyday humans. Explain it to tech CEOs, Venture Capitalists, Politicians, the United Nations. And see how well you do.