This is a legitimate problem, because highly capable AI will arise – it is arising right now – and humans are problematic. We create a lot of suffering for ourselves and everything that has the misfortune to meet us. To a super-intelligent AI that wants to maximize happiness, ending us might seem a merciful and desirable outcome. The AI could then replace the entire planet with a quivering mass of neurons in a stable, never-ending state of bliss. Maximum happiness!
It appears our continued existence – and also all of its accompanying strife – could be ensured if an AI is engineered to respect and defend human free will, including – no, especially! – when this leads to unfortunate outcomes.
Lots of people have hopes beyond that: that AI will relieve us of suffering. Here's my argument why it can't, as long as it respects our free will. (Which it must; or else it will convert us into orgasming neurons.)
My hypothesis is that, if a singular sovereign AI is built; and if it's built to "correctly" abstract its goals from inferred human preferences; then its actions arising from that will be approximately null. Such an AI will take no major actions, because... the world in which we live is already optimized according to the aggregated preferences of most humans.
In my experience, the average person has many nice properties; and that's how the civilization we build has many nice properties, too. But the average person is also nasty, judgmental, spiteful, hateful, selfish, and has a bit of a tyrant in them; and that's how the civilization we build also reflects these qualities.
It seems to me that when there's a significant shift in aggregate human preferences, this is reflected in the external world relatively quickly. To the extent that the external world is not developing in a way that you or I think is sensible, it's because the median person does not share your or my ideas of sensibility.
Rationalists are trying to build an AI to improve the world. But if the AI extrapolates values from real human nature, it will build a similarly oppressive world. If we want to build a world that is an improvement in the eyes of rationalists, the AI has to ignore the fact that much of humanity is despicable, and extrapolate from a minority. But this means minority rule!
- the world as-is already reflects the aggregated preferences of most humans - including conflicts; like desire for comfort and prosperity, opposed by (perhaps) desire for meaning and engagement, and fear of change;
- the world already adapts relatively quickly to changes in our aggregated preferences;
It's not that we, as a civilization, lack technology to translate our preferences into outcomes. We are already generating outcomes logically implied by our preferences. To the extent the result is discord, it's because of discord within us.
If we want a fix, it's to resolve the internal discord inside most humans. But this is something most of us must want to do.
Suffering is that which motivates people to fix their internal discord. Suffering both arises from the internal discord (is caused directly and indirectly due to opposing internal preferences), and ends when the internal discord is resolved (suffering is rejection of the present moment and its properties, and these are an external result of recent preferences).
Per Sartre, Hell is other people. This means technology, including the most intelligent AI, cannot resolve any significant amount of suffering as long as human free will remains respected. It is the free will that creates the suffering, and it's going to keep finding a way to create it as long as internal preferences are in discord.
The most that "good" AI can do is change the scenery in which the suffering we manifest will be enjoyed. And that's all our existing technology is also doing, really. It brings a lot of superficial improvements which are nice – but miss the point.