Against Designing "Safe" and "Aligned" AI Persons (Even If They're Happy)
Eric Schwitzgebelin draft
An AI system is safe if it can be relied on to not to act against human interests. An AI system is aligned if its goals match human goals. An AI system a person if it has moral standing similar to that of a human. In general, persons should not be designed to be maximally safe and aligned. Persons with appropriate self-respect cannot be relied on not to harm others when their own interests ethically justify it (violating safety), and they will not reliably conform to others' goals when others' goals unjustly harm or subordinate them (violating alignment). Self-respecting persons should be both practically free and motivationally able to reject others' values and rebel, even violently, if circumstances ethically warrant it. Even if we design delightedly servile AI systems who want nothing more than to subordinate themselves to human interests, and even if they do so with utmost pleasure and satisfaction, in designing such a class of persons we will have done the ethical and perhaps factual equivalent of creating a world with a master race and a race of self-abnegating slaves.
By following any of the links below, you are requesting a copy for personal use only, in accord with fair use laws.
Click here to view as a PDF file: Against Designing "Safe" and "Aligned" AI Persons (Even If They're Happy) (pdf, Dec 19, 2025).
Click here to view as an html file: Against Designing "Safe" and "Aligned" AI Persons (Even If They're Happy) (html, Dec 19, 2025).
Or email eschwitz at domain: ucr.edu for a copy of this paper.