Defining Human Values

Summary

Defining human values is a crucial challenge in AI alignment research, as it involves capturing and encoding the complex moral principles, ethical judgments, and social norms that guide human behavior. Researchers have approached this task through various methods, including creating datasets of ethical scenarios and judgments (like the ETHICS dataset and Commonsense Norm Bank), developing models to predict moral reasoning (such as Delphi), and exploring philosophical frameworks for addressing normative uncertainty. The challenge lies not only in identifying universal human values but also in accounting for personal values, different moral frameworks, and the contextual nature of ethical decision-making. Efforts to define human values must grapple with the interplay between competing values, the vagueness inherent in moral concepts, and the need for a well-defined procedure to resolve ontological crises as AI systems evolve. Ultimately, the goal is to create AI systems that can make ethically-aligned decisions in diverse real-world situations, reflecting the nuanced and sometimes conflicting nature of human values.

Research Papers