Agency and Intentionality

Summary

The concept of agency and intentionality in AI alignment research explores the distinction between systems that can be described as agents versus those that are merely devices. Drawing from Dennett’s ideas of physical and intentional stances, researchers have formalized these concepts in computational theory. Agents are characterized by their ability to optimize for specific functions or goals, while devices are defined by their input-output mappings. This distinction is crucial for understanding and predicting system behavior, as it allows for the application of Bayesian reasoning to determine the probability of a system being an agent or a device based on observed actions. This framework has important implications for AI alignment, as it helps researchers better understand the nature of artificial systems and their potential for goal-directed behavior, which is essential for ensuring AI systems act in alignment with human values and intentions.

Research Papers