Towards AIs with Deep Human-Compatible Values

Recent advances in generative artificial intelligence (AI) have led to much public discussion about AI risks, guardrails, and proposals for regulations. A major concern is that AIs lack human-compatible values.

Meanwhile, there has been surprisingly little consideration of the nature of values in AI research, education, and discussions. Instead, salient questions and insights about values and their nature come from other disciplines. For example:

  • What are values?
  • What are the origins and utilities of values?
  • What are human-compatible values?
  • Why do different groups of people have different values?
  • How are values acquired?
  • When do groups change their values?
  • How can AIs be created that maintain alignment with human-compatible values?

Contrary to common presumptions, human values are not universal and static. Different groups of people operate in different contexts, do different things, and have different values. People learn deep competences and deep values by interacting with different environments and with each other. The utility of many important values is that they enable people to share work by collaborating effectively. To work gracefully with people and groups, future AIs will need to do likewise. Such AIs could potentially help groups to create and evaluate novel actions, prioritize competing values, and thrive in changing contexts and situations.

Not comprehending the rich, nuanced, diverse, and changing nature of competences and values, AI regulators and other stakeholders often overestimate the near term prospects for manually-created guardrails that are robust and general. They may also underestimate the potential power of value-aware collaborative AIs to help groups thrive during change.

Stefik, M. (2024) Towards AIs with Deep Human-Compatible Values. DropBox Link

This entry was posted in Artificial Intelligence, Developmental AI, Human-AI Collaboration. Bookmark the permalink.

Comments are closed.