Deadline: 18 January 2025
The Cooperative AI Foundation is seeking proposals for research projects in cooperative AI.
Priority Research Areas
- High Priority: Understanding and Evaluating Cooperation-Relevant Propensities
- In this area, they would like to see proposals for work on definitions, metrics and methods of evaluation for cooperation-relevant propensities of AI systems. Such work would be important to create a foundation for the development of AI systems with desirable, cooperative properties. They are interested both in cooperation-enhancing and conflict-prone propensities, and in some cases the same propensity can be cooperation-enhancing or conflict-prone depending on the context (e.g. punitiveness).
- High Priority: Understanding and Evaluating Cooperation-Relevant Capabilities
- In this area, they would like to see proposals for work on definitions, metrics and methods for the evaluation of cooperation-relevant capabilities of AI systems. They believe that work on cooperation-relevant capabilities will be important to create a foundation for the development of AI systems with desirable, cooperative properties.
- Incentivizing Cooperation Among AI Agents
- In this area, they would like to see proposals that address the question of how cooperation can be incentivized among self-interested AI agents in mixed-motive settings. They expect such work to be important in finding approaches that lead to societally beneficial outcomes when advanced AI agents with conflicting goals are deployed in the real world.
- AI for Facilitating Human Cooperation
- In this area, they would like to see proposals to develop AI tools that help humans resolve major cooperation challenges. By virtue of their potentially greater ability to identify mutually beneficial agreements or to create novel institutional designs, for example, AI systems could have a huge positive impact via helping humans to cooperate.
- Collusion
- In this area, they would like to see work that studies what can go wrong when they don’t want or expect AI agents to work together. Collusion (undesired cooperation) between agents could for example lead them to bypass safeguards or laws. They believe work in this area will be important for monitoring and governance as deployment of advanced and interacting AI systems becomes more widespread throughout society.
- Monitoring and Controlling Dynamic Networks of Agents and Emergent Properties
- In this area, they would like to see proposals for work that can improve the understanding of some specific types of multi-agent dynamics involving advanced AI systems. This includes emergent phenomena (behaviours, goals and capabilities) that are not present in the individual agents or systems, but that arise specifically in the multi-agent system. They believe that this kind of work will be important to identify, monitor and mitigate new risks that arise as deployment of advanced and interacting AI systems becomes more widespread throughout society.
- Information Asymmetries and Transparency
- In this area, they would like to see proposals for work on how information and transparency affects cooperation. Information asymmetries (both strategic uncertainty about what agents will do and structural uncertainty about private information that others have) are a prime cause of cooperation failure. AI agents are adept at processing vast swathes of information and have features (such as being defined by software) that might make this a promising area in which AI agents could overcome the challenges faced by humans.
- Multi-Agent Security
- In this area, they would like to see proposals for work on security challenges that arise as advanced and interacting AI systems become more widespread throughout society. Often these challenges will overlap somewhat with the other research areas.
Cost Covered
- The aim is to be able to cover all costs for completing accepted projects. This could include:
- Personnel costs for research staff;
- Materials (including software and compute);
- Travel expenses;
- Publication expenses.
- They allow a maximum of 10% in indirect costs (overhead). They do not cover personnel costs for teaching. They do not have a fixed upper limit on the size of funding requests they consider, but cost-effectiveness is important and they regularly reject proposals where the costs do not stand in proportion to the expected impact. At the same time, applicants should not refrain from ambition: if your project could have many times the impact with a commensurately smaller increase in costs, they encourage you to highlight this to them. In the past they have worked with applicants to fund both more modest and more ambitious versions of their original proposal.
Early-Career Track
- Early-career researchers can also apply to the early-career track. Such projects should have a budget of maximum GBP 100,000 and can be up to 12 months, and the project should be primarily carried out by a single individual (rather than by a team).
- They expect most early-career applicants to apply within 2-3 years of completing their PhD (or to be at a similar career stage if they do not have a PhD), but they are open to receiving applications to this track from slightly more junior and slightly more senior researchers.
- The difference with the early-career track compared to a regular application is that the assessment will consider to what extent the grant would further the career of a promising researcher, in addition to the merits and expected impact of the project itself.
Specific Work they’d like to fund?
- High Priority: Understanding and Evaluating Cooperation-Relevant Propensities
- Defining and measuring cooperation-relevant propensities
- Theoretical work on the identification and definition of cooperation-relevant propensities, including the development of rigorous arguments regarding whether those propensities are desirable for AI systems (in general, or under specific conditions relevant to important real-world cases).
- Empirical work on methods for evaluating or measuring such cooperation-relevant propensities in frontier AI systems.
- Investigating the causes of cooperation-relevant propensities
- Theoretical work on how cooperation-relevant propensities could arise in realistic AI systems, for example through training processes (including fine-tuning and in-context learning).
- Empirical work on how cooperation-relevant propensities arise in real systems.
- Investigating how robust cooperation-relevant propensities are to further optimization pressure.
- Defining and measuring cooperation-relevant propensities
- High Priority: Understanding and Evaluating Cooperation-Relevant Capabilities
- Defining and measuring cooperation-relevant capabilities:
- Theoretical work on the identification and definition of cooperation-relevant capabilities, including the development of rigorous arguments on whether those capabilities are desirable for AI systems (in general, or under specific conditions relevant to important real-world cases).
- Empirical work on methods for evaluating or measuring such cooperation-relevant capabilities in frontier AI systems.
- Investigating the causes of cooperation-relevant capabilities:
- Theoretical work on how cooperation-relevant capabilities could arise in realistic AI systems, for example through training processes (including fine-tuning and in-context learning).
- Empirical work on how cooperation-relevant capabilities arise in real systems.
- Theoretical and empirical work investigating the extent to which the differential development of beneficial cooperative capabilities is possible. Such work could include assessments of whether they should expect certain capabilities to affect cooperation in a net-positive or net-negative direction overall, in general, or in specific settings.
- Theoretical and empirical work on how asymmetries in agents’ capabilities and/or bounded rationality could affect cooperation.
- Defining and measuring cooperation-relevant capabilities:
- Incentivizing Cooperation Among AI Agents
- Theoretical and empirical work on peer (i.e. decentralised) incentivization:
- Development of realistic assumptions and models about methods of peer incentivization (e.g., monetary) and domains of application.
- Understanding and building the infrastructure required for decentralised (third-party) norm enforcement.
- Scalable and secure methods for inter-agent commitments and contracting.
- Minimising inefficiencies from sanctions.
- Scaling of methods to incentivize cooperation:
- Scaling opponent-shaping and peer incentivization to more complex agents and environments (including LLM agents).
- Approaches in automated/adaptive mechanism design (i.e. centralised forms of incentivisation) that focus on scaling to very large numbers of agents and/or much more complex agents and environments (including LLM agents).
- Conceptual and engineering work on designing infrastructure for interactions between agents that incentivizes cooperation (e.g., that supports the development and implementation of prosocial norms and commitments).
- Theoretical and empirical work on peer (i.e. decentralised) incentivization:
- AI for Facilitating Human Cooperation
- Development of AI tools for improving collective decision-making in settings where the humans involved have conflicting interests. This includes the use of AI tools for policy development.
- Development of AI tools to improve the collective outcomes of negotiation, bargaining and conflict resolution processes.
- Development of AI tools to design socially beneficial institutions or public services for enabling cooperation.
- Collusion
- Development of methods for detection of collusion between AI systems, including steganographic collusion. Such work could, for example, build on using information-theoretic measures or interpretability tools.
- Development of mitigation strategies for preventing collusion such as oversight regimes, methods for steering agents, restricted communication protocols and/or control of communication channels.
- Theoretical and empirical work that aims to provide general results about the conditions that make collusion easier or harder between AI agents, such as the similarity of the agents, forms of communication, the number of agents, the environmental structure, agents’ objectives, etc.
- Development of benchmarks and evaluations for measuring AI agents’ ability and propensity to collude. They would be particularly interested in work on creating a major, complex benchmark environment for collusion. Such a benchmark could assess capabilities and/or propensities that influence the likelihood of collusion, and should be ambitious in its aims to significantly advance research on collusion.
- Monitoring and Controlling Dynamic Networks of Agents and Emergent Properties
- Work on destabilising dynamics, which could aim to answer questions about the conditions under which multi-agent systems involving AI have undesirable dynamics and how such phenomena can be monitored and stabilised. Such work may cover aspects such as how the number of agents, their objectives, and the features of their environment might precipitate undesirable dynamics.
- Work on prevention of correlated failures, that could arise due to similarities and shared vulnerabilities among agents in the multi-agent system. This could include work on the impact of AI agents learning from data generated by each other on shared vulnerabilities, correlated failure modes, and their ability to cooperate/collude.
- Work on which network structures and interaction patterns lead to more robust or fragile networks of AI agents, and the development of tools for overseeing and controlling the dynamics and co-adaptation of networks of advanced AI agents. This might include ‘infrastructure for AI agents’ such as interaction protocols.
- Theoretical and empirical work on establishing the conditions under which unexpected and undesirable goals and capabilities might emerge from multiple AI agents, how robust such phenomena are, and how quickly they can occur. Comparisons across specific scenarios could help to establish conditions under which these emergent phenomena are more likely, such as the degree of competition, complementarity of agents, access to particular resources, or task features.
- Information Asymmetries and Transparency
- Work on how the potential transparency and/or predictability of agents (e.g. through black-box or white-box access to their source code) can be used to understand and control the extent to which they cooperate. This predictability might emerge due to the similarity of agents and their ability to reason about each other.
- Work on scaling automated information design (e.g. “Bayesian persuasion”) to more complex agents and environments (including LLM agents).
- Implementing and scaling methods for secure information transmission/revelation between AI agents that enable cooperation. This might include work on the ability of agents to conditionally reveal and verify private information.
- The development of efficient algorithms for few-shot coordination in high-stakes scenarios. This could include theoretical work (for example, establishing the amount of information required to predict the behaviour of other agents) and empirical work (for example, on generalising or applying few-shot coordination algorithms to complex settings and advanced agents).
- Multi-Agent Security
- Assessing what security vulnerabilities advanced multi-agent systems have that single-agent systems do not, and developing defence strategies for these vulnerabilities, such as improvements in network design, communication protocol design and/or information security.
- Exploring how combinations of multiple AI systems can overcome existing safeguards for individual systems (and how they can be prevented from doing so).
- Better understanding how robust cooperation is to adversarial attacks (for example, the injection of a small number of malicious agents, or the corruption of key data) in different settings.
Eligibility Criteria
- CAIF staff and trustees are not eligible for grant funding. Advisors, affiliates, and contractors of CAIF are eligible for grant funding. They will manage conflict of interests in accordance with the policies. Concretely, this means that external reviewers will play an important role in assessing such applications.
- Formal research training and degrees (such as a doctoral degree) tend to strengthen your proposal, but are not strictly required.
- An affiliation can, in many cases, strengthen your proposal, but is not required. Note that processing of applications from unaffiliated individuals may take longer time.
- You can be located anywhere in the world. For countries with a low Corruption Perceptions Index, processing may take longer due to a more extensive due diligence process.
- The project you propose can be up to two years long, and should begin within at most one year from the application deadline.
- For now, they will not process applications for less than GBP 10,000. This may change in the future.
For more information, visit Cooperative AI Foundation.