-
公开(公告)号:US20250045610A1
公开(公告)日:2025-02-06
申请号:US18673036
申请日:2024-05-23
Applicant: SRI International
Inventor: Pedro Daniel Barbosa Sequeira , Haochen Wu
Abstract: In an example, a method includes obtaining data indicating a plurality of trajectories representing a behavior of a team comprising a plurality of agents; obtaining a plurality of baseline profiles, wherein each of the plurality of baseline profiles encodes at least one of a preference and/or a goal that is relevant to a task performed by the team; generating a probability distribution of each agent of the plurality of agents over the plurality of baseline profiles, wherein the probability distribution of each agent describes a behavior of the agent; updating the corresponding probability distribution of each agent of the plurality of agents; and generating, based on the updated probability distributions of the plurality of agents, reward functions that explain the observed joint actions performed by the team, wherein each of the reward functions describes the behavior of a corresponding one of the plurality of agents.