MIT researchers have developed a bot with AI that can beat human players in online multiplayer games. It uses deductive reasoning to identify friend or foe to ensure victory.
DeepRole is described as the first gaming bot that can win online multiplayer games where participants’ team allegiances are initially unclear.
“If you replace a human teammate with a bot, you can expect a higher win rate for your team. Bots are better partners,” said first author Jack Serrino, who majored in electrical and computer science at MIT.
The work is part of a broader project to better model how humans make socially informed decisions. Doing so could help build robots that better understand, learn from, and work with humans. In the games, player roles and motives are kept secret.
“Games like Avalon better mimic the dynamic social settings humans experience in everyday life,” said co-author Max Kleiman-Weiner. “You have to figure out who’s on your team and will work with you, whether it’s your first day of kindergarten or another day in your office.”
In Avalon, three players are randomly and secretly assigned to a ‘resistance’ team and two players to a ‘spy’ team. Both spy players know all players’ roles. During each round, one player suggests a subset of two or three players to execute a mission.
All players simultaneously and publicly vote to approve or disapprove the subset. If a majority approve, the subset secretly determines whether the mission will succeed or fail. If two ‘succeeds’ are chosen, the mission succeeds; if one ‘fail’ is selected, the mission fails.
Resistance players must always choose to succeed, but spy players may choose either outcome. The resistance team wins after three successful missions; the spy team wins after three failed missions.
Winning the game comes down to choosing who is resistance or spy, and voting for you collaborators. “It’s a game of imperfect information,” said Kleiman-Weiner. “You’re not even sure who you’re against when you start, so there’s an additional discovery phase of finding whom to cooperate with.”
DeepRole uses a game-planning algorithm called counterfactual regret minimisation (CFR), which learns to play a game repeatedly playing against itself, augmented with deductive reasoning. At each point in a game, CFR looks ahead to create a decision ‘game tree’ of lines and nodes describing the potential future actions of each player.
In playing out potentially billions of game simulations, CFR notes which actions had increased or decreased its chances of winning, and iteratively revises its strategy to include more good decisions.
The bot is trained by playing against itself as both resistance and spy. At each mission, the bot looks at how each person playing in comparison to the ‘game tree’.
It also uses the same technique to evaluate how a third-person observer might interpret its own actions.