arXiv:1809.07066 Abstract | arXiv Analytics

arXiv:1809.07066 [cs.LG]Abstract References Reviews Resources

Prosocial or Selfish? Agents with different behaviors for Contract Negotiation using Reinforcement Learning

Vishal Sunder, Lovekesh Vig, Arnab Chatterjee, Gautam Shroff

Published 2018-09-19Version 1

We present an effective technique for training deep learning agents capable of negotiating on a set of clauses in a contract agreement using a simple communication protocol. We use Multi Agent Reinforcement Learning to train both agents simultaneously as they negotiate with each other in the training environment. We also model selfish and prosocial behavior to varying degrees in these agents. Empirical evidence is provided showing consistency in agent behaviors. We further train a meta agent with a mixture of behaviors by learning an ensemble of different models using reinforcement learning. Finally, to ascertain the deployability of the negotiating agents, we conducted experiments pitting the trained agents against human players. Results demonstrate that the agents are able to hold their own against human players, often emerging as winners in the negotiation. Our experiments demonstrate that the meta agent is able to reasonably emulate human behavior.

Comments: Proceedings of the 11th International Workshop on Automated Negotiations (held in conjunction with IJCAI 2018)

Categories: cs.LG, cs.AI, cs.MA, stat.ML

Keywords: reinforcement learning, contract negotiation, deep learning agents capable, human players, meta agent

Related articles: Most relevant | Search more

arXiv:1809.10679 [cs.LG] (Published 2018-09-27)

Definition and evaluation of model-free coordination of electrical vehicle charging with reinforcement learning

Nasrin Sadeghianpourhamami, Johannes Deleu, Chris Develder

arXiv:1809.09095 [cs.LG] (Published 2018-09-23)

On Reinforcement Learning for Full-length Game of StarCraft

Zhen-Jia Pang, Ruo-Ze Liu, Zhou-Yu Meng, Yi Zhang, Yang Yu, Tong Lu

arXiv:1811.01483 [cs.LG] (Published 2018-11-05)

Contingency-Aware Exploration in Reinforcement Learning