Skip to content

A Lightweight Agent-Based Multi-Collaboration System

We hope to build an system that is lightweight, customizable, and that is caplable of social-reasoning and expecting what the other agent should be doing through the lens of advesarial collaboration. We are building a generalize agent model for every single agent in this environment and hopefully moving a step closer to level-one agent.

Using Language Models' In-context Learning abilities, we are developing frameworks for multi-agent collaborations system using Role-Playing Leader-Hallucinations system, or in short, RPLH. We know that as parameter increases, performance automatically increases. However, only limited works have being down on generalizing this sort of in-context learning ability into less parameterized models. In our work, we try to address this issue. We try to design an efficient light-weight system that is runable directly on a laptop Macbook or using API calls with smaller/cheaper GPT models such as GPT-4o-mini. More importantly, this system can scale up in terms of the complexity of the environment and the number of agents.

Intelligence In Data Flow & Structured Environment

We believe that the key of collaboration lays in the communication process and the data we can extract from such communication for an agent to gradually build up its agent model.

Our study studies a particularly a hard task (for smaller/less parametrized models) but an interesting one because the LM is embodied in a structured environment where not all actions are unstructured string based (i.e. mafia easoning environment), so the HCA agent must reason from each local agent's action to deduct about what they may be thinking and whether or not they are the spy agent. Furthermore, other than the scalability of our environment soloy (2x2, 3x3, 4x4, 5x5, ...) for even smaller language models, due to the strctured environment and that most social reasoning or agent simulations require the agent to operate in an structured environment, it can be deemed as a base-case of able to scale/transport and operate in some other structured virtual enviornment that is more complicated for social reasoning (i.e. VirtualHome).

Social Reasoning Agent

In decentralzie environment, usually each agent must communicate about what they can see and what they can do to better collaborate with each other and do the actions because of the partial observability situation. Here is an example of such partial information of describing what an specific local agent can see and do.

Agent[0.5, 0.5]:
- I am in square[0.5, 0.5],
- I can observe ['box_green', 'target_orange'],
- I can do one of the following action: ['move(box_green, square[1.5, 0.5])', 'move(box_green, square[0.5, 1.5])']

Differently, we aer using a fully-observable environment, meaning that all agent can see everything (where all the boxes are at and where all the targets are at). Howveer, the environmnet still remains partially actuatable as there are limitations to what each agent can do. However, even with this setting, a common type of problem in usual decentralzied collaboration setting (each agent has their own perspective and doesn't just do whatever the central agent tells them to do) is that it is extremely hard for different agents to come to an agreement on the plan, especially when the number of agent scales.

Our system propose a novel approach to solve this problem where each HCA agent try to conduct social reasoning of what the other agent would think, do, and react to it's plan and recieve actual sensory feedbacks from local agents to understand what is the problem with teh previous reasoning. Idealy, through longer iterations, the HCA would gradually build an more correct model of other agents. This situation improves complexity even more when we have different local agent playing this role of the HCA agent, meaning that they can incorporate their attitudes into giving actions.

Crucially, sometimes the adversarial agent does not "speak much". Thus, most of the ocial reasoning may need to come from judging in agent's behaviors.

Advesarial Reasoning Agent

Beyond standard reasonings that LM perform in multi-agent collaboration, our system (specifically h_efficient and d_efficient) supports agent based reasoning with advesarial players in teh environment that has a different objective than all other agents in the environment. Hypothetically, under a advesarial situation (spy exist in environment), a hallucination efficient system would improve the social understanding ability, hence, convergence rate (since advesarial players would try their best to disrupt the system). With more interaction, the HCA may get a better idea of who the spy is and taken action to prohibit the spys.

Different from traditional methods of learning from advesarial environment, our agent is learning advesarialy in a collaboration environment.

RPLH System Demos:

We have made an demo notebook of how LMs talk to each other.

Here is a demo of our RPLH performing multi-agent resasoning with the first HCA (central) agent hallucinations about future steps:

  • Conversation demos are here
  • Actual running data are here

The rendering notebook can also be accessed here