News

Benchmarking Large Language Models for Multi-Agent Systems: A Comparative Analysis of AutoGen, CrewAI, and TaskWeaver

14/08/2024
Benchmarking Large Language Models for Multi-Agent Systems: A Comparative Analysis of AutoGen, CrewAI, and TaskWeaver

The conference paper “Benchmarking Large Language Models for Multi-Agent Systems: A Comparative Analysis of AutoGen, CrewAI, and TaskWeaver” was presented in the 22nd International Conference on Practical Applications of Agents and Multi-Agent Systems. The conference held in Salamanca, Spain, took place in June 2024. ISEP's participation allowed the dissemination of knowledge acquired and explored within the scope of the PRODUTECH R3 – WP13 project.

This article presents a comparative study on the potential of integrating large language models (LLMs) in multi-agent systems, demonstrating significant advances in the capacity for collaborative problem solving and solution programming.

The study, which focused on generating code for energy forecasting models, evaluated three multi-agent open source frameworks: AutoGen, CrewAI, and TaskWeaver. Each framework was powered by different LLMs and tested taking into account their ability to create deep learning models to predict energy consumption. The results were promising, with the three frameworks managing to generate functional code.

The TaskWeaver framework, using LLM GPT-3.5, was able to obtain a root mean square error (RMSE) of 25.04. This study demonstrates that the combination of multi-agent systems and LLMs makes it possible to solve complex problems and allows the validation of LLMs' responses.

The results will allow the development of automatic code for problem solving or process automation without resorting to developers. This publication by the ISEP team, Rafael Barbarroxa, Luis Gomes, and Zita Vale, will be available in PAAMS 24 proceedings.

This work has been supported by the European Union under the Next Generation EU, through a grant of the Portuguese Republic's Recovery and Resilience Plan (PRR) Partnership Agreement, within the scope of the project PRODUTECH R3 – "Agenda Mobilizadora da Fileira das Tecnologias de Produção para a Reindustrialização". The work is part of WP13 of the PRODUTECH R3 project