HMAW: Hierarchical Multi-Agent Workflow for Prompt Optimization

Abstract

Large language models (LLMs) have shown great progress in responding to user questions, allowing for a multitude of diverse applications. Yet, the quality of LLM outputs heavily depends on the prompt design, where a good prompt might enable the LLM to answer a very challenging question correctly. Therefore, recent works developed many strategies for improving the prompt, including both manual crafting and in-domain optimization. However, their efficacy in unrestricted scenarios remains questionable, as the former depends on human design for specific questions and the latter usually generalizes poorly to unseen scenarios. To address these problems, we give LLMs the freedom to design the best prompts according to themselves. Specifically, we include a hierarchy of LLMs, first constructing a prompt with precise instructions and accurate wording in a hierarchical manner, and then using this prompt to generate the final answer to the user query. We term this pipeline Hierarchical Multi-Agent Workflow, or HMAW. In contrast with prior works, HMAW imposes no human restriction and requires no training, and is completely task-agnostic while capable of adjusting to the nuances of the underlying task. Through both quantitative and qualitative experiments across multiple benchmarks, we verify that despite its simplicity, the proposed approach can create detailed and suitable prompts, further boosting the performance of current LLMs.

Method Overview

MY ALT TEXT

We propose modeling the prompt optimization problem as a zero-shot output within a multi-agent workflow. The initial query, q_i, is first inputted into the first layer of our framework (the COE layer). Before being processed by the CEO LLM agent, q_i is transformed into an LLM prompt p_i^c by the prompter f^c, which also concatenates it with the context C^c in the CEO layer. The output of the first layer, q_i^c, serves as the query from the CEO layer to the Manager layer.

Similarly, the Manager Layer and the Worker Layer each include their own prompters, f^m and f^w, respectively. Besides concatenating the content of this layer, the initial query q_i is also concatenated to enhance stability. The input for the Worker LLM is our optimized prompt P_i^*, which directly triggers the LLM agent to generate the final response to the original query q_i.

Results

MY ALT TEXT

An example of prompt optimization using HMAW on the Education dataset.

MY ALT TEXT

A case study of HMAW on the CodeNet Dataset.

MY ALT TEXT

A case study of HMAW on the GSM8K Dataset. Colored texts indicate content coherence.

BibTeX

@misc{liu2024hierarchical,
        title={Towards Hierarchical Multi-Agent Workflows for Zero-Shot Prompt Optimization}, 
        author={Yuchi Liu and Jaskirat Singh and Gaowen Liu and Ali Payani and Liang Zheng},
        year={2024},
        eprint={2405.20252},
        archivePrefix={arXiv},
        primaryClass={cs.CL}
  }