HawkEyes: Spotting and Evading Instruction Disalignments of LLMs

Abstract

LLM agents have been demonstrated to be powerful in vision language planning (VLP) tasks. However, they often encounter challenges with sequential VLP tasks, particularly in adhering to instructions in prompts, which affects their overall efficacy. To unleash the efficacy of LLM agents against instruction disalignments, this paper proposes HawkEyes, an LLM-based approach to self-identify and self-avoid instruction disalignments of any given LLM agent. Instead of altering the intrinsic mechanism of LLM agents, HawkEyes operates externally on the input and output sequences of LLM agents. Specifically, HawkEyes uses LLMs to decompose the instructions in the LLM agent’s workflow into primitive constraints, creating oracles to detect any disalignments of these primitive constraints and synthesize avoiding actions to preempt potential disalignments. This paper also demonstrates the application of HawkEyes to enhance three state-of-the-art LLM agents, assessing HawkEyes’s effectiveness on two challenging VLP tasks:WebShop and MoTIF. Evaluation results show that HawkEyes significantly boosts the performance of LLM agents across various agents and tasks. Notably, HawkEyes doubles the success rate of LLM-planner, a state-of-the-art LLM agent dedicated to sequential VLP, from 17.2% to 34.5% on the MoTIF dataset, showcasing its capability to adapt LLM planning more flexibly and effectively in sequential VLP scenarios.

Publication
In the First International Workshop on Large Language Models for Code - Co-Located with ICSE 2024.
Date
Links