Evoflux: A New Method to Help Small AI Agents Build and Execute Tool Workflows at Inference Time

A new research paper introduces Evoflux, a method that helps compact AI agents create and adjust executable tool workflows during inference, rather than relying solely on pre-training data. This could make small models more reliable for MCP-style tasks.

A new research paper introduces Evoflux, a method designed to help compact language models build and refine executable tool workflows during inference. The core problem it addresses is that small AI agents often struggle with MCP-style tool use, which involves more than simple function calling. They need to discover tools from live catalogs, satisfy schemas, preserve dependencies across intermediate outputs, and ground final responses in executed evidence.

The paper argues that small planners frequently generate plausible workflow graphs that fail under real-world conditions like tool resolution, parameter validation, and dependency tracking. The authors contend that this failure mode is poorly handled by distillation from small corpora of examples.

While the full technical details are complex, the key takeaway is that Evoflux operates at inference time, allowing an agent to evolve and refine its workflow on the fly rather than being limited by what it learned during training. This approach could make AI assistants more robust and reliable when interacting with external tools and APIs.