Microsoft's New Tool Lets Developers Test AI Behavior with Text Descriptions

Microsoft released an open-source tool that lets developers create AI behavior tests using simple text descriptions. This could make it easier to ensure AI systems behave as intended.

Microsoft introduced Adaptive Spec-driven Scoring for Evaluation and Regression Testing (ASSERt), an open-source framework that allows developers to create AI behavior tests using plain text descriptions. Instead of writing complex code, developers can describe the desired behavior of an AI system in natural language, and ASSERt automatically generates the necessary tests.

This matters because it lowers the barrier for testing AI systems. Currently, creating robust tests for AI can be time-consuming and requires specialized knowledge. With ASSERt, even developers without deep AI expertise can ensure their models behave as expected, making AI development more accessible and reliable.

The tool works by letting developers write "behavioral specifications" in text form, which ASSERt then converts into executable test cases. It is designed to help with regression testing — catching when updates to a model cause it to behave differently than before. This is especially useful in production environments where AI models are frequently updated and need to be continuously validated.

If you're a developer, you can start using ASSERt today by visiting its GitHub repository. The tool is open-source, so you can explore the documentation and examples to get started.