
Microsoft made two key additions to its Copilot product lineup this week: automated agent evaluation tools for Copilot Studio users, and tools to facilitate team collaboration in Microsoft 365 Copilot.
The two new features are Agent Evaluation for automated evaluation and testing of agents built in Copilot Studio and Teams Mode in M365 Copilot for launching collaborative Copilot-driven sessions in the widely used Teams app.
Together, the two developments could drive new efficiencies across the Copilot lifecycle, from building agents all the way through to everyday use among coworkers within the familiar Teams environment. As such, these new products continue to drive Copilot, and AI functionality more broadly, more deeply into development and business processes.
Agent Evaluation
Agent Evaluation is designed to help companies manage development as a full lifecycle that includes building, testing, and improving agents. Until now, Microsoft said, agents have been manually tested — requiring the troubleshooting of issues case by case. That approach is inconsistent and lacks scalability – downsides that are at odds with enterprise AI requirements.
With Agent Evaluation, customers can conduct automated testing directly in Copilot Studio, creating evaluation sets, selecting test methods, and defining the criteria that indicate business success for the agent.
With Agent Evaluation, users can upload tests they’ve already defined, reuse recent interactions, and manually add test questions. Also new: AI-powered generation of test queries from the agent’s metadata and knowledge sources, providing visibility into agent quality without requiring manual work. Test coverage can be expanded by mixing manual and imported test sets to expand the breadth and depth of testing.
Available test methods include exact or partial matches, similarity metrics, intent recognition, and relevance and completeness. With this range, users can choose the optimal method based on their agent type. Agents can be judged according to how different users would do so, with options ranging from strict checklist compliance to overall helpfulness.
Agent Evaluation features allow users to specify what indicates success in their business and processes. This could be strict keyword matches or conceptual matches. The flexibility to establish custom thresholds ensures an agent addresses an organization’s expectations for accuracy and relevance.
Agent Evaluation results are presented with pass or fail indicators, numeric scores on answer quality, and details on knowledge sources used by the agent. Agent Evaluation in Microsoft Copilot Studio is now available in public preview.
Teams Mode
The new Teams Mode for Microsoft 365 Copilot lets users bring coworkers into Copilot conversations, turning individual AI chats into collaborative group chats within Teams. The goal is greater collaboration through AI by maximizing productivity within existing workflows
The simple user interface for Teams mode lets users broaden a Copilot session into a group conversation by selecting “start a group chat” in the top-right corner of the Copilot app. The initiating user selects which messages to share in the group chat.
In Teams Mode, Copilot can be added to any existing Teams group chat (similar to adding a coworker to a Teams chat) so a group can coordinate and complete tasks using AI.
By way of example, an individual may draft a strategy document while leveraging the Researcher agent for key data points such as market size, current industry trends, market leaders, customer perspective, and more. Then they can bring in colleagues from departments such as sales, engineering, finance, and marketing for their inputs. Those colleagues, in turn, can tap Copilot for insights specific to their professional discipline, helping to build out a more comprehensive strategy document.
Teams Mode for Microsoft 365 Copilot is rolling out in public preview for Microsoft 365 Copilot licensed users on desktop, mobile, and web. See the Teams Mode Demo




