- Change filtering from >= to == MAX_TOOL_CALL_TURNS to ensure exact match
- Add VALIDATE_THINK_BLOCKS flag for optional <think> block validation
- Refactor data structure from flat expected_calls to turn-based expected_calls_by_turn
- Extract helper methods from collect_trajectories for better code organization
- Fix Turn 3 issue where prompts ended with tool responses instead of generating tool calls
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>