Further details coming soon.
AI systems do not behave like traditional software, which makes life interesting for testers. When the same input can lead to different outputs, and “correct” is sometimes open to debate, how do we decide what deserves our trust?
This talk looks at AI evaluation through a tester’s lens and makes a simple case: trust should not be assumed, it should be earned, and earned in proportion to the evidence.
This is not a deep dive into tools, metrics, or complex frameworks. In 25 minutes, we will focus on the mindset shift that AI asks of testers. What changes when software stops behaving predictably? What stays the same? And why are testers actually well suited for this moment?
If you have ever looked at an AI response and thought, “Well… that seems confident,” this talk is for you.
You will leave with a clearer way to think about evaluating AI systems, asking better questions, and applying your testing instincts in a world where software occasionally sounds very sure of itself.
Where agentic workflows actually earn their keep in a real QA pipeline — and the two places they quietly fail.
The four control surfaces to set up before an agent touches production: scope, evaluation, failure cataloguing, human-in-the-loop.
Patterns for flaky-test triage, regression pruning, and visual-diff arbitration with receipts from three production systems.
A reference architecture you can take back to Monday’s sprint planning, plus the metrics that prove it’s working.
Sr. Consultant & AI Coach
Rahul Verma is an awarded thought leader in the testing community, working as a Senior Coach and Consultant at trendig technology services gmbh. He created Swayam, an LLM framework for layered prompting, and Arjuna, a free, open-source Python test automation framework. He has contributed as an author and reviewer for certification bodies like Artificial Intelligence United, Selenium United, ISTQB, and CMAP.
His testing experience covers LLMs in testing, Python automation frameworks, web security, white box testing, and web performance testing. His research explores meta-programming and object-oriented design patterns for automating these areas. He has presented, published, and trained thousands of testers, with his work deeply influenced by his interest in poetry and spirituality.
One pass, every talk, no parallel tracks. Super Early Bird
ends when the next 200 seats are gone.
Join our community of testers and start your journey