< Speaker / Facilitator >
Csaba Tamas
Parloa, AWS, FinLeap, KEBA

Connect in LinkedIn
Csaba Tamas is CPO at Parloa, leading product management, UX, and enablement. Previously, he held senior roles at AWS, guiding top fintechs on AI/ML strategy and leading global GTM strategy. He was also CPO/CTO at FinLeap Connect and held leadership roles at KEBA, scaling digital products across banking and mobility. With 15+ years of experience, an MBA, and data science studies at MIT.
< About >
Stanford HAI's 2026 AI Index reports that AI agents now achieve 66.3% success rates on real-world computer tasks - up from 12% in March 2025. They still fail one out of every three times. And that's under benchmark conditions. Your production environment is harder.
Meanwhile, AI agents are already making decisions inside your product and across the broader business context - some of them with expensive consequences.
Unlike traditional software, AI systems are inherently non-deterministic. Agents can feel magical at first, but they also fail in unexpected and difficult-to-predict ways. Most leaders and administrators of agentic systems are still unaware of the operational reality: today's AI agents often operate at only 60-70% reliability.
This is not a minor issue. It is a liability problem.
Some leaders recognize this risk and prevent pilots from reaching production entirely - but that also creates lost opportunity. Others are discovering, often too late, that manual testing produces statistically insignificant results, and that agentic transformation requires entirely new skills, operational disciplines, and organizational muscle memory.
This talk is a field guide, not a vision deck.
Csaba Tamas draws on real production architectures and hard-earned lessons from deploying AI agents in practice - including where and how they fail. He then demonstrates what deterministic governance infrastructure for non-deterministic AI systems actually looks like.
The question is not whether your agents are failing. It's whether you have the instrumentation to know when - and the architecture to prevent those failures from mattering.
< Key learnings >
Why most agentic projects fail and how choosing the right ROI project from the start changes everything
What 'working' actually means at scale: task success rate, evaluation frameworks, and the silent failures traditional software never had
How to see inside the machine: Observability, and why governance must come before scale, not after
The hidden risk in the knowledge layer: why RAG failures are curation problems, not engineering ones
How to measure customer trust, not just satisfaction, and why that distinction defines whether your agentic system survives contact with reality
< Tickets >
