39C3

Leo Meyerovich

Leo is the founder and CEO of Graphistry and has spent the last decade advancing GPU, graph, and AI technologies for cyber investigations. He holds a PhD in Computer Science from UC Berkeley and pioneered GPU-accelerated visual analytics, helping launch Apache Arrow, NVIDIA RAPIDS, and the GFQL graph dataframe language. He led the first agentic AI speed-runs of Splunk Boss of the SOC (BOTS), where AI auto-solved the majority of challenges faster than human teams. Earlier research includes the first parallel browser at Berkeley, the first functional reactive web framework (Flapjax) at Brown, and Project Domino for citizen data science to track COVID misinformation. He has received multiple best paper awards including the SIGPLAN 10-Year Test of Time award. He regularly works with enterprises, financial institutions, law firms, and technology companies on data-intensive investigations across cybersecurity, fraud, and intelligence.


Beitrag

30.12
13:50
40min
Breaking BOTS: Cheating at Blue Team CTFs with AI Speed-Runs
Leo Meyerovich, Sindre Breda

After we announced our results, CTFs like Splunk's Boss of the SOC (BOTS) started prohibiting AI agents. For science & profit, we keep doing it anyways. In BOTS, the AIs solve most of it in under 10 minutes instead of taking the full day. Our recipe was surprisingly simple: Teach AI agents to self-plan their investigation steps, adapt their plans to new information, work with the SIEM DB, and reason about log dumps. No exotic models, no massive lab budgets - just publicly available LLMs mixed with a bit of science and perseverance. We'll walk through how that works, including videos of the many ways AI trips itself up that marketers would rather hide, and how to do it at home with free and open-source tools.

CTF organizers can't detect this - the arms race is probably over before it really began. But the real question isn't "can we cheat at CTFs?" It's what happens when investigations evolve from analysts-who-investigate to analysts-who-manage-AI-investigators. We'll show you what that transition already looks like today and peek into some uncomfortable questions about what comes next.

Security
One