Arduin Findeis

Hi there! I am a machine learning (ML) researcher who likes to build software. My research focuses on the evaluation of ML systems: which systems are “better” or “worse”. In particular, for language model and reinforcement learning applications. Recent projects include KraspAI Kompass, Beobench and Bauwerk. I am a PhD candidate in the Department of Computer Science at the University of Cambridge and member of the AI4ER CDT. Reach out by sending an email or scheduling a call.
🗞️ News
01/2024: New blog post released:
The benchmark problems reviving manual evaluation