I am currently doing a post-doc at Toyota Technological Institute at Chicago, where I am working with Karen Livescu and other awesome people on multi-modal machine learning models that combine speech and vision. I completed my PhD at the University of Edinburgh where I was supervised by Sharon Goldwater, Aren Jansen and Simon King. Before starting my PhD, I worked with Thomas Niesler at Stellenbosch University, South Africa.

My main research interests are in machine learning, speech and language processing, and computer vision. I am particularly interested in machine learning methods that can learn from small amounts of labelled data, and in unsupervised methods that can learn directly from raw unlabelled data. Can an algorithm find meaningful units and structures in a corpus of speech audio, with only minimal guidance? How much supervision is required to build a useful speech processing or computer vision system? These questions are central when building language, speech and vision systems in low- and zero-resource settings.

Email GitHub LinkedIn Scholar