Resources
Videos and teaching
My videos are all available on YouTube. I provide links to selected video lists below.
- Introduction to machine learning (Data Analytics 414)
- Introduction to natural language processing (NLP817)
- Introduction to neural networks [slides, notes]
- Introduction to speech features [slides]
- Dynamic time warping [slides, notebook]
- Our Git workflow [notes]
Code and data
My code is available on GitHub. To find the code for a particular paper, look for the [code]
link given with each of papers on the publications page. I provide links to selected projects below.
- YFACC: Yorùbá Flickr Audio Caption Corpus. The dataset is described in (Olaleye et al., 2023).
- semantic_flickraudio: A dataset of labels for semantic speech retrieval, as described in (Kamper et al., 2019). The dataset is an extension of the Flickr Audio Captions Corpus from MIT.
- ES-KMeans: The embedded segmental K-means (ES-KMeans) algorithm for unsupervised word segmentation and clustering of speech. We use it in (Kamper et al., 2017), as shown in this recipe. For getting started, this notebook serves as a good reference.
For Stellenbosch University students
- Getting started with the Stellenbosch HPC by Matthew Baas
- Electrical and Electronic Engineering LaTeX template [Overleaf]
- Stellenbosch University Style Guide
Invited talks
- Speech systems that emulate language acquisition in humans
SLS, MIT, 2024.
ILCC, University of Edinburgh, 2024. - Multimodal few-shot learning & probing self-supervised speech models
LSCP, Ecole Normale Supérieure, 2023. - What can large spoken language models tell us about speech?
IndabaX South Africa, University of Cape Town, 2023. [video] - Unsupervised word segmentation using dynamic programming on self-supervised speech representations
AAAI SAS Workshop, Invited Talk, 2022. [video] - Learning acoustic units and words from unlabelled speech (with a bit of vision)
CLSP Seminar, Johns Hopkins University, 2020. - (Outrageously) low-resource speech processing
Deep Learning Indaba, Nairobi, 2019. [video] - Multimodal learning from images and speech
Aalto University & Tampere University, 2019.
KU Leuven & UPF Barcelona, 2019. - Acoustic word embeddings for low resource speech processing
TWiML&AI Podcast, 2018. - Frontiers of natural language processing
With Sebastian Ruder. Deep Learning Indaba, Stellenbosch, 2018. - Deep learning for (more than) speech recognition
IndabaX Western Cape, University of Cape Town, 2018. [video] - Learning from unlabelled speech, with and without visual cues
Ohio State University, 2017.
CLIP Colloquium Speaker, University of Maryland, 2017. - Unsupervised neural and Bayesian models for zero-resource speech processing
CSAIL, MIT, 2016. - Unsupervised speech processing using acoustic word embeddings
MLSLP Workshop, Spotlight Speaker, 2016.