This project investigates how infants may learn phonetic categories from naturalistic speech, focusing on the Cantonese tone system. Traditional distributional learning theories suggest infants identify contrastive dimensions by detecting multimodal distributions. However, such patterns rarely occur in variable, real-world speech. Building on a recent study on distributional learning across contexts (DLAC), this study asks the question whether the ease of learning sound contrasts is influenced by the degree of contextual variation in the input.
Using the Multi-ethnic Hong Kong Cantonese Corpus (MeHKCC), we examined around 65,000 tone tokens from six Cantonese tones. We extracted nine F0-based acoustic features and used t-SNE and Earth Mover’s Distance (EMD) to measure how tone distributions differ across contexts (e.g., neighboring consonants, syllable position, prosody).
Our findings revealed that tone pairs that are acquired more readily exhibit greater distributional separation and contextual variation than those that are more challenging to learn. These current results provide empirical support for the generalizability of the DLAC framework to both complex sound contrasts and tonal systems, underscoring the framework’s potential as a mechanism for learning in the absence of invariance in speech signals.
Figure: Even when distributions for the tone pair T3T5 were unimodal overall, we can see differently shaped distributions across context. Those variation across contexts may provide cues for learning.