About
The Cornell Phonetics Lab is a group of students and faculty who are curious about speech. We study patterns in speech — in both movement and sound. We do a variety research — experiments, fieldwork, and corpus studies. We test theories and build models of the mechanisms that create patterns. Learn more about our Research. See below for information on our events and our facilities.
3rd December 2020 04:10 PM
CLC Speaker Series: Morgan Sonderegger, McGill University
The Cornell Linguistics Circle proudly presents Associate Professor Morgan Sonderegger from McGill University, who will talk about The SPADE project: large-scale analysis of a spoken language across space and time.
Abstract:
Speech corpora of many languages, styles, and formats exist in the world, representing significant potential for scientific studies of speech -- by language scientists but also for health and technological applications. However in practice there are significant practical and methodological barriers to conducting the “same study” across corpora.
The Speech Across Dialects of English (SPADE) project (2017-present), a collaboration between universities in Canada, Scotland, and the US, aims to develop and apply user-friendly software for large-scale speech analysis of existing public and private English speech datasets, and to understand how English speech has changed over time and space. This talk summarizes the project's results thus far.
We briefly introduce our open-source software system for Integrated Speech Corpus ANalysis (ISCAN), then turn to datasets collected for the project and case studies carried out so far analyzing segmental realization (e.g. sibilants, vowels) across Old World (British Isles) and New World (North American) English, which illustrate two possibilities of large-scale studies:
Location: Event Information
4th December 2020 11:30 AM
Jason Baldridge of Google Research (topic TBD)
Jason Baldridge is a research scientist at Google, where he works on natural language understanding. He was previously an Associate Professor of Computational Linguistics at the University of Texas at Austin. His main research interests include categorial grammars, parsing, semi-supervised learning for NLP, reference resolution and text geolocation. He has long been active in the creation and promotion of open source software for natural language processing, including co-creating the Apache OpenNLP Toolkit and OpenCCG. Jason received his Ph.D. from the University of Edinburgh in 2002, where his doctoral dissertation on Multimodal Combinatory Categorial Grammar was awarded the 2003 Beth Dissertation Prize from the European Association for Logic, Language and Information.
Location: Learning Machine Seminar Series - Jason Baldridge11th December 2020 09:55 AM
Invited Speaker Stefan Frank: Neural models of bilingual sentence processing
As part of Computational Psycholinguistics Discussion, invited speaker Stefan Frank (Radboud University) will give a talk on Neural models of bilingual sentence processing.
Abstract: A bilingual’s two grammars do not form independent systems but interact with each other, as is clear from phenomena such as syntactic transfer, cross-linguistic structural priming, and code-switching. I will present neural network models of bilingual sentence processing that show how (some of) these phenomena can emerge from mere exposure to two languages, that is, without requiring cognitive mechanisms that are specific to bilingualism.
Bio: Stefan is a computational psycholinguist who has worked on a wide range of topics in sentence processing, studying both native and non-native language users. He is an associate professor of psycholinguistics and data science in the Center for Language Studies at Radboud University, Nijmegen.
Contact Cornell's Dr. Marten van Schijndel for the Zoom link.
Location:The Cornell Phonetics Laboratory (CPL) provides an integrated environment for the experimental study of speech and language, including its production, perception, and acquisition.
Located in Morrill Hall, the laboratory consists of six adjacent rooms and covers about 1,600 square feet. Its facilities include a variety of hardware and software for analyzing and editing speech, for running experiments, for synthesizing speech, and for developing and testing phonetic, phonological, and psycholinguistic models.
Web-Based Phonetics and Phonology Experiments with LabVanced
The Phonetics Lab licenses the LabVanced software for designing and conducting web-based experiments.
Labvanced has particular value for phonetics and phonology experiments because of its:
Students and Faculty are currently using LabVanced to design web experiments involving eye-tracking, audio recording, and perception studies.
Subjects are recruited via several online systems:
Computing Resources
The Phonetics Lab maintains two Linux servers that are located in the Rhodes Hall server farm:
In addition to the Phonetics Lab servers, students can request access to additional computing resources of the Computational Linguistics lab:
These servers, in turn, are nodes in the G2 Computing Cluster, which currently consists of 195 servers (82 CPU-only servers and 113 GPU servers) consisting of ~7400 CPU cores and 698 GPUs.
The G2 Cluster uses the SLURM Workload Manager for submitting batch jobs that can run on any available server or GPU on any cluster node.
Articulate Instruments - Micro Speech Research Ultrasound System
We use this Articulate Instruments Micro Speech Research Ultrasound System to investigate how fine-grained variation in speech articulation connects to phonological structure.
The ultrasound system is portable and non-invasive, making it ideal for collecting articulatory data in the field.
BIOPAC MP-160 System
The Sound Booth Laboratory has a BIOPAC MP-160 system for physiological data collection. This system supports two BIOPAC Respiratory Effort Transducers and their associated interface modules.
Language Corpora
Speech Aerodynamics
Studies of the aerodynamics of speech production are conducted with our Glottal Enterprises oral and nasal airflow and pressure transducers.
Electroglottography
We use a Glottal Enterprises EG-2 electroglottograph for noninvasive measurement of vocal fold vibration.
Real-time vocal tract MRI
Our lab is part of the Cornell Speech Imaging Group (SIG), a cross-disciplinary team of researchers using real-time magnetic resonance imaging to study the dynamics of speech articulation.
Articulatory movement tracking
We use the Northern Digital Inc. Wave motion-capture system to study speech articulatory patterns and motor control.
Sound Booth
Our isolated sound recording booth serves a range of purposes--from basic recording to perceptual, psycholinguistic, and ultrasonic experimentation.
We also have the necessary software and audio interfaces to perform low latency real-time auditory feedback experiments via MATLAB and Audapter.