Linguistics Research Spotlight: Yue Wang
Congratulations to Dr. Yue Wang for her recent appointment as an Acoustical Society of America (ASA) Fellow. ASA Fellows are those members of the society who have rendered conspicuous service or made notable contributions to the announcement, promotion, or dissemination of the knowledge of acoustics or the fostering of its practical applications. Dr. Wang was recognized for studies of the behavioural and neural mechanisms underlying speech learning and processing.
Collaborative work on the various features of hyperarticulated or clear speech
Dr. Wang, Director of the Language and Brain (LAB) Lab, focuses on research under the umbrella of how to enhance speech intelligibility. Through three major sources of funding (SSHRC, NSERC, SFU Big Data Initiative), Dr. Wang has worked with a team of collaborators to investigate what features, both auditory and visual, speakers use when trying to improve the intelligibility of their speech.
Dr. Wang’s collaborators are experts in different fields and conduct research in a variety of universities. On her long list of collaborators are linguists like Dawn Behne (Norwegian University of Science and Technology), Allard Jongman and Joan Sereno (University of Kansas), and Bob McMurray (University of Iowa). She also works with local computer scientists and mathematicians like Ghassan Hamarneh (Simon Fraser University; Computing Science), Lisa Tang (University of British Columbia), and Paul Tupper (Simon Fraser University; Mathematics). Her team also includes graduate and undergraduate students, such as Sylvia Cho, Beverly Hannah, and Keith Leung. “I don’t think any of this can be done by myself,” says Dr. Wang, “I want to stress that all these projects are collaborative.”
In our day-to-day lives, we adjust aspects of how we speak based on who we are communicating with. Most accommodations we make are dependent on the speech context such as the physical environment, the purpose of communication, and who we are speaking to. As listeners, we are also adjusting our expectations of speech style based on who is speaking to us and what environment we are in. For example, we may try to speak more clearly than usual in a noisy cafeteria or when speaking with non-native speakers. However, although changing how loudly or slowly we are speaking is an approach that we can use, shouting or slowing down may not necessarily help others understand us better. There may be a trade-off in intelligibility if the modifications we make violate intrinsic properties of speech that a listener would expect. How we make adjustments in speech production to improve, rather than hurt, intelligibility has been a focus for Dr. Wang and the LAB Lab in their recent experiments.
Dr. Wang’s ongoing projects look at what specific aspects of speech are affected when we try to increase intelligibility. One project, “Hyperarticulation in audio-visual speech communication,” focuses on hyperarticulated or clear speech and asks whether speakers modify specific aspects of speech sounds, and which voice cues specifically are useful for the listener in speech perception. It also investigates the visual domain to determine how hyperarticulated speech is displayed through facial movements, a topic that is increasingly relevant as we continue to have increased interactions through video calls. The project incorporates cross-linguistic aspects, involving English, Korean, Mandarin, and Norwegian, to observe how changes in voice and facial cues can make words more contrastive.
Another project, “Communicating pitch in clear speech: articulation, acoustics, intelligibility, neuro-processing, and computational modeling,” focuses on another interesting characteristic of speech that is common in languages spoken throughout Asia and Africa, tone. This work, which looks at Mandarin tones, is unique because traditionally, it is thought that tone does not have a corresponding visual component in production. However, initial results from Dr. Wang’s work suggest that eyebrow and head movements are both involved in signaling tonal contrasts in clear and plain speech.
One application of research in speech intelligibility is the focus on a third project being conducted in the LAB Lab. “Automated lip-reading: extracting speech from video of a talking face,” is funded by SFU’s Big Data Initiative Next Big Question Fund and is a collaboration between linguists and computer scientists. This project takes aspects of the experimental work investigating intelligibility to develop an audio-visual machine lip reading system that can reconstruct a speaker’s voice information based on visual information from that speaker’s facial movements. The goal of this project is to enhance speech signals and improve intelligibility of videos, particularly in adverse listening environments, by combining linguistic knowledge with deep learning approaches.
Next up for Dr. Wang is to shift focus from the segmental level of intelligibility to look more at general conversational contexts and how interlocutors make adaptations to achieve communicative success in spontaneous verbal interactions. She is also keen to incorporate neurolinguistic perspectives to look at brain activity while we make these accommodations in our own speech and experience them as a listener. When talking about these future directions, Dr. Wang says, “the general direction is looking at how to relate all these behaviour findings to a new level and how to make use of big data to answer these questions in a different way.”
Tupper, P., Leung, K.W., Wang, Y., Jongman, A., & Sereno, J.A. (2021). The contrast between clear and plain speaking style for Mandarin tones. Journal of the Acoustic Society of America, 150(60), 4464-4473.
Wang, Y., Sereno, J.A., & Jongman A. (2020). Multi-modal perception of tone. In H-M. Liu, F-M. Tsai, & P. Li (Eds.), Speech perception, production and acquisition: Multidisciplinary approaches in Chinese languages (pp. 159-176), Singapore: Springer Nature Singapore.
Garg, S., Hamarneh, G., Jongman, A., Sereno, J.A., & Wang, Y. (2020). ADFAC: Automatic detection of facial articulatory features. MethodsX, 7.
Redmond, C., Leung, K., Wang, Y., McMurray, B., Jongman, A., & Sereno, J.A. (2020). Cross-linguistic perception of clearly spoken English tense and lax vowels based on auditory, visual, and auditory-visual information, Journal of Phonetics, 81.
Garg, S., Hamarneh, G., Jongman, A., Sereno, J.A., & Wang, Y. (2019). Computer-vision analysis reveals facial movements made during Mandarin tone production align with pitch trajectories. Speech Communcation, 113, 47-62.