TECHNICAL PROGRAM

Link to Program at a Glance

Keynote speeches

 

K1: Brain Plasticity and Language Processing

Date: Tuesday, Oct 18, 2016

Time: 9:00-10:00

Chair: Prof. Jianwu Dang

SPEAKER: Prof. Li-Hai Tan

Shenzhen Institute of Neuroscience and Shenzhen University School of Medicine

 

K2: Speech Processing for Unwritten Languages

Date: Wednesday, Oct 19, 2016

Time: 11:00-12:00

Chair: Prof. Helen Meng

SPEAKER: Prof. Alan W Black

Carnegie Mellon University, USA

 

K3: Rendering Speech Across Speaker and Language Barriers

Date: Thursday, Oct 20, 2016

Time: 11:00-12:00

Chair: Prof. Jen-Tzung Chien

SPEAKER: Prof. Frank K. Soong

Speech Group, Microsoft Research Asia (MSRA), Beijing, China

 

Tutorials:

Date: Monday, Oct 17, 2016  

 

T1: Deep Learning for Statistical Parametric Speech Synthesis

Time: 9:30-11:30

Chair: Prof. Jinsong Zhang

PRESENTER: Zhen-Hua Ling

 

T2: Speech Front-End Processing Under Multi-Sources Reverberant Acoustic Environments 

Time: 9:30-11:30

Chair: Prof. Yanmin Qian

PRESENTER: Qiang Fu, Xiaofei Wang

Institute of Acoustics, Chinese Academy of Sciences

 

T3: Techniques & Applications for Speech Interaction Between Human and Cloud Robot

Time: 9:30-11:30

Chair: Prof. Kai Yu

PRESENTER: Min Chu, Zhijie Yan, Jian Sun, Yining Chen

 

T4: Deep Learning: Recent Advances and Moving Forward

Time: 13:30-15:30

Chair: Prof. Kai Yu

PRESENTERS: Dong Yu

Microsoft Research

 

T5: Undirected Graphical Models: Theory and Applications to Speech and Language Processing

Time: 13:30-15:30

Chair: Prof. Jinsong Zhang

PRESENTER: Zhijian Ou

Department of Electronic Engineering, Tsinghua University, Beijing, China

 

T6: Emotion Recognition in Speech, Text and Conversational Data

Time: 16:00-18:00

Chair: Prof. Kai Yu

PRESENTER: Junlan Feng, Chaomin Wang, Yanmeng Wang

 

T7: Speaker Verification: State of the Art, Spoofing and Countermeasures

Time: 16:00-18:00

Chair: Prof. Jinsong Zhang

PRESENTER: Zhizheng Wu, Haizhou Li

 

Oral Session O1: Speech Enhancement

Time: 10:20–12:00, October 18, 2016 (Tuesday)

Session Chairs:  Changchun Bao, Xiang Xie



O1-1: Multi-Channel Feature Adaptation for Robust Speech Recognition

Zhaofeng Zhang1, Xiong Xiao2, Longbiao Wang3, Jianwu Dang3, Masahiro Iwahashi1, Eng Siong Chng2, 4, Haizhou Li4, 5

1Nagaoka University of Technology, Japan; 2Temasek Laboratories, Nanyang Technological University (NTU), Singapore; 3Tianjin University, China; 4School of Computer Science and Engineering, NTU, Singapore; 5Department of Electrical and Computer Engineering, National University of Singapore



O1-2: Speech Intelligibility Enhancement in Noisy Reverberant Conditions

Junfeng Li1, Risheng Xia1, Qiang Fang2, Aijun Li2, Yonghong Yan1

1Institute of Acoustics, Chinese Academy of Sciences; 2Institute of Linguistics, Chinese Academy of Social Sciences



O1-3: Speech Enhancement with Binaural Cues Derived from a Priori Codebook

Nan Chen, Changchun Bao, Feng Deng

Speech and Audio Signal Processing Laboratory, School of Electronic Information and Control Engineering, Beijing University of Technology, China



O1-4: Speech Enhancement Based on Nonparametric Factor Analysis

Lin Li1, Jiawen Wu1, Xinghao Ding1, Qingyang Hong1, Delu Zeng2

1School of Information Science and Technology, Xiamen University, China; 2School of Mathematics, South China University of Technology, China



O1-5: Deep Neural Network for Robust Speech Recognition with Auxiliary Features from Laser-Doppler Vibrometer Sensor

Zhipeng Xie1, Jun Du1, Ian McLoughlin2, Yong Xu3, Feng Ma3, Haikun Wang3

1The University of Science and Technology of China; 2University of Kent, Medway, UK; 3iFlytek Research



Oral Session O2: Speech Perception

Time: 10:20–12:00, October 18, 2016 (Tuesday)

Session Chairs:  Jiangping Kong, Gang Peng



O2-1: Cognitive Representation of Phonological Categories: The Evidence from Mandarin Speakers' Learning of Cantonese Tones

Kaile Zhang1, Yonghong Li2, Gang Peng1, 3

1Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University; 2Key Lab of China's National Linguistic Information Technology, Northwest University for Nationalities; 3Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences



O2-2: Effects of Preceding Vocabulary Context on the Perception of Mandarin Vowels

Xunan Huang1, 2, Caicai Zhang2, Fei Chen1, Jonathan Sieg3, Lan Wang1, Feng Shi3

1Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China; 2Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong; 3School of Literature, Nankai University, China



O2-3: The Preliminary Study of Influence on Tone Perception from Segments

Chong Cao1, Yanlu Xie1, Ju Lin1, Qian Li2, Jinsong Zhang1

1Beijing Language and Culture University, Beijing, China; 2Chinese Academy of Social Sciences, Beijing, China



O2-4: The Effects of Tone Categories on the Perception of Mandarin Vowels

Hao Zhang1, 3, Fei Chen1, Nan Yan1, Lan Wang1, Yu Chen2, Feng Shi3

1Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; 2School of Chinese Language and Culture, Tianjin University of Technology, Tianjin, China; 3College of Chinese Language and Culture, Nankai University, Tianjin, China



Oral Session O3: Speaker and Emotion Recognition

Time: 16:00–18:00, October 18, 2016 (Tuesday)

Session Chairs:  Zhenhua Ling, Kai Yu



O3-1: Max-Margin Metric Learning for Speaker Recognition

Lantian Li, Dong Wang, Chao Xing, Thomas Fang Zheng

Center for Speech and Language Technologies, Tsinghua University, China



O3-2: Digit-Dependent Local I-Vector for Text-Prompted Speaker Verification with Random Digit Sequences

Peixin Chen1, Wu Guo1, Guoping Hu2

1National Engineering Laboratory of Speech and Language Information Processing, University of Science and Technology of China, Hefei, China; 2Key Laboratory of Intelligent Speech Technology, Ministry of Public Security



O3-3: Significance of Automatic Detection of Vowel Regions for Automatic Shout Detection in Continuous Speech

Vinay Kumar Mittal1, Anil Kumar Vuppala2

1Indian Institute of Information Technology Chittoor, Sri City, Andhra Pradesh, India; 2International Institute of Information Technology, Hyderabad, Telangana, India



O3-4: Cross-Corpus Speech Emotion Recognition Using Transfer Semi-Supervised Discriminant Analysis

Peng Song1, Xinran Zhang2, Shifeng Ou3, Jingjing Liu2, Yanwei Yu1, Wenming Zheng4

1School of Computer and Control Engineering, Yantai University, Yantai, China; 2School of Information Science and Engineering, Southeast University, Nanjing, China; 3School of Opto-electronic Information Science and Technology, Yantai University, Yantai, China; 4Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing, China



O3-5: Neural Networks based Channel Compensation for I-Vector Speaker Verification

Wei Rao1, Xiong Xiao1, Chenglin Xu1, 2, Haihua Xu1, Kong Aik Lee4, Eng Siong Chng1, 2, Haizhou Li2, 3, 4

1Temasek Laboratories, Nanyang Technological University, Singapore; 2School of Computer Science and Engineering, Nanyang Technological University, Singapore; 3Department of Electrical and Computer Engineering, National University of Singapore, Singapore; 4Human Language Technology Department, Institute for Infocomm Research, A*STAR, Singapore



O3-6: Senone I-Vectors for Robust Speaker Verification

Zhili Tan1, Yingke Zhu2, Man-Wai Mak1, Brian Kan-Wing Mak2

1Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University; 2Dept. of Computer Science and Engineering, The Hong Kong University of Science and Technology



Oral Session O4: Source Localization and Separation

Time: 16:00–18:00, October 18, 2016 (Tuesday)

Session Chairs:  Tai-Shih Chi, Qiang Fu



O4-1: Multi-Task Joint-Learning for Robust Voice Activity Detection

Yimeng Zhuang, Sibo Tong, Maofan Yin, Yanmin Qian, Kai Yu

Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering SpeechLab, Department of Computer Science and Engineering Brain Science and Technology Research Center Shanghai Jiao Tong University, Shanghai, China



O4-2: A Regression Approach to Binaural Speech Segregation via Deep Neural Network

Nana Fan, Jun Du, Lirong Dai

National Engineering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei, China



O4-3: Interaural Coherence Induced Ideal Binary Mask for Binaural Speech Separation and Dereverberation

Yi-Ting Chen, Tzu-Hao Chen, Mao-Chang Huang, Tai-Shih Chi

Department of Electrical and Computer Engineering, National Chiao Tung University, Hsinchu



O4-4: Voice Activity Detection Based on Sequential Gaussian Mixture Model with Maximum Likelihood Criterion

Zhan Shen1, Jianguo Wei1, Wenhuan Lu2, Jianwu Dang1, 3

1School of Computer Science and Technology, Tianjin University; 2School of Computer Software, Tianjin University; 3School of Information, JAIST, Japan



O4-5: Robust Multiple Speech Source Localization Based on Phase Difference Regression

Zhaoqiong Huang, Ge Zhan, Dongwen Ying, Ruohua Zhou, Jielin Pan, Yonghong Yan

Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences



O4-6: Improvement of Mask-Based Speech Source Separation Using DNN

Ge Zhan, Zhaoqiong Huang, Dongwen Ying, Jielin Pan, Yonghong Yan

Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences



Oral Session O5: Applications of Spoken Language Technologies

Time: 8:30–10:30, October 19, 2016 (Wednesday)

Session Chairs:  Zhiyong Wu, Hsin-Min Wang



O5-1: Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment

Dean Luo1, Wentao Gu2, Ruxin Luo3, Lixin Wang4

1Shenzhen Institute of Information Technology; 2Nanjing Normal University; 3Shenzhen Polytechnic; 4Shenzhen Seaskyl and Technologies



O5-2: THear: Development of a Mobile Multimodal Audiometry Application on a Cross-Platform Framework

Wai-Kim LEUNG1,2,3, Jia Jia1,2,3, Yuhao Wu1,2,3, Jiayu Long2,3,4, Xiulong Zhang1,2,3, Lianhong Cai1,2,3

1Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China; 2Key Laboratory of Pervasive Computing, Ministry of Education; 3Tsinghua National Laboratory for Information Science and Technology (TNList); 4Department of Information Art & Design, Tsinghua University, Beijing 100084, China



O5-3: Dialog State Tracking for Interview Coaching Using Two-Level LSTM

Ming-Hsiang Su, Chung-Hsien Wu, Kun-Yi Huang, Tsung-Hsien Yang, Tsui-Ching Huang

Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan



O5-4: Evaluation of a Multimodal 3-D Pronunciation Tutor for Learning Mandarin as a Second Language: An Eye-Tracking Study

Ying Zhou1, Fei Chen2, Hui Chen3, Lan Wang2, Nan Yan2

1School of Information Engineering, Wuhan University of Technology.&& Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences;2Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences; 3Institute of Software, Chinese Academy of Sciences



O5-5: A Multi-Channel/Multi-Speaker Interactive 3D Audio-Visual Speech Corpus in Mandarin

Jun Yu1, 2, 3, Rongfeng Su1, 2, Lan Wang1, 2, Wenpeng Zhou1, 2

1Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences; 2The Chinese University of Hong Kong, Hong Kong; 3School of Information Science and Engineering, Lanzhou University, Lanzhou, China



O5-6: Realizing Speech to Gesture Conversion by Keyword Spotting

Na Zhao, Hongwu Yang

College of Physics and Electronic Engineering, Northwest Normal University



Oral Session O6: Neural Networks for Speech and Language Processing

Time: 13:30–15:30, October 19, 2016 (Wednesday)

Session Chairs:  Jun Du, Dong Yu



O6-1: Long Short-term Memory Recurrent Neural Network based Segment Features for Music Genre Classification

Jia Dai1, Shan Liang1, Wei Xue1, Chongjia Ni2, Wenju Liu1

1National Laboratory of Pattern Recognition, Institute of Automation, University of Chinese Academy of Sciences, Beijing, China; 2School of Mathematic and Quantitative Economics, Shandong University of Finance and Economics, Shandong, China



O6-2: Investigating Gated Recurrent Neural Networks for Acoustic Modeling

Yuanyuan Zhao, Jie Li, Shuang Xu, Bo Xu

Interactive Digital Media Technology Research Center Institute of Automation, Chinese Academy of Sciences, Beijing, China



O6-3: Detection of Mood Disorder Using Speech Emotion Profiles and LSTM

Tsung-Hsien Yang, Chung-Hsien Wu, Kun-Yi Huang, Ming-Hsiang Su

Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan



O6-4: Gated Recurrent Units Based Hybrid Acoustic Models for Robust Speech Recognition

Jian Kang, Weiqiang Zhang, Jia Liu

Tsinghua University



O6-5: Improving Mandarin Tone Recognition Based on DNN by Combining Acoustic and Articulatory Features

Ju Lin, Yanlu Xie, Yingming Gao, Jinsong Zhang

Beijing Language and Culture University



O6-6: Investigating LSTM for Punctuation Prediction

Kaituo Xu1, Lei Xie1, Kaisheng Yao2

1School of Computer Science, Northwestern Polytechnical University, Xi'an; 2Microsoft Corporation, Redmond, WA



Oral Session O7: Acoustic Modeling

Time: 13:30–15:30, October 19, 2016 (Wednesday)

Session Chairs:  Jen-Tzung Chien, Dong Wang



O7-1: Investigating Neural Network based Query-by-Example Keyword Spotting Approach for Personalized Wake-up Word Detection in Mandarin Chinese

Jingyong Hou, Lei Xie, Zhonghua Fu

School of Computer Science, Northwestern Polytechnical University, Xi'an



O7-2: Robust Front-End for Speech Recognition by Human and Machine in Noisy Reverberant Environments: The Effect of Phase Information

Yang Liu, Naushin Nower, Shota Morita, Masashi Unoki

School of Information Science, Japan Advanced Institute of Science and Technology



O7-3: Cluster-Based Senone Selection for the Efficient Calculation of Deep Neural Network Acoustic Models

Junhua Liu1, 2, Zhenhua Ling1, Si Wei2, Guoping Hu2, 3, Lirong Dai1

1National Engineering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei, China; 2Research Department, iFLYTEK Co., LTD., Hefei, China; 3Key Laboratory of Ministry of Public Security for Intelligent Speech Technology, Hefei, China



O7-4: CTC Regularized Model Adaptation for Improving LSTM RNN Based Multi-Accent Mandarin Speech Recognition

Jiangyan Yi1, 3, Hao Ni1, 3, Zhengqi Wen1, Bin Liu1, Jianhua Tao1, 2, 3

1National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China; 2CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Automation, Chinese Academy of Sciences, Beijing, China; 3School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing, China



O7-5: Senone Log-Likelihood Ratios Based Articulatory Features in Pronunciation Erroneous Tendency Detecting

Leyuan Qu, Yanlu Xie, Jinsong Zhang

Beijing Language and Culture University, Beijing, China



O7-6: Exploiting Language-Mismatched Phoneme Recognizers for Unsupervised Acoustic Modeling

Siyuan Feng1, Tan Lee1, Haipeng Wang2

1Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong; 2Noah's Ark Lab, Huawei Technologies Co., Ltd., Hong Kong



Oral Session O8: L2 Speech Perception

Time: 13:30–15:30, October 19, 2016 (Wednesday)

Session Chairs:  Jiahong Yuan, Jinsong Zhang



O8-1: The Perception of the English Alveolar-Velar Nasal Coda Contrast by Monolingual versus Bilingual Chinese Speakers

Minghui Wu1, Marjoleine Sloos2, Jeroen van de Weijer3

1Shanghai University of International Business and Economics; 2Fryske Akademy (KNAW, Royal Netherlands Academy of Sciences), Ljouwert, The Netherlands; 3Shanghai International Studies University



O8-2: Recognition of Spoken Words in L2 Speech Using L1 Probabilistic Phonotactics: Evidence from Cantonese-English Bilinguals

Michael C. W. Yip

Department of Psychological Studies, The Education University of Hong Kong, Hong Kong



O8-3: Acoustic Correlates and Gender Effects in Production and Perception of Japanese Polite Speech

Shuju Shi1, 2, Chiharu Tsurutani3, Xiaoli Feng1, Jinsong Zhang1, Nobuaki Minematsu2

1Beijing Language and Culture University, China; 2The University of Tokyo, Japan; 3Griffith University, Australia



O8-4: The Examination of the Relationship between Perception and Production of Mandarin tone of Kazak Students

Yali Liu, Zihou Meng

Communication Acoustic Laboratory, Communication University of China, Beijing, China



O8-5: Production and Perception of Focus in L2 Mandarin of Qiang Speakers

Xiaxia Zhang, Bei Wang

Minzu University of China



O8-6: A Study on Perceptual Training of Japanese CSL Learners to Discriminate Mandarin Lexical Tones

Feiya Li, Yanlu Xie, Xiaomin Yu, Jinsong Zhang

College of Information Sciences, Beijing Language and Culture University



Oral Session O9: Speech Synthesis

Time: 8:30–10:30, October 20, 2016 (Thursday)

Session Chairs:  Chen-Yu Chiang, Wentao Gu



O9-1: Improvements on Punctuation Generation Inspired Linguistic Features for Mandarin Prosody Generation

Chen-Yu Chiang1, Yu-Ping Hung1, Guan-Ting Liou2, Yih-Ru Wang2

1Dept. of Communication Engineering, National Taipei University, New Taipei City; 2Dept. of Electrical Engineering, National Chiao Tung University, Hsinchu



O9-2: Dictionary Update for NMF-based Voice Conversion Using an Encoder-Decoder Network

Chin-Cheng Hsu1, Hsin-Te Hwang1, Yi-Chiao Wu1, Yu Tsao2, Hsin-Min Wang1

1Institute of Information Science, Academia Sinica, Taipei; 2Research Center for Information Technology Innovation, Academia Sinica, Taipei



O9-3: A Bi-directional LSTM Approach for Polyphone Disambiguation in Mandarin Chinese

Changhao Shan1, Lei Xie1, Kaisheng Yao2

1School of Computer Science, Northwestern Polytechnical University, Xi'an; 2Microsoft Corporation, Redmond



O9-4: DNN-Based Unit Selection Using Frame-Sized Speech Segments

Zhiping Zhou, Zhenhua Ling

University of Science and Technology of China



O9-5: Text-Based Sentential Stress Prediction Using Continuous Lexical Embedding for Mandarin Speech Synthesis

Yibin Zheng1, 3, Ya Li1, Zhengqi Wen1, Bin Liu1, Jianhua Tao1, 2, 3

1National Laboratory of Pattern Recognition; 2CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Automation, Chinese Academy of Sciences, Beijing, China; 3School of Computer and Control Engineering, University of Chinese Academy of Sciences



O9-6: Learning Auxiliary Categorical Information for Speech Synthesis based on Deep and Recurrent Neural Networks

Zhengqi Wen1, Kehuang Li2, Zhen Huang2, Jianhua Tao1, Chin-Hui Lee2

1National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China; 2School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA. USA



Oral Session O10: Speech Production

Time: 13:30–15:30, October 20, 2016 (Thursday)

Session Chairs:  Kiyoshi Honda, Yuan Jia



O10-1: A Post-Thyroidectomy Aerodynamic Study in Patients Suffering or not from Recurrent Laryngeal Paralysis

Ming Xiu1, Camille Fauth1, Béatrice Vaxlaire1, Jean-François Rodier2, Pierre-Philippe Volkmar2, Rudolph Sock1, 3

1U.R. Parole & Cognition, E.A. 1339 Linguistique, Langues et Parole (LiLPa) et Institutde Phonétique de Strasbourg (IPS), Université de Strasbourg; 2Groupement Hospitalier Saint Vincent, Clinique Sainte Anne, Strasbourg; 3Language, Information and Communication Laboratory, LICOLAB, Université Pavol



O10-2: Individual Difference and Acoustic Effect of Female Laryngeal Cavities

Jing Li, Kiyoshi Honda, Ju Zhang, Jianguo Wei

School of Computer Science and Technology, Tianjin University, Tianjin, China;



O10-3: Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra

Congcong Zhang, Kiyoshi Honda, Ju Zhang, Jianguo Wei

School of Computer Science and Technology, Tianjin University, Tianjin, China



O10-4: An Interface Research on Rhetorical Structure and Prosody Features in Chinese Reading Texts

Liang Zhang, Yuan Jia, Aijun Li

Institute of Linguistics, Chinese Academy of Social Sciences, Beijing



O10-5: EEG Evidence for a Three-Phase Recurrent Process during Spoken Word Processing

Bin Zhao1, Jianwu Dang1, 2, Gaoyan Zhang1

1Tianjin key Laboratory of Cognitive Computing and Application, School of Computer Science and Technology, Tianjin University, Tianjin, China; 2Japan Advanced Institute of Science and Technology, Japan



O10-6: The Influence of Syllable Structure and Prosodic Strengthening on Consonant Production in Shanghai Chinese

Bijun Ling, Jie Liang,

Tongji University



Oral Session O11: Automatic Speech Recognition

Time: 13:30–15:30, October 20, 2016 (Thursday)

Session Chairs:  Wenju Liu, Chung-Hsien Wu



O11-1: Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code

Zhiying Huang1, Shaofei Xue2, Zhijie Yan2, Lirong Dai1

1National Engineering Laboratory of Speech and Language Information Processing, University of Science and Technology of China, Hefei, China; 2Alibaba Inc.



O11-2: Applying Connectionist Temporal Classification Objective Function to Chinese Mandarin Speech Recognition

Pengrui Wang, Jie Li, Bo Xu

Interactive Digital Media Technology Research Center, Institute of Automation, Chinese Academy of Sciences, Beijing, China, China



O11-3: Rapid Speaker Adaptation Based on D-code Extracted from BLSTM-RNN in LVCSR

Shaofei Xue1, Zhijie Yan1, Zhiying Huang2, Lirong Dai2

1Alibaba Inc.; 2National Engineering Laboratory of Speech and Language Information Processing, University of Science and Technology of China, Hefei, China



O11-4: Automatic Acoustic Segmentation in N-best List Rescoring for Lecture Speech Recognition

Peng Shen, Xugang Lu, Hisashi Kawai

National Institute of Information and Communications Technology, Japan



O11-5: An Investigation of Adaptation Techniques for Building Acoustic Models for Hearing-impaired Children in a CAPT Application

Yingke Zhu, Brian Mak

Department of Computer Science and Engineering, The Hong Kong University of Science and Technology



O11-6: Lattice Based Transcription Loss for End-to-End Speech Recognition

Jian Kang, Weiqiang Zhang, Jia Liu

Tsinghua National Laboratory for Information Science and Technology, Department of Electronic Engineering, Tsinghua University, Beijing, China



Special Session S1: Speech Processing for Biomedical Applications

Time: 10:20–12:00, October 18, 2016 (Tuesday)

Session Chairs:  Qingyang Hong, Ming Li



S1-1: Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques

Ying Qin1, Tan Lee1, Anthony Pak Hin Kong2, Sam Po Law3

1Department of Electronic Engineering, The Chinese University of Hong Kong; 2Department of Communication Sciences and Disorders, University of Central Florida; 3Division of Speech and Hearing Science, University of Hong Kong



S1-2: Classification Between Normal and Adventitious Lung Sounds Using Deep Neural Network

Lin Li1, Wenhao Xu1, Qingyang Hong1, Feng Tong2, Jinzhun Wu3

1School of Information Science and Technology, Xiamen University, China; 2Key Lab of Underwater Acoustic Communication and Marine Information Technology of MOE, Xiamen University, China; 3The First Affiliated Hospital of Xiamen University, China



S1-3: Speaker Diarization System for Autism Children's Real-Life Audio Data

Tianyan Zhou1, Weicheng Cai1, Xiaoyan Chen3, Xiaobing Zou3, Shilei Zhang4, Ming Li1,2

1SYSU-CMU Joint Institute of Engineering, Sun Yat-sen University, China; 2SYSU-CMU Shunde International Joint Research Institute, China; 3The Third Affiliated Hospital, Sun Yat-sen University, China; 4Speech Technology and Solution Group, IBM China Research Laboratory



S1-4: Discriminating Features of Infant Cry Acoustic Signal Towards Automated Diagnosis of Cause of Crying

Vinay Kumar Mittal

Indian Institute of Information Technology Chittoor, Sri City



S1-5: Exploratory Data Analysis on Nuclei in Cantonese Dysarthric Speech

Ka Ho Wong1, Hoi Kiu Kristy Mok1, Helen Meng1, 2

1Human-Computer Communications Laboratory, Department of Systems Engineering and Engineering Management; 2Stanley Ho Big Data Decision Analytics Research Centre, The Chinese University of Hong Kong, Hong Kong



Special Session S2: Prosody in Interaction—Studies of Spoken Dialogues and Discourses

Time: 8:30–10:30, October 19, 2016 (Wednesday)

Session Chairs:  Aijun Li, Bo Xu



S2-1: Advance Prosodic Indexing - Acoustic Realization of Prompted Information Projection in Continuous Speeches and Discourses

Helen Chen, Weite Fang, Chiu-yu Tseng

Phonetics Lab, Institute of Linguistics, Academia Sinica, Taipei



S2-2: Investigating Deep Neural Network Adaptation for Generating Exclamatory and Interrogative Speech in Mandarin

Yibin Zheng1, 3, Ya Li1, Zhengqi Wen1, Bin Liu1, Jianhua Tao 1, 2, 3

1National Laboratory of Pattern Recognition; 2CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Automation, Chinese Academy of Sciences, Beijing, China; 3School of Computer and Control Engineering, University of Chinese Academy of Sciences



S2-3: Prosodic Annotation Enriched Statistical Machine Translation

Peidong Guo, Heyan Huang, Ping Jian, Yuhang Guo

Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Department of Computer Science and Technology, Beijing Institute of technology, China



S2-4: The Effect of Information Structure on the Distribution of Stress Degree in Chinese Reading Texts

Yuan Jia

Institute of Linguistics, Chinese Academy of Social Science, Beijing, China



S2-5: A Linguistic Annotation Scheme of Chinese Discourse Structures and Study of Prosodic Interactions

Yuan Jia, Aijun Li

Institute of Linguistics, Chinese Academy of Social Sciences



S2-6: Gender and Prosodic Entrainment in Mandarin Conversations

Zhihua Xia1, Qiuwu Ma2

1Jiangsu Normal University, Jiangsu Province, China; 2Tongji University, Shanghai, China



Special Session S3: Speech Recognition at Adverse Acoustic Conditions

Time: 8:30–10:30, October 20, 2016 (Thursday)

Session Chairs:  Fei Chen, Jeih-weih Hung, Yu Tsao



S3-1: Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition

Biswajit Das, Ashish Panda

TCS Innovation Labs-Mumbai, Yantra Park, Thane, Maharshtra, India.



S3-2: Deep Long Short-Term Memory Networks for Speech Recognition

Jen-Tzung Chien, Alim Misbullah

Department of Electrical and Computer Engineering, National Chiao Tung University, Hsinchu



S3-3: Employing Median Filtering to Enhance the Complex-Valued Acoustic Spectrograms in Modulation Domain for Noise-Robust Speech Recognition

Hsin-Ju Hsieh1, 2, Berlin Chen1, Jeih-weih Hung2

1National Taiwan Normal University; 2National Chi Nan University



S3-4: Comparison of Regularization Constraints in Deep Neural Network based Speaker Adaptation

Peng Shen, Xugang Lu, Hisashi Kawai

National Institute of Information and Communications Technology, Japan



S3-5: Improving the Performance of Speech Perception in Noisy Environment Based on a FAME Strategy

Ying-Hui Lai1, Syu-Siang Wang2, Yu-Ting Su3, Cheng Han-Che3, Fan Kang Fu3, Yu Tsao2

1Department of Electrical Engineering, Yuan Ze University, Taoyuan; 2Research Center for Information Technology Innovation, Academia Sinica, Taipei'; 3Department of Mechatronic Engineering, National Taiwan Normal University, Taipei



S3-6: A Speaker-Dependent Deep Learning Approach to Joint Speech Separation and Acoustic Modeling for Multi-Talker Automatic Speech Recognition

Yan-Hui Tu1, Jun Du1, Lirong Dai1, Chin-Hui Lee2

1University of Science and Technology of China; 2Georgia Institute of Technology



Poster Session P1: Speech Prosody and Speech Generation

Time: 16:00–18:00, October 18, 2016 (Tuesday)

Session Chairs:  Minghui Dong, Zhiyong Wu



P1-1: Tongue Shape Variation Model for Simulating Mandarin Chinese Articulation

Jinguang Zhang, Xiyu Wu, Jiangping Kong

Department of Chinese Language and Literature, Peking University, Beijing, China



P1-2: Rich Prosodic Information Exploration on Spontaneous Mandarin Speech

Cheng-Hsien Lin1, 3, Chung-Long You1, Chen-Yu Chiang2, Yih-Ru Wang1, Sin-Horng Chen1

1Dept. of Electrical Engineering, National Chiao Tung University, Hsinchu; 2Dept. of Communication Engineering, National Taipei University; 3Information & Communications Research Labs, Industrial Technology Research Institute



P1-3: Prosodic Strength Intrinsic to Lexical Items: A Corpus Study on Tone Reduction in Tone4+Tone4 Words in Mandarin Chinese

Wei Lai1, Mark Liberman1, Jiahong Yuan1, Xiaoying Xu2

1University of Pennsylvania; 2Beijing Normal University



P1-4: The Design and Implementation of HMM-based Dai Speech Synthesis

Zhan Wang, Jian Yang, Xin Yang

School of Information Science and Technology, Yunnan University Kunming, China



P1-5: Study on the Relation of Fundamental and Formant Frequencies for Affective Speech Synthesis

Bogu Li1, Zhilei Liu1, Jianwu Dang1, 2

1Tianjin Key Lab. of Cognitive Computing and Application, Tianjin University, Tianjin, China; 2Japan Advanced Institute of Science and Technology, Ishikawa, Japan



P1-6: Spatial Co-variation of Lip and Tongue at Strong and Weak Syllables

Ju Zhang1, Kiyoshi Honda1, Jianguo Wei2, Jianrong Wang1, Jianwu Dang1, 3

1School of Computer Science and Technology, Tianjin University, Tianjin;2School of Computer Software, Tianjin University, Tianjin, China;3School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan



P1-7: Discourse Prosody and Its Application to Speech Synthesis

Na Hu, Pengfei Shao, Yiqing Zu, Zuyan Wang, Wei Huang, Shijin Wang

iFLYTEK Research, Hefei, China



P1-8: Perceptual Evaluation of Natural and Synthesized Speech with Prosodic Focus in Mandarin Production of American Learners

Ying Chen, Li Liu, Xueqin Zhao

Nanjing University of Science and Technology



P1-9: DBLSTM-Based Multi-Task Learning for Pitch Transformation in Voice Conversion

Runnan Li1, Zhiyong Wu1, 2, Helen Meng1, 2, Lianhong Cai1

1Tsinghua-CUHK Joint Research Center for Media Sciences, Technologies and Systems, Graduate School at Shenzhen, Tsinghua University; 2Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong



P1-10: The Singing Voice Before and After Vocal Warm-up by Students of Chinese National Singing

Yu Chen1, Weifeng Kong2, Yujie Chi3, Yanting Chen1, Jianguo Wei3, Jianwu Dang3

1Tianjin University of Technology, Tianjin, China; 2Tianjin Conservatory of Music, Chinese Academy of Sciences, Tianjin, China; 3Tianjin University, Tianjin, China



P1-11: The Correlation Between Signal Distance and Consonant Pronunciation in Mandarin Words

Huijun Ding1, Chenxi Xie1, Lei Zeng1,Yang Xu1, Guo Dan1, 2

1Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University, 3688 Nanhai Ave., Shenzhen, Guangdong, China; 2Center for Neurorehabilitation, Shenzhen Institute of Neuroscience, Shenzhen, China



P1-12: Automatic Mandarin Prosody Boundary Detecting Based on Tone Nucleus Features and DNN Model

Ju Lin, Yanlu Xie, Wei Zhang, Jinsong Zhang

Beijing Language and Culture University



P1-13: Prosodic Cues in Polite and Rude Mandarin Speech

Ping Fan, Wentao Gu

Nanjing Normal University;



Poster Session P2: Topics on Speech Signal Processing

Time: 8:30–10:30, October 19, 2016 (Wednesday)

Session Chairs:  Dong Wang, Nengheng Zheng



P2-1: Microphone Array Speech Denoising Modeled by Tensor Filtering

Jing Wang, Yahui Shan, Shequan Jiang, Xiang Xie

School of Information and Technology, Beijing Institute of Technology, China



P2-2: First Investigation of Universal Speech Attributes for Speaker Verification

Sheng Zhang1, Wu Guo1, Guoping Hu2

1National Engineering Laboratory of Speech and Language Information Processing, University of Science and Technology of China, Hefei, China; 2Key Laboratory of Intelligent Speech Technology, Ministry of Public Security, Hefei, China



P2-3: Binary Speaker Embedding

Lantian Li, Chao Xing, Dong Wang, Kaimin Yu, Thomas Fang Zheng

Center for Speech and Language Technologies, Tsinghua University, China



P2-4: A Sparse Representation of the Excitation Source Characteristics of Nonnormal Speech Sounds

Vinay Kumar Mittal1, B. Yegnanarayana2

1Indian Institute of Information Technology Chittoor, Sri City, Andhra Pradesh, India; 2International Institute of Information Technology, Hyderabad, Telangana, India



P2-5: Interferences Suppression Using Two Closely-Spaced Microphones

Zhonghua Fu

School of Computer Science, Northwestern Polytechnical University, Xi'an, Shaanxi, China



P2-6: A Study of Variational Method for Text-Independent Speaker Recognition

Liang He1, Yao Tian1, Yi Liu1, Fang Dong2, Weiqiang Zhang1, Jia Liu1

1Tsinghua National Laboratory for Information Science and Technology, Department of Electronic Engineering, Tsinghua University, Beijing, China; 2School of Information and Electrical Engineering, Zhejiang University City College, Hangzhou, China



P2-7: Evaluation of the Deep Nonlinear Metric Learning Based Speaker Identification on the Large Scale of Voiceprint Corpus

Feng Yong1, Cai Xinyuan2, Ji Ruifang2

1School of Automation, Chongqing University, Chongqing, China; 2Interactive Digital Media Technology Research Center, Institute of Automation, Chinese Academy of Sciences, Beijing, China



P2-8: Learning FOFE Based FNN-LMs with Noise Contrastive Estimation and Part-of-Speech Features

Junfeng Hou, Shiliang Zhang, Lirong Dai

National Engineering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei, Anhui, China



P2-9: A Pseudo-Task Design in Multi-Task Learning Deep Neural Network for Speaker Recognition

Xugang Lu1, Peng Shen1, Yu Tsao2, Hisashi Kawai1

1National Institute of Information and Communications Technology, Japan; 2Research Center for Information Technology Innovation, Academic Sinica



P2-10: Exploring Tonal Information for Lhasa Dialect Acoustic Modeling

Jian Li1, Hongcui Wang1, Longbiao Wang1, Jianwu Dang1, 2, Kuntharrgyal Khuru1, Gyaltsen Lobsang1

1Tianjin Key Laboratory of Cognitive Computing and Application, School of Computer Science and Technology, Tianjin University, Tianjin, China; 2Japan Advanced Institute of Science and Technology, Japan



P2-11: Spatial Dispersion Constrained NMF for Monaural Source Separation

Viet-Hang Duong1, Yuan-Shan Lee1, Bach-Tung Pham1, Seksan Mathulaprangsan1, Pham-The Bao2, Jia-Ching Wang1

1Department of Computer Science and Information Engineering, National Central University, Jhongli; 2Faculty of Mathematics & Computer Sciences, University of Science, Ho Chi Minh City, Viet Nam



P2-12: An Adaptive Filter with Gain and Time-shift Parameters for Echo Cancellation

Zhiping Zhang, Zhiqiang Wu

Wright State University



P2-13: F0 Estimation of Speech Based on IRAPT Using WLP-Based TV-CAR Analysis

Wei Shan1, Keiichi Funaki2

1Graduate School of Engineering & Science, University of the Ryukyus; 2C&N Center, University of the Ryukyus



P2-14: Phone Recognition for Lhasa-Tibetan Based on Articulatory Features Augmentation Learning

Yue Zhao1, Rui Zhao1, Xiaona Xu1, Licheng Wu1, Qiang Ji2

1School of Information Engineering, Minzu University of China, Beijing; 2Rensselaer Polytechnic Institute, Troy, USA



P2-15: HMM-Based Cue Parameters Estimation for Speech Enhancement

Feng Deng, Changchun Bao, Mao-shen Jia

Speech and Audio Signal Processing Laboratory, School of Electronic Information and Control Engineering, Beijing University of Technology, Beijing, China



Poster Session P3: Speech Recognition and Related Topics

Time: 8:30–10:30, October 20, 2016 (Thursday)

Session Chairs:  Hung-yi Lee, Lan Wang



P3-1: Improving Accented Mandarin Speech Recognition by Using Recurrent Neural Network based Language Model Adaptation

Hao Ni1, 3, Jiangyan Yi1, 3, Zhengqi Wen1, Bin Liu1, Jianhua Tao1, 2, 3

1National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China; 2CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Automation, Chinese Academy of Sciences, Beijing, China; 3School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing, China



P3-2: Unsatisfied Customer Call Detection with Deep Learning

Pengyu Cong, Chaomin Wang, Zhijie Ren, Huixin Wang, Yanmeng Wang, Junlan Feng

China Mobile Research



P3-3: Mismatched Training Data Enhancement for Automatic Recognition of Children's Speech Using DNN-HMM

Mengjie Qian1, Ian McLoughlin2, Wu Guo1, Lirong Dai1

1Neering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei, Anhui, China; 2School of Computing, The University of Kent, Medway, Kent, UK



P3-4: Pronunciation Error Detection Using DNN Articulatory Model Based on Multi-Lingual and Multi-Task Learning

Richeng Duan1, Tatsuya Kawahara1, Masatake Dantsuji2, Jinsong Zhang3

1School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan; 2Academic Center for Computing and Media Studies, Kyoto University; 3School of Information Science, Beijing Language and Culture University, Beijing



P3-5: Rich Punctuations Prediction Using Large-Scale Deep Learning

Xueyang Wu, Su Zhu, Yue Wu, Kai Yu

Key Lab. of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering SpeechLab, Department of Computer Science and Engineering, Brain Science and Technology Research Center Shanghai Jiao Tong University, Shanghai, China



P3-6: Confidence Estimation for Speech Recognition Systems Using Conditional Random Fields Trained with Partially Annotated Data

Sheng Li1, Xugang Lu2, Shinsuke Mori1, Yuya Akita1, Tatsuya Kawahara1

1Kyoto University, Sakyo-ku, Kyoto, Japan; 2National Institute of Information and Communications Technology, Kyoto, Japan



P3-7: On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation

Tianxing He1, Yu Zhang2, Jasha Droppo3, Kai Yu1

1Shanghai Jiao Tong University; 2MIT; 3Microsoft Research



P3-8: Comparison of DCT and Autoencoder-Based Features for DNN-HMM Multimodal Silent Speech Recognition

Licheng Liu, Yan Ji, Hongcui Wang, Bruce Denby

Tianjin University



P3-9: Directed Automatic Speech Transcription Error Correction Using Bidirectional LSTM

Da Zheng, Zhehuai Chen, Yue Wu, Kai Yu

Shanghai Jiao Tong University



P3-10: End-to-end Keywords Spotting Based on Connectionist Temporal Classification for Mandarin

Ye Bai1, 3, Jiangyan Yi1, 3, Hao Ni1, 3, Zhengqi Wen1, Bin Liu1, Ya Li1, Jianhua Tao1, 2, 3

1National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing , China; 2CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Beijing, China; 3School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing, China



P3-11: Exploiting Noisy Web Data by OOV Ranking for Low-Resource Keyword Search

Zhipeng Chen, Ji Wu

Multimedia Signal and Intelligent Information Processing Lab, Department of Electronic Engineering, Tsinghua University, Beijing, China



P3-12: Automatic Detection of Rhythmic Patterns in Native and L2 Speech: Chinese, Japanese and Japanese L2 Chinese

Shuju Shi1, 2, Yanlu Xie1, Xiaoli Feng, 1, Jinsong Zhang1

1Beijing Language and Culture University, P.R. China; 2University of Illinois – Urbana Champaign, USA



P3-13: Incorporating Local Environment Information with Ensemble Neural Networks to Robust Automatic Speech Recognition

Chia-Yung Hsu1, Ryandhimas E. Zezario1, Jia-Ching Wang1, Chin-Wen Ho1, Xugang Lu2, Yu Tsao3

1Department of Computer Science and Information Engineering, National Central University; 2National Institute of Information and Communications Technology, Japan; 3Research Center for Information Technology Innovation, Academia Sinica, Taipei



Poster Session P4: Atypical Speech Perception and Production

Time: 13:30–15:30, October 20, 2016 (Thursday)

Session Chairs:  Longbiao Wang, Nan Yan



P4-1: Categorical Perception of Two Pairs of Mandarin Tones in Bimodal Cochlear Implanted Children

Wentao Gu1, Jiao Yin1, James Mahshie2

1Institute of Linguistic Science and Technology, Nanjing Normal University, China; 2Department of Speech and Hearing Science, George Washington University, USA



P4-2: L1/L2 Difference in Phonological Sensitivity and Information Planning - Evidence from F0 Patterns

Chao-yu Su1, 2, 3, Chiu-yu Tseng1

1Institute of Lingustic, Academia Sinica; 2Taiwan International Graduate Program, Academia Sinica; 3Institute of Information Systems and Application, National Tsing Hua University, Hsinchu



P4-3: Cantonese Spoken Word Retention by Speakers with and without Congenital Amusia: Implications from Phonological Similarity and Cognitive Load Effects

Xiao Wang1, Gang Peng2, 3

1The Chinese University of Hong Kong; 2The Hong Kong Polytechnic University; 3Shenzhen Institutes of Advanced Technology



P4-4: Tongue Performance in Articulating Mandarin Apical Syllables by Prelingual Deaf Adults Using Ultrasonic Technology: Two Case Studies

Quan Zhou1, Yu Chen1, Yanting Chen1, Hao Zhang2, Jianguo Wei3, Jianwu Dang3

1Tianjin University of Technology, Tianjin, China; 2Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; 3Tianjin University, Tianjin, China



P4-5: Investigation of the Spatiotemporal Dynamics of the Brain during Perceiving Words

Yuke Si1, Jianwu Dang1, 2, Gaoyan Zhang1

1Tianjin key Laboratory of Cognitive Computing and Application, Tianjin University, Tianjin, China; 2Japan Advanced Institute of Science and Technology, Japan



P4-6: The Effect of Gain Thresholds on Speech Intelligibility for Statistical Model Based Noise Reduction for Cochlear Implants: A Simulation Based Verification

Wenzhi He1, Nengheng Zheng1, Qinglin Meng2

1College of Information Engineering, Shenzhen University; 2Acoustic Lab., School of Physics and Optoelectronics, South China University of Technology



P4-7: The Perceptual Cues of Nasal finals in Standard Chinese

Yanping Li, Yanlu Xie, Luoduo Feng, Jinsong Zhang

Beijing Language and Culture University, Beijing, China



P4-8: English Stress Acquisition by Native Speakers of Tibetan

Dan Hu1, Hui Feng1, 2, 3, Tongyu Wu4

1School of Foreign Languages and Literature, Tianjin University, Tianjin; 2Research Center for Linguistic Sciences, Tianjin University, Tianjin; 3Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin; 4Linguistics and English Language Department, Lancaster University, Lancaster, UK



P4-9: A Study on Functional load of Chinese Prosodic Boundaries under Reduction of Syllable Information

Yue Chen, Yanlu Xie, Bin Wu, Jinsong Zhang

Beijing Language and Culture University



P4-10: Effects of Background Noise and Tonal Target Stimulus on Human Auditory Evoked Potential

Lei Wang, Fei Chen

Southern University of Science and Technology



P4-11: Relationship between Perception and Production of English Vowels by Chinese English Learners

Aihui Zhang1, Hui Feng1, 2, 3, Siyu Wang3, Jianwu Dang3, 4

1School of Foreign Languages and Literature, Tianjin University, Tianjin; 2Research Center for Linguistic Sciences, Tianjin University, Tianjin; 3Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin; 4School of Information Science, Japan Advanced Institute of Science and Technology, Japan



P4-12: PVD: A New Pathological Voice Dataset for Intra-Speaker Recognition Research Interest

Dongdong Li1, Jianyu Wang1, Yingchun Yang2

1School of Information Science and Engineering, East China University of Science and Technology, Shanghai, P.R. China; 2College of Computer Science and Technology, Zhejiang University, Hangzhou, P.R. China



P4-13: Mandarin Neutral Tone by Native Speakers and Cantonese L2 Learners

Lei Liu, Nan Huang, Wentao Gu

Nanjing Normal University



P4-14: Vowels as Acoustic Cues for Sub-Dialect Identification in Chinese

Huangmei Liu, Jie Liang

Tongji University