Special Sessions

The Speech Prosody 2012 Organizing Committee is pleased to announce acceptance of the following two Special Sessions at Speech Prosody 2012 to be held on 22-25 May, 2012, in Shanghai.
The deadline for paper submission for special sessions is December 15, 2011.


SS-1: Control of prosodic features for expressive speech synthesis with less speech corpus
SS-2: Prosody and languages in contact


SS-1: Control of prosodic features for expressive speech synthesis with less speech corpus

High quality has already been realized in speech synthesis through selection-based methods of human speech segments.  Although the method enables synthetic speech with various voice qualities and speaking styles, it requires large speech corpora with targeted quality and style.  From this respect, speech conversion techniques now attain a great concern among researchers.  HMM/GMM-based methods are widely used, but they have several major problems if we view from the prosody aspect; prosodic features cover a wider time span than segmental features and their frame-by-frame processing is not always appropriate. In the special session, we try to list up these problems with their solutions.  Topics include: Prosodic control in HMM-based speech synthesis, conversion of prosody for voice/style adaptation, expression of linguistic and para-/non-linguistic information, prosody modeling, under-resourced languages, etc.


Keikichi Hirose (University of Tokyo)
He received the B. E. degree in electrical engineering in 1972, and the M. E. and Ph. D. degrees in electronic engineering respectively in 1974 and 1977 from the University of Tokyo. From 1977, he is a faculty member at the University of Tokyo, and was a Professor of the Department of Electronic Engineering from 1994. Currently he is professor at the Department of Information and Communication Engineering, Graduate School of Information Science and Technology, University of Tokyo. From March 1987 to January 1988, he was Visiting Scientist at the Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, U.S.A. He has been engaged in a wide range of research on spoken language processing, including analysis, synthesis, recognition, dialogue systems, and computer-assisted language learning. From 2000 to 2004, he was Principal Investigator of the national project “Realization of advanced spoken language information processing utilizing prosodic features,” supported by the Japanese Government. He served as Chair of Speech Committee, Institute of Electronics, Information and Communication Engineers (IEICE)/Acoustical Society of Japan (ASJ) from 2003 to 2005.   He is Chair of Speech Prosody Special Interest Group (SPro-SIG), ISCA, from October 2010.  He has been on the editorial board of Speech Communication journal since 2004. He is a Fellow of Institute of Information and Communication Engineering and a member of a number of academic societies, including IEEE, International Speech Communication Association (Board member), Acoustical Society of America, Acoustical Society of Japan, Information Processing Society of Japan, Japanese Society for Artificial Intelligence, and Research Institute of Signal Processing Japan (Board member).


Jianhua Tao (CAS)
He received the M.S. degree from Nanjing University in 1996 and the Ph.D. in Computer Science from Tsinghua University in 2001. He is currently the professor at National Laboratory of Pattern Recognition (NLPR) of Chinese Academy of Sciences where he chairs the human computer speech interaction group. He developed quite several earliest versions of Speech systems, multimodal interaction system in China, and published more than 90 papers in IEEE Trans. on ASLP, ICASSP, Interspeech, ICME, ICPR, ICCV, ICIP, etc. He has been the main researcher and contributor of several national scientific projects supported by National Natural Science Foundation of China (NSFC), National High-Tech Program and International Cooperation Projects (863). Currently, He is one of the editorial board members of "International Journal on Computational Linguistics and Chinese Language Processing", “Journal on Multimodal User Interfaces (JMUI)”, “International Journal of Synthetic Emotions (IJSE)”, and the Steering Committee Member for the IEEE Transactions on Affective Computing. He was elected as vice-chair of ISCA Special Interesting Group of Chinese Spoken Language Processing from 2006, the executive committee member of HUMAINE association from 2007, the board member of COCOSDA from 2007, and is also the Council member of Chinese Speech Information Processing Society and the Acoustical Society of China.



SS-2: Prosody and languages in contact

Description (importance of the topic and rationale for the session) 
This session will allow people doing research in prosody and languages in contact to present the results of their studies. Focus will be given to any studies concerned with the analysis and description of the prosody observed in language/ dialects used in specific linguistic situations:

  • L2 acquisition  and any question relative to prosodic transfer from L1 to L2;
  • Prosodic characteristics observed in languages/ dialects spoken in multilingual countries: are some of the characteristics due to others languages in contact (in the French dialect spoken in Senegal, some accentual patterns observed do obviously come from Wolof, etc.);
  • Prosodic transcription tools & systems as well as languages resources available to study prosody and languages in contact.

The question of « prosody and languages in contact » is of interest not only to provide information on the way prosodic transfers/ influences do apply in various multilingual situations, but also to get better knowledge of the grammar of the target language (which features are categorical in nature, etc. ?).


Elisabeth DELAIS-ROUSSARIE, Director of research at the CNRS, is working on phrasing and intonation in French. Recently, she has worked in collaboration with Brechtje POST on prosodic transcription, in order to evaluate the advantages and limitations of the most commonly used systems (ToBi, IViE, INTSINT and API), in particular when dealing with learner’s data and dialectal variations. She has also worked in the development of a learner corpus to study the acquisition of prosody in French as a foreign/second language.
Contact and webpage:


Mathieu AVANZI achieved a PhD dealing with the prosody/syntax interface in spontaneous French. He’s actually holding a postdoctoral position in Neuchâtel University, where he works on the prosody of French dialects spoken out of France, with a special focus on the contact language varieties.
Contact and webpage:


Guri BORDAL is a PhD-student at the University of Oslo and Université Paris Ouest. Her research is concerned with role of transfers in prosodic variation in contact situation. Her thesis is about the contact between French and the African tone language, Sango in the Central African Republic.
Contact and webpage:


Copyright©2010 sp2012. All rights reserved.

This site is best viewed with Internet Explorer 8.0 at resolution 1024 X 768.