A Corpus-based Learning Method for Prominence Detection in Spontaneous Speech


Mathieu Avanzi, Neuchâtel
Anne Lacheret-Dujour, Paris Ouest Nanterre
Bernard Victorri, Lattice, ENS, Paris

The aim of this paper is to present a software tool called ANALOR, which allows semi-automatic prominences detection in spontaneous French. On the basis of a manual annotation made by two experts on a 70-minute long corpus including different regional varieties of French (Belgian, Swiss and metropolitan French) and various discourse-genres (from reading speech to spontaneous conversations), our system conducts a learning-method in order to get the best thresholds for prominence prediction. This procedure allows honing the detection, the constituency between the automatic identification and the human labeling passing from 75.3 without training to 79.1 of f-measure after corpus-based learning.