Long-Range Prosody Prediction and Rhythm


Greg Kochanski, Anastassia Loukina, Elinor Keane, Chilin Shih and Burton Rosner, University of Oxford

Rhythm is expressed by recurring, hence predictable beat patterns. Poetry in many languages is composed with attention to poetic meters while prose is not. Therefore, one way to investigate speech rhythm is to evaluate how prose reading differs from poetry reading via a quantitative method that measures predictability. We use linear regression to predict the acoustic properties of segments from the properties of up to 7 preceding segments. This accounts for as much as 41\% of the variance of our full (prose) corpus and up to 79\% in a subcorpus of poetry. While roughly half of the predictive power comes from the segment immediately preceding the target, the predicted variance increases by 6\% (for the full/prose corpus) or by 25\% (for the poetry sub-corpus) upon extending the predictor to include the seven preceding segments. Therefore, interactions between segments extend well beyond the immediate vicinity. Potentially, these longer-range regressions capture the rhythms of the poetry. This approach could form the foundation of a general method for characterizing the statistical properties of spoken language, especially in reference to prosody and speech rhythm.