Tone was one of the first areas of generative phonology

Tonal Complexes and Tonal Alignment

Akin Akinlabi Mark Liberman

Rutgers University University of Pennsylvania

1. Introduction

Tone was one of the first areas of generative phonology where constraint-based

approaches were proposed (Leben 1973, Goldsmith 1976, Williams 1976). Such works

raised the hope that tonal phonology could be explained in terms of universal (if

parameterized) constraints operating on appropriate representations. However,

derivations based on language-specific rules were later argued to be necessary, in order to handle the wide range of attested tonological phenomena (Hyman and Ngunga 1994).

We propose that a successful account of tonal phonology, constraint based or not, requires enriching tonal representations to include some simple kinds of structures, which organize tones in somewhat the same way that segments, syllables and feet organize nontonal features. Our proposal is similar in spirit to that of Bamba (1991). Constraints mentioning such tonal structures can motivate deletion, epenthesis, spreading or reordering of tonal features, just as constraints on syllable or foot structure may motivate such processes in well-known cases of segmental phonology. Such structures license tones just as syllables license segments.

Specifically, this paper proposes a tonal unit consisting minimally of paired HL or LH tones. We argue that such units, long postulated as underlying elements in accentual systems, also play a crucial role in tonal phonology more generally.

1.1 Contour Formation in Benue Congo In the Benue Congo languages of West Africa, we often see a process in which the transition from one tonal level to another is delayed so as to create a salient tonal contour © Akin Akinlabi and Mark Liberman NELS 31 2 Akin Akinlabi and Mark Liberman on the syllable beginning the new level. In a notation in which vowels are tagged as "High", "Low", "Rising", "Falling", etc., this leads to the creation of rises and falls, as in

the examples below from Yoruba1 and Edo (Bini):

(1) (a) Yoruba rising example ala (LH) ala (L LH) → 'dream' (b) Yoruba falling example rara (HL) rara → (H HL) 'elegy' (c) Edo (Bini) falling example ekpo (HL) → ekpo (H HL) 'bag' Phonetically, F0 transitions are always contours, not step functions; however, the effects symbolized by the notations in (1) are not just the necessary phonetic consequence of tonal co-articulation. The most striking evidence for this assertion is the fact that a language may form phonologically-significant contours for some tonal sequences and not others. Thus Edo forms phonologically-significant contours in the case of High Low but not Low High sequences; while Yoruba does so for both Low High and High Low, but not for any case where Mid tone precedes or follows High or Low.

. The difference between the Yoruba H L and M L cases can be seen in the pitch tracks in (2). Each plot shows the F0 tracks for six VCV sequences, taken from the middle of longer phrases. The vertical line marks the point of release of the medial

consonant, here always a nasal or liquid. The HL cases were spoken in a relatively narrow pitch range, while the ML examples were spoken in a relatively wide pitch range.

As a result, the amount of pitch change is similar, but the timing of the pitch change is different: in the HL examples, the pitch fall does not begin until after the release of the medial consonant and continues throughout most of the following vowel, while in the ML examples, the fall is about half complete by the release of the medial consonant, and generally ends early in the following vowel. Informal experimentation suggests that this roughly (half-segment-long) difference in timing can be crucial for native-speaker perception of Yoruba tonal identity.2 In some level-tone languages, for example Igbo and Ijo (Kalabari, Nembe, Kolokuma), phonologically-significant contour tone formation does not appear to occur.

Note however that in a language where there is no phonological contrast between the presence and absence of contour formation, it may remain to be determined whether the situation should be described as invariable absence of contour formation, or rather invariable formation of contours.

For the past few decades, phonologists have generally followed Hyman and Schuh (1974:88) in treating this process of tonal contour formation as "tone spreading."

In this approach, all tonal specifications are built up from a small number of level tone primitives, such as High and Low, with rises and falls treated as Low+High or High+Low sequences respectively. Strings of tones are represented on a separate tier from strings of non-tonal segments, with the alignment of tonal and segmental strings indicated by association lines connecting them. In this perspective, a falling tone is just a tonal sequence HL associated with a single segment, as in (3a) below. When L follows H on successive syllables, as in (3b), such a contour can be created on the second syllable simply by adding an association line, as in (3)c.

Using arbitrary phonological rules or constraints, formal specification of the desired outcome in such cases is easy: in certain contexts, certain association lines are added; in other contexts, they are not.

2 Observations of these phenomena go back to the earliest linguistic treatments of the languages in question. Thus in Ida Ward’s 1952 work “An Introduction to the Yoruba Language”, she remarks that (p.

34) that “[t]he juxtaposition of high and low tones, either high-low or low-high, needs some comment.” Citing an example of the form HL, she observes that there is a fall on the second syllable that is “heard as a more or less deliberate glide,” and warns that “[u]nless the English speaker makes it, he is apt to give the impression of gliding down on the first syllable …, which does not satisfy the Yoruba.” 4 Akin Akinlabi and Mark Liberman As Hyman and Schuh observed in 1974, there is a general pattern to be accounted for. Such tone contour formation is common, but by no means universal. When it happens, the change is always in the direction of a delay rather than an advance of the F0 transition. It is possible to speculate about general articulatory reasons why F0-transition delays might be more likely to happen than F0-transition advances, or general acoustic reasons why delays might be more salient than advances. However, we will argue instead for an account of the asymmetry in terms of formal properties of the phonological representation of tone.

In current approaches to phonology, systematic changes like the one exemplified in (3) are attributed to the existence of general constraints that prefer the output to the input. However, there is no generally-recognized constraint that would motivate the kind of contour formation shown in (3). In fact, as we will see in detail later on, the facts of Yoruba seem to motivate a constraint that is ironically opposite to the well-known "wellformedness constraint" proposed by Goldsmith for tonal association: a tone spreads if and only if the adjacent syllable already has a tone.

Why then does tone contour formation tend to occur? We believe that the answer to this question is tied up with the answer to several other, apparently unrelated questions about tone and accent: Why do tone polarization and polarized tone epenthesis tend to occur? Why do multiple-tone sequences sometimes but not always simplify? Why do high-before-low raising and low-before-high lowering tend to occur?

None of these phenomena are inevitable, but all are commonplace, and typologically typical of tonal systems. We believe that all of these cases are symptoms of the same cause, namely the formation of tonal complexes. Tonal complexes are "bound states" of (two or more) unlike tones, such as [high low] or [low high], and we propose that they have a role in organizing tonal features somewhat analogous to that of moras and syllables in organizing segmental features.

On this analogy, tone contour formation is like re-syllabification, in which a coda consonant becomes an onset for a following syllable. Tone polarization and polarized tone epenthesis are like the epenthesis of vowels and consonants in rescuing forbidden or marked syllable structures. And the phonetic dissimilation of tone sequences is like the different phonetic interpretation of high vowels or nasals in onset vs. rhyme positions in syllables.

This is enough to suggest why (3c) is sometimes preferred to (3b). After spreading, the adjacent H and L tones form a tonal complex, and a constraint requiring tones to be bound into tonal complexes will then be satisfied.

As we have explained things so far, however, we have left a crucial observation of Hyman and Schuh 1974 unexplained. We have shown how to motivate the particular kind of tone-onto-tone spreading that results in tonal contour formation. However, we have not explained their important generalization that this process always delays and never advances the point of pitch fall or rise. For example, HIGH LOW always become HIGH FALLING, never FALLING LOW, even though both outcomes produce a tonal complex in our sense.

Using autosegmental notation, the result of an input like (4a) is always as in (4b) below, never as in (4c):

In order to explain this generalization, we will need to look further into the kind of thing that a "tonal complex" is. Hints are provided by two other general observations about the phonetic realization of tone, namely that the F0 target for a single static tone tends to occur at the (temporal) end of the associated phonetic region, and that the simplest cases of "dynamic" or contour tones force us to posit a second, earlier (phonetic) alignment point within the associated time region.

The plot in (5a) shows fundamental frequency as a function of time for the Igbo word ya meaning “he”. Although this monosyllabic word has high tone, the pitch is not a uniform level high. Instead, the pitch rises thoughout the syllable, with the peak value found near the end. In languages like Igbo and Yoruba, other things equal, the phonetic target value of a tone – the highest F0 of a High tone, or the lowest F0 of a Low tone – is found at the end of the span of time corresponding to the associated tone-bearing unit.

(5)a. Igbo: ya (“he”) b. Igbo: ama na ke (“Ama and Ike”) 6 Akin Akinlabi and Mark Liberman We need an additional F0 target at the start of the utterance. We can think of this as a junctural value, or as a default value, but in any case, it usually falls in between the target values of an initial High and an initial Low tone, so that a low-tone stretch would be falling, just as this high-tone syllable is rising.

When we look at tone sequences, and at phrases involving a longer stretches of tone-bearing units with the same tone, the same pattern generally holds. For example, consider the pitch track shown in (5b) for the Igbo phrase ama na ke “Ama and Ike”.

The first two syllables are High, but they are neither uniformly high, nor rising followed by high. Instead, there is a rise distributed over the two-syllable high-tone region, with the highest point falling at the end of that region. The next syllable is Low, and the low target is unsurprisingly at its end. The last two syllables are High again, and again the high target (lower than before because of downdrift and final lowering) is at the end of the two-syllable low-tone region.

Thus a crude but roughly correct way to synthesize sentential pitch contours for a

language like Igbo is:

1. Divide the utterance into maximal regions of like tone.

2. Place a mid-valued tonal target at the start of the utterance.

3. Place a tonal target at the end of each region, chosing an F0 value determined by the tonal type, downdrift/downstep, and final boundary effects if any.3

4. Interpolate linearly from target to target.

When we look at tone-bearing units that carry a contour tone, as in the Yoruba HL examples in (2), we can see the need to postulate additional phonetic anchor points at the beginning of some phrase-internal tone-bearing regions. We believe that these phonetic modeling practices point the way to an appropriate phonological structure for tone.

Let’s assume that in Igbo or Yoruba every tone-bearing region has two potential tonal targets. The one at the end is obligatory, while the one at the beginning may remain unfilled. We propose to reify this pair of potential tonal targets, and call it a “tonal complex.” In this sense, a tonal complex (TC) is an entity that binds two (or perhaps more) tonal positions to a tone-bearing region (which is one or more moras or syllables).

These two tonal positions are a bit like the onset and rhyme of a syllable: whenever one is present, the other "ought" to be there, even though it may often be lacking. And as with the onset and rhyme of a syllable, these structurally differentiated positions are not equally optional.

In the simple binary tonal complexes we are considering, the second tonal "hook" is the stronger of the two positions, and therefore is the default. If only one tone is available, it will go there, with the earlier tonal position in the TC remaining empty.

Therefore we can reformulate (6a) below as (6b):

In both (6a) and (6b), the s symbols in the top row stand for syllables (or other tone-bearing units). In (6b), we’ve used boxes to symbolize two-element tonal complexes. Each tonal complex as a whole connects to a tone-bearing unit, as indicated in the diagram by the fact that the upper association line connects to the box. (Although it is not shown in (6b), a tonal complex might also link to a string of adjacent tone-bearing units). Within the tonal complex, we’ve used an asterisk and a dot to symbolize the primary and secondary tonal association points. The tones link to these association points within the tonal complex.

