Journal of Phonetics 77 (2019) 100935
Contents lists available at ScienceDirect
Journal of Phonetics journal homepage: www.elsevier.com/locate/Phonetics
Special Issue: Integrating Phonetics and Phonology, eds. Cangemi & Baumann
Visible amplitude: Towards quantifying prominence in sign language Oksana Tkachman *, Kathleen Currie Hall, Robert Fuhrman, Yurika Aonuki The University of British Columbia, Vancouver, Canada
a r t i c l e
i n f o
Article history: Received 3 April 2018 Received in revised form 4 October 2019 Accepted 7 October 2019
Keywords: Prominence Sign language Visible amplitude Location Movement Hand quantity Optical flow analysis
a b s t r a c t While there has been some prior work on what characteristics can increase or decrease the phonetic prominence of a sign in a signed language, there is not yet an easily obtainable, objective measure that can be used to help quantify signal-based aspects of sign language prominence. This paper introduces a novel measure, visible amplitude, which provides a way to quantify the amount of movement contained on a frame-by-frame basis in a video, and as such, can be used as one measure of prominence. After a review of the literature that demonstrates how certain sub-lexical characteristics of signs (location, movement, and the number of hands employed) make signs ‘stand out’ phonetically, phonologically, and prosodically, the ability of visible amplitude to capture the effects of these characteristics is examined. It is shown that within a particular database of American Sign Language (ASL-Lex: Caselli et al., 2017), the number of hands involved in a sign’s production along with movement—either transitional movement due to differences in major location, or major movement due to phonologically contrastive differences in the sign’s identity—each contribute significantly to the overall visible amplitude in the sign. We review some long-standing claims about the lexical distribution of signs in light of this new measure, as well as propose possible future applications. Ó 2019 Elsevier Ltd. All rights reserved.
1. Introduction
Speech, being a product of the motion of physiological systems, is inherently multimodal. For example, the motion of the structures of the vocal tract during speech have visible consequences for the structure of motions of the face (Yehia, Kuratate, & Vatikiotis-Bateson, 2002) and head (Munhall, Jones, Callan, Kuratate, & Vatikiotis-Bateson, 2004), and gestural motions of the limbs are also correlated with the acoustic signal (Wagner et al., 2015). Perhaps it is unsurprising then that this multimodality extends to linguistic prominence as well, with manual gestures (Wagner et al., 2015), head nods, eyeblinks (Ambrazaitis & House, 2017; Prieto, Puglesi, BorràsComes, Arroyo, & Blat, 2015), and changes of facial expression (Swerts & Krahmer, 2010) exhibiting correspondence with pitch accents in the acoustic signal. The multimodality of speech and language would also suggest that we should expect to find many parallels between spoken language and languages realized in other modalities, for example, in sign languages. Indeed, sign languages, though * Corresponding author at: 2613 West Mall, Vancouver, British Columbia V6T 1Z4, Canada. E-mail address:
[email protected] (O. Tkachman). https://doi.org/10.1016/j.wocn.2019.100935 0095-4470/Ó 2019 Elsevier Ltd. All rights reserved.
soundless, have phonological structure, in the sense that they employ meaningless sub-lexical parameters such as handshape, location, and movement to create meaningful signs (Stokoe, 1960; Stokoe, Casterline, & Croneberg, 1965; for a more recent overview, see e.g., Sandler, 2005, 2012). They also employ intonation via a variety of manual but mostly non-manual means (Baker-Shenk, 1983; Boyes Braem, 1999; Brentari & Crossley, 2004; Dachkovsky & Sandler, 2009; Dachkovsky, 2004; Engberg-Pedersen, 1990; Nespor & Sandler, 1999; Pfau & Quer, 2010; Sandler, 1999; van der Kooij, Crasborn, & Emmerik, 2006; Wilbur, 1999). Prominence is another property of linguistic communication that is found in sign language. When it comes to prominence, it is important to recognize that it is inherently a perceptual phenomenon; that is, a unit (such as a word or syllable) is considered “prominent” if it in some way “stands out” from otherwise comparable surrounding units (see, e.g., Baumann & Winter, 2018; Himmelmann & Primus, 2015; Streefkerk, 2002; Turnbull, Royer, Ito, & Speer, 2017). Prominence is crucially context-dependent (Himmelmann & Primus, 2015) and can be influenced by a variety of characteristics, only some of which are signalbased. Cole, Mo, and Baek (2010), Bishop (2012), Turnbull
2
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
et al. (2017), and Baumann and Winter (2018), for example, all clearly demonstrate that prominence perception is influenced by signal-external factors such as listener expectations, which can come from e.g. word frequency as well as syntactic and discourse contexts of various kinds (e.g., edges of phrases being more prominent than centers; contrastive information being more prominent than non-contrastive). That said, it is also clearly the case that prominent units are often associated with signal-based enhancements, such as higher fundamental frequency, increased amplitude, and/or increased duration (see e.g., Shattuck-Hufnagel & Turk, 1996; Beckman & Venditti, 2010; Cole et al., 2019; along with Schweitzer; Smith, Erickson, & Savariaux; and Smith & Rathcke, in press, this Special Issue), and that these signal-based properties also influence the perception of prominence (see e.g. Baumann & Winter, 2018 for direct investigation of the relative importance of various factors). Sign language prominence is undoubtedly also a perceptual phenomenon, but most work to date has focused on the ways in which signers make the signal more or less prominent by manipulating certain features of signs (see, e.g., Sandler, 1999; Nespor & Sandler, 1999 for Israeli Sign Language; Boyes Braem, 1999 for Swiss German Sign Language; Wilbur, 1999 for American Sign Language; van der Kooij & Crasborn, 2008 for the Sign Language of the Netherlands, among others). Even within this signal-based examination of prominence, however, there is no consistent measure employed in those studies that would allow comparison across studies or even a way of quantifying the relative prominence of individual signs. This contrasts sharply with spoken language research, where the acoustic correlates of prosody are widely recognized to include some combination of F0, amplitude (intensity), and duration,1 with the relative prominence of linguistic elements being at least influenced by syntagmatic differences in these time-varying parameters (along with the nonsignal based parameters mentioned above). There have been some attempts in the past to come up with objective and reliable measurements of the signal-based aspects of sign language prominence. For example, Wilbur (1999) discusses the fact that characteristics such as duration, displacement, and velocity are all measurable components of signing and describes a prior (unpublished) kinematic study, Wilbur and Zelaznik (1997), in which infrared diodes on the hands are used to track the three-dimensional movement and velocity of a signer in action, precisely to obtain objective, quantifiable phonetic measures of stress. Indeed, stressed signs were found to be faster and involve more displacement than unstressed signs; duration was not found to be statistically significant. As Ormel and Crasborn (2012: 302) point out, however, “among the disadvantages of using kinematic
equipment are the unnatural signing environment and the impossibility of analyzing the growing number of video corpora of signed languages.” In this paper, we address these issues by introducing the notion of visible amplitude, a measure which functions as a visual analogue of acoustic amplitude.2 We outline a method for computing visible amplitude directly from video, without the use of intrusive and costly marker-based motion capture systems. We offer this as a measure that can be reliably computed from video of a signer’s motions, and enable the quantification of the potential perceptibility of a sign in terms of properties of the motions used in a sign’s production.3 As such, visible amplitude may be useful when used in combination with other parameters in quantitative models of prominence in sign language. As an initial step toward this larger goal, we present the results of an analysis of the visible amplitude of signs in the ASL-Lex database (Caselli, Sehyr, Cohen-Goldberg, & Emmorey, 2017) detailing the distribution of visible amplitude as it relates to certain sub-lexical parameters that contribute to the phonological structure of a sign (location, movement type, and the number of hands used). Although our analysis is limited to single signs produced in citation form, we show that the ways in which sign prominence have qualitatively been described in the literature correspond well with differences in visible amplitude. We also demonstrate that a long-standing proposal about the relationship between the lexical distribution of signs in signing space and their relative perceptibility (Siple, 1978) may be true for ASL and partially accounted for by visible amplitude. While the limitation to single lexical items prevents us from demonstrating that visible amplitude can account for syntagmatic differences in prominence, we suggest ways in which the measure could be used in this manner in the future. We also note that visible amplitude may be of use not just to sign language researchers, but could also be applied to research in spoken languages as a means of quantifying certain aspects of co-speech gesture, which itself can be a marker of prominence.4 Gesture research suggests that gestural cooccurrence with speech is governed by its own rules, which has led to the suggestion that speech and gesture are in fact parts of the same communicative system (see, e.g., McNeill, 1992, 2000; Kendon, 1972, 2000). Specifically with respect to prominence, research has shown that gestural apices have been shown to co-occur with prosodically prominent words in French (Roustan & Dohen, 2010), with pitch-accented and stressed syllables in Dutch (de Ruiter, 1998), English (Yasinnik, Renwick, & Shattuck-Hufnagel, 2004), and Brazilian Portuguese (RochetCapellan, Laboissière, Galván, & Schwartz, 2008), and with durational lengthening in Hong Kong Cantonese (Fung & Mok, 2018). It has also been shown that disrupting typically cooccurring patterns can disrupt prominence comprehension. For
1 It should be noted that in one of the most comprehensive examinations to date of different competing factors that influence prominence perception, Baumann and Winter (2018) find that discrete categories of signal-based properties were in fact the most predictive of perceived prominence by naïve German speakers – for example, the categorical presence (vs. absence) of a pitch accent, the accent position (e.g., nuclear / pre-nuclear / post-nuclear), and the accent type (e.g., rising / falling / high / low) were most predictive of their participants’ perception of prominent units. Of course, as Baumann and Winter point out, these are categorical labels for various combinations of more continuous characteristics, such as F0, amplitude, and duration, suggesting that signal-based characteristics are indeed extremely important in understanding the perception of relative prominence.
2 Although our primary intention is for this measure to be thought of as analogous computationally to acoustic amplitude, we note that it has interesting and perhaps even more direct parallels in the articulatory measure of displacement discussed in Smith, Erickson, and Savariaux (in press, this Special Issue), which in turn is reflected in acoustic differences in F1 (see also discussion in Hall, Tkachman, & Aonuki, 2019). 3 Note that testing the ways in which visible amplitude relates to perceptual salience is beyond the scope of the current paper. Our hypothesis, however, is that greater visible amplitude would be more perceptible, though whether the relationship is linear or not is an empirical question. 4 We are grateful to the journal editor, Taehong Cho, for pointing out this additional application.
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
instance, Swerts and Krahmer (2008) show how conflicting auditory information (pitch range and duration values) and visual information (manual beat gestures and head nods) affect the perceived location of prominent words in a phrase; that is, displacing visual cues hinders prominence perception. Fung and Mok (2018) suggest that for prosodic typology, research should be expanded to include both studies on prosody-gesture alignment and on languages with more diverse prosodic patterns. We agree with this claim and suggest that studies on gestural and prosodic prominence cooccurrence should investigate beyond temporal cooccurrences. For example, one might hypothesize that gestural apices display variation in magnitude depending on the relative prominence of the co-occurred focused spoken counterpart, and this could be demonstrated through the use of visible amplitude measures. Moreover, visible amplitude can be applied to non-manual kinds of visible information as well, such as head nods and eyebrow movements, both features that can co-occur with prosodic prominence in speech (Borràs-Comes, Vanrell, & Prieto, 2014; Loehr, 2007). Visible amplitude, therefore, could be a powerful tool for quantitative measurement of all types of multichannel communication. The remainder of the paper is structured as follows. First, we provide an overview of sign language phonological structure (§2). This is followed by a discussion of potential ways of measuring prominence in sign languages, including justification for our method (§3), and a detailed description the measurement of visible amplitude (§4). In §5 and §6, we provide an example application of this measure to the ASL-Lex database (Caselli et al., 2017), showing that differences in hand quantity, location, and movement can each be measured using visible amplitude. Finally, we discuss the findings and future applications in §7 and conclude in §8. 2. The building blocks of signs: parameters of interest and relevant terminology
To understand how signers might make signs more prominent (§3), we first need to introduce the reader to the basic notions of sign language phonology, i.e., a sign’s sub-lexical parameters. Three parameters (handshape, location, and movement) are believed to be cross-linguistically universal (Tatman, 2015), and play the most significant role in sign phonetics and phonology. Minimal and near-minimal pairs5 can be based on a contrast in one of these parameters (see, e.g., Figs. 1 and 2).6 Handshape is perhaps the most complex parameter: the human hand has between 25 and 30 degrees of freedom (Flanagan & Johansson, 2002; van Duinen & Gandevia, 2011), which allows for great intricacy. With all its complexity, however, handshape is more about the static shape of the hand rather than about movement during the sign. Previous studies on phenomena that might be related to sign prominence (which 5 It will be noted that in many of the images used here, there are, in addition to the primary contrastive differences the images are being used to illustrate, other differences, especially with respect to facial expression and head tilt. While non-manual markers can indeed be used contrastively in sign languages, none of the variation shown in the images here is, to our knowledge, conveying any crucial lexical information about the sign. 6 For practical reasons, we do not always include images or full descriptions of the signs being described. However, unless otherwise noted, all examples used in the paper are from ASL and can be seen in the ASL-Lex database (Caselli et al., 2017, freely available with all videos at www.asl-lex.org).
3
will be detailed in §3) do not show any indication that manipulations of handshape might be used to influence prominence. Thus, while it is not impossible that handshape might interact with movement, and thus interact with the measure of prominence proposed here, i.e., visible amplitude, we leave the investigation of handshape to future studies and focus instead on parameters believed to have a direct impact on prominence. Other sign parameters, such as hand orientation, nonmanual components (such as raising the eyebrows or tilting the head), and hand quantity (in the sense of whether one hand or two hands are involved in the sign’s production; see §2.3) are more language-specific. That is, only some sign languages have been reported to have minimal pairs based on these parameters (see Fig. 3 for a hand quantity near-minimal pair in ASL). Of these three, only one, hand quantity, has been noted to play a role in sign prominence (see §3). Thus, in this study, we focus on only three sign parameters, location, movement, and hand quantity, each discussed in more detail below. 2.1. Location
The location of a sign refers to where the sign is articulated relative to the signer’s body, and as will be shown in §3, the location of a sign may be manipulated to adjust a sign’s prominence. Signs may be classified as having both a major location (e.g., a region of signing space, such as the head) as well as a minor location within the major location area (e.g., the chin). Whereas phonetically the sign’s location is bounded only by the arms’ length, phonologically only a limited number of locations are exploited systematically; see Liddell and Johnson (1989) for relevant discussion. The set of phonologically possible locations appears to be quite large, however: Liddell and Johnson (1989: 148) list 51 locations on the body, 70 on the non-dominant hand, and 27 in space. These locations are not all necessarily contrastive, however; Sandler (1989) specifies only 16 phonologically important locations. Even though the specific number and identity of phonologically relevant locations depends on the theoretical model one adopts (see e.g. Brentari, 1998; Liddell & Johnson, 1989; Sandler, 1989; Stokoe, 1960; van der Hulst, 1993), signs can in general be grouped into four categories. First, signs can be articulated in neutral space, that is, the space in front of the signer’s upper body (“neutral-space” signs). Second, signs can be articulated with the dominant hand acting on the nondominant hand (typically called “unbalanced” signs though see discussion in §5.2 and in Battison, 1974). Third, signs can be body-anchored, that is, articulated with direct contact to some body location, such as the temple or the shoulder (“body-anchored” signs; see Fig. 1 for examples). Fourth, some signs that are not anchored to the body in terms of physical contact are still crucially signed at specified locations other than neutral space, usually somewhere around the head (“other-space” signs). The locations of such signs are usually specified in terms of the body location they are closest to (see Liddell & Johnson, 1989 for discussion). 2.2. Movement
As with location, there are both major and minor movements that are typically identified in a sign. Major movement
4
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
Fig. 1. Examples of a handshape- (left vs. central images) and location-based minimal pair (central vs. right images) in ASL: CANDY (left image), APPLE (central image), and ONION (right image). (We are following the convention in sign language literature to gloss signs with small caps (e.g., SIGN).) All three signs share the same twisting movement.
Fig. 2. Example of a movement-based minimal pair in ASL: CHURCH (right image).
CHOCOLATE
(left image) and
(also called primary or path movement) involves changing the position of the hand in signing space, and thus involves either the elbow or shoulder joint (e.g., POWER). Minor movement (also called secondary, local, or handshape-internal movement) does not involve a change in the hands’ location in space but does involve movement in some way, generally due to wrist and / or finger movements (e.g., PEACH). Some signs involve both major and minor movement simultaneously (e.g., WOLF). As with location, only a few kinds of movement are exploited within each of these two categories: in ASL, Liddell and Johnson (1989) cite four major movement options, which can themselves be produced across different planes of space, and nine minor movement options (though see Sandler, 1989, 1990; Corina, 1990). Potentially both types of movement
Fig. 3. A hand-quantity-based minimal pair in ASL:
could be modified for prominence, although minor movement is more restricted since it is confined to the hand itself (see §3). There is, however, another type of movement that is of particular relevance to the discussion of sign prominence, namely, transitional movement. Traditionally, signs have been analyzed as consisting of movements that are relevant to the sign’s lexical interpretation (the stroke, which may include both major and/or minor movement if the sign is specified for them) and also of transitional movements that exist by necessity of moving the hands from the place of articulation of one sign to the place of articulation of the next sign (see Blondel & Miller, 2001; Jantunen, 2013). The stroke starts when the hand(s) are in the sign’s place of articulation, the handshape is fully formed, and the major movement starts its course, and ends when the hands start moving away, towards the next sign’s location or the resting position. Transitional movements encompass any other non-minor movements involved in a sign’s production. In most phonological models of sign languages, transitional movements are excluded as irrelevant to a sign’s recognition and interpretation, since the sign’s formational features are most well-formed during the stroke (see e.g., Perlmutter, 1990; van der Hulst, 1993; Brentari, 1998, among others). This theoretical standpoint results in practices and guidelines for sign data collection and annotation where transitional movements are systematically not included (e.g., Crasborn & Zwitserlood, 2008; Johnston, 2009). Recently, however, evidence that even transitional movements may not be so irrelevant for signs’ perception and recognition has been
FAULT
(left; one-handed) and
TIRED
(right; two-handed).
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
accumulating (Jantunen, 2013; ten Holt, van Doorn, de Ridder, Reinders, & Hendriks, 2009). For example, signers are capable of recognizing (some) signs based on only their transitional movement (Clark & Grosjean, 1982; Emmorey & Corina, 1990; Grosjean, 1981; ten Holt et al., 2009). Specifically, gating studies such as Grosjean (1981) demonstrate that while seeing at least some of the major movement is required for signers to identify the sign with high confidence, location and handshape information are often available prior to isolating movement and this information helps narrow down the range of potential candidates. In studies such as that of ten Holt et al. (2009), 66% of signs were recognized based on their transitional movement alone (p. 229), and information such as handshape, orientation, and location can be inferred early on during the transitional movement. Of course, other factors such as the neighbourhood density of the sign play a role in this early recognition (e.g., signs with very common handshapes that are articulated in the neutral space are harder to predict; ten Holt et al., 2009), but one of the interesting characteristics of transitional movement is the fact that often the formational features of the sign (e.g., handshape, orientation) may already be present in the transitional movement (Jantunen, 2013; Xavier, Tkachman, & Gick, 2015). For example, Grosjean (1981) reports that whereas it took his participants approximately 396 msec to guess the sign correctly (with mean sign duration of 817 msec, p. 202), the location, orientation, and handshape were in fact isolated prior to the isolation point (at 307, 309, and 322 msec, respectively, p. 210). Emmorey and Corina (1990) report that it took their participants only 239 msec to recognize a sign (mean sign duration 703 msec, p. 1233), but that location and orientation were isolated sooner than handshape (150 msec for location and orientation vs. 170 msec for handshape, though handshape was still isolated prior to sign recognition, pp. 1236–1237).7 Furthermore, transitional movements are not entirely involuntary movements; evidence from studies on sign poetry shows that they are often modifiable (Jantunen, 2013) and may acquire properties typical for lexical movement (Wilbur, 1990). They can have syllabic status at the phonetic level, as in counting syllables or tapping (Wilbur & Nolen, 1986). In sign creation, they can be reanalyzed as major movement (Geraci, 2009). In continuous signing, transitional movements, especially transitions across major locations, help sign segmentation (Orfanidou, McQueen, Adam, & Morgan, 2015). Moreover, disambiguation of ambiguous argument structure in sign 7 Emmorey and Corina (1990) report that they did not find evidence of handshape anticipation in transitional movement. This conclusion, however, was based on the assumption that in signs produced in higher locations and thus with longer transitional movement, handshape information, if available, would be isolated sooner than in signs produced in lower locations, an assumption that was not supported by their findings (p. 1239). The fact that handshape was not isolated sooner in longer transitions, however, does not necessarily mean there was no handshape information available during the transitional movement. As both Grosjean (1981) and Emmorey and Corina (1990) report, signs articulated near the face took longer to identify than signs articulated in the neutral space. This could happen because during the large transition to the face area, multiple potential candidates produced with the same handshape but in different locations in the face area may compete with each other, which would delay recognition until the correct location is reached (e.g., signs such as APPLE and ONION in Fig. 1). Frequency is another possible factor; as Grosjean (1981) notes, the distance the hand(s) had to move in transitional movement was highly correlated with the sign’s frequency. Many if not most one-handed signs are produced in higher locations (e.g., in ASL-Lex, 55% of 1H signs were produced in the head area, see Table 2), which could again delay their recognition.
5
sentences takes place before the major movement and/or the facing of the disambiguating sign is visible (Krebs, Wilbur, Alday, & Roehm, 2018). Furthermore, electrophysiological research suggests that semantically unexpected signs elicit a typical N400 event-related potential (ERP) that is triggered prior to the onset of the critical sign, i.e., during the transitional movement (Hosemann, Herrmann, Steinbach, BornkesselSchlesewsky, & Schlesewsky, 2013). All of these considerations suggest that transitional movements may be a vital part of sign production and perception (cf. Xavier et al., 2015). This is perhaps not surprising, given the fact that sign languages are different from spoken languages in having articulators that are clearly visible to addressees and in having slower articulations. All of this said, however, we do not take a strong stand on this issue in the current paper; instead, we simply report how the measure of visible amplitude can be used to capture differences in signs, either including or excluding transitional movement. 2.3. Hand quantity
Another factor to consider is what is referred to in this paper as hand quantity, i.e., whether one or two hands are involved in the production of a sign. Fig. 3 shows an example of a hand quantity-based minimal pair in ASL (see Padden & Perlmutter, 1987, for more examples). As mentioned above, hand quantity is not always treated as one of the phonological parameters of sign languages, for a number of reasons. First, not all languages exhibit minimal pairs based on the number of hands employed (e.g., Tatman, 2015, indicates that only around 14% of the sign languages included in her survey of 87 signed languages are actively claimed to have hand quantity as a phonological parameter). Second, even those languages that do have hand quantity-based minimal pairs tend to have very few such pairs (e.g., Johnston & Schembri, 1999, claim that in Australian Sign language (Auslan) there is only one such pair, SCISSORS and CRAB; see http://www.auslan.org.au/dictionary/ for videos). Third, the behaviour of the non-dominant hand leads many researchers to conclude that it is not an independent articulator in signing (see discussion in Crasborn, 2011). It is true, however, that it is easier to notice two moving objects than just one (Bruce & Green, 1990), and this simple observation has consequences for the notion of prominence (for example, in sign shouting two-handed versions of signs are preferred over one-handed versions; see §3.2). We thus include hand quantity as a parameter of interest in our measure of prominence, visible amplitude, without intending to make any claim about the exact linguistic status of the nondominant hand. 3. Prominence in sign languages
As mentioned in §1, there are a variety of signal-based acoustic characteristics in spoken languages (primarily, F0, duration, and amplitude) that can be manipulated to contribute to an increase or decrease in the perceived prominence of syllables or words (see, e.g., Shattuck-Hufnagel & Turk, 1996, for an overview and Gordon, 2011, for more on stress; van der
6
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
Hulst, 2011, for more on pitch accent; and Parker, 2011, for more on sonority; as well as more general discussion of the wide variety of approaches to prominence in e.g. Wagner et al., 2015; Baumann & Winter, 2018). Over the years of sign language description and sign language research, a number of observations have been made about ways in which a sign, too, can be made physically more prominent, compared to (a) its own citation form (as given by signers in isolation and whose meaning is clear without context, or as it is listed in sign dictionaries, Johnston & Schembri, 1999) and / or (b) other signs surrounding it. This literature is summarized below, but importantly, most of the studies point to the same few manipulations of a sign’s location, movement, and hand quantity: the height of the sign’s place of articulation, the size of its movement, the number of movement repetitions, and / or the number of hands employed in its production. 3.1. Quantitative approaches
One issue is that much sign language research treats these parameters of sign language phonology as categorical and subjective, as described above in §2. That is to say, which time-varying and quantifiable parameters of the visible motion signal are relevant to understanding prominence from the ‘physical perspective’ (Wagner et al., 2015) is still a relatively open question. Therefore, how such parameters might interact with the categorical properties thought to play a role in sign language prominence is likewise an unsettled issue. That said, the question of which physical / signal-based parameters are most important for sign language prominence has received some attention in the literature. As mentioned above, Wilbur and Zelaznik (1997) (discussed in Wilbur, 1999) showed that stressed signs reach higher peak velocity than unstressed signs, and that final signs have a higher peak velocity than non-final signs.8 They also showed that both final and stressed signs are larger than non-final and unstressed signs because they travel longer distances (displacement). But a subsequent study which sought to clarify whether perceivers utilize information from the velocity signal specifically for interpreting prosodic structure, as opposed to other physical properties of motion such as acceleration and jerk (the third derivative of displacement), was inconclusive (Wilbur & Martínez, 2002). Although deaf perceivers exhibited an overall preference for either velocity or acceleration, whether they preferred one or the other depended on the producer. Indeed, the authors conclude that the ‘open question’ posed by their study is which prosodic information is encoded in acceleration, and which is encoded in velocity. Relatedly, Jantunen (2013) reports that transitional movements exhibit higher velocities on average than strokes, but strokes exhibit higher acceleration. If we accept these results at face value, then this might suggest that both transitional movements and major movements contribute to prosodic structure in sign languages. However, it is worth noting that that study used a single flesh-point marker placed on one finger of a signing hand to derive its motion signals, rather than 8
Note, of course, that structural position can itself be an indication of prominence in both signed and spoken languages (e.g. Baumann & Winter, 2018; Himmelmann & Primus, 2015; Wilbur, 1999).
considering the motion of the whole signer. In any case, which aspects of the visual signal are most important for the perception of prosody, and how these properties interact with signs and transitional movements, is not clear at this point. 3.2. Qualitative approaches
In this section, we review several phonetic and phonological phenomena that can be manipulated to increase of descrease a sign's loudness, prominence, and/or perceptibility. The goal is to determine which sign parameters are manipulated to that end and in what way. Unlike the measures mentioned in the previous section, most of the measures here have been discussed from a qualitative perspective; this is not to say that they could not be quantified, just that so far, they have mostly not been. Indeed, we are attempting to understand which characteristics of prominence could in fact be usefully quantified. First, some studies have investigated lexical stress in signed languages, i.e., the process of highlighting a sign at the phrase or utterance level,9 which has an obvious relationship to prominence in terms of also being about relational differences between adjacent signs. Friedman (1974) claimed that in ASL, the major changes in signs under stress are to movement and body contact, and that the type of body contact in the unstressed form of the sign may affect how the movement is modified under stress. Specifically, signs with a separate contact before and after the major movement tend to gain an arching movement under stress, while signs with continuous contact retain their contact under stress and involve a slowing of their movement. Wilbur and Nolen (1986) showed that stressed signs in ASL also have a larger number of syllables (that is, more iterations of the stem). Wilbur and Schick (1987) confirmed this earlier result and also showed that stressed signs in ASL tend to have locations higher in the signing space relative to unstressed productions (see also Coulter, 1993), with more muscle tension and with sharper boundary transitions. Similarly, van der Kooij and Crasborn (2013) studied prosodic correlates of focus in NGT, and found that some signs that lack a major movement in their lexical form are articulated higher in the signing space when in a focused position. Sign languages also use systematic modifications to reflect prosodic structure. Duration, repetition, and movement size are often used to mark phrase boundaries (see Ormel & Crasborn, 2012, for a review). Numerous studies have noted that utterance-final signs in ASL have longer duration (Coulter, 1990; Grosjean, 1979; Liddell, 1980; Perlmutter, 1992, 1993; Wilbur & Nolen, 1986; Wilbur, 1999). Similar observations have been made for other sign languages (e.g., see Boyes Braem, 1999 for Swiss German Sign Language and Nespor & Sandler, 1999 for Israeli Sign Language). Furthermore, at the ends of intonation phrases, holds and pauses may be longer; reduplicated signs may have more iterations; rate may be slowed; and size may be increased (see also Sandler, 1999; Boyes Braem, 1999 for Swiss German Sign Language; Miller, 1996 for the Sign Language of Quebec). 9 An anonymous reviewer points out that in contemporary intonational phonology (on spoken language), this would usually be termed accent. But since here we are reviewing studies that employed the term stress rather than accent, we follow the terminology of these original studies and refer to lexical stress.
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
Finally, like spoken languages, signed languages show phonetic variability that could be considered a manipulation of some components of prominence, i.e., phonetic enhancement and reduction. Increased duration, repetition, and movement size are sometimes used paralinguistically for emphasis, supporting Nespor and Sandler's (1999) suggestion that they add prominence. Brentari (1998) proposes a phonological feature hierarchy for signed languages that draws on Enhancement Theory (Stevens & Keyser, 1989; Stevens, Keyser, & Kawasaki, 1986), making the argument that certain phonological characteristics serve to provide phonetic enhancement to more basic or primary features. Specifically, she argues that proximalization, that is, increasing the size of the sign by employing joints closer to the body, e.g., an elbow instead of a wrist, serves as an enhancement to more basic aperture features. For instance, she shows that proximalization can be used to enhance the sonority of a sign, specifically claiming that the perceptual consequence of increasing sonority is greater visibility (cf. greater audibility for increased sonority in spoken languages) (Brentari, 1998: 217). More generally, one can also consider signed shouting and whispering as phonetic enhancements and reductions. Even though shouting and whispering are not necessarily manipulations of phonological prominence, they do involve an increase or decrease in acoustic amplitude in spoken languages, concomitant with some kind of contextual prominence, and similar observations have been made for signed languages. Because these phenomena are well documented in terms of their consequences for the linguistic signal, they provide a useful starting point for understanding what phonetic characteristics are available for signal-based prominence manipulation. Sign whispering refers to situations when the signer addresses just one interlocutor and tries to conceal their signing from other people, while sign shouting refers to situations when the signer addresses several interlocutors, such as signing to a group of friends, or in more formal registers, such as giving a lecture, thus making the signing more salient at a global level than it would otherwise be. In sign whispering, onehanded variants of signs are often preferred to two-handed ones, the movement is reduced, and the hands move closer to the body and are lowered. Sign shouting, on the other hand, involves larger head and body movements, and the signing space is larger than normal (both higher and wider), thus increasing the salience of the signing generally (see Emmorey, 2001, on ASL; and Crasborn, 2001, on the Sign Language of the Netherlands (Nederlandse Gebarentaal, NGT)), though of course there may still be variation in relative phonological prominence between elements within the shouted utterances. Signs articulated in neutral space are often articulated higher and further from the body, and in twohanded signs the hands are located further apart from each other (Crasborn, 2001). Often these modifications are achieved via proximalization, as defined above. Other sign characteristics such as hand opening and non-manual features such as mouthing can also be hyper-articulated to enhance the shouting effect (Crasborn, 2001). Crasborn (2001), discussing shouting in NGT, notes that in addition to movement modifications, shouting also involves a lot of movement and orientation changes. Signs without major movement typically involve repeated movement at metacarpophalangeal
7
joints (the joints that connect the fingers to the palms). These characteristics can also be observed in other registers of signing. As Baker and Cokely (1980) note, in a formal register (e.g., a lecture or sermon), signers also employ two-handed sign forms articulated in a larger signing space, but the signing space itself is not raised. Mauk (1999) found similar effects in signing to a distant addressee. Thus, with phenomena such as whispering and shouting, we see employment of sign modifications that involve changes in hand quantity; movement reduction or enhancement achieved via different joints; repetition; and location and orientation manipulations. 3.3. Summary of prominence in sign language
The preceding overview of previous research was intended to show that in order to make a sign more prominent (in a variety of different ways), the same few formational features are manipulated: a sign can be articulated in a higher location than in its citation form, its movement can be reiterated, the movement’s size can be increased via displacement or proximalization, a sign’s temporal duration can be increased, and a twohanded version of a one-handed sign can be used. Table 1 summarizes how these characteristics are distributed across different types of sign language prominence phenomena. Future investigation of the physical correlates of sign languages that lead to differences in prominence would surely benefit from having a measure that is more easily obtainable, i.e., can be extracted from video recordings of signs rather than requiring invasive motion capture equipment to be used for data collection. Along these lines, we propose that a measure derived from velocity, visible amplitude, might in fact be a more appropriate measure than the raw kinematic signals for quantifying prosodic structure by functioning in a manner analogous to acoustic amplitude in spoken language (see §4). Visible amplitude should allow us to directly quantify the effects of three of the major categories listed in Table 1; specifically, location, movement, and hand quantity. Because of the way that visible amplitude is calculated, however, duration does not have a predictable effect on it: longer durations may be the result of slower movements (which, all else being equal, result in less visible amplitude) or bigger movements (which, all else being equal, result in more visible amplitude), as will be shown in §4. Thus, to the extent that duration depends on movement, it too is encapsulated by visible amplitude, but it does not have the same straightforward interpretation. On the other hand, duration is relatively straightforwardly calculated from a video independently. The details of what visible amplitude is and how it can be calculated are provided in the next section. 4. Measuring visible amplitude
Audible energy is a general notion in the description of spoken languages (see, e.g., Diehl, 2008). As the sound wave caused by vocalization moves, the energy stored in the wave applies a force to the surrounding medium, sending it into motion. We hear (or measure) this time-varying process as acoustic intensity, which is the power delivered by the wave over some area (with physical units of Watts/m2). This property of the acoustics is reflected in the time-varying amplitude of the acoustic waveform, which is commonly computed using the
8
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
Table 1 Summary of formational features manipulated for phenomena related to prominence in sign languages. Lexical stress Location: Higher location Movement: Reiteration Larger size (displacement) Larger size (proximalization)
Prosodic structure marking
U U U
U U U
U U U
Handedness: Two-handed variant Duration: Longer duration
Phonetic enhancement
U U
root-mean square, or RMS (equation (1)) (e.g. Yehia et al., 2002), which is the square root of a signal’s power.10
RMS Amplitude ¼
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 2 ðx þ x 22 þ þ x 2N Þ N 1
ð1Þ
By analogy to the relation between audible energy and acoustic intensity defined in terms of power, we measure visible amplitude using the root-mean-square of a motion velocity time-series extracted directly from the video of a sign. In order to compute the velocity, we make use of a video analysis technique known as Optical Flow Analysis (OFA; see e.g., Barbosa, Yehia, & Vatikiotis-Bateson, 2008; Fleet & Weiss, 2006; Hall, Letawsky, Turner, Allen, & McMullin, 2015; Hall, Smith, McMullin, Allen, & Yamane, 2017; Horn & Schunck, 1981; Moisik, Lin, & Esling, 2014). Optical flow itself is the distribution of apparent motion velocities of objects in the image plane, which can arise, for example, due to the motion of a perceiver relative to their environment (Gibson, 1966). This basic idea is leveraged to determine motion in video by using changes in pixel brightness from one frame to the next to estimate their apparent movement velocity (Horn & Schunck, 1981). This results in an optical flow field, which contains the apparent motion vectors for all pixels in the image plane; see Fig. 4. For this study, the optical flow estimation technique described in Horn and Schunck (1981) was used, as implemented in the open-source FlowAnalyzer software developed by Barbosa (2013). FlowAnalyzer computes the optical flow field, and additionally reduces it to a single scalar value per frame-step by summing the magnitudes of all of the pixel velocity vectors in the field and dividing by the total number of pixels (see the definition provided in Fig. 4 and Barbosa et al., 2008, for further details). This gives the average magnitude of motion velocity within the field for a particular framestep. This measure is computed for each frame-step in the entire video, yielding a time series that reflects the overall 10 It is worth emphasizing that the meaning of the terms energy and power in signal processing differ from their use in physics (e.g., Lathi & Green, 2014); the energy of a signal being its sum of squares, and its power a time-average of the energy measured over some interval. These quantities do not always have a clear physical interpretation. But in the case of the speech waveform, whose fluctuations are structured by the movement velocity of a pressure wave, the energy (power) of the signal can be interpreted as representing the kinetic energy of the acoustic waveform, with kinetic energy being proportional to the velocity squared.
structure of motion in the image plane at a given time step, and changes in this structure through time. We refer to this as the velocity magnitude time series. For example, two adjacent frames in a video with four apparently moving pixels11 as in Fig. 4(a-b) might result in the optical flow field shown in Fig. 4(c). In the first frame, the four pixels appear to be adjacent to each other; in the second, they have each apparently moved away from the centre point. Each arrow (vector) in the optical flow field thus represents the magnitude and direction of apparent movement of each pixel from one frame to the next. The magnitude of each pixel’s motion (zj) is calculated as the norm of the pixel’s velocity vector (vj) (Fig. 4(d)), i.e., as the square root of the sum of the squares of the horizontal (xj) and vertical (yj) vector components. These multiple separate magnitudes are reduced to a single measurement, mi, by averaging them (Fig. 4(e)). The unit for this average magnitude measure is pixels per frame step. Finally, we then calculate the visible amplitude as the rootmean-square (RMS) of the velocity magnitude time series output by OFA (Fig. 4(f); cf. Eq. (1) above). This measure squares the per-frame-step pixel velocity magnitudes, sums them, computes the temporal average of this sum of squares by dividing by the number of frame-steps in the time series, and then takes the square root (cf. Eq. (1)).12 It is important to note that in principle, the RMS can be computed within a moving window, which is how it is typically applied when it is used to compute the timevarying amplitude of the acoustic waveform. In the analysis reported in §5, we instead compute the RMS using the entire time series, which yields a measure that is proportional to the time average of visible amplitude measured over the entire sign (with the proportionality being due to the square root). While a more fine-grained temporal analysis of visible amplitude is certainly possible, we limit our preliminary application of the measure to this case, since it enables us to compare the visible amplitude of different lexical items while controlling for duration. This is an important first step toward establishing the viability of visible amplitude as a parameter in models of sign language prominence, where it would more than likely be combined with duration. Also, controlling for the duration of a sign enables us to better understand the relation between visible amplitude and a sign’s sub-lexical parameters. This is also an important step toward establishing how other sign language parameters might interact with visible amplitude in models of sign language
11 Of course, the pixels on the screen do not actually change location. Instead, as mentioned earlier, their apparent movement is captured by tracking the location of pixels with analogous luminosity from frame to frame. 12 It is worth mentioning that use of the square root in the calculation of visible amplitude is not strictly necessary, but we elect to use it here to maintain consistency with the use of RMS in the calculation of acoustic amplitude, as discussed previously. Use of the square root has the benefit of casting the measure’s interpretation in terms of energy’s contribution to the amplitude of the visible motion signal, again, in analogy to the acoustic signal. Along these lines, one can see that use of the square root gives the measure units of pixels / frame-step—the reader can recognize that this is the case by noting that the RMS is simply the standard deviation of the data, without centering (alternatively, the two are equivalent in the case where the data has zero-mean). Abandoning the square root and instead using the mean-square velocity would give the measure units of (pixels2)/(frame-step2). While this would result in a tighter analogy to the units of kinetic energy (12 m v 2 ) and directly reflect the power of the visible motion signal, we believe the benefits of maintaining consistency with the use of RMS as a measure of acoustic amplitude (cf. Schweitzer, in press, this Special Issue), and the emphasis on the importance of energy’s contribution to amplitude, outweigh this concern.
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
9
Fig. 4. Optical Flow Analysis and visible amplitude calculation.
prominence, since several of these factors are thought to work together to determine a sign’s prominence (see §3). Further, it should be noted that OFA is known to be a reliable technique for extracting the velocity signal from video. For example, velocity signals of orofacial motion extracted with OFA reliably correspond to those extracted from marker-based motion capture systems (Barbosa et al., 2008) (i.e. with correlation coefficients of 0.98), and signals from videos of different sizes (in terms of the number of pixels in the image), can easily be compared with proper normalization by the image size (as is performed automatically by the FlowAnalyzer software, discussed previously). Moreover, Barbosa et al. (2008) point out that “the temporal variation of this seemingly impoverished measure is surprisingly well-coordinated with time-varying measures made in other domains (for example, the RMS amplitude of the speech acoustics),” citing in turn Barbosa, Yehia, and Vatikiotis-Bateson (2007), who demonstrate correlations between video of the hands / face of a person speaking an oral language and the acoustic properties of their speech.
The signs in the database are all recorded by one female, middle-aged, white, deaf native signer of ASL. In addition to the accessibility of the data, this source was chosen because of its relative uniformity of presentation, which is important for several reasons. First, Mauk and Tyrone (2012) found preliminary evidence that the signing space is not uniform across signers. That is, not all signers use the same amount of space in relation to their own anatomical limits, and signers may also vary in terms of the extent to which they use the same areas of space. Thus, given that the focus here is on the relative visible amplitude of different signs, it was important that the signs all be produced by the same signer to avoid issues of signer normalization. Furthermore, although the signs in ASL-Lex were not all recorded entirely identically (e.g., there is some variation in the details of the signer’s presentation in terms of clothing, hair, and exact location relative to the camera, as can be seen in Figs. 1 to 3), they are all recorded in a consistent manner, with the signer starting each sign with her hands below the video frame, producing a sign in isolation, and then returning her hands below the frame.
5. Visible amplitude in American Sign Language
5.2. Coding
To test the viability of visible amplitude, we applied the measurement technique to a database of American Sign Language. Of interest is how each of three particular sub-lexical parameters of a sign’s structure, namely, major location, major movement type, and hand quantity, contribute to a sign’s total visible amplitude.
In addition to the set of videos of 993 signs, ASL-Lex is extensively coded for lexical and phonological information about each sign. Of particular relevance to the current study is that each sign is coded for its major movement type (with the categories “circular,” “curved,” “straight,” “back-and-forth,” “none,” or “other”), its major location (with the categories “head,” “body,” “neutral,” “arm,” and “hand”),13 and its hand quantity type (with the categories one- or two- handed, and within two-handed, what they call “symmetrical-or-alternating” vs. “asymmetrical” movements of the two hands). In the field of sign language linguistics more broadly, the terms “balanced” and “unbalanced” are used, and here, the shorthand terms “1H” for one-handed signs, “2HB” for two-handed balanced
5.1. The ASL-Lex database
The database used for this study is ASL-Lex, which is freely available online (http://asl-lex.org/) and described in Caselli et al. (2017). This is a database of 993 signs from ASL, with the choice of signs drawn from several sources and intended to provide a frequency-balanced set of words. As a general proposition, fingerspelled words and classifier constructions were not included, though see Caselli et al. (2017: 787) for details.
13 It should be noted that when the sign has multiple locations, only the first location of the dominant hand is coded in ASL-Lex. For example, the signs for MAN and AGREEMENT are each coded as head-located, but they start at the head and end in a different location, lower in the signing space.
10
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
signs, and “2HU” for two-handed unbalanced signs are adopted. 2HB signs are defined as those in which the two hands have the same handshape, location, and movement.14 In 2HU signs, the non-dominant hand typically serves as the place of articulation for the dominant hand,15 and the non-dominant hand either matches the handshape of the dominant hand or takes on one of seven unmarked handshapes (Battison, 1974). It should be noted that while the coding schemes for location and hand quantity align directly with the properties in Table 1 that are associated with changing prominence in sign languages, the coding for major movement type is less direct. The movement-related parameters in Table 1 are about increasing the movement relative to a baseline, by adding repetitions or expanding the size of the movement in space. The ASL-Lex coding for major movement type, however, is about the path of the movement. That said, different major movement paths might themselves be expected to result in different total amounts of movement (e.g., because a straight line between two points would be a shorter path than a curved line between the same two points).
5.3. Data selection
In the original ASL-Lex videos, available for public viewing at http://asl-lex.org/, some of the videos have been clipped so that the sign does not finish with the signer’s hands back in the rest position (an example of this can be seen with the sign for WATER). Of the 993 signs in the ASL-Lex database, we found that 230 had this kind of clipping (approximately 23%). Because the approach to OFA we use here takes into account the motion of the entire sign, including transitional movements, and not just the stroke of the sign,16 we decided to simply discard these signs from further analysis, leaving 763 signs available for analysis. Of these 763 signs, 31 are coded in ASL-Lex as being compound signs, that is, sequential combinations of two or more other signs (e.g., the sign for BEDROOM is a compound of the signs for SLEEP and ROOM). These were excluded from the current analysis, in order to ensure that the analysis stays at the level of individual lexemes. Compounds, being a combination of two or more signs, would be expected to have more visible amplitude than individual signs, and for this reason at this initial stage of our investigation we opted for excluding them from the dataset.17 14 The movement in a balanced sign can be either “symmetrical” or “alternating” in manner (in ASL-Lex, CAN is an example of a symmetrical 2HB sign, and CAR is an example of an alternating 2HB sign). 15 There are exceptions to this. For example, in the sign for SHEEP, the non-dominant arm serves as the location forthe sign, and what makes this sign crucially different from a 2HB sign with an arm location (such as TABLE) is that the handshape of the non-dominant hand in the 2HU sign SHEEP is different from that of the dominant hand. In the ASL-Lex coding system, 2HU sign can also be signed in neutral space, with the non-dominant hand serving as a reference point for the dominant hand, e.g. in signs such as BAKE or SCAN. 16 We do, however, also examine some stroke-only subsets of the data in §6.2 for comparison. 17 We note that there are other multimorphemic signs that are not compounds that were nevertheless included in the analysis. For example, the sign for 5-DOLLARS involves the FIVE handshape, whereas signs for other dollar amounts (not included in ASL-Lex) would involve the same location and movement, but use the handshape of the relevant number to be indicated. It is not a sequential compound, however, and we do not know of any reason to suspect that such signs would a priori have more visible amplitude than non-multimorphemic signs.
Additionally, 12 signs were coded as having a sign type of “other” rather than 1H, 2HB, or 2HU, and so were excluded, along with the sign for TAIL, which was excluded because it was produced with a location that was otherwise not represented in the database. Excluding compounds, clipped signs, and signs with an atypical sign type or location, 691 total signs were included in the analysis.18 5.4. Procedure
Optical Flow Analysis was performed on the video of each of the 691 included signs using FlowAnalyzer, which yielded the velocity magnitude time series for each sign. The RMS of this time series was computed as a measure of visible amplitude, as described in §4. The relationship of visible amplitude to the coded characteristics of signs in the database was examined, to see how each type of characteristic contributes to visible amplitude. 6. Results
The primary focus of this paper is the way in which three different sub-lexical characteristics might affect visible amplitude: hand quantity, location, and movement. Before taking visible amplitude into account, however, we first examine the baseline distribution of the signs within these three categories. 6.1. Distribution of sign types
Of the 691 included signs, 282 were one-handed signs (‘1H’), 238 were two-handed balanced signs (‘2HB’), and 171 were two-handed unbalanced signs (‘2HU’). Fig. 5 illustrates the distribution of major movement across these three hand quantity types, Fig. 6 illustrates the distribution of major location, and Table 2 summarizes these distributions numerically. Of particular note in looking at Fig. 5 as compared to Fig. 6 is that the distribution of major movement types is decidedly different from that of major locations, across the three hand quantity types, a point that we will return to in discussion in §7. The distribution of major movement types is quite consistent, with the majority of major movements being “back and forth” for all three hand quantity types, and very similar proportions of each hand quantity using each of the other major movement categories (straight, curved, circular, and other). A Chi-square test on the distribution of all major movements across the three hand quantity types is significant, however [v2(10) = 23.67, p = 0.01], indicating that the distribution is not in fact entirely consistent. As seen in Table 2, the primary discrepancy is that there is a higher proportion of 1H signs with no major movement (22%) than there is for either 2HB (11%) or 2HU (11%) signs. Doing a Chi-square test on the subset of the data that does not include the category “None” for major movement shows that there is no other significant discrepancy [v2(8) = 7.36, p = 0.50]. For major location, however, the distribution varies much more drastically. A Chi-square test of the distribution of all major locations across the three hand quantity types shows the distributions to be highly significantly different [v2(8) 18 Some signs were excluded for more than one reason (e.g., compound and coded as having a sign type of ‘other’).
AGREEMENT
was both a
11
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
Fig. 5. Distribution of major movement types across hand quantity types. 1H indicates a one- handed sign; 2HB indicates a two-handed, balanced sign; and 2HU indicates a two-handed, unbalanced sign.
Fig. 6. Distribution of major locations across hand quantity types. 1H indicates a one-handed sign; 2HB indicates a two-handed, balanced sign; and 2HU indicates a two-handed, unbalanced sign.
Table 2 Raw counts and percentages of each sign type that have each major movement or major location characteristic. Sign Type 1H
2HB
2HU
Major Movement
None Straight Curved Circular Back-and-Forth Other Total
63 (22%) 55 (20%) 39 (14%) 28 (10%) 83 (29%) 14 (5%) 282 (100%)
27 (11%) 51 (21%) 35 (15%) 29 (12%) 90 (38%) 6 (3%) 238 (100%)
18 (11%) 41 (24%) 17 (10%) 23 (14%) 66 (39%) 6 (4%) 171 (100%)
Major Location
Head Body Neutral Arm Hand Total
156 (55%) 35 (12%) 84 (30%) 7 (3%) 0 (0%) 282 (100%)
38 (16%) 34 (14%) 164 (69%) 0 (0%) 2 (1%) 238 (100%)
7 (4%) 1 (1%) 12 (7%) 9 (5%) 142 (83%) 171 (100%)
= 681.37, p < 2.2e16]. This is partially due to the ASL-Lex coding system; the major location of “Hand” is used almost exclusively for 2HU signs, but “Hand” might be better considered as comparable to a “minor” location. Most of these
hand-located signs involve both hands occurring in what might be considered the neutral space, in addition to involving one hand acting on the other, and so the 2HU signs are not as entirely different from the 2HB signs as they might seem at first
12
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
glance, at least in terms of their position in the signing space and the articulatory effort of moving the hands into their place of articulation. However, there is still a large difference between the 1H signs and either of the 2H signs: the majority of the 1H signs (55%) are produced at the head, while only 16% of 2HB signs and only 4% of 2HU signs are produced at this location. Meanwhile, the majority of both 2HB (70%) and 2HU (90%) signs are in the neutral location (if one includes “Hand” as a neutral location), while only 30% of 1H signs are produced in neutral space. We can also examine the interaction of movement type and location, as shown in Fig. 7 and summarized numerically in Table 3. Fig. 7 shows that there is no particular interaction between major movement and major location; the distribution of major movement types is relatively similar for each major location. The biggest difference is that body-located and armlocated signs tend to consistently have some sort of major movement, while a larger proportion of other-located signs have no major movement. A Chi-square test on the distribution of major movement types within just the head-, neutral-, and hand-located signs, however (which encompasses 88% of all the data), indicates that there is no significant difference in the distributions across these three locations [v2(10) = 10.66, p = 0.39].
We now move on to an examination of the visible amplitude in the various types of signs. First, Fig. 8 shows the average amount of visible amplitude for each of the three hand quantity types, regardless of movement or location.19 As can be seen in Fig. 8, each of the 2H types have more visible amplitude than the 1H signs. An ANOVA testing for the influence of hand quantity type on visible amplitude indicates that hand quantity type is statistically significant [F(2, 688) = 117; p < 2.2e16], and subsequent post-hoc t-tests indicate that the 1H signs have significantly less visible amplitude than either of the 2H signs [t (427.08) = 14.12, p < 2.2e16, Cohen’s d = 1.27 for 1H vs. 2HB and t(309.22) = 11.17, p < 2.2e16, Cohen’s d = 1.13 for 1H vs. 2HU].20 Perhaps surprisingly, there was also a statistically significant difference between the two types of 2H signs [t (389.92) = 2.19, p = 0.03, Cohen’s d = 0.22]. Given the much smaller effect size for this comparison, it is less likely to be particularly meaningful (see e.g., Cohen, 1988; Sawilkowsky, 2009). The difference in effect sizes might stem from the fact that in 2HU signs, one hand is often used as a passive “place” of articulation for the other hand, and thus contributes less to visible amplitude than an active articulator in a balanced sign, but this hypothesis is beyond the scope of the current project.
These results support the hypothesis that hand quantity type (1H, 2HB, or 2HU), which may be one contributing factor to prominence, can be measured by visible amplitude. Note, however, that despite having twice the number of articulators, two-handed signs do not simply have twice the visible amplitude of one-handed signs. In all signs, there will be some visible amplitude that could be considered “background” energy coming from movements of the body, head, and face (see §6), which would mean that two-handed signs would not, even in idealized circumstances, have twice the visible amplitude of one-handed ones. Next, we consider the interaction among hand quantity type, major movement, and major location, and how each contributes to visible amplitude. First, Fig. 9 shows how visible amplitude varies across different major movement types for each of the three hand quantity types. Perhaps surprisingly, visible amplitude does not vary significantly regardless of the type of major movement involved, within a given hand quantity type. ANOVAs on each of the hand quantity type datasets shown in Fig. 9 show no significant effect of major movement on visible amplitude: for 1H signs, F(5, 276) = 0.985, p = 0.427; for 2HB signs, F(5, 232) = 1.96, p = 0.09; and for 2HU signs, F(5, 165) = 0.62, p = 0.69.21 This result is perhaps especially surprising given that one of the possible movement types is “none” – i.e., even signs with no major movement end up with approximately the same amount of visible amplitude as signs with a major movement. This may be accounted for by the presence of both minor and transitional movements in these signs. Of the 108 signs classified as having no major movement, 98 of them do have some minor movement, which in turn often involves multiple repetitions. And, because of the nature of this database, i.e., signs produced in isolation from a “rest” position, all signs, even those classified as having no major movement, have transitional movements that contribute to visible amplitude. In order to probe the true contribution of movement type separately from minor and transitional movement effects, a small subset of the database was further examined. The subset was controlled for as many characteristics as possible that might affect visible amplitude, but crucially consisted of signs with two different types of major movement: “curved” vs. “straight.” Other than this difference, all of the signs were 1H signs that had only major movement and no minor movement, with only one repetition of the major movement, that were produced at the “Head” location. They did vary slightly in terms of the more specific location, i.e., some were produced at the Forehead, others at the Cheek, others at the Chin, etc. In order to completely eliminate the effects of transitional movement, however, visible amplitude was calculated only for the stroke of the sign;22 the stroke averaged 23% of the entire sign recording (though it
19 In this and all subsequent boxplots, the bottom and top of the box are at the first and third quartiles, respectively. The whiskers extend to the smallest or largest point in the dataset that falls within 1.5 * IQR below the bottom or above the top of the box, where IQR is the inter-quartile range. All points beyond the whisker are classified as outliers. 20 We note that the distribution of residuals in the ANOVA is not Normal, as in general, the visible amplitude scores are somewhat right-skewed (as can be gleaned from Fig. 8, where there are high outliers for each hand quantity type, despite an otherwise relatively regular distribution). A non-parametric version of an ANOVA, the Kruskal-Wallis test, which does not assume Normality, confirms the ANOVA results: Kruskal-Wallis v2(2) = 194.97, p < 2.2e16. In general, we report the results of parametric tests here, because the data are largely in line with the relevant assumptions, but non-parametric tests were also conducted for comparison. Results were always in agreement unless otherwise noted.
21 Note that while in Fig. 9, it may seem as though signs with “Other” movement types have noticeably greater visible amplitude than other signs of the same sign type (especially within 2HU signs), there are very few such signs that actually fall into these categories. Specifically, there are 14 “Other” signs among the 1H signs, six “Other” signs among the 2HB signs, and six “Other” signs among the 2HU signs, as compared to 17–90 signs of each of the other movement type categories. So, while “Other” movement types might possibly be associated with greater visible amplitude, there are not enough examples in the current dataset to say anything conclusive. 22 Recall from §2.2 that the stroke starts when the hand(s) are in the sign’s place of articulation, the handshape is fully formed, and the major movement starts its course, and ends when the hands start moving away, towards the next sign’s location or the resting position.
6.2. Visible amplitude results
13
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
Fig. 7. Distribution of major movement types across major locations.
Table 3 Raw counts and percentages of each major location that have each major movement characteristic. Major Location
Major Movement
None Straight Curved Circular Back-and- Forth Other Total
Head
Body
Neutral
Arm
Hand
43 (21%) 40 (20%) 25 (12%) 22 (11%) 64 (32%) 7 (4%) 201 (100%)
2 (3%) 12 (17%) 20 (29%) 14 (20%) 22 (31%) 0 (0%) 70 (100%)
47 (18%) 59 (23%) 32 (12%) 23 (9%) 89 (34%) 10 (4%) 260 (100%)
1 (6%) 3 (19%) 0 (0%) 1 (6%) 8 (50%) 3 (19%) 16 (100%)
15 (10%) 33 (23%) 14 (10%) 20 (14%) 56 (39%) 6 (4%) 144 (100%)
Fig. 8. Visible amplitude by hand quantity type. 1H indicates a one-handed sign; 2HB indicates a two-handed, balanced sign; and 2HU indicates a two-handed, unbalanced sign.
ranged from 6% to 36%). The signs also varied in terms of handshape, which as mentioned above is thought not particularly likely to affect visible amplitude. These variations were required in order to have multiple signs of each major movement type; even with those allowances, there were only 12 “curved” signs and 16 “straight” signs that were otherwise matched. Fig. 10 shows the difference in visible amplitude among the matched subset, based on major movement path. As can be seen, the visible amplitude tends to be larger for the signs with
curved movement than it is for those with straight movement; a t-test reveals that the two are right on the cusp of being statistically significantly different from each other with an alpha level of 0.05: t(13.978) = 2.1443, p = 0.05. The Cohen’s d measure of effect size is 0.91, which is typically considered a “large” effect (Cohen, 1988; Sawilkowsky, 2009). Given the relatively small samples here, which would tend to minimize any effect, it seems reasonable to surmise that indeed, different major movement paths may lead to different visible amplitudes.
14
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
Fig. 9. Visible amplitude for each major movement type, within each hand quantity type. 1H indicates a one-handed sign; 2HB indicates a two-handed, balanced sign; and 2HU indicates a two-handed, unbalanced sign.
Fig. 10. Visible amplitude in a subset of signs that are matched as 1H signs produced at the Head, with no minor movements, and only one repetition of the major movement. Visible amplitude was measured only during the stroke of the sign, i.e., excluding any transitional movements.
Thus, the interpretation of the effect of major movement on visible amplitude must be approached carefully: while the type of major movement may have a direct effect on visible amplitude, it is perhaps easily masked by the presence of transitional (and possibly minor) movements, especially in a situation such as this one where the transitional movements are long relative to the strokes. Returning to the original full dataset (all signs, and including transitional movements), Fig. 11 shows how visible amplitude varies across different locations for each of the three hand quantity types. This time, there is considerable variability in the amount of visible amplitude based on where the sign is located. Note that caution should be used in interpreting the graph, as not all locations are used for all sign types in the subset of the ASL-Lex database included here: there are no handlocated 1H signs, and no arm-located 2HB signs; there is also only one body-located 2HU sign. All three datasets shown in Fig. 11 show statistically significant differences on the basis of an ANOVA: for 1H signs, F(3, 278) = 17.48, p = 1.99e10;
for 2HB signs, F(3, 234) = 19.83, p = 1.74e11; and for 2HU signs, F(4, 166) = 2.715, p = 0.03.23 For 1H and 2HB signs, it is clear that the head-located signs have greater visible amplitude, as would be expected from the added transitional movement of taking the hand(s) up to the head as part of the sign. Indeed, the 2HB signs seem to illustrate a relatively “clean” example of the expected effects of transitional movement on visible amplitude, with visible amplitude decreasing as the major location is lower in the possible space. For the 1H signs, the arm-located signs seem to be much more similar in terms of their visible amplitude to the headlocated signs. It should first be noted that there is a relatively small number of these signs (only seven arm-located 1H signs,
23 The effect of location on 2HU signs is the only marginal result here. The nonparametric Kruskal Wallis test indicates a Kruskal Wallis v2(4) = 8.21, and p = 0.08, which does not quite reach significance under an assumption of a = 0.05. This is likely due to the relative scarcity of data, and will be discussed below.
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
15
Fig. 11. Visible amplitude across different major locations for each hand quantity type.
as compared to 35 body-located signs, 84 neutral-located signs, and 156 head-located signs). Additionally, arm-located 1H signs often have a movement that spans the entire length of the arm (e.g., the signs for ARM or STEAL), which tends to increase the visible amplitude of the sign, and even when this is not the case, several of the signs do involve obvious large amounts of transitional movement. For example, the sign for is located near the shoulder of the non-dominant arm and so involves significant transitional movement of the dominant hand, and the sign for POOR, which is at the elbow of the non-dominant arm, requires the non-dominant hand itself to come up and reach almost shoulder height to make the elbow accessible to the dominant hand. Thus, it is not surprising that the arm-located signs pattern like head-located signs in the 1H sign type. As for the 2HU signs, recall that the vast majority of these signs are coded as being located at the hand (142 handlocated signs, as compared to one body-located, seven head-located, nine arm-located, and 12 neutral-located signs). Here, the head-, arm-, and hand-located signs all seem to have roughly the same amount of visible amplitude, while the body- and neutral-located signs are the ones that show a significant boost. This is due less to the base location of the signs, we believe, and more to what the two hands happen to be doing within each major location. Specifically, most 2HU head-located signs involve only one hand coming up to the head, and often then moving down to the non-dominant hand (e.g., STAMP), which makes them similar to other 1H signs and even 2HU hand-located signs in terms of visible amplitude. The other 2HU locations (neutral, body), on the other hand, often involve both hands moving up into the specified signing space (e.g., WORLD, BLINDS_1), making them more like 2HB signs in terms of visible amplitude. Thus, it is not just the location that is contributing to the amount of visible amplitude but the combination of location and use of the nondominant hand. It is important to stress that these differences in visible amplitude on the basis of location are not necessarily about the location itself per se, but rather about the amount of movement that it takes to reach that location, i.e., the transitional
POWER
movement, which in the case of ASL-Lex is always from and to a resting position with the hands on the signer’s lap. Again, this point can be illustrated by using a small subset of the ASLLex signs; this time, the signs were matched on as many parameters as possible except location, and then subjected to a visible amplitude analysis of just their stroke movement, excluding the transitional frames. All of the selected signs were 1H signs with back-and-forth major movement and no minor movement, and had exactly two repetitions of the major movement. The key difference was that eight of the signs were produced at the head, specifically at the cheek / nose area, while seven of the signs were produced in neutral space. They did also differ in terms of handshape, i.e., the number and degree of finger extension/flexion, but as discussed in §2, handshape is less likely to have a direct impact on visible amplitude. Visible amplitude was calculated only for the stroke of the sign, as defined in §2.2; on average, the stroke portion was approximately 36% of the entire sign video (though it ranged between 25% and 55%24). Fig. 12 shows the results of this calculation. As can be seen, there is no difference in visible amplitude for the two locations in this subset [t(11.239) = 0.34, p = 0.74]. This is not surprising, as the motions within the stroke should indeed have been essentially identical. Thus, the claim above that location affects the visible amplitude of a sign should be interpreted as meaning that transitioning to or from a particular location (from the default rest position in this case) affects the visible amplitude. Thus, the results for major movement and major location are somewhat “opposite,” though driven by the same force: the type of major movement may affect visible amplitude, but is easily overshadowed by the existence of transitional movement, while major location affects visible amplitude only insofar as a sign uses different transitional movements to achieve a particular location.
24 Note that the stroke accounts for a higher proportion of the sign in this subset than it did in the subset used to test the effect of major movement. This is likely due to the fact that, simply because of trying to maximize the available signs in the subset, the subset for testing movement contained signs with only one repetition of the major movement, while in this subset, all signs had two repetitions of the major movement.
16
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
Fig. 12. Visible amplitude in a subset of signs that are matched as 1H signs produced with back-and-forth major movement, no minor movement, and exactly two repetitions of the major movement. Visible amplitude was measured only during the stroke of the sign, i.e., excluding any transitional movements.
Finally, while three-way interactions among hand quantity, location, and movement type as they relate to visible amplitude could be explored, the lack of an effect of movement type on visible amplitude when the full recording is included (see Fig. 9 above) means that no further information is to be gained through such an exploration. Furthermore, because of the limited size of the dataset, many of the cells in the three-way cross-categorization are sparsely populated or even empty, making any inferences from such an examination very limited; this also prohibits any exploration of a threeway interaction in any kind of matched subset of the data. We believe that the key results are adequately represented by the above figures, in that if the entire sign is considered, transitional movement plays the largest role in determining visible amplitude as reflected by the effects of location on visible amplitude, while during the stroke of the sign, major movement has an effect. 7. Discussion
In this paper, we have proposed a novel measure of a sign’s phonetic characteristics, visible amplitude, that is directly affected by a sign’s movements and hand quantity, which each affect how much motion there is in a sign. We believe that this measure will be an integral tool for allowing researchers to measure signal-based properties of the relative prominence of a sign, as it is easily extracted from video recordings and can provide an analogue to the various acoustic measures used to quantify prominence in spoken languages. As mentioned in §1, this measure is computationally analogous to acoustic amplitude, but it is perhaps conceptually more similar to the notion of articulatory displacement as discussed e.g. in Smith et al. (in press, this Special Issue), in that it is a measure of the amount of movement during the articulation of a word. This has immediate visible consequences in a signed language, while in spoken languages, greater articulatory displacement affects the acoustic signal, such that greater tongue displacement could be realized as a difference in, e.g., formant values.
We applied the visible amplitude measure to the ASL-Lex database of American Sign Language (Caselli et al., 2017). This database consists of sign lexemes signed in isolation by a single signer, which provided us with data for lexeme-level analysis of sign-internal visible amplitude. The results give us a number of insights into how a sign’s sub-lexical characteristics affect its visible amplitude. In particular, we see evidence that hand quantity, transitional movements, and type of major movement each contribute to visible amplitude. First, we found that visible amplitude tends to increase when the number of hands involved in a sign increases, such that two-handed signs have more visible amplitude than onehanded signs. They do not have twice as much amplitude, however, presumably because there are movements of the body and head that also contribute to visible amplitude in both types of signs. Likewise, we found that there is a significant, though small, difference between the visible amplitude of balanced versus unbalanced two-handed signs, with the unbalanced signs having slightly less visible amplitude. While it may be intuitive that balanced signs would generate more visible amplitude than unbalanced signs since both hands are active articulators, it is less intuitive that the difference is so small. The explanation comes, we believe, from the contribution of the transitional movement of the non-dominant hand: though not active during the stroke execution, the nondominant hand does add visible amplitude in pre- and poststroke phases. Second, we found that visible amplitude is also affected by a sign’s location. Signs higher in the signing space tend to have greater visible amplitude, but this effect seems to be an artifact of both the transitional movements used to achieve particular locations and other factors such as the specific use of each hand within a sign. There is no evidence that two signs produced with identical internal movement at two different locations have any difference in visible amplitude, which is indeed to be expected. The slight increase in visible amplitude of 2HB signs as compared to 2HU signs may then be due to the fact that more 2HB signs were signed in higher locations than 2HU signs
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
(31% vs. 10%), thus benefiting both from visible amplitude generated by major movement of the hands and by transitional movement due to location. This seems especially probable since the signs that generated most visible amplitude in the study overall were 2HB signs articulated at the head. Third, we found that major movement type may affect visible amplitude, but this effect can be seen in this database only if the effects of transitional and minor movements are removed from the sign. That is, major movements may not be large enough contributors to visible amplitude to be noticeable independently from transitional movements, at least in a database of individual words in isolation. These results have implications for using visible amplitude as a signal-based measure of prominence. In general, visible amplitude does provide a way of quantifying the amount of movement in a sign, and we hypothesize, based on the qualitative reports discussed in §3, that one way of making a sign more prominent is to increase its visible amplitude. However, caution should be used, especially in the context of continuous signing as compared to signs in isolation. Transitional movements clearly play a role in the determination of visible amplitude, and this effect must be taken into account, regardless of whether one adopts the view that transitional movements are integral components of individual signs (see §2.2 above). For example, if transitional movement is an important characteristic to signers, we might expect to see visible amplitude being increased under prominence via manipulations of either the exact starting and ending locations of signs (cf. van der Kooij & Crasborn, 2013) or the syntactic order of signs, to juxtapose signs with more distant locations. On the other hand, if signers concentrate their efforts in the strokes of signs, we might expect to see that visible amplitude is increased under prominence by using both hands where only one is required (cf. Emmorey, 2001; Crasborn, 2001), increasing the size of the major movement (e.g., through proximalization, cf. Crasborn, 2001, or through changing the length of the movement path, cf. Friedman, 1974), or increasing the number of repetitions of the movement (cf. Sandler, 1999). In all of these cases, it will be important to use visible amplitude as a relative measure, e.g., in the comparison of one sign to another with a different prominence profile. For example, one could compare the visible amplitude of the sign SHIRT in (2a) vs. (2b), examples that are adapted from Wilbur (1999: 238). In (2a), SHIRT is the prominent item, while in (2b), HARRY is the prominent item. As discussed in Wilbur (1999), one of the primary ways that stress is indicated in ASL is through the use of syntactic order; the prominent item is placed sentence-finally. Given the known interactions between structural and signal-based aspects of prominence in spoken languages, however, we predict that the prominence is marked not just by word order but also by the physical increase of visible amplitude during the production of the sign. Given that the word order varies, the transitional movement between the preceding word and the target word may be manipulable; in this case, the sign for WHO ends at the chin, while the sign for BIRTHDAY ends at the chest; the sign for SHIRT is produced at the shoulder (see Fig. 13 for examples). One could imagine that the distance between the two signs might be increased for
17
emphasis, and / or that the sign for SHIRT could be itself made bigger by using both hands or by increasing the movement in the sign itself. (2)
a. b.
ELLEN GIVE HARRY WHAT FOR BIRTHDAY, SHIRT
‘What Ellen gave Harry for his birthday was a shirt.’ b. ELLEN GIVE WHO SHIRT FOR BIRTHDAY, HARRY ‘The person who Ellen gave a shirt for his birthday was Harry.’
We also examined the overall distribution of the different types of signs in the database. In part, this was simply informational, to establish the frequency of each different sign type. But there is also an interesting long-standing claim in the sign language literature that may be illuminated by considering the distribution of various lexical characteristics and how they relate to visible amplitude. Specifically, Siple (1978) speculated that increased perceptibility of one sub-lexical aspect of the sign would be associated with lesser perceptibility of another sub-lexical aspect, and that the lexical distribution of the formation of certain signs can be explained by this trade-off. Signers tend to look each other in the face rather than focusing on the hands during signed communication (see Muir & Richardson, 2005 for British Sign Language; Emmorey, Thompson, & Colvin, 2009 for American Sign Language), and so the region of highest visual acuity is near the signer’s face. Siple (1978: 102) conjectured that there should be a trade-off such that signs that are articulated in regions of lesser visual acuity would need a boost in their perceptibility that might, for example, come from an increase in movement during the sign. Additionally, Siple predicted that signs made in peripheral, low-acuity areas would be twohanded and symmetrical, rather than one-handed signs (p. 103). These claims lend themselves well to exploration through the use of visible amplitude, if it is in fact the case that increased visible amplitude is associated with increased perceptibility. If Siple is right, then we should see that signs that are lower in visible amplitude (and hence less perceptible) should be signed higher in the signing space, closer to the region of highest visual acuity (and hence in a more perceptible location), while signs that are higher in visible amplitude should be signed lower in the signing space. The results in §6 indicate that the biggest predictors of visible amplitude are the use of one vs. two hands and the transitional movement used to attain a particular location in the signing space. And, as was seen in §6.1, location but not major movement path shows a different lexical distributional pattern across hand quantity values for signs. Recall from Fig. 5 that back-and-forth movement was the most common movement path type for 1H, 2HB, and 2HU signs, and that the other movement paths were distributed roughly the same across the three hand quantity types. In contrast, Fig. 6 showed that locations are distributed quite differently across the three hand quantity types; the majority of 1H signs are produced at the head, while the head is a very uncommon location for 2HB and 2HU signs (see also similar results discussed in Johnston & Schembri, 2007: 103 for Australian Sign Language). Thus, signs that involve only one hand are produced in regions of higher signing space than those that involve two hands, and we suggest that this trade-off
18
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
Fig. 13. Top: Ending hand positions for
WHO, BIRTHDAY;
is done to increase their perceptibility. Note that prominence in this case is being affected by both visible amplitude (hand quantity and possible transitional movements to various locations) and visual acuity (location does directly affect the perceptibility of a sign). In addition to being a longstanding claim within the sign language literature, we note that the concept of the trade-off investigated here has parallels in the spoken language literature as well. Languages, regardless of modality, are message-transmission systems and are thus subject to the same kinds of considerations that govern all such systems (see Hockett, 1955; Hall, Jaeger, Hume, & Wedel, 2018; Lindblom, 1990; Pierce, 1961; Pate & Goldwater, 2015; Shannon & Weaver, 1949). It has been argued that one consequence of this fact, and particularly the need to balance considerations of communicative effectiveness with resource cost management, is a tendency for information in a signal to be presented at a uniform rate (e.g., the ‘smooth signal’ redundancy hypothesis of Aylett & Turk, 2004; the ‘uniform information density’ hypothesis of Levy & Jaeger, 2007). Aylett and Turk (2004), for example, show that the predictability of a unit
Bottom: Starting hand position for
SHIRT.
and the unit’s degree of prosodic prominence work together to regulate duration in an efficient manner. That is, less predictable units tend to be lengthened, as do prosodically prominent units, and rather than having lengthening throughout the signal from both sources, unpredictable units tend to be put into stressed positions such that the cost of lengthening does double duty. In the current case, one could imagine a similar communication-based explanation for the trade-off in “sources” of prominence. For example, perhaps there is some threshold of perceptibility that any sign needs to achieve, to which the different characteristics of hand quantity, location, and movement can contribute. In order to be resource-cost effective (and to ensure sufficient contrast at the lexical level), however, one would not need all three characteristics to contribute maximally to all signs. Thus, a higher contribution to prominence by one characteristic could be accompanied by a lower contribution by another, as seen here. Whether such a threshold actually exists and what it might be, however, will need to be investigated in future studies. Several further issues need to be discussed in relation to visible amplitude, however. One is the fact that the hands
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
are not the only moving objects, and yet our measure captures all movement. That is, other non-manual movements produced by the signer (torso movements, facial expressions, eye blinks, and so on) are all measured by Optical Flow Analysis. Though there are ways of removing some of this ‘background noise’ from the measure (e.g., via defining regions of interest that exclude the head, etc.), we opted not to do so at this initial investigation for two reasons. First, the overall contribution of non-manuals does not appear to be large, at least for this particular corpus (the signer is relatively still, with a composed face, and her movements are very controlled). Additionally, it may be the case that non-manual contribution to visible amplitude is in fact an essential part of the sign and is thus important for normal sign perception. In fact, there is some evidence that the torso and head show systematic motions during the normal production of signs, as well as their contribution to sign prominence. For example, Sanders and Napoli (2017) discuss “reactive effort,” which they define as the effort of resisting incidental torso movement that has been induced by movement of the manual articulators. They suggest that sign lexicons are shaped by the drive to minimize reactive effort and thus that signs that induce twisting of the torso and thus require more reactive effort are less frequent than signs that induce rocking of the torso and require less reactive effort. Even in signs requiring minimal reactive effort, however, some torso movement is involved nevertheless, and, to the best of our knowledge, no one has yet investigated the question of how such non-manual movements contribute to sign comprehension. We do know, however, that such movements contribute to sign prominence, coarticulation, and other related phenomena. Wilbur and Patschke (1998), for example, show that stressed signs in ASL can be marked by the body leaning forward, and body leans can also play a role in contrastive focus, inclusion and exclusion, and affirmation. In Swiss German Sign Language, sideward movements of the torso can mark prosodic and discourse units (Boyes Braem, 1999). The head plays a role in topic marking (see Kimmelman, 2015 and references therein) and also can move forward to reduce effort in production of head-anchored signs (Mauk and Tyrone, 2012). And the face, which is highly perceptible because the eyegaze of the addressee is fixated on it, also contributes relevant articulatory information. Most intriguing perhaps is the phenomenon called echo phonology (Woll, 2008 on British Sign Language), where mouth actions mimic or “echo” certain articulatory actions of the hands. All of these actions no doubt increase the redundancy of the sign, making it more perceptible, and will also increase its visible amplitude. The role of all these and many other phenomena related to the behavior of non-manual and passive articulators is a rich field for future research. A second issue, introduced in §1, is that the measure of visible amplitude is a property of the linguistic signal and as such, is only an approximation for any measure of prominence. In particular, we believe that prominence as a whole is best understood as a perceptual phenomenon, i.e., that certain characteristics of signed or spoken language are perceived to be more prominent than others, rather than being more or less prominent only within the linguistic signal, regardless of modality. As illustrated in Cole et al. (2010), Bishop (2012), Turnbull et al. (2017), and Baumann and Winter (2018), this means that non-signal-based characteristics, such as the
19
discourse context and a listener’s expectations, can affect prominence; see also Cole et al. (2019) and Wagner and McAuliffe (in press, this Special Issue). The measure proposed here, visible amplitude, is however simply one of many properties of the (multimodal) signal, and thus while it may influence the perception of prominence, it is neither a direct nor a complete measure of prominence, and should not be interpreted as such. In addition to such contextual considerations, there are other factors in the visual domain, in particular, that may affect perceived prominence. For example, the facts discussed above that signers tend to direct their gaze toward the face (Emmorey et al., 2009; Muir & Richardson, 2005), and that visual acuity diminishes with distance from the focused area (Mandelbaum & Sloan, 1947) both affect the perceived prominence of different signs, but are not captured by a measure of visible amplitude. What is crucially needed is an understanding of how visible amplitude affects the perceptibility of signs. We hypothesize that higher visible amplitude leads to greater perceptibility, but this is an empirical question to be investigated in future studies. A third issue that undoubtedly plays a role in sign formation and may interact with visible amplitude is iconicity, or the relationship of resemblance between a sign’s form and its meaning. Because of the affordances of the visual-spatial modality and the flexibility of the hands to move in space and to form intricate handshapes, iconicity is pervasive at all levels of linguistic organization of sign languages (Meir & Tkachman, 2018; Pietrandrea & Russo, 2007), but perhaps is most obvious at the lexical level. For example, Fig. 14 illustrates two iconic signs in ASL, CAT and HORSE. Each specifies a salient feature of its referent (whiskers for CAT, ears for HORSE) in its appropriate location (the face for whiskers, the head for ears) with the movement either tracing the feature (CAT) or representing the prototypical movement of the feature (flopping ears for HORSE, which is represented by a folding down of the fingers). Such forms are usually coined in order to be easily recognized, and not with considerations of ease of production. The ASLLex database also codes signs for their iconicity on a scale of 1 (not iconic) to 7 (very iconic) based on hearing nonsigner judgments. Out of the 691 signs analyzed, 191 were judged to be of relatively high iconicity in that they were rated as 4 or higher (the mean score for all signs was 3.09). We did some preliminary testing of the possible effect of rated iconicity on visible amplitude, and found no significant effect of iconicity on visible amplitude for any of the three hand quantity types (relatedly, the correlation between visible amplitude and iconicity was close to zero for all three). This is not to say that visible amplitude and iconicity do not interact at all, and some observations on historical change in ASL may suggest avenues for future research. Frishberg (1975) describes changes in signs from the beginning of the 20th century to the 1970s, and many of those changes resemble ones that Siple (1978) hypothesized to be preferred due to ease of visual perception. For example, one-handed signs produced below the neck tended to become twohanded (“body displacement”; note that this should increase visible amplitude), and two-handed signs made with contact to the face/head become one-handed (“head displacement”; note that this should decrease visible amplitude). In fact, the
20
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
Fig. 14. Iconic ASL signs for
CAT
signs in Fig. 14 illustrate this latter tendency. Whiskers grow on both sides of a cat’s face, and horses have two ears, yet both of the signs are one-handed. It seems that, while considerations of iconic recognition may motivate signs without regard to production and perception consequences, so too may production and perception considerations motivate signs at the cost of decreasing iconicity. The details of this mutual influence, however, are subject of future research. A fourth factor that was not investigated here but which will affect future studies of the role of visible amplitude in continuous signing is coarticulation. In continuous signing, signs are not articulated the way they are in isolation, which means their visible amplitude will vary depending on their environments. Such differences are not available to be investigated in the ASL-Lex database, given that it consists of signs in isolation. That said, there are a few studies on sign coarticulation that can provide clues as to how it happens and which kinds of signs are affected more or less than others. Studies on American (Lucas, Bayley, Rose, & Wulf, 2002), Australian, and New Zealand Sign Languages (AusLan and NZSL, Schembri et al., 2009) report that signs articulated at the forehead are subject to lowering if the preceding sign is articulated at a lower location, or if the following sign is body-anchored. Russell, Wilkinson, and Janzen (2011) confirmed this effect for signs located at the forehead or eye level, but not signs lower on the face, which, they argue, demonstrates that location coarticulation is not simply a matter of lowering, and that coarticulatory effects should be investigated not with some single variable but with a combination of location, contact, and movement type. Signs in high locations are not the only ones affected by coarticulation; signs articulated in neutral space can raise if the preceding or following sign is articulated in a high location, and the effect is stronger in fast signing (see Mauk, 2003; Mauk, Lindblom, & Meier, 2008 for ASL; Ormel, Crasborn, & van der Kooij, 2013 for NGT). Interestingly, body-anchored signs are not affected by coarticulatory effects, but may influence the location of following or preceding signs (Mauk, 2003). In NGT, even two-handed unbalanced signs (usually treated as specified for location) can raise due to coarticulation, since only the location of the dominant hand is specified, whereas the non-dominant hand is free to raise (Ormel
(left image) and
HORSE
(right image).
et al., 2013). And studies that compared signers of different age groups (Lucas et al., 2002; Schembri et al., 2009) report that those effects are stronger for younger signers, which both groups of authors suggest reflects a change in progress. This change resembles that reported by Frishberg (1975) in historical changes in ASL. All of this suggests that there are pressures (not necessarily articulatory in nature; see Russell et al., 2011) that drive change in some signs but not in others. We tentatively hypothesize that at least one of the driving forces may be a consideration of the visible amplitude contained in the signs, but this hypothesis is subject to future investigation. As mentioned above, then, the next step would be to apply the measure of visible amplitude to syntagmatic data (e.g., continuous signing), to see whether particular signs or phrases are measurably more prominent than their neighbours, in a way analogous to studies of prosodic prominence in spoken languages. Likewise, for sign languages with a longer history of documentation, it would be useful to compare the visible amplitude of earlier signs to their contemporary counterparts. 8. Conclusion
The measure of visible amplitude has been introduced as a way of quantifying the amount of movement in signs, as a step toward a consistent, objective, easily obtainable phonetic measure of prominence. It has been shown that hand quantity and the transitional movement associated with different major locations each directly affect visible amplitude, and that this effect may help explain observations about lexical distribution. Major movement may also affect visible amplitude, with longer movements having greater visible amplitude than shorter movements, but this effect can be seen only in the absence of interference from transitional and minor movements. Now that these baseline effects have been established, future studies can focus on understanding how other types of movement may affect visible amplitude, how visible amplitude can be used as a measure of prominence in continuous signing, how visible amplitude is related to perceptibility, and how other characteristics of signing, such as iconicity and coarticulation, affect visible amplitude.
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
Acknowledgements The authors gratefully acknowledge the feedback of the UBC Language and Learning Lab, the Vancouver Phonology Group, the audience at LabPhon 16, and our editors and reviewers. We are also grateful for financial support for this project from the Social Sciences and Humanities Research Council of Canada, Canada, Grant #4302016-01218.
Appendix A. Supplementary data
Supplementary data to this article can be found online at https://doi.org/10.1016/j.ijtst.2018.11.001. References Ambrazaitis, G., & House, D. (2017). Multimodal prominences: Exploring the patterning and usage of focal pitch accents, head beats, and eyebrow beats in Swedish television news readings. Speech Communication, 95, 100–113. Aylett, M., & Turk, A. (2004). The smooth signal redundancy hypothesis: A functional explanation for relationships between redundancy, prosodic prominence, and duration in spontaneous speech. Language and Speech, 47(1), 31–56. Baker, C., & Cokely, D. (1980). American sign language. A teacher's resource text on grammar and culture. Silver Spring, MD: TJ Publ. Baker-Shenk, C. (1983). A microanalysis of the nonmanual components of questions in American Sign Language (Doctoral dissertation). California, USA: UC Berkeley. Barbosa, A. V., Yehia, H. C., & Vatikiotis-Bateson, E. (2007). Temporal characterization of auditory-visual coupling in speech. Proceedings of Meetings on Acoustics. https://doi.org/10.1121/1.2920162. Barbosa, A. V., Yehia, H. C., & Vatikiotis-Bateson, E. (2008). Linguistically valid movement behavior measured non-invasively. In AVSP (pp. 173–177). Barbosa, A. V. FlowAnalyser. [Computer Program], 2013. Available online: https://www. cefala.org/~adriano/optical_flow/. Battison, R. (1974). Phonological deletion in American Sign Language. Sign Language Studies, 5, 1–19. Baumann, S., & Winter, B. (2018). What makes a word prominent? Predicting untrained German listeners’ perceptual judgments. Journal of Phonetics, 70, 20–38. Beckman, M. E., & Venditti, J. J. (2010). Tone and intonation. In W. J. Hardcastle, J. Laver, & F. E. Gibbon (Eds.), The handbook of phonetic sciences (2nd ed., pp. 603–652). Somerset, NJ: Wiley. Bishop, J. (2012). Information structural expectations in the perception of prosodic prominence. In G. Elordieta & P. Prieto (Eds.), Prosody and meaning (pp. 239–270). Berlin: Mouton de Gruyter. Blondel, M., & Miller, C. (2001). Movement and rhythm in nursery rhymes in LSF. Sign Language Studies, 2(1), 24–61. Borràs-Comes, J., Vanrell, M. M., & Prieto, P. (2014). The role of pitch range in establishing intonational contrasts. Journal of the International Phonetics Association, 44(1), 1–20. Boyes Braem, P. (1999). Rhythmic temporal patterns in the signing of deaf early and late learners of Swiss German Sign Language. Language and Speech, 42(2–3), 177–208. Brentari, D. (1998). A prosodic model of sign language phonology. Cambridge, MA: MIT Press. Brentari, D., & Crossley, L. (2004). Prosody on the hands and face: Evidence from American Sign Language. Sign Language & Linguistics, 5(2), 105–130. Bruce, V., & Green, P. (1990). Visual perception: Physiology, psychology and ecology (2nd ed.). London/Hillsdale, NJ.: Lawrence Erlbaum Associates. Caselli, N. K., Sehyr, Z. S., Cohen-Goldberg, A. M., & Emmorey, K. (2017). ASL-LEX: A lexical database of American Sign Language. Behavior Research Methods, 49(2), 784–801. Clark, L. E., & Grosjean, F. (1982). Sign recognition processes in American Sign Language: The effect of context. Language and speech, 25(4), 325–340. Cohen, J. (1988). Statistical power analysis for the behavioral sciences. New York: Lawrence Erlbaum. Cole, J., Hualde, J. I., Smith, C. L., Eager, C., Mahrt, T., & de Souza, R. N. (2019). Sound, structure, and meaning: The bases of prominence ratings in English, French, and Spanish. Journal of Phonetics, 75, 113–147. Cole, J., Mo, Y., & Baek, S. (2010). The role of syntactic structure in guiding prosody perception with ordinary listeners and everyday speech. Language and Cognitive Processes, 25, 1141–1177. Corina, D. (1990). Reassessing the role of sonority in syllable structure: Evidence from a visual-gestural language. CLS, 26(2), 31–44. Coulter, G. R. (1990). Emphatic stress in ASL. In S. D. Fischer & P. Siple (Eds.), Theoretical issues in sign language research, Vol. 1. Linguistics (pp. 109–125). Chicago, IL, US: University of Chicago Press. Coulter, G. R. (1993). Phrase-level prosody in ASL: Final lengthening and phrasal contours. In Current issues in ASL phonology (pp. 263–272). Academic Press. Crasborn, O. (2001). Phonetic implementation of phonological categories in Sign Language of the Netherlands (Doctoral dissertation). Universiteit Leiden.
21
Crasborn, O. (2011). The other hand in sign language phonology. In M. van Oostendorp, C. J. Ewen, E. Hume, & K. Rice (Eds.). The Blackwell Companion to Phonology (Vol. I, pp. 223–240). Oxford: Wiley-Blackwell. Crasborn, O. A., & Zwitserlood, I. E. P. (2008). The Corpus NGT: An online corpus for professionals and laymen. Dachkovsky, S., & Sandler, W. (2009). Visual intonation in the prosody of a sign language. Language and Speech, 52(2–3), 287–314. Dachkovsky, S. (2008). Signs of the Time, Selected Papers from TISLR 2004. de Ruiter, J. P. (1998). Gesture and speech production Doctoral dissertation. Radboud University Nijmegen. Diehl, R. L. (2008). Acoustic and auditory phonetics: The adaptive design of speech sound systems. Philosophical Transactions of the Royal Society B: Biological Sciences, 363(1493), 965–978. Emmorey, K. (2001). Language, cognition, and the brain: Insights from sign language. research. Psychology Press. Emmorey, K., & Corina, D. (1990). Lexical recognition in sign language: Effects of phonetic structure and morphology. Perceptual and Motor Skills, 71(3_suppl), 1227–1252. Emmorey, K., Thompson, R., & Colvin, R. (2009). Eye gaze during comprehension of American Sign Language by native and beginning signers. Journal of Deaf Studies and Deaf Education, 14(2), 237–243. Engberg-Pedersen, E. (1990). Pragmatics of nonmanual behaviour in Danish Sign Language. Sign Language Research, 87, 121–128. Flanagan, J. R., & Johansson, R. S. (2002). Hand movements. Encyclopedia of the Human Brain, 2, 399–414. Fleet, D., & Weiss, Y. (2006). Optical flow estimation. In Handbook of mathematical models in computer vision (pp. 237–257). Springer, Boston, MA. Friedman, L. (1974). On the physical manifestation of stress in the American Sign Language Unpublished manuscript. Berkeley: University of California. Frishberg, N. (1975). Arbitrariness and iconicity: Historical change in American Sign Language. Language, 696–719. Fung, H. S. H., & Mok, P. P. K. (2018). Temporal coordination between focus prosody and pointing gestures in Cantonese. Journal of Phonetics, 71, 113–125. Geraci, C. (2011). Epenthesis in Italian Sign Language. Sign Language & Linguistics, 12 (1), 3–51. Gibson, J. J. (1966). The senses considered as perceptual systems. Boston: HoughtonMifflin. Gordon, M. (2011). Stress: Phonotactic and phonetic evidence. In M. van Oostendorp, C. J. Ewen, E. Hume, & K. Rice (Eds.). The Blackwell companion to phonology (Vol. II, pp. 924–948). Oxford: Wiley-Blackwell. Grosjean, F. (1979). The production of sign language: Psycholinguistic perspectives. Sign Language Studies, 25(1), 317–329. Grosjean, F. (1981). Sign & word recognition: A first comparison. Sign Language Studies, 195–220. Hall, K. C., Jaeger, T. F., Hume, E., & Wedel, A. (2018). The role of predictability in shaping phonological patterns. Linguistics Vanguard, 4, 1–15. Hall, K. C., Letawsky, V., Turner, A., Allen, C., & McMullin, K. (2015). Effects of predictability of distribution on within-language perception. In Proceedings of the 2015 annual conference of the Canadian Linguistic Association (pp. 1–14). Hall, K. C., Smith, H., McMullin, K., Allen, B., & Yamane, N. (2017). Using optical flow analysis on ultrasound of the tongue to examine phonological relationships. Canadian Acoustics, 45(1), 15–24. Hall, K. C., Tkachman, O., & Aonuki, Y. (2019). Lexical competition and articulatory enhancement in American Sign Language. Paper presented at the Annual Meeting of the Canadian Linguistics Association Vancouver, BC. Himmelmann, N. P., & Primus, B. (2015). Prominence beyond prosody: A first approximation. In pS-prominenceS: Prominences in Linguistics (pp. 38–58). Disucom Press, University of Tuscia Viterbo. Hockett, C. F. (1955). A manual of phonology. International Journal of American Linguistics, 21(4). Horn, B. K., & Schunck, B. G. (1981). Determining optical flow. Artificial intelligence, 17 (1–3), 185–203. Hosemann, J., Herrmann, A., Steinbach, M., Bornkessel-Schlesewsky, I., & Schlesewsky, M. (2013). Lexical prediction via forward models: N400 evidence from German Sign Language. Neuropsychologia, 51(11), 2224–2237. Jantunen, T. (2013). Signs and transitions: Do they differ phonetically and does it matter? Sign Language Studies, 13(2), 211–237. Johnston, T. (2009). Creating a corpus of Auslan within an Australian National Corpus. In Selected Proceedings of the 2008 HCSNet Workshop on Designing the Australian National Corpus (pp. 87–95). Somerville, MA: Cascadilla Proceedings Project. Johnston, T., & Schembri, A. C. (1999). On defining lexeme in a signed language. Sign Language & Linguistics, 2(2), 115–185. Johnston, T., & Schembri, A. (2007). Australian Sign Language (Auslan): An introduction to sign language linguistics. Cambridge University Press. Kendon, A. (1972). Some relationships between body motion and speech. Studies in dyadic communication, 7, 177. Kendon, A. (2000). Language and gesture: Unity or duality. Language and Gesture, 2. Kimmelman, V. (2015). Topics and topic prominence in two sign languages. Journal of Pragmatics, 87, 156–170. Krebs, J., Wilbur, R. B., Alday, P. M., & Roehm, D. (2018). The Impact of Transitional Movements and Non-Manual Markings on the Disambiguation of Locally Ambiguous Argument Structures in Austrian Sign Language (ÖGS). Language and Speech. 0023830918801399. Lathi, B. P., & Green, R. A. (2014). Essentials of digital signal processing. Cambridge University Press.
22
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935
Levy, R. P., & Jaeger, T. F. (2007). Speakers optimize information density through syntactic reduction. In Advances in neural information processing systems (pp. 849–856). Liddell, S. K. (1980). American sign language syntax (Vol. 52) The Hague: Mouton. Liddell, S. K., & Johnson, R. E. (1989). American Sign Language: The phonological base. Sign Language Studies, 64(1), 195–277. Lindblom, B. (1990). Explaining phonetic variation: A sketch of the H&H theory. In Speech production and speech modelling (pp. 403–439). Dordrecht: Springer. Loehr, D. P. (2007). Aspects of rhythm in gesture and speech. Gesture, 7, 179–214. Lucas, C., Bayley, R., Rose, M., & Wulf, A. (2002). Location variation in American sign language. Sign Language Studies, 2(4), 407–440. Mandelbaum, J., & Sloan, L. L. (1947). Peripheral visual acuity: With special reference to scotopic illumination. American Journal of Ophthalmology, 30(5), 581–588. Mauk, C. E., Lindblom, B., & Meier, R. P. (2008). Undershoot of ASL locations in fast signing. Signs of the time Selected papers from TISLR, 8, 3–24. Mauk, C. E., & Tyrone, M. E. (2012). Location in ASL: Insights from phonetic variation. Sign Language & Linguistics, 15(1), 128–146. Mauk, C. E. (1999). The interaction of sign size with phonological form in American Sign Language. Austin. Mauk, C. E. (2003). Undershoot in two modalities: Evidence from fast speech and fast signing. (Doctoral dissertation). McNeill, D. (1992). Hand and mind: What gestures reveal about thought. University of Chicago Press. McNeill, D. (2000). Language and gesture (Vol. 2). Cambridge University Press. Meir, I., & Tkachman, O. (2018). Iconicity. In M. Aronoff (Ed.), The Oxford research encyclopedia of linguistics. New York: Oxford University Press. Miller, C. (1996). Phonologie de la langue des signes québécoise structure simultanée et axe temporal (Doctoral dissertation). Quebec, Canada Montréal: Université du Québec à Montréal. Moisik, S. R., Lin, H., & Esling, J. H. (2014). A study of laryngeal gestures in Mandarin citation tones using simultaneous laryngoscopy and laryngeal ultrasound (SLLUS). Journal of the International Phonetic Association, 44(1), 21–58. Muir, L. J., & Richardson, I. E. (2005). Perception of sign language and its application to visual communications for deaf people. Journal of Deaf studies and Deaf education, 10(4), 390–401. Munhall, K. G., Jones, J. A., Callan, D. E., Kuratate, T., & Vatikiotis-Bateson, E. (2004). Visual prosody and speech intelligibility: Head movement improves auditory speech perception. Psychological Science, 15(2), 133–137. Nespor, M., & Sandler, W. (1999). Prosody in Israeli sign language. Language and Speech, 42(2–3), 143–176. Orfanidou, E., McQueen, J. M., Adam, R., & Morgan, G. (2015). Segmentation of British Sign Language (BSL): Mind the gap! The Quarterly Journal of Experimental Psychology, 68(4), 641–663. Ormel, E., & Crasborn, O. (2012). Prosodic correlates of sentences in signed languages: A literature review and suggestions for new types of studies. Sign Language Studies, 12(2), 279–315. Ormel, E., Crasborn, O., & van der Kooij, E. (2013). Coarticulation of hand height in Sign Language of the Netherlands is affected by contact type. Journal of Phonetics, 41 (3–4), 156–171. Padden, C. A., & Perlmutter, D. M. (1987). American Sign Language and the architecture of phonological theory. Natural Language & Linguistic Theory, 5(3), 335–375. Parker, S. (2011). Sonority. In M. van Oostendorp, C. J. Ewen, E. Hume, & K. Rice (Eds.). The Blackwell companion to phonology (Vol. II, pp. 1160–1184). Oxford: Wiley-Blackwell. Pate, J. K., & Goldwater, S. (2015). Talkers account for listener and channel characteristics to communicate efficiently. Journal of Memory and Language, 78, 1–17. Perlmutter, D. M. (1990). On the segmental representation of transitional and bidirectional movements in ASL phonology. Theoretical Issues in Sign Language Research, 1, 67–80. Perlmutter, D. M. (1993). Sonority and syllable structure in American Sign Language. In Current issues in ASL phonology (pp. 227–261). Perlmutter, D. M. (1992). On the Segmental Representation of Transitional and Bidirectional Movements in ASL Phonology. Theoretical issues in sign language research; vol. I Linguistics, ed. By Susan d. Fischer and Patricia Siple, 67-80. Pfau, R., & Quer, J. (2010). Nonmanuals: Their grammatical and prosodic roles. In D. Brentari (Ed.), Sign languages (Cambridge Language Surveys (pp. 381–402). Cambridge University Press. Pierce, J. R. (1961). An introduction to Information theory: Symbols, signals, and noise (1980 ed.). New York, NY: Dover Publications. Pietrandrea, P., & Russo, T. (2007). Diagrammatic and imagic hypoicons in signed and verbal languages. Empirical Approaches to Language Typology, 36, 35. Prieto, P., Puglesi, C., Borràs-Comes, J., Arroyo, E., & Blat, J. (2015). Exploring the contribution of prosody and gesture to the perception of focus using an animated agent. Journal of Phonetics, 49(1), 41–54. Rochet-Capellan, A., Laboissière, R., Galván, A., & Schwartz, J. L. (2008). The speech focus position effect on jaw–finger coordination in a pointing task. Journal of Speech, Language, and Hearing Research.. Roustan, B., & Dohen, M. (2010). Co-production of contrastive prosodic focus and manual gestures: Temporal coordination and effects on the acoustic and articulatory correlates of focus. Speech Prosody 2010-Fifth International Conference. Russell, K., Wilkinson, E., & Janzen, T. (2011). ASL sign lowering as undershoot: A corpus study. Laboratory Phonology, 2(2), 403–422. Sanders, N., & Napoli, D. J. (2017). A cross-linguistic preference for torso stability in the lexicon. Sign Language & Linguistics, 19(2), 197–231. Sandler, W. (1989). Phonological representation of the sign: Linearity and nonlinearity in American Sign Language (Vol. 32). Walter de Gruyter.
Sandler, W. (1990). Temporal aspects and ASL phonology. Theoretical Issues in Sign Language Research, 1, 7–35. Sandler, W. (1999). The medium and the message: Prosodic interpretation of linguistic content in Israeli Sign Language. Sign Language & Linguistics, 2(2), 187–215. Sandler, W. (2005). An overview of sign language linguistics. In K. Brown (Ed.), Encyclopedia of language and linguistics (2nd ed., pp. 328–338). Oxford, UK: Elsevier. Sandler, W. (2012). The phonological organization of sign languages. Language and Linguistics Compass, 6(3), 162–182. Sawilkowsky, S. S. (2009). New effect size rules of thumb. Journal of Modern Applied Statistical Methods, 8(2), 597–599. Schembri, A., McKee, D., McKee, R., Pivac, S., Johnston, T., & Goswell, D. (2009). Phonological variation and change in Australian and New Zealand Sign Languages: The location variable. Language Variation and Change, 21(2), 193–231. Schweitzer, A. (in press, this Special Issue). Exemplar-theoretic integration of phonetics and phonology: Detecting prominence categories in phonetic space. Journal of Phonetics. Shannon, C., & Weaver, W. (1949). The mathematical theory of information. Urbana: University of Illinois Press. Shattuck-Hufnagel, S., & Turk, A. (1996). A prosody tutorial for investigators of auditory sentence processing. Journal of Psycholinguistic Research, 25(2), 193–247. Siple, P. (1978). Visual constraints for sign language communication. Sign Language Studies, 19(1), 95–110. Smith, R., & Rathcke, T. (in press, this Special Issue). Dialectal phonology constrains the phonetics of prominence. Journal of Phonetics. Smith, C. L., Erickson, D., & Savariaux, C. (in press, this Special Issue). Articulatory and acoustic correlates of prominence in French: Comparing L1 and L2 speakers. Journal of Phonetics. Stevens, K., & Keyser, S. J. (1989). Primary features and their enhancement in consonants. Language, 65(1), 81–106. Stevens, K. N., Keyser, S. J., & Kawasaki, H. (1986). Toward a phonetic and phonological theory of redundant features. In J. S. Perkell & D. H. Klatt (Eds.), Invariance and variability in speech processes (pp. 426–463). Hillsdale, NJ: Lawrence Erlbaum. Stokoe, W. C. (1960). Sign language structure (Studies in Linguistics). Occasional Paper, 8. Stokoe, W., Casterline, D. C., & Croneberg, C. G. (1965). Dictionary of the American Sign Language based on scientific principles. Washington, DC: Gallaudet. Streefkerk, B. M. (2002). Prominence: acoustic and lexical/syntactic correlates (Vol. 58). LOT. Swerts, M., & Krahmer, E. (2008). Facial expression and prosodic prominence: Effects of modality and facial area. Journal of Phonetics, 36(2), 219–238. Swerts, M., & Krahmer, E. (2010). Visual prosody of newsreaders: Effects of information structure, emotional content, and intended audience on facial expressions. Journal of Phonetics, 38(2), 197–206. Tatman, R. (2015). The cross-linguistic distribution of sign language parameters. In Proceedings of the Annual Meeting of the Berkeley Linguistics Society (Vol. 41, No. 41). ten Holt, G. A., van Doorn, A. J., de Ridder, H., Reinders, M. J. T., & Hendriks, E. A. (2009). Which fragments of a sign enable its recognition? Sign Language Studies, 9 (2), 211–239. Turnbull, R., Royer, A. J., Ito, K., & Speer, S. R. (2017). Prominence perception is dependent on phonology, semantics, and awareness of discourse. Language, Cognition, and Neuroscience, 32(8), 1017–1033. https://doi.org/10.1080/ 23273798.2017.1279341. van der Hulst, H. (1993). Units in the analysis of signs. Phonology, 10(2), 209–241. van der Hulst, H. (2011). Pitch accent systems. In M. van Oostendorp, C. J. Ewen, E. Hume, & K. Rice (Eds.). The Blackwell companion to phonology (Vol. II, pp. 1003–1026). Oxford: Wiley-Blackwell. van der Kooij, E., & Crasborn, O. (2008). Syllables and the word-prosodic system in Sign Language of the Netherlands. Lingua, 118(9), 1307–1327. van der Kooij, E., & Crasborn, O. A. (2013). The phonology of focus in Sign Language of the Netherlands. Journal of Linguistics. van der Kooij, E., Crasborn, O., & Emmerik, W. (2006). Explaining prosodic body leans in Sign Language of the Netherlands: Pragmatics required. Journal of Pragmatics, 38 (10), 1598–1614. van Duinen, H., & Gandevia, S. C. (2011). Constraints for control of the human hand. The Journal of Physiology, 589(23), 5583–5593. Wagner, M., & McAuliffe, M. (in press, this Special Issue). The effect of emphasis on phrasing. Journal of Phonetics. Wagner, P., Origlia, A., Avesani, C., Christodoulides, G., Cutugno, F., D'Imperio, M., ... Moniz, H. (2015). Different parts of the same elephant: A roadmap to disentangle and connect different perspectives on prosodic prominence. In The Scottish Consortium of the ICPhSThe Scottish Consortium of the ICPhS (Ed.), Proceedings of the 18th International Congress of Phonetic Sciences (paper number 0202). Glasgow, UK: The University of Glasgow. Wilbur, R. B. (1990). An experimental investigation of stressed sign production. International Journal of Sign Linguistics, 1, 41–60. Wilbur, R. B. (1999). Stress in ASL: Empirical evidence and linguistic issues. Language and Speech, 42(2–3), 229–250. Wilbur, R. B., & Martínez, A. M. (2002). Physical correlates of prosodic structure in American Sign Language. In M. Andronis, E. Debenport, A. Pycha, & K. Yoshimura (Eds.), Proceedings of the Chicago Linguistic Society (Vol. 38-1, pp. 693-704). Chicago: Chicago Linguistic Society. Wilbur, R. B., & Zelaznik, H. N. (1997). Kinematic correlates of stress and position in ASL. In Annual Meeting of the Linguistic Society of America, January (Vol. 3).
O. Tkachman et al. / Journal of Phonetics 77 (2019) 100935 Wilbur, R. B., & Nolen, S. B. (1986). Reading and writing. Gallaudet Encyclopedia of Deaf People and Deafness, 146–151. Wilbur, R. B., & Patschke, C. G. (1998). Body leans and the marking of contrast in American Sign Language. Journal of Pragmatics, 30(3), 275–303. Wilbur, R. B., & Schick, B. S. (1987). The effects of linguistic stress on ASL signs. Language and Speech, 30(4), 301–323. Woll, B. (2008). Do mouths sign? Do hands speak?: Echo phonology as a window on language genesis. LOT Occasional Series, 10, 203–224.
23
Xavier, A., Tkachman, O., & Gick, B. (2015). Towards convergence of methods for speech and sign segmentation. Canadian Acoustics, 43(3). Yasinnik, Y., Renwick, M., & Shattuck-Hufnagel, S. (2004). The timing of speechaccompanying gestures with respect to prosody. In Proceedings of the international conference: From sound to sense (Vol. 50, pp. 10–13). Yehia, H. C., Kuratate, T., & Vatikiotis-Bateson, E. (2002). Linking facial animation, head motion and speech acoustics. Journal of Phonetics, 30(3), 555–568.