Available online at www.sciencedirect.com
Fuzzy Sets and Systems 214 (2013) 65 – 74 www.elsevier.com/locate/fss
Analyzing musical expressivity with a soft computing approach Josep Lluis Arcos∗ , Enric Guaus, Tan H. Ozaslan Artificial Intelligence Research Institute, IIIA-CSIC, Spanish National Research Council, Campus UAB, Bellaterra, Spain Available online 4 February 2012
Abstract In this paper we present our research on the design of a tool to analyze musical expressivity. Musical expressivity is a human activity difficult to model computationally because of its nature: implicitly acquired by musicians through a long process of listening and imitation. We propose the use of soft computing techniques to deal with this problem. Specifically, from a collection of sound features obtained by using state of the art audio analysis algorithms, we apply a soft computing process to generate a compact and powerful representation. Moreover, we have designed a graphical user interface to provide a flexible analysis tool. We are using the analysis tool in the guitarLab project, focused on the study of musical expressivity of classical guitar. © 2012 Elsevier B.V. All rights reserved. Keywords: Soft computing; Musical expressivity; Musical analysis
1. Introduction When musicians play a musical piece, they depart from a musical score and incorporate a lot of nuances not explicitly written in the score. This contribution of the musicians comes from their musical knowledge and their personal understanding of music and is known as musical expressivity. Specifically, the most recognized music performers are those able to convey a personal character to a piece while preserving the original composition. However, the analysis and understanding of musical expressivity is still an open research problem. In the last decade, the advances in audio processing techniques have provided many tools able to extract musical features from a musical recording (see [7] for an overview). However, these extracted features are numerical measurements difficult to relate to their musical meaning. This phenomenon is known as the semantic gap in music: the existing distance between what current computer methods are able to describe and what humans are able to capture when listening the same music (see [3] for a more detailed description). Traditionally, expressivity analysis has been performed as an off-line process where a collection of rules are generated by the use of machine learning techniques. The goal is to make explicit the expressive trends recurrently performed by musicians. In this off-line approach the recordings are not directly analyzed by the experts. Instead, experts analyze the rules extracted by machine learning algorithms from recordings. An example of this approach is the research of Widmer [25] applying inductive learning techniques to acquire rules of classical piano performing. See [19] for a more complete survey on the design of systems using explicit knowledge. ∗ Corresponding author. Tel.: +34 935809570; fax: +34 935809661.
E-mail addresses:
[email protected] (J.L. Arcos),
[email protected] (E. Guaus),
[email protected] (T.H. Ozaslan). 0165-0114/$ - see front matter © 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.fss.2012.01.019
66
J.L. Arcos et al. / Fuzzy Sets and Systems 214 (2013) 65 – 74
The goal of our research is the design of a tool for assisting musicians to directly analyze specific recordings. Specifically, our research in the guitarLab project [9] is focused on the study of classical guitar and aims at designing a system able to model the use of the expressive resources of that instrument. To that purpose, one of the first goals in our project is the design of analysis tools to characterize the musical expressivity of guitar performers. The study of guitar expressivity is not new. Guitar recordings were studied in [15] with the goal of designing a realistic guitar synthesis tool. However, they were focused on the synthesis model. Other works [10,22] propose the use of optical motion tracking systems to model guitarist gestures as an indirect way to understand the musical expressivity. Although these previous results are helpful to our research, their approaches do not fully cover our goals. Previously [1,2], we proposed the use of fuzzy techniques to store performance deviations in the SaxEx system. Specifically, performance deviations were represented using fuzzy labels with associated membership degrees. Using these techniques we were able to generate expressive melodies by performing sound re-synthesis on non-expressive recordings. However, fuzzy techniques were only used in the internal reasoning. Following these positive results of the SaxEx system, in this paper we present the use of soft computing techniques to help the analysis of musical performances. Specifically, we are interested in providing a graphical representation to facilitate the analysis of a musical performance to users not familiar neither with statistical analysis or low-level music analysis. Then, we aimed at solving two issues: to find a unified representation of multiple musical features and to provide a compact representation. The unified representation is interesting to facilitate the comprehension. The compact representation is important to facilitate the comparison among different musical features. Soft computing techniques have been successfully used previously in musical analysis. For instance, Kostek [13] describes applications of different soft computing methods to a wide range of problems such as the automatic classification of musical instrument sounds, musical phrase analysis, or the control of pipe organ sounds. Fuzzy sets have been used to analyze the emotional expression in music performance and body motion [6] or to model music performance rules [4]. Fuzzy set functions allow the direct use of qualitative descriptions related to different emotional expressions (e.g. Happiness–Sadness, Legato–Staccato). An interesting work by Leon and Liern [16] has exploited soft computing capabilities to musical analysis. Specifically, they model notes as fuzzy sets and study their compatibility by using the Zadeh consistency index. Then, the compatibility between musicians is inferred by the use of an OWA operator applied to the notes composing a musical piece. However, the research is not focused on analyzing musical expressivity. Lesaffre et al. [17] performed a large-scale study demonstrating the importance of the semantic descriptions of music. Humans use linguistic-based terms to describe music characteristics (either affective or structural descriptors). Their conclusions clearly support our approach to use qualitative representations to analyze musical performances. 2. System design The analysis tool has been designed in four main modules: the acquisition of sound features from a musical recording, the transformation of this data into a more exploitable representation, the analysis of the musical score, and the graphical user interface. As we will describe in the next sections, we use soft computing techniques to represent both sound features and musical features. Then, this uniform representation is exploited to provide a flexible graphical visualization able to compact the musical information. As a testing example, we used the Jazz Standard Blue Bossa of Joe Henderson [24] played by a guitarist. This musical piece is interesting because it presents a melodic structure that allows to compare the variations introduced by a musician in similar melodic sequences. 2.1. Extracting sound features Several sound features may be extracted from the analysis of a musical recording. We used state of the art audio analysis algorithms to obtain the collection of sound features. Specifically, we used libraries designed for the annotation of audio signals such as Aubio [5], MIR Toolbox [14], and our own tools [23]. Extracted features capture information about different facets of a musical performance. In the current system, the features analyzed can be grouped into three different expressive dimensions: rubato, loudness, and timbre. The first group of features is related to rubato. Rubato is a musical term that refers to the rhythmic changes introduced by the performer when playing a musical phrase. These variations, that produce slight deviations of tempo, are used
J.L. Arcos et al. / Fuzzy Sets and Systems 214 (2013) 65 – 74
67
1
0
Onset Dev (secs)
Fig. 1. Fuzzification of rubato from onset deviations.
by the performer as a way to convey emotions. Specifically, rubato descriptors provide information about the notes’ start (known as note onsets) and their duration. These two values are used to calculate the deviations introduced by a musician with respect to the score (the reference notation). For instance, we calculate the deviations of played notes comparing with the expected onsets determined by the score. Anticipated notes have a negative rubato value whereas delayed notes have a positive value of rubato. Next, rubato values are mapped to qualitative labels (see Fig. 1) where central label represents no-deviation (in our experiments deviations lower than 0.01 s). Deviations of note durations are calculated in a similar way. The second group of features is related to the loudness applied by the musician when playing a note. In a musical score, the composer may introduce some marks in the score (like crescendo or decrescendo marks) but loudness of notes is not explicitly indicated. Nevertheless, analogously to rubato, a musician apply different loudness to the notes according to her musical knowledge and her personal aesthetic preferences. The third group of features is related to note qualities (known as timbre). When analyzing musical notes, we may distinguish two phases: the note attack (the starting period of a note) and the sustain (the period where the note tone is stable). To characterize note attacks we use the rise time. To characterize the stable part of a note we use the Mel Frequency Cepstral Coefficients (MFCC). The rise time measures the time (in ms) a played note takes to reach the maximum loudness. In guitar, rise time is used to characterize string plucking. The ranges of rise attack vary from 15 to 40 ms and were mapped to five labels. Central label represents neutral attacks. On the extremes, rise attacks higher than 30 are mapped to very-soft attacks and rise attacks lower than 18 are mapped to very-harsh attack. Although each note may be described by its fundamental frequency (the note pitch), the notes generated by a musical instrument incorporate additional frequencies that characterize (and distinguish) the instrument. Many of these frequencies are present due to the physical characteristics of each instrument. However, the way a musician plays a note may alter some of them. For instance, in a guitar changing the place a given string is plucked affects the presence of higher or lower frequencies in the resulting note. Mel Frequency Cepstral Coefficients (MFCC) were proposed by Logan [18] and are used in music information retrieval applications as a descriptor that characterizes tone qualities (timbre). MFCC is used as a rough approximation to estimate the bandwidths of human auditory system by mapping the perceived frequencies of a tone to a linear scale. Thus, we have used MFCC to capture timbre qualities of the notes at the sustain phase. Fig. 2 shows the evolution of three different sound features (loudness, rubato, and rise time) for the first fifth bars of the recorded theme. Notice that, although the changes in these three features are clear, it is not easy to extract conclusions by using this numerical (i.e. direct) representation. As we will describe in the next section, soft computing techniques will help to transform the values of the extracted features to a most appropriate representation. Additionally, all these deviations may be analyzed individually, at the note level, or grouping notes in a higher musical level such as melodic motives (short sequences of notes). For instance, ritardandi or accelerandi are expressive resources that affect several consecutive notes. In Section 2.3 we will describe the capabilities of this higher analysis level. 2.2. Representing sound features From the description of the sound features presented in the previous section, it is clear that the analysis tool has to manage different types of data. First, note onsets or note durations (rubato) are related to time. Moreover, because they are directly related to the reference that provides the score, the deviations with respect to the score reference are
68
J.L. Arcos et al. / Fuzzy Sets and Systems 214 (2013) 65 – 74
Fig. 2. Numerical representation of three sound features.
Fig. 3. Rubato.
Fig. 4. Zooming rubato.
more informative than their values. Analyzing the direct representation of the evolution of these features presented in Fig. 2, it is difficult to extract musical patterns. Fortunately, since rubato captures temporal information close to the score representation, we may represent it using temporal boxes. Fig. 3 presents a first attempt to relate a performance with its score. The first observation is that, although real note durations may be partially perceived, onset deviations are really difficult to analyze. We may zoom the score (see Fig. 4) and now the deviations become more evident. Nevertheless, the musical perspective is quickly lost. As we will propose below, the intuition is that we should use a qualitative solution to present rubato information. The second type of features measures the amount of an audio characteristic in a given recording window. This type of features may be characterized by its numerical range (maximum and minimum) where each feature has its own range. However, to allow the comparison among them, the ranges are normalized between 0 and 1. Examples of these type
J.L. Arcos et al. / Fuzzy Sets and Systems 214 (2013) 65 – 74
69
Fig. 5. Qualitative transformation of sound features.
of features are note loudness or attack rise time. Analogous to the previous features (see Fig. 2), it is very difficult to extract meaningful information from their direct representation. Then, we also apply a fuzzification process to feature values. The third type of sound features are qualitative features. The range of values of these features is a collection of unordered labels. Thus, each label represents a feature class and no direct relation exists among the classes. For instance, MFCC are 10 different numerical coefficients that estimate timbre qualities. Because coefficients are not independent on each other, they cannot be compared individually. Instead, MFCC are exploited using clustering techniques. Thus, grouping the samples according to their coefficients, we are characterizing guitar body resonances and restricting the analysis to the most representative clusters. Specifically, MFCC of note samples are clustered into five different clusters with the Expectation Maximization algorithm using the Weka toolbox [26]. The analysis tool provides several configuration alternatives. First, sound features may be characterized either using three or five fuzzy labels. Choosing three fuzzy labels, users may easily identify the places where a specific sound feature plays an important role. Increasing the representation to five fuzzy labels, users may perform more detailed analysis. We used a trapezoidal representation as described in the previous section for rubato (see Fig. 1). Additionally, the fuzzification may be performed with respect to a local context (e.g. a melodic motive) or to a more broad context (e.g. the whole recording). Specifically, the numerical range of a given sound feature (i.e. its minimum and maximum) is calculated only taking into account the values of a specified context. This capability provides a useful flexibility to the analysis tool. For instance, when analyzing the evolution of the loudness in a musical phrase, the relevant dynamic scope is the same musical phrase (in the recording ranging from −19 to −24 dB). However, when analyzing the loudness in a melodic motive both the phrase context and the motive context may be relevant. First, the phrase context reveals values that overtake the local context. Instead, using the motive context nuances arise. Specifically, because loudness differences in motives are around 4 dB, we uniformly distribute fuzzy labels into this smaller scope. After transforming all the features to a qualitative level, there are several alternatives to display them. First, since all the features refer to a period of time (e.g. a note) values are shown as boxes occupying the time period. Next, each fuzzy label has a color associated using intensity to grade from lowest to highest value. Finally, each feature may be displayed by placing the boxes in the y-axis or placing all at the same horizontal line. The first representation (see Fig. 5) may be useful to analyze the evolution of a single sound feature. However, with this representation it is difficult to compare the correlations among different sound features because features are distant from each other. Placing each feature in a single line (see Fig. 6) provides a more compact representation. Moreover, since fuzzy labels are graded using color intensities, the colors already point to their distinct values. Then, as we will explain in Section 2.4, the mosaicing representation is a powerful analysis tool.
70
J.L. Arcos et al. / Fuzzy Sets and Systems 214 (2013) 65 – 74
Fig. 6. Mosaic representation.
Fig. 6 presents four graded features (onset, energy, duration, and rise time) and one non-graded feature (MFCC). Positive deviations are represented with lighter colors (e.g. notes played with a longer duration) whereas negative values are represented with dark colors (e.g. shortened notes). Instead, colors used to represent MFCC feature only represent different clusters. 2.3. Analyzing the melodic context In the previous section, we have presented the sound features used to describe musical expressivity. However, to understand the use of expressive resources in a musical piece we have to characterize the musical context where notes are played, i.e. analyze musical pieces. Musical analysis is performed at different musical levels ranging from the note level to the phrase level. At the note level, a first characteristic to consider is the metrical strength of the notes. The metrical strength of a note can be calculated with respect to its position in a bar and/or to its position at the phrase level. For instance, in a four beats meter the metrical strength associated to the beats is, respectively, strong, weak, semi-strong, and weak. Additionally, we have to consider the syncopation effect where a note that elapses in a strong beat is anticipated in a weak beat. A typical example of this effect is produced by the notes starting at the end of bars (see examples in score of Blue Bossa in previous figures). This rhythmic resource intensifies the accent of the note. A second characteristic to consider at the note level is the underlying harmony. Notes play a different role whether they belong to the underlying chord or not. For instance, notes not belonging to the chord are often used in the melody to connect chord’s notes (and as such are called passing notes). Moreover, these passing notes tend to be placed in weak metrical beats. Exploiting harmonic knowledge, the harmonic stability of each note is calculated. In the next section, we will explain how melodic features, such as metrical strength and harmonic stability, are displayed together with sound features to show their correlations. Melodic motives comprise a second analysis level. Melodic motives are short sequences of notes, mainly ranging from three to five notes, that are combined to build larger musical structures such as musical phrases. Melodic motives are related to the human perception of melodies and have been studied by the Gestalt Theory [11,12]. Specifically, Meyer [20] applied the principles of Gestalt Theory to model melody perception. Gestalt theory states that perceptual elements are grouped together to form a single perceived whole (called ‘gestalt’). This grouping follows some principles: proximity (two elements are perceived as a whole when they are perceptually close), similarity (two elements are perceived as a whole when they have similar perceptual features, e.g. color in visual perception), and good continuation (two elements are perceived as a whole if one is a ‘natural’ continuation of the other). Implication–Realization (IR) model of Narmour [21] claims that similar principles hold for the perception of melodic sequences. These principles take the form of implications and involve two main principles: registral direction (PRD) and intervallic difference (PID). The PRD principle states that small intervals create the expectation of a following interval in the same registral direction (for instance, a small upward interval generates an expectation of another upward interval), and large intervals create the expectation of a change in the registral direction (for instance, a large upward
J.L. Arcos et al. / Fuzzy Sets and Systems 214 (2013) 65 – 74
71
Fig. 7. Eight of the basic structures of the IR model.
Fig. 8. Combining melodic and sound features.
interval generates an expectation of a downward interval). The PID principle states that a small (five semitones or less) interval creates an expectation of a following similarly sized interval (plus or minus two semitones), and a large interval (seven semitones or more) creates an expectation of a following smaller interval. Based on these principles, IR model proposes eight basic melodic patterns (see Fig. 7) that are used to characterize the local structure of a melody. Then, from our successful experience on using the IR model to capture melodic similarities [8], we incorporated this analysis capability into our system. Specifically, IR analysis is used to compare the way similar melodic motives are played. First, melodies are fragmented in melodic motives and then, melodic motives are grouped according to the IR patterns shown in Fig. 7. For instance, in the next section we will see how we may compare the way different descending melodies (P patterns) are played. The last analysis level we are currently considering is the phrase level. The typical length of a musical phrase is eight bars. At the phrase level, we can apply the notions of metrical strength to the bars. For instance we can consider a musical phrase as four pairs of bars and apply the metrical strength analogously to a four beats bar. In addition, the underlying harmony at bar level may emphasize the melodic structure of the musical phrase. 2.4. The analysis tool After introducing how sound features are represented and the melodic features calculated from the score, we are ready to describe the capabilities of the analysis tool. Besides some technical capabilities allowing score navigation and performance playing, we will focus the explanation on how mosaicing representation provides a powerful analysis perspective of musical expressivity. The analysis tool (see Fig. 8) provides three information layers: on top, the score including melodic and harmonic information; below the score, the musical features; and finally, sound features. Musical and sound features are aligned with their corresponding notes. Moreover, instead of displaying a melody fragment, we may display melodic fragments that share the same IR pattern (see Fig. 9). As a result of the uniform representation of sound features, it becomes very natural for identification of feature correlations. Additionally, since melodic features are also represented in the same way, the relation among notes and expressive resources can be established easily. As an example, in Fig. 8 metrical strength and harmonic stability (melodic features) are presented together with sound features. Thus, it becomes easy to identify feature correlations whenever colors coincide vertically. For instance, notes played in strong beats tend to be played with high loudness. However, we can observe that musicians do not apply the same combinations of expressive resources to all notes with a similar melodic role.
72
J.L. Arcos et al. / Fuzzy Sets and Systems 214 (2013) 65 – 74
Fig. 9. Analyzing similar melodic motives.
A detailed analysis of the previous observation can be performed by focusing the analysis only on similar melodic motives (the alternative display mode). Specifically, using the IR model we may display together different melodic motives that present the same perception pattern. For instance, Fig. 9 shows the analysis of three similar descending melodic sequences. Specifically, second and third sequences are just a melodic transposition of the first sequence. The metrical strength is the same and there are only minor differences on the underlying harmony. As it can be observed, the way starting and ending notes are played is very similar. Middle notes do not present exactly values but share similar envelopes. That is to say, the notes present the same ascending/descending pattern in almost all sound features. 2.5. Technical details In the previous sections we have presented the intuitions behind our tool, described the melody and sound features we exploit, and provided some examples. The purpose of this section is to introduce in a more mathematical notation the elements used in our proposal. First, in each recording we have two components: the score S and the performance P. The score is represented as a tuple of notes where each note has five melodic features: pitch, onset, duration, metrical strength, and harmonic stability. The performance is represented as a tuple of played notes where each played note is characterized by five features: loudness, onset, duration, rise time, and MFCC. Each note in S has its peer performed note in P. We will use the notation Si and Pi to refer to note i in the score (and performance). Then, to access the feature j of the note i we use Si, j . Given a new recording (S, P) containing n notes, we have to define the analysis window (a, b) where a < b≤n. For instance, using as analysis window (1, n) we are using the whole recording as analysis context. If we are interested only on a specific melodic motive, we may use the IR melodic analysis provided by the analysis tool to choose a specific melodic context. To calculate the qualitative values of a feature, we have to calculate the deviations from the expected values (determined by the score) and its range. Deviations may be calculated directly comparing score and performance, or from the mean. For instance, the duration deviation of a given note i is calculated as i,duration = Pi,duration − Si,duration Instead, loudness deviations are calculated from the mean. That is, given an analysis window (a, b), we first calculate the mean as b Pi,loudness m loudness = i=a b−a+1
J.L. Arcos et al. / Fuzzy Sets and Systems 214 (2013) 65 – 74
73
and then, deviations are calculated as i,loudness = Pi,loudness − m loudness In a similar way, rubato is calculated comparing onsets (melody versus performance) and deviations in rise time are calculated from the mean. Notice that MFCC is represented in a different way because it is characterized by a clustering process. Then, deviations are centered to zero, i.e. neutral performances will be represented as the central label (see Fig. 1 for an example). The next step is to decide the number of qualitative labels we may use for the analysis. In the current approach we may choose among three or five labels. Since labels are represented as trapezoidal fuzzy numbers (a, b, c, d), they are constructed taking into account the range of each feature. The analysis provides three options: the predefined range of the feature (for instance, the range for rise time in guitar varies from 15 to 40 ms); the range of the feature in the recording (for instance, loudness in the Blue Bossa’s recording varies from −19 to −24 dB); or the range in the analysis window. We uniformly distribute trapezoids in the ranges of each feature. For instance, suppose that the values of a given feature are distributed in the interval [−1, 1]. We define the three trapezoidal fuzzy numbers as (−1.0, −1.0, −0.5, −0.25), (−0.5, −0.25, 0.25, 0.5), and (0.25, 0.5, 1.0, 1.0). Then, we calculate the memberships to each trapezoid. 3. Conclusions In this paper we presented a system to analyze musical expressivity that benefits from soft computing techniques to summarize information and to provide a friendly representation to users not familiarized neither with statistical analysis or low-level music analysis. Our system has been designed with the aim to understand musical expressivity of guitarists. An important characteristic of the tool is the uniform representation of sound and musical features. This uniform representation allows to explicit the relations among the performance and its score. Moreover, the incorporation of the Implication–Realization melodic analysis provides a powerful mechanism to compare how similar musical motives are played. The current design of the system provides great flexibility and many interaction capabilities to users. However, the tool is not proactive in pointing information to users. As a future research, we plan to automatically exploit the relations between melodic features and performance features to guide the user interaction. Additionally, we want to explore the possibilities to compare multiple recordings of a given musical piece. This functionality will allow to compare different musicians or to analyze how different affective intentions (e.g. tender versus aggressive) change the way a musician plays the same piece. Acknowledgments This work was partially funded by projects NEXT-CBR (TIN2009-13692-C03-01), IL4LTS (CSIC-200450E557) and by the Generalitat de Catalunya under the Grant 2009-SGR-1434. We also want to thank Benjamí Abad for his contribution with the guitar recordings. References [1] J. Arcos, R. López de Mántaras, An interactive case-based reasoning approach for generating expressive music, Appl. Intell. 14 (1) (2001) 115–129. [2] J. Arcos, R. López de Mántaras, Combining fuzzy and case-based reasoning to generate human-like music performances, in: Technologies for Constructing Intelligent Systems: Tasks, Physica-Verlag GmbH, 2002, pp. 21–31. [3] N. Bernardini, X. Serra, M. Leman, G. Widmer, G.D. Poli (Eds.), Challenges and Strategies. A Roadmap for Sound and Music Computing, URL http://http://smcnetwork.org/roadmap/challenges, 2007. [4] R. Bresin, G. De Poli, R. Ghetta, A fuzzy formulation of KTH performance rule system, in: 2nd International Conference on Acoustics and Musical Research, 1995, pp. 15–36. [5] P. Brossier, Automatic Annotation of Musical Audio for Interactive Systems. Ph.D. Thesis, Centre for Digital music, Queen Mary University of London, 2006. [6] A. Friberg, A fuzzy analyzer of emotional expression in music performance and body, in: Proceedings of Music and Music Science, 2004, pp. 1–13.
74
J.L. Arcos et al. / Fuzzy Sets and Systems 214 (2013) 65 – 74
[7] F. Gouyon, P. Herrera, E. Gómez, P. Cano, J. Bonada, A. Loscos, X. Amatriain, X. Serra, Content Processing of Music Audio Signals, Logos Verlag Berlin GmbH, 2008 , pp. 83–160 (Chapter 3). [8] M. Grachten, J.L. Arcos, R.L. de Mantaras, Melody retrieval using the implication/realization model, in: 6th International Conference on Music Information Retrieval (ISMIR 2005), 2005 (First prize of the MIREX Symbolic Melodic Similarity Contest). [9] guitarLab, URL http://www.iiia.csic.es/guitarLab, 2011. [10] H. Heijink, R.G.J. Meulenbroek, On the complexity of classical guitar playing: functional adaptations to task constraints, J. Motor Behav. 34 (4) (2002) 339–351. [11] K. Koffka, Principles of Gestalt Psychology, Routledge & Kegan Paul, London, 1935. [12] W. Köhler, Gestalt Psychology: An Introduction to New Concepts of Modern Psychology, Liveright, New York, 1947. [13] B. Kostek, Soft Computing in Acoustics: Applications of Neural Networks, Fuzzy Logic and Rough Sets to Musical Acoustics, Studies in Fuzziness and Soft Computing, vol. 31, Springer-Verlag, 2009. [14] O. Lartillot, P. Toiviainen, A matlab toolbox for musical feature extraction from audio, in: International Conference on Digital Audio Effects (DAFx-07), 2007, pp. 237–244. [15] M. Laurson, C. Erkut, V. Välimäki, M. Kuuskankare, Methods for modeling realistic playing in acoustic guitar synthesis, Comput. Music J. 25 (3) (2001) 38–49. [16] T. Leon, V. Liern, Obtaining the compatibility between musicians using soft computing, in: E. Hüllermeier, R. Kruse, F. Hoffmann (Eds.), IPMU 2010, Part II, CCIS, vol. 81, Springer Verlag, 2010, pp. 75–84. [17] M. Lesaffre, L. De Voogdt, M. Leman, B. De Baets, H. De Meyer, J. Martens, How potential users of music search and retrieval systems describe the semantic quality of music, J. Am. Soc. Inf. Sci. Technol. 59 (5) (2008) 695–707. [18] B. Logan, Mel frequency cepstral coefficients for music modeling, in: International Symposium on Music Information Retrieval, ISMIR, 2000. [19] R. López de Mántaras, J. Arcos, AI and music, from composition to expressive performance, AI Mag. 23 (3) (2002) 43–57. [20] L. Meyer, Emotion and Meaning in Music, University Chicago Press, Chicago, IL, 1956. [21] E. Narmour, The Analysis and Cognition of Basic Melodic Structures: The Implication–Realization Model, University of Chicago Press, 1990. [22] J. Norton, Motion Capture to Build a Foundation for a Computer-Controlled Instrument by Study of Classical Guitar Performance, Ph.D. Thesis, Stanford University, September 2008. [23] T.H. Ozaslan, E. Guaus, E. Palacios, J.L. Arcos, Identifying attack articulations in classical guitar, in: S. Ystad, M. Aramaki, R. KronlandMartinet, K. Jensen (Eds.), Exploring Music Contents, Lecture Notes in Computer Science, vol. 6684, Springer-Verlag, 2011, pp. 219–241. [24] Sher Music Co (Ed.), The New Real Book, Sher Music Co, 1988. [25] G. Widmer, Discovering simple rules in complex data: a meta-learning algorithm and some surprising musical discoveries, Artif. Intell. 146 (2) (2003) 129–148. [26] I.H. Witten, E. Frank, M.A. Hall, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, 2000.