This is largely an exploration of “Calculation of the Acoustical Properties of Triadic Harmonics” by Norman Cook1 (JASA 2017). I originally started looking into this work several years ago, but have since refactored it from Matlab to Julia for the performance boost (mostly for fun).

In the original work, triads are characterized by the signed semitone difference between the intervals between notes in a triad $\Delta$. By convention, the upper interval is subtracted from the lower interval. $$ \Delta=d_{12}-d_{23} $$ For example, a major chord in first inversion consists of a minor third $(d_{23}=3)$ atop a major third $(d_{12}=4)$, yielding a difference $\Delta=4-3=1$.

From differences $d_{12}$ and $d_{23}$, the metrics tension $T$ and valence $V$ are created. These metrics are scalar fields parameterized by constants $\alpha$ and $\varepsilon$ respectively. $$ T(d_{12},d_{23}|\alpha)=\exp\left[ -\left(\frac{d_{12}-d_{23}}{\alpha}\right)^2 \right] $$ $$ V(d_{12},d_{23}|\varepsilon)= \frac{2\Delta}{\varepsilon}\exp\left[ -\frac{(d_{12}-d_{23})^4}{4} \right] $$

The current models for tension and valence reference several dissonance theories (Plomp and Levelt, Sethares, Kameoka and Kuriyagawa) that there are certain rises and falls in perceived dissonance as the harmonicity of a dyad interval increases. This also corroborates Hermann von Helmholt’s more fundamental beat theory, that intervals deemed “consonant” are scaled loosely according to the critical bandwidths of human audition.

The above functions can be interpreted as surfaces constructed over basis $d_{12},d_{23}$. We observe these values over a dense grid of intervals by converting semitones to frequencies. A twelve tone equal temperament (12TET) scale is used for ease of calculation, as reference frequency $f_0$ is factored out with the calculation of intervals (frequency ratios). $$ f=f_0*2^\frac{d}{12} $$ $$ T(f_1,f_2,f_3|\alpha)=\exp\left[ -\left(\frac{(f_2-f_1)-(f_3-f_2)}{\alpha}\right)^2 \right] $$ $$ V(f_1,f_2,f_3|\varepsilon)= \frac{2\Delta}{\varepsilon}\exp\left[ -\frac{\big((f_2-f_1)-(f_3-f_2)\big)^4}{4} \right] $$
Note: All of the plots use the default values of $\alpha=0.60$ and $\varepsilon=1.558$

The original work shows that by adding one harmonic to each chord tone (F1, or $N=1$ below) and re-calculating the metrics as a sum of pair-wise differences between all frequency components $$ T_{F1}(f_1,f_2,f_3|\alpha) = \!\!\!\!\!\!\!\!\!\!\!\! \sum_{\substack{i,j,k\in\{1,2,3\} \\ a,b,c\in[1,N] \\ \{a,i\}\neq\{b,j\}\neq\{c,k\}}} \!\!\!\!\!\!\!\! T(af_i,bf_j,cf_k|\alpha) $$ $$ V_{F1}(f_1,f_2,f_3|\varepsilon) = \!\!\!\!\!\!\!\!\!\!\!\! \sum_{\substack{i,j,k\in\{1,2,3\} \\ a,b,c\in[1,N] \\ \{a,i\}\neq\{b,j\}\neq\{c,k\}}} \!\!\!\!\!\!\!\! V(af_i,bf_j,cf_k|\varepsilon) \;\;, $$ the resulting tension surface shows an increase of tension from diminished $(d_{12}=d_{23}=3)$, to augmented $(d_{12}=d_{23}=4)$, to tritone $(d_{12}=d_{23}=6)$ chords. Furthermore, the valence surface exhibits local maxima that coincide with the major chord inversions and local minima that coincide with minor chord inversions.

Adding a second harmonic (F2) to the calculation yields more peaks and valleys, though not all of them coincide with discrete semitone intervals.

Some open comments and questions I have of this work.

  • I did try to conduct an informal triangle test: I would choose two chords that have the same level of tension/valence (according to the F1 model) and a third chord with markedly different tension/valence. Subjects were then asked if they were able to pick the odd one out.
    • Example:
    • The sample size was far too small for me to draw any significant conclusions. I would be interested to perform more subjective validation of these models. It probably wouldn’t hurt to implement fade-in and fade-outs in my audio samples as well.
  • The concept of chords from different chord centers seems to be outside of the scope of this model. Incorporating triads from different chord centers would require a dimensionality increase.
  • There were some extrema in the F1 surfaces that beg some thought (marked with ‘X’ on the above 2D figure)
    • On the tension plot, stacks of 6-0 and 0-6 would not result in triads, while a 8-8 triad is constructed with minor 6ths (which does admittedly sound tense).
    • On the valence plot, the 8-7 triad represents an Ab major in an open voicing, while the 7-8 triad represents a C minor in an open voicing.
  • This work further suggests that chordal properties can be interpreted as a differentiable surface! This may have further implications on learning methods for musical analysis.
  • It would be interesting to further develop this work to include different harmonic temperaments and harmonic rolloffs.
  • A web applet would be fun.

  1. Prof. Puckette brought to my attention that this was also the name of Fatboy Slim. ↩︎