Introduction
Let's take a look at how to use ReAline
to align phonological sequences and then analyze errors in the style of Jiao et al. (2019)
For this example, we'll need the following imports:
from clu.phontools.lang.en import EnglishUtils
from clu.phontools.struct import *
from clu.phontools.pronouncing import ConverterUtils
from clu.phontools.alignment.realine import *
from clu.phontools.alignment.lbe import *
We'll be aligning a target phrase with a transcript phrase.
In our example, our target phrase will be ...
balance clamp and bottle
... and our transcribed phrase will be ...
bell is glad a bottle
Phrases and syllables
clu-phontools
makes it easy to find all possible attested pronunciations for a phrase via a pronouncing dictionary. In the case of English, we can use the CMU pronouncing dictionary:
# prompt/target is "balance clamp and bottle"
# get all pronunciations for the phrase
# return a sequence of `clu.phontools.struct.Phrase`
target_phrases = EnglishUtils.all_possible_phrases_for(["balance", "clamp", "and", "bottle"])
How many pronunciations did we find? In this case, there should be 2:
assert len(target_phrases) == 2
Let's examine the stress patterns. Perhaps we only care about the distinction between strong (S) and weak (W) syllables:
# let's examine the stress patterns...
for phrase in target_phrases:
print(phrase.coarse_stress)
Perhaps we have a particular stress pattern we're interested in examining. For this example, we'll pretend we're interested in finding phrases with the stres pattern SW S W SW:
# our stress pattern of interest
pattern = "SW S W SW"
# find the first Phrase that matches our pattern.
# we'll use the first entry (stress pattern = ["SW", "S", "W", "SW"])
# as our target.
match_stress = lambda phrase: phrase.match_coarse_stress_pattern(pattern)
target: Phrase = next(filter(match_stress, target_phrases))
We'll apply the same steps to find transcript phrases:
# transcript says "bell is glad a bottle"
# get all pronunciations for the phrase
transcript_phrases = EnglishUtils.all_possible_phrases_for(
["bell", "is", "glad", "a", "bottle"]
)
# in this case, there should be 4:
assert len(transcript_phrases) == 4
Before we searched for a particular syllable-stress pattern. Let's ignore stress for a moment and simply look for a pattern of syllables. Imagine we're interested in phrases composed of 4 monosyllabic words followed by a disyllabic word. We can represent this pattern using X X X X XX where X denotes a syllable and whitespace represents a lexical boundary:
# all of these phrases have 6 syllables with the structure "X X X X XX".
# If you're unfamiliar with regular expressions, keep in mind that ...
# ^ in the pattern below means "starts with"
# $ in the pattern below denotes the "end of sequence"
all(phrase.match_masked_syllables(pattern="^X X X X XX$", mask="X") for phrase in transcript_phrases)
# for our comparisons, let's use the first one:
transcript: Phrase = transcript_phrases[0]
Alignment
Now that we have both a target and transcript, let's use ReAline to align the two phonological sequences:
# let's calculate lexical boundary errors for the first one:
aligner = ReAline()
By default, ReAline expects to align IPA, so we'll first want to convert our ARPABet-based representations to IPA:
target_phones = [ConverterUtils.arpabet_to_ipa(phone) for phone in target.phones]
transcript_phones = [ConverterUtils.arpabet_to_ipa(phone) for phone in transcript.phones]
alignment = aligner.align(target_phones, transcript_phones)
alignment
should have the following value in this case (NOTE: newlines added for better legibility):
[
('b', 'b'), ('æ', 'ɛ'), ('l', 'l'), ('ʌ', 'i'), ('n', '-'), ('s', 'z'),
('k', 'g'), ('l', 'l'), ('æ', 'æ'), ('m', '-'), ('p', '-'),
('ʌ', '-'), ('n', '-'), ('d', 'd'),
('-', 'ʌ'), ('b', 'b'), ('ɒ', 'ɒ'), ('t', 't'), ('ʌ', 'ʌ'), ('l', 'l')
]
Error analysis
Now that we have an automatically aligned sequence, let's analyze the phonological and lexical boundary errors in our transcript.
Phoneme errors
First, let's calculate phoneme errors:
#
phoneme_errors = aligner.phoneme_errors(alignment)
Lexical boundary errors
Next, let's calculate lexical boundary errors (LBEs):
target_stress = target.coarse_stress
# should produce ['SW', 'S', 'W', 'SW']
transcript_masked_stress = transcript.mask_syllables(mask="X")
# should produce ['X', 'X', 'X', 'X', 'XX']
lbe_errors = calculate_lbes_from_stress(target_stress, transcript_masked_stress)
# should produce [LexicalBoundaryError(error_type=LexicalBoundaryErrorType.INSERTION_WEAK, target_index=0, transcript_index=0)]
Wrapping up
A script containing this same example can be found at examples/asu-use-case.py