This package collects convenience functions for working with emuR
and
EMU-SDMS. For more information on those tools, see this useful
manual.
For now, functions are available to bulk extract and pre-process SSFF
tracks,
with specific functions available for processing fundamental frequency,
formants, and measures which depend directly on these. This can all be
done in a single step with the function import_ssfftracks()
, but the
different steps can also be carried out independently. Some of the
independent functions may well be useful for raw data that isn’t
generated in emuR
. These functions are adapted from the data
processing used in Kirby et
al. 2023.
The package also provides different functions for adding SSFF tracks to
existing EMU databases. One such function is praatsauce2ssff()
, which
allows users to add the output from
PraatSauce to an EMU database.
The function was written on the basis of PraatSauce output, but should
in principle also work for output from
VoiceSauce. This hasn’t
been tested though. Additionally there are functions for calculating
spectral moments and DCT coefficients of spectra generated over
equidistant time steps for all the sound files of an EMU database. These
are called moments2ssff()
and dct2ssff()
. See the help files for
more.
The function import_ssfftracks()
assumes that you have raw data stored
in an EMU database which has already been loaded into R, and that you
have generated a segment list with relevant portions of the data using
the querying
system
in emuR
. Let’s load some example data from Kirby et
al. forthc. into R.
datapath <- system.file('extdata/db', package='emuhelpeR')
raw <- emuR::load_emuDB(datapath)
#> INFO: Checking if cache needs update for 2 sessions and 10 bundles ...
#> INFO: Performing precheck and calculating checksums (== MD5 sums) for _annot.json files ...
#> INFO: Nothing to update!
This data can be inspected in EMU-SDMS by typing emuR::serve(raw)
in
the R console. Let’s have a look at a segment list I prepared:
library(emuhelpeR)
seg_list
#> # A tibble: 10 × 16
#> labels start end db_uuid session bundle start…¹ end_i…² level attri…³
#> <chr> <dbl> <dbl> <chr> <chr> <chr> <int> <int> <chr> <chr>
#> 1 op 715. 929. 33157d9f-4a3… f1 F1-00… 5 5 ORL ORL
#> 2 op 302. 611. 33157d9f-4a3… f1 F1-00… 5 5 ORL ORL
#> 3 op 482. 733. 33157d9f-4a3… f1 F1-00… 5 5 ORL ORL
#> 4 op 1461. 1665. 33157d9f-4a3… f1 F1-00… 5 5 ORL ORL
#> 5 op 897. 1165. 33157d9f-4a3… f1 F1-00… 7 7 ORL ORL
#> 6 op 1026. 1261. 33157d9f-4a3… m1 M1-00… 4 4 ORL ORL
#> 7 op 775. 1194. 33157d9f-4a3… m1 M1-00… 5 5 ORL ORL
#> 8 op 1289. 1547. 33157d9f-4a3… m1 M1-00… 7 7 ORL ORL
#> 9 op 1226. 1544. 33157d9f-4a3… m1 M1-00… 6 6 ORL ORL
#> 10 op 983. 1134. 33157d9f-4a3… m1 M1-00… 7 7 ORL ORL
#> # … with 6 more variables: start_item_seq_idx <int>, end_item_seq_idx <int>,
#> # type <chr>, sample_start <int>, sample_end <int>, sample_rate <int>, and
#> # abbreviated variable names ¹start_item_id, ²end_item_id, ³attribute
dplyr::glimpse(seg_list)
#> Rows: 10
#> Columns: 16
#> $ labels <chr> "op", "op", "op", "op", "op", "op", "op", "op", "op…
#> $ start <dbl> 714.5465, 301.8254, 481.9161, 1460.7596, 897.0181, …
#> $ end <dbl> 928.9229, 611.3265, 733.1633, 1665.3628, 1165.0907,…
#> $ db_uuid <chr> "33157d9f-4a3a-468a-882b-60d3b10ea771", "33157d9f-4…
#> $ session <chr> "f1", "f1", "f1", "f1", "f1", "m1", "m1", "m1", "m1…
#> $ bundle <chr> "F1-0002-car-rep1-Naam-37", "F1-0002-car-rep1-baa-8…
#> $ start_item_id <int> 5, 5, 5, 5, 7, 4, 5, 7, 6, 7
#> $ end_item_id <int> 5, 5, 5, 5, 7, 4, 5, 7, 6, 7
#> $ level <chr> "ORL", "ORL", "ORL", "ORL", "ORL", "ORL", "ORL", "O…
#> $ attribute <chr> "ORL", "ORL", "ORL", "ORL", "ORL", "ORL", "ORL", "O…
#> $ start_item_seq_idx <int> 4, 4, 4, 4, 6, 3, 4, 6, 5, 6
#> $ end_item_seq_idx <int> 4, 4, 4, 4, 6, 3, 4, 6, 5, 6
#> $ type <chr> "SEGMENT", "SEGMENT", "SEGMENT", "SEGMENT", "SEGMEN…
#> $ sample_start <int> 31512, 13311, 21253, 64420, 39559, 45237, 34182, 56…
#> $ sample_end <int> 40965, 26959, 32332, 73442, 51380, 55596, 52655, 68…
#> $ sample_rate <int> 44100, 44100, 44100, 44100, 44100, 44100, 44100, 44…
There are a bunch of functional measures available for this database, as the following prompt will tell us:
emuR::list_ssffTrackDefinitions(raw)
#> name columnName fileExtension
#> 1 praatF0 pF0 pF0
#> 2 eggF0 pdF0 pdF0
#> 3 H1H2c H1H2c H1H2c
#> 4 H1A1c H1A1c H1A1c
#> 5 H1A3c H1A3c H1A3c
#> 6 CPP CPP CPP
#> 7 CQ_PH CQ_PH CQ_PH
#> 8 CQ_PD CQ_PD CQ_PD
#> 9 praatF1 pF1 pF1
#> 10 praatF2 pF2 pF2
#> 11 praatF3 pF3 pF3
Using import_ssfftracks()
we can bulk extract all these measures from
the segments in seg_list
into a single data frame. I set proc=FALSE
,
because right now we just want to extract the raw measures. verbose
is
set to FALSE
to avoid printing progress bars that look ugly on GitHub.
x <- import_ssfftracks(db_handle=raw, seg_list=seg_list, proc=FALSE, verbose=FALSE)
x
#> # A tibble: 2,632 × 31
#> sl_rowIdx labels start end db_uuid session bundle start…¹ end_i…² level
#> <int> <chr> <dbl> <dbl> <chr> <chr> <chr> <int> <int> <chr>
#> 1 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 2 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 3 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 4 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 5 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 6 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 7 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 8 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 9 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 10 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> # … with 2,622 more rows, 21 more variables: attribute <chr>,
#> # start_item_seq_idx <int>, end_item_seq_idx <int>, type <chr>,
#> # sample_start <int>, sample_end <int>, sample_rate <int>, times_orig <dbl>,
#> # times_rel <dbl>, times_norm <dbl>, praatF0 <dbl>, eggF0 <dbl>, H1H2c <dbl>,
#> # H1A1c <dbl>, H1A3c <dbl>, CPP <dbl>, CQ_PH <dbl>, CQ_PD <dbl>,
#> # praatF1 <dbl>, praatF2 <dbl>, praatF3 <dbl>, and abbreviated variable names
#> # ¹start_item_id, ²end_item_id
dplyr::glimpse(x)
#> Rows: 2,632
#> Columns: 31
#> $ sl_rowIdx <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
#> $ labels <chr> "op", "op", "op", "op", "op", "op", "op", "op", "op…
#> $ start <dbl> 714.5465, 714.5465, 714.5465, 714.5465, 714.5465, 7…
#> $ end <dbl> 928.9229, 928.9229, 928.9229, 928.9229, 928.9229, 9…
#> $ db_uuid <chr> "33157d9f-4a3a-468a-882b-60d3b10ea771", "33157d9f-4…
#> $ session <chr> "f1", "f1", "f1", "f1", "f1", "f1", "f1", "f1", "f1…
#> $ bundle <chr> "F1-0002-car-rep1-Naam-37", "F1-0002-car-rep1-Naam-…
#> $ start_item_id <int> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, …
#> $ end_item_id <int> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, …
#> $ level <chr> "ORL", "ORL", "ORL", "ORL", "ORL", "ORL", "ORL", "O…
#> $ attribute <chr> "ORL", "ORL", "ORL", "ORL", "ORL", "ORL", "ORL", "O…
#> $ start_item_seq_idx <int> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, …
#> $ end_item_seq_idx <int> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, …
#> $ type <chr> "SEGMENT", "SEGMENT", "SEGMENT", "SEGMENT", "SEGMEN…
#> $ sample_start <int> 31512, 31512, 31512, 31512, 31512, 31512, 31512, 31…
#> $ sample_end <int> 40965, 40965, 40965, 40965, 40965, 40965, 40965, 40…
#> $ sample_rate <int> 44100, 44100, 44100, 44100, 44100, 44100, 44100, 44…
#> $ times_orig <dbl> 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 7…
#> $ times_rel <dbl> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1…
#> $ times_norm <dbl> 0.000000000, 0.004694836, 0.009389671, 0.014084507,…
#> $ praatF0 <dbl> 216.523, 216.423, 216.719, 217.147, 217.576, 218.00…
#> $ eggF0 <dbl> 215.0127, 215.0127, 215.0127, 218.6272, 218.6272, 2…
#> $ H1H2c <dbl> 15.851, 16.093, 16.287, 16.513, 16.633, 16.495, 16.…
#> $ H1A1c <dbl> 19.376, 19.110, 18.868, 18.655, 18.433, 18.122, 17.…
#> $ H1A3c <dbl> 22.017, 21.680, 21.623, 21.852, 21.368, 20.545, 20.…
#> $ CPP <dbl> 19.499, 20.861, 24.094, 23.843, 25.368, 24.960, 25.…
#> $ CQ_PH <dbl> 0.7078310, 0.7078310, 0.7078310, 0.7105345, 0.71053…
#> $ CQ_PD <dbl> 0.7461344, 0.7461344, 0.7461344, 0.7482461, 0.74824…
#> $ praatF1 <dbl> 788.254, 788.753, 789.251, 789.750, 790.601, 792.51…
#> $ praatF2 <dbl> 1678.267, 1670.355, 1662.443, 1654.532, 1648.756, 1…
#> $ praatF3 <dbl> 3585.059, 3585.030, 3585.001, 3584.972, 3582.238, 3…
Neat! But if we skip proc=FALSE
and set some more parameters, we can
also do a bunch of preprocessing in the same step, such as by-speaker
normalization and automated removal of outliers that fall outside three
standard deviations from the mean within the same group. When the
function is called, it will also print a message telling us how many
outliers were removed from each track.
f0col='praatF0'
specifies that F0 values are stored in the SSFF trackpraatF0
. In this track, values of 0 should be recoded asNA
, and outliers should be automatically removed after.f0dep='H1H2c'
specifies that the trackH1H2c
(the difference between the first two harmonics) is directly dependent on the F0 measurements, so for each F0 measure coded asNA
, the correspondingH1H2c
should also be coded asNA
.fncol=c('praatF1', 'praatF2', 'praatF3')
specifies that the available formant measures F1-F3 are stored in the SSFF trackspraatF1
,praatF2
, andpraatF3
. Outliers are automatically removed.fndep=list(c('H1A1c', 'F1'), c('H1A3c', 'F3'))
specifies that, respectively,H1A1c
is a spectral measure that directly depends on F1 (and F0), andH1A3c
is a spectral measure that directly depends on F3 (and F0).H1A1c
values will be coded asNA
if the corresponding F1 or F0 measure isNA
, etc.speaker='speaker'
specifies that there is a column with speaker information in theseg_list
data frame, and that column is labeledspeaker
. This is used for by-speaker normalization.group_var=c('speaker', 'vowel')
specifies that the columnsspeaker
andvowel
inseg_list
should be used for determining which tokens should be automatically removed; only tokens that are three standard deviations from the mean within-speaker and within-vowel are removed.timing_rm=list('cl', 250)
specifies that F0 measurements that are more than 250 ms removed from acl
label in the data should be removed.outlier_rm='eggF0'
specifies that, in addition to the automated outlier procedures that have already been applied, outliers should also be automatically removed from the SSFF trackeggF0
.
y <- import_ssfftracks(db_handle=raw, seg_list=seg_list,
f0col='praatF0', f0dep='H1H2c',
fncol=c('praatF1', 'praatF2', 'praatF3'),
fndep=list(c('H1A1c', 'F1'), c('H1A3c', 'F3')),
speaker='session', group_var='session',
timing_rm=list('cl', 250), outlier_rm='eggF0',
verbose=FALSE)
#> [1] "Initial number of NAs in F0 track: 114"
#> [1] "Number of NAs removed from F0 track during automated outlier removal: 0"
#> [1] "Number of NAs removed from H1H2c track during automated outlier removal: 114"
#> [1] "Number of NAs removed from F1 track during automated outlier removal: 0"
#> [1] "Number of NAs removed from F2 track during automated outlier removal: 1"
#> [1] "Number of NAs removed from F3 track during automated outlier removal: 6"
#> [1] "Number of NAs removed from H1A1c track during automated outlier removal: 114"
#> [1] "Number of NAs removed from H1A3c track during automated outlier removal: 114"
#> [1] "Number of NAs removed from eggF0 track during automated outlier removal: 26"
y
#> # A tibble: 2,632 × 61
#> sl_rowIdx labels start end db_uuid session bundle start…¹ end_i…² level
#> <int> <chr> <dbl> <dbl> <chr> <chr> <chr> <int> <int> <chr>
#> 1 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 2 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 3 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 4 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 5 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 6 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 7 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 8 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 9 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> 10 1 op 715. 929. 33157d9f-4… f1 F1-00… 5 5 ORL
#> # … with 2,622 more rows, 51 more variables: attribute <chr>,
#> # start_item_seq_idx <int>, end_item_seq_idx <int>, type <chr>,
#> # sample_start <int>, sample_end <int>, sample_rate <int>, times_orig <dbl>,
#> # times_rel <dbl>, times_norm <dbl>, eggF0 <dbl>, H1H2c <dbl>, H1A1c <dbl>,
#> # H1A3c <dbl>, CPP <dbl>, CQ_PH <dbl>, CQ_PD <dbl>, F0 <dbl>, uppF0 <dbl>,
#> # lowF0 <dbl>, zF0 <dbl>, normF0 <dbl>, zH1H2c <dbl>, normH1H2c <dbl>,
#> # F1 <dbl>, uppF1 <dbl>, lowF1 <dbl>, zF1 <dbl>, normF1 <dbl>, F2 <dbl>, …
dplyr::glimpse(y)
#> Rows: 2,632
#> Columns: 61
#> $ sl_rowIdx <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
#> $ labels <chr> "op", "op", "op", "op", "op", "op", "op", "op", "op…
#> $ start <dbl> 714.5465, 714.5465, 714.5465, 714.5465, 714.5465, 7…
#> $ end <dbl> 928.9229, 928.9229, 928.9229, 928.9229, 928.9229, 9…
#> $ db_uuid <chr> "33157d9f-4a3a-468a-882b-60d3b10ea771", "33157d9f-4…
#> $ session <chr> "f1", "f1", "f1", "f1", "f1", "f1", "f1", "f1", "f1…
#> $ bundle <chr> "F1-0002-car-rep1-Naam-37", "F1-0002-car-rep1-Naam-…
#> $ start_item_id <int> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, …
#> $ end_item_id <int> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, …
#> $ level <chr> "ORL", "ORL", "ORL", "ORL", "ORL", "ORL", "ORL", "O…
#> $ attribute <chr> "ORL", "ORL", "ORL", "ORL", "ORL", "ORL", "ORL", "O…
#> $ start_item_seq_idx <int> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, …
#> $ end_item_seq_idx <int> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, …
#> $ type <chr> "SEGMENT", "SEGMENT", "SEGMENT", "SEGMENT", "SEGMEN…
#> $ sample_start <int> 31512, 31512, 31512, 31512, 31512, 31512, 31512, 31…
#> $ sample_end <int> 40965, 40965, 40965, 40965, 40965, 40965, 40965, 40…
#> $ sample_rate <int> 44100, 44100, 44100, 44100, 44100, 44100, 44100, 44…
#> $ times_orig <dbl> 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 7…
#> $ times_rel <dbl> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1…
#> $ times_norm <dbl> 0.000000000, 0.004694836, 0.009389671, 0.014084507,…
#> $ eggF0 <dbl> 215.0127, 215.0127, 215.0127, 218.6272, 218.6272, 2…
#> $ H1H2c <dbl> 15.851, 16.093, 16.287, 16.513, 16.633, 16.495, 16.…
#> $ H1A1c <dbl> 19.376, 19.110, 18.868, 18.655, 18.433, 18.122, 17.…
#> $ H1A3c <dbl> 22.017, 21.680, 21.623, 21.852, 21.368, 20.545, 20.…
#> $ CPP <dbl> 19.499, 20.861, 24.094, 23.843, 25.368, 24.960, 25.…
#> $ CQ_PH <dbl> 0.7078310, 0.7078310, 0.7078310, 0.7105345, 0.71053…
#> $ CQ_PD <dbl> 0.7461344, 0.7461344, 0.7461344, 0.7482461, 0.74824…
#> $ F0 <dbl> 216.523, 216.423, 216.719, 217.147, 217.576, 218.00…
#> $ uppF0 <dbl> 294.5784, 294.5784, 294.5784, 294.5784, 294.5784, 2…
#> $ lowF0 <dbl> 123.5667, 123.5667, 123.5667, 123.5667, 123.5667, 1…
#> $ zF0 <dbl> 0.2614007, 0.2578924, 0.2682773, 0.2832942, 0.29834…
#> $ normF0 <dbl> 177.3753, 177.2133, 177.6928, 178.3861, 179.0810, 1…
#> $ zH1H2c <dbl> 1.738322, 1.796663, 1.843433, 1.897917, 1.926846, 1…
#> $ normH1H2c <dbl> 13.95344, 14.31125, 14.59808, 14.93223, 15.10965, 1…
#> $ F1 <dbl> 788.254, 788.753, 789.251, 789.750, 790.601, 792.51…
#> $ uppF1 <dbl> 1298.965, 1298.965, 1298.965, 1298.965, 1298.965, 1…
#> $ lowF1 <dbl> 351.1913, 351.1913, 351.1913, 351.1913, 351.1913, 3…
#> $ zF1 <dbl> -0.233120763, -0.229962020, -0.226809459, -0.223650…
#> $ normF1 <dbl> 745.3951, 745.9944, 746.5925, 747.1918, 748.2139, 7…
#> $ F2 <dbl> 1678.267, 1670.355, 1662.443, 1654.532, 1648.756, 1…
#> $ uppF2 <dbl> 2479.633, 2479.633, 2479.633, 2479.633, 2479.633, 2…
#> $ lowF2 <dbl> 683.3898, 683.3898, 683.3898, 683.3898, 683.3898, 6…
#> $ zF2 <dbl> 0.3231926, 0.2967641, 0.2703357, 0.2439105, 0.22461…
#> $ normF2 <dbl> 1515.805, 1506.660, 1497.515, 1488.371, 1481.694, 1…
#> $ F3 <dbl> 3585.059, 3585.030, 3585.001, 3584.972, 3582.238, 3…
#> $ uppF3 <dbl> 3842.108, 3842.108, 3842.108, 3842.108, 3842.108, 3…
#> $ lowF3 <dbl> 2429.872, 2429.872, 2429.872, 2429.872, 2429.872, 2…
#> $ zF3 <dbl> 1.9843558, 1.9842259, 1.9840961, 1.9839662, 1.97174…
#> $ normF3 <dbl> 3868.009, 3867.929, 3867.848, 3867.767, 3860.190, 3…
#> $ zH1A1c <dbl> 0.5591552574, 0.4962194196, 0.4389616163, 0.3885655…
#> $ normH1A1c <dbl> 16.50144, 16.07416, 15.68542, 15.34326, 14.98665, 1…
#> $ zH1A3c <dbl> 0.68503828, 0.63057184, 0.62135923, 0.65837058, 0.5…
#> $ normH1A3c <dbl> 18.53489, 17.99424, 17.90279, 18.27017, 17.49369, 1…
#> $ uppeggF0 <dbl> 348.6988, 348.6988, 348.6988, 348.6988, 348.6988, 3…
#> $ loweggF0 <dbl> 43.39191, 43.39191, 43.39191, 43.39191, 43.39191, 4…
#> $ zCPP <dbl> -0.228769020, 0.007061036, 0.566854573, 0.523394048…
#> $ normCPP <dbl> 20.37302, 21.71946, 24.91550, 24.66737, 26.17494, 2…
#> $ zCQ_PH <dbl> 0.4689086, 0.4689086, 0.4689086, 0.4873716, 0.48737…
#> $ normCQ_PH <dbl> 0.6190917, 0.6190917, 0.6190917, 0.6217139, 0.62171…
#> $ zCQ_PD <dbl> 0.5354962, 0.5354962, 0.5354962, 0.5508538, 0.55085…
#> $ normCQ_PD <dbl> 0.6558349, 0.6558349, 0.6558349, 0.6579906, 0.65799…
Notice that the praatF0
, praatF1
columns etc. have been renamed to
F0
, F1
. Notice also that for each SSFF track has a corresponding
column with z-score normalized values (e.g. zF1
) and a corresponding
column where these normalized values have been rescaled based on the
overall mean and standard deviation of the data (e.g. normF1
).
import_ssfftracks()
is very dependent on emuR
and EMU-SDMS, but it
incorporates several independent functions which can in principle be
used on raw data generated with other software: f0_proc()
for
processing F0 and dependencies, fn_proc()
for processing formants and
dependencies, outlier_rm
for automated removal of outliers, and
normz
for z-score normalizing and rescaling by speaker. The syntax of
these functions is similar to import_ssfftracks()
.
If the output of PraatSauce is loaded into R, it will look roughly like this:
dplyr::glimpse(ps)
#> Rows: 3,546
#> Columns: 14
#> $ Filename <chr> "F1-0002-car-rep1-Naam-37", "F1-0002-car-rep1-Naam-37", "F1-…
#> $ session <chr> "f1", "f1", "f1", "f1", "f1", "f1", "f1", "f1", "f1", "f1", …
#> $ seg_Start <dbl> 0.5919388, 0.5919388, 0.5919388, 0.5919388, 0.5919388, 0.591…
#> $ seg_End <dbl> 0.7145465, 0.7145465, 0.7145465, 0.7145465, 0.7145465, 0.714…
#> $ t <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 1…
#> $ t_ms <dbl> 0.5929388, 0.5939388, 0.5949388, 0.5959388, 0.5969388, 0.597…
#> $ f0 <dbl> 184.974, 184.847, 184.720, 184.593, 184.466, 184.444, 184.45…
#> $ F1 <dbl> 658.327, 650.616, 640.280, 622.049, 603.819, 585.588, 567.35…
#> $ F2 <dbl> 1127.051, 1114.348, 1103.061, 1096.036, 1089.011, 1081.986, …
#> $ F3 <dbl> 3382.894, 3396.817, 3392.458, 3333.087, 3273.716, 3214.345, …
#> $ CPP <dbl> 18.038, 19.296, 19.707, 19.974, 20.051, 21.084, 20.277, 19.0…
#> $ H1H2c <dbl> 8.541, 8.694, 8.880, 9.044, 9.029, 8.866, 8.593, 8.272, 7.90…
#> $ H1A1c <dbl> 16.492, 16.159, 15.834, 15.492, 15.206, 15.038, 14.897, 14.8…
#> $ H1A3c <dbl> 7.896, 7.166, 5.750, 3.910, 2.540, 1.576, 4.947, 5.527, 8.22…
In order to add this to an existing EMU database with no SSFF tracks,
you can use the praatsauce2ssff()
like so:
datapath_ps <- system.file('extdata/ps', package='emuhelpeR')
ps_db <- emuR::load_emuDB(datapath)
praatsauce2ssff(ps_output=ps, db_handle=ps_db, session_col='session')
Note that the session_col
argument is only necessary if there are
multiple sessions in the database.
Subsequently, you can have a look at the SSFF tracks, such as the F0 track, in EMU by running e.g. the following:
sco <- emuR::get_signalCanvasesOrder(ps_db, 'default')
emuR::set_signalCanvasesOrder(ps_db, 'default', c(sco, 'f0'))
emuR::serve(ps_db)
You can install the development version of emuhelpeR
from GitHub with:
#install.packages("devtools")
devtools::install_github("rpuggaardrode/emuhelpeR")
Kirby, James, Marc Brunelle & Pittayawat Pittayaporn (2023) Transphonologization of onset voicing: Revisiting Northern and Eastern Kmhmu. Phonetica. DOI: 10.1515/phon-2022-0029.