Digitised comparative word list of Malay, Nias, Toba-Batak, and Enggano in Modigliani’s “L’isola Delle Donne” from 1894
Gede Primahadi Wijaya
Rajeg
University
of Oxford/CIRHSS, Udayana University
This work is funded by the Arts and Humanities Research Council
(AHRC) (Grant ID:
AH/W007290/1,
“Lexical resources for Enggano, a threatened language of
Indonesia”), led by the Faculty of Linguistics, Philology and
Phonetics at the University of Oxford, UK. Visit the central webpage of
the Enggano project.
Digitised, annotated comparative word list in Modigliani’s “L’isola delle donne” from 1894 by Gede Primahadi W. Rajeg is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International
Please cite the source of the data set (Modigliani 1894) (if in in APA7th) and the particular version of this repository (Rajeg 2025) (in DataCite) as follows:
Modigliani, E. (1894). L’isola delle donne. Ulrico Hoepli. https://www.google.co.uk/books/edition/L_isola_delle_donne/gksCAAAAMAAJ?hl=en&gbpv=0
Rajeg, Gede Primahadi Wijaya (2025). Digitised comparative word list of Malay, Nias, Toba-Batak, and Enggano in Modigliani’s “L’isola Delle Donne” from 1894. University of Oxford. Dataset. https://doi.org/10.25446/oxford.28330022.v1
For future updates and version of records, please check the Releases page on this GitHub repository and its Zenodo archive.
The
data-source
directory contains the original data in .xlsx file that the first author
hand-digitised from the original source (Modigliani
1894). The light annotation included reflects the
content of the original source, covering several aspects. First,
annotating the string component that is printed in italics in the
original source; the marking is indicated by the XML tag <i>
so it can
be traced computationally. Second, there is also annotation concerning
remark (<rm...>
) for a given language column in the original source,
and that concerning aspect of meaning (<sem...>
). These annotations
are still available in the WORD
column of the
data-output
with
long-table
format (the column WORD2
excludes these annotations, which are
transferred into the REMARK
column of the long-table format). In the
wide-table
format of the data-output, the language columns named with ..._1
do
not contain these annotations, which have been transferred into other
columns named with ..._rm
and ..._sem
labels. The REMARK
column in
the two sets of data-output contains another annotation I put while
transcribing from the source: the cell beginning with M--
is a comment
for the Malay column while that starting with I--
is for the Italian
(the reference language).
Information concerning the orthography standardisation is available on the README page of the ortho directory.
Modigliani, Elio. 1894. L’isola Delle Donne. Milano: Ulrico Hoepli. https://www.google.co.uk/books/edition/L_isola_delle_donne/gksCAAAAMAAJ?hl=en&gbpv=0.
Rajeg, Gede Primahadi Wijaya. 2025. “Digitised Comparative Word List of Malay, Nias, Toba-Batak, and Enggano in Modigliani’s ‘L’isola Delle Donne’ from 1894.” Dataset. University of Oxford. https://doi.org/10.25446/oxford.28330022.v1.