Tippkeskuse välitööd 2016 aastal

(CEES fieldwork in 2016)

Intervjuud eesti folkloristika ajaloost (Pille Kippar, Tiiu Jaago jt) - Mari Sarv, Meelis Roll

Petserimaa: keel ja rahvaluule - Andreas Kalkun (TÜ projekt)

Krasnojarski krai eestlased Eestis - Anu Korb

Matsalu ja Vilsandi rahvuspark, projekt rahvusparkide mälumaastikud: 03-08.07.2016 Valdo Valper, Lona Päll, Jüri Metssalu, Mari-Ann Remmel

Jüri khk Nabala, + Loksa,+ Animägi 15.-16. 06., 29.-30.06, 27.-28.aug , 11.09., 16. 09 - Mari-Ann Remmel, Lona Päll

Ajaloolise Harjumaa looduslike pühapaikade inventuur arhiiviallikate põhjal Rapla ja Hageri kihelkonnas Muinsuskaitseameti tellimusel okt-dets, Jüri Metssalu

Etnobotaanilised välitööd: Saaremaal, Ukrainas, Valgevenes

Lasteaia- ja koolipärimuse kogumise ettevalmistustööd, pilootküsitlused

Minoriteetide pärimus ja keel – koostöös tippkeskuse uurijate ja välispartneritega (Valgevene, Udmurdi Vabariik jt).

„Balkan and Baltic Holiness – Modern Religiosity and National Identity“ – Kagu-Bulgaaria, Eesti valitud paigad

„Creativity and tradition in Estonian and Polish cultural communication" välitööd erinevate kogukondade juures Eestis ja Poolas

Rootsi Riigiarhiivi Balti Arhiiv Stockholmis (Riksarkivet, Baltiska Arkivet, Juhan Aaviku isikuarhiiv) – Triinu Ojamaa

Tippkeskuse välitööd: Eesti Rahvaluule Arhiiv 2015-2023

(CEES fieldwork of Estonian Folklore Archives 2015-2023)

2015

● Fieldwork recordings on childhood memories for research on the interrelations of individuals and communities in the heritage creation and transmission 2016

● Fieldwork interviews on the history of Estonian folkloristics (Pille Kippar, Tiiu Jaago jt) – Mari Sarv, Meelis Roll

● Fieldworks in Petserimaa, Russia to document Seto language and folklore in collaboration with Tartu University – Andreas Kalkun

● Fieldworks among the Estonians born in Siberia, Krasnojarski krai – Anu Korb

● Fieldwork expedition in Matsalu and Vilsandi national parks to document place lore and memory landscapes – Valdo Valper, Lona Päll, Jüri Metssalu, Mari-Ann Remmel

● Fieldworks on place lore and natural sacred sites in Northern Estonia on the basis of archival materials Mari-Ann Remmel, Lona Päll, Jüri Metssalu

● Life stories collecting campaign „EV 100. My life and love“. Rutt Hinrikus, Astrid Tuisk.

2017

● Fieldworks at the Estonians born in Estonian Communities in Russia - Anu Korb

● Fieldworks on the history of Estonian folkloristics, Estonia - Mari Sarv

● Fieldworks with Seto singers in different places of Estonia - Janika Oras

● Fieldworks in Northern Udmurtia, Russian Federation - Janika Oras

● Fieldworks in Pärnumaa, Estonia – Ingrid Rüütel, Helen Kõmmus

● Fieldworks in Läänemaa, Estonia – Mari Sarv, Marek Saumann

● Fieldworks on narrative folklore, Setomaa, Estonia – Risto Järv, Piret Päär

● Fieldworks at the Folk Music Festival in Viljandi – Helen Kõmmus, Taive Särg

● Place lore fieldworks in different places of Estonia - Mari-Ann Remmel and workgroup of place lore

● Collecting campaign of family name stories 2017-2018 - Risto Järv, Mari-Ann Remmel, Valdo Valper.

● Collecting campaign on contacts between humans and animals – Mall Hiiemäe

2018

● Collecting campaign in collaboration with Environmental Board of Estonia for documenting the Christmas customs in Western Estonia – Mari-Ann Remmel

● Collecting campaign for youth to collect „My grandmother’s story“ – Mari Sarv

● Fieldworks in Harjumaa, Estonia on sacred natural sites – Valdo Valper, Jüri Metssalu

● Fieldworks at Seto groups in Krasnojarsk, Russia – Andreas Kalkun

● Fieldworks at Ingrians and Votians in Ingrian Region, Russia on traditional singing, sacred places, village life – Mari Sarv, Janika Oras

● Fieldworks at the Folk Music Festival in Viljandi on festival traditions, informal performances and singing occasions – Mari Sarv, Kätlin Luht

● Fieldworks in Setomaa, Estonia on song culture and fiest traditions – Janika Oras

● Fieldworks in Pärnu, Estonia on singing traditions and culture organization during the Soviet times – Mari Sarv

● Fieldwork recording in Kuremaa Mill, Estonia on mills and millers – Astrid Tuisk, Janno Simm, Mairi Kaasik, Inge Annom

2019

● Collecting campaign „My garden“ – Anu Korb, Maarja Hollo

● Collecting campaign „Things in our travels“ – Tiiu Jaago, Astrid Tuisk

● Life stories collecting campaign „Tartu in my life“ – Astrid Tuisk, Anu Korb

● Multi-channel recording of Seto multipart traditional singing, fieldworks in Põlva and Setomaa, Estonia – Janika Oras, Žanna Pärtlas, Jaan Tamm

● Fieldworks in Setomaa on singing choirs, traditional festivals, and singers – Janika Oras, Celia Roose

● Fieldworks in runosong interpretations’ festival „Regi vägi“ in Keila, Estonia – Taive Särg

● Place-lore fieldworks in Harjumaa, Estonia – Jüri Metssalu

● Fieldworks in Estonian settlement Simititsa in Ingria, Russia – Astrid Tuisk

● Fieldworks in Western Estonia on the memorial event of a local builderman Adam Kellmann – Mari Sarv

● Fieldworks on place lore and natural sacred sites in Western Estonia and Northern Estonia – Mari-Ann Remmel, Lona Päll, Valdo Valper, Jüri Metssalu, Kristel Kivari

● Fieldwork recording in Pilistvere care home – Mari Sarv, Risto Järv, Piret Päär

● Fieldworks in runosong fiest in Western Estonia – Taive Särg, Helen Kõmmus, Kadri Tamm

● Fieldworks in international folklore festival Baltica – Mari Sarv

● Fieldworks in Mulgimaa on contemporary family singing tradition – Taive Särg, Inge Annom, Olga Ivaškevitš

2020

● Fieldworks in Pärnu, Estonia for recording the older traditions of Saaremaa – Taive Särg, Janno Simm

● Collecting campaign „Violence in Estonian culture“ – Mari Sarv

● Fieldwork recording in Seto singing competition organized on the occasion of Shrove Tuesday in Tartu – Janika Oras

● Collecting campaign on corona traditions, and health care during corona period – Astrid Tuisk, Ave Goršič

● Field recording of Forest singing feast in Hüpassaare, Estonia – Janika Oras, Taive Särg

● Fieldworks on documenting the condition of sacred natural sites in Harjumaa, Pärnumaa and Hiiumaa, Estonia – Mari-Ann Remmel, Valdo Valper, Jüri Metssalu

● Fieldwork expedition in Mõisaküla, Estonia on childhood memories, children games and vernacular pedagogy – Astrid Tuisk, Mari Sarv, Janika Oras, Kadri Tamm, Kristi Metste. Fieldwork expedition was related with the work of CEES historical cultural practices work group and biographics work group.

● Fieldworks in and around Tallinn, Estonia on games and youth memories among the persons born in Estonian villages in Siberia – Anu Korb

● Fieldworks in the territory of National Park Lahemaa on place lore – Jüri Metssalu

● Fieldworks on Kihnu island, Estonia, UNESCO cultural heritage region – Janika Oras

● Fieldworks in Setumaa, Estonia on singing and instrumental music, and folk calendar customs – Janika Oras

● Fieldwork interview with a folk singer Marika Oja in Tallinn, Estonia – Janika Oras

● Fieldwork interviews on place lore in Kose, Estonia – Jüri Metssalu

● Collecting campaign on counting rhymes in Virumaa, Estonia in collaboration with Viru Institute – Mall Hiiemäe

2021

● Collecting campaign and fieldworks on changing and adapting family traditions in corona-virus times – Olga Ivaškevitš

● Fieldwork expedition in Iisaku, Estonia on childhood memories, interethnic relations, calendary customs – Astrid Tuisk, Mari Sarv, Kadri Tamm, Liina Saarlo

● Fieldwork recordings in Seto singing camp free singing events, Setomaa, Estonia – Janika Oras, Andreas Kalkun

● Fieldwork expedition in Setomaa in collaboration with linguists from University of Tartu – Andreas Kalkun

● Fieldworks on documenting the condition of sacred natural sites in Pärnumaa – Mari-Ann Remmel, Kristel Kivari

2022

● Collecting campaign and fieldwork recordings on the topic „Music in my life“ – Taive Särg, Janika Oras, Helen Kõmmus, Natali Ponetajev, Liina Saarlo

● Fieldworks on documenting the condition of sacred natural sites in Pärnumaa and Valgamaa – Mari-Ann Remmel, Kristel Kivari, Reeli Reinaus, Valdo Valper.

● Fieldwork expeditions in the territory of future Härgla lime stone mining area to document the cultural heritage in the area and the local people relations with the landscape – Jüri Metssalu

● Fieldworks on changing and adapting family traditions in corona-virus times – Olga Ivaškevitš

● Fieldworks in Setomaa, Estonia on persons’ musical biographies, Seto singing tradition, and Seto festivities; fieldwork recordings of improvisations’ contest and traditional fiest in Saatse – Janika Oras

● Fieldwork expedition in Karula, Estonia for collecting information, and photographs on the singers and folklore collectors in the framework of preparation of the academic publication of runosongs „Vana Kannel“. The aim of the fieldworks was to collect information on musical self-expression, entertainments, hobbyist activities and children games in the period of 1940s and 1950s. Fieldworks are related to the research in the CEES work group of historical cultural practices – Liina Saarlo, Mari Sarv, Enn-Kalev Tarto, Kadri Tamm, Andreas Kalkun, Janika Oras, Valdo Valper, Astrid Tuisk, Kärri Toomeos-Orglaan, Mathilda Matjus, Margit Kooser.

● Collecting campaign „Letters in my life“ – Maarja Hollo, Astrid Tuisk.

● Fieldwork recording on the student organization singing traditions – Taive Särg, Janno Simm, Kadri Vider

● Social media collecting campaign on the war memes. Creating, introducing and managing of the Facebook group „Ukraina meemid“. Developing the archiving strategies. The collecting is related with the work of CEES work group of contemporary culture and media studies.

Development of speech corpora at the language technology laboratory of Tallinn University of Technology

Einar Meister

2016: 20 hours of various speech recordings were transcribed (oral presentations, interviews) and approx. 15 hours of recordings of the Estonian L2 Corpus and the Estonian Adolescent Corpus were manually segmented and annotated.

2017: 10 hours of radio interviews and 10 hours of the Riigikogu (Parliament of Estonia) recordings were transcribed and approx. 9 hours of recordings of the Estonian L2 Corpus were segmented and annotated manually.

2018: Different speech corpora were further developed – the Estonian L2 Corpus (51-hour-volume), the Estonian Adolescent Corpus (72-hour-volume), and the speech recognition training corpus (237-hour-volume, includes recordings of news, interviews, talk shows, lectures, conference presentations, and of the Riigikogu (Parliament of Estonia).

2019: Spontaneous speech of approx. 50 speakers (aged 60-90 years) was recorded and transcribed for the Elderly Speech Corpus, each averaging 25 minutes.
For the Meeting Corpus, 9 recordings were collected, with 5–8 participants in each recording, each recording lasting 50–130 minutes.
10 hours of podcast recordings containing a lot of spontaneous speech and English words were also collected and transcribed.

2020: A speech corpus VoxLingua107, collected semi-automatically from YouTube, was created for spoken language identification; the corpus contains speech samples of 107 languages, with an average of 62 hours per language.
A new speech corpus ERR2020 was collected, which contains 389 hours of TV and radio broadcasts from the ERR archive with manually created transcriptions.
The development of the Elderly Speech Corpus and the Meeting Corpus continued. The spontaneous speech of 100 speakers (aged 60–90 years) was recorded and transcribed, and approx. 14 hours of meeting recordings were collected.

2021: Development of the Elderly Speech Corpus and the Meeting Corpus continued. The spontaneous speech of 25 speakers (aged 60–90 years) was recorded and transcribed, and approx. 50 hours of meeting recordings were collected.

2022: The development of the Elderly Speech Corpus was completed. Spontaneous speech samples from 25 speakers (aged 60–90 years) was recorded and transcribed. In total, the Elderly Speech Corpus contains the recordings of spontaneous speech of 200 speakers (100 men, 100 women) with transcriptions, the volume of the corpus is about 72 hours.

All speech corpora developed at the language technology lab are used to train acoustic models for various speech recognition applications.
The Estonian Adolescent Corpus, the Estonian L2 Corpus and the Elderly Speech Corpus have also been used for phonetic studies.

The speech corpora have been created as part of the projects "Speech recognition 2" (01.01.2015–31.12.2017, project number: EKT87) and "Speech recognition" (01.01.2018–30.06.2023, project number: EKTB24) funded by the national program for Estonian language technology.

The articles dealing directly with corpus development:
Valk, Jörgen; Alumäe, Tanel (2021). VoxLingua107: A dataset for spoken language recognition. 2021 IEEE Spoken Language Technology Workshop (SLT), SLT 2021: Proceedings, January 19-22, 2021, Online Conference. Piscataway, NJ: IEEE, 652−658. DOI: 10.1109/SLT48900.2021.9383459.

Meister, Einar; Meister, Lya (2022). Estonian elderly speech corpus – design, collection and preliminary acoustic analysis. Baltic Journal of Modern Computing, 10 (3), 360−371. DOI: 10.22364/bjmc.2022.10.3.09.

Besides these, there are numerous articles on speech recognition development and phonetics research, where different corpora have been used.

Fieldwork for the project IUT35-1 "Speech styles, sentence prosody and phonological variation: description, theory and modelling"

Liisi Piits

Reading experiments to study phonological variation.

2015: Pilot recordings of the reading experiment carried out in Tallinn

Article by Kalvik, Mari-Liis, Liisi Piits, 2015, Lugemiseksperiment fonoloogilise varieerumise uurimiseks. – Eesti ja soome-ugri keeleteaduse ajakiri. Journal of Estonian and Finno-Ugric Linguistics 6 (3), 49−77. https://doi. org/10.12697/jeful.2015.6.3.02

2016: Recordings in Tallinn

2017: Fieldwork for recordings in Kolga, Kiviõli, Toila, Jõhvi, Karksi-Nuia, Võru, Palamuse, Alatskivi, and Tartu.

Articles published:

Kalvik, Mari-Liis, Liisi Piits, 2017, Varieeruva vältega sõnad: häälduseelistused ja määramisraskused. – Mäetagused 68, 83−100. https://doi.org/10.7592/ Mt2017.68.

Piits, Liisi, Mari-Liis Kalvik 2017. varieeruva vältega sõnade hääldusuuringud kõnesünteesi teenistuses. – Eesti rakenduslingvistika Ühingu aastaraamat 13. toim. Margit Langemets, Maria-Maren Linkgreim, Helle Metslang. Tallinn: Eesti Rakenduslingvistika Ühing, 123−140. http:// dx.doi.org/10.5128/ErYa13.08

2018-2021: Compilation and organisation of data, corpus annotation.

The corpus includes reading experiments of 191 informants. All data is automatically segmented at word and phoneme levels. In addition, for half of a data the segmentation was manually corrected and additional levels were created for each phenomenon (quantity degree, h at the beginning of the word and palatalization). The corpus can be used for research, see Kalvik, Mari-Liis, Liisi Piits, 2021, Speech Corpus of Phonological Variation. https://doi.org/10.15155/3-00-0000-0000-0000-08971L.

Articles published:

Kalvik, Mari-Liis, Liisi Piits 2019. sõna esinemissagedus ja tähenduste eristamise vajadus häälduse mõjutajana. – Eesti ja soome-ugri keeleteaduse ajakiri. Journal of Estonian and Finno-Ugric Linguistics 10 (1), 71−88. https://doi.org/10.12697/jeful.2019.10.1.04

Piits, Liisi, Mari-Liis Kalvik 2019. Palatalisatsioonist ühesilbilistes i-tüvelistes pika vokaaliga sõnades. Roos närtsis, sest vaas oli tühi. – Keel ja Kirjandus 7, 513−533.

Piits, Liisi; Kalvik, Mari-Liis 2021. Fonoloogiline varieerumine eesti keeles kolme nähtuse näitel. Emakeele Seltsi aastaraamat, 66 (1), 177−201. DOI: 10.3176/esa66.08.

A summary of the main research findings: How the phenomena covered vary and which pronunciation variants readers prefer were demonstrated. The results show that h at the beginning of a word is almost always pronounced when reading aloud in Estonian: in 92% of cases on average, and even more in the case of frequently used words. Palatalization of the consonants l, n, s, t, d at the end of i-stemmed Estonian words with a (CC)V̅C structure is generally untypical. In the case of almost 50 examined words, it was possible to determine which quantity degree (second or third) the informants preferred the most. For four types, we also identified the pronunciation trend. The findings allow us to discuss the possible causes and draw practical conclusions on how to make synthetic speech more natural.