Page Not Found
Page not found. Your pixels are in another canvas.
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas.
About me
This is a page not in th emain menu
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml
and set future: false
.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published in Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, 2020
The task of grapheme-to-phoneme (G2P) conversion is important for both speech recognition and synthesis. Similar to other speech and language processing tasks, in a scenario where only small-sized training data are available, learning G2P models is challenging. We describe a simple approach of exploiting model ensembles, based on multilingual Transformers and self-training, to develop a highly effective G2P solution for 15 languages. Our models are developed as part of our participation in the SIGMORPHON 2020 Shared Task 1 focused at G2P. Our best models achieve 14.99 word error rate (WER) and 3.30 phoneme error rate (PER), a sizeable improvement over the shared task competitive baselines.
Recommended citation: Vesik, K., Abdul-Mageed, M., & Silfverberg, M. (2020). One Model to Pronounce Them All: Multilingual Grapheme-to-Phoneme Conversion With a Transformer Ensemble. In Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, Seattle. https://www.aclweb.org/anthology/2020.sigmorphon-1.16/
Published in University of British Columbia, 2021
This paper investigates representation and perception of contrastive vowel length in Estonian songs. In speech, vowels in Quantity 1 (short; Q1) and Quantity 2 (long; Q2) are correlated with σ1/σ2 duration ratios of 2:3 and 3:2, respectively (Lehiste, 1960). However, in atypical timing contexts such as song, adhering to speech ratios may not always be feasible. I aim to determine whether native Estonian speakers perceive vowel quantity in song adhering to the same criteria that they are shown to use in speech. A small corpus exploration suggests that there is a general trend for composers to assign Q1 words to shorter:longer note pairs and Q2 words to longer:shorter pairs, without specific adherence to the ratios typical of speech. Given that text setting in musical composition does not necessarily follow expected speech ratios, a perception study was implemented with the purpose of determining whether listeners nevertheless perceive vowel quantity in song according to the speech ratios. Native Estonian listeners were asked to identify sung bisyllabic nonce words of varying ratios as either Q1 or Q2, and results showed that participants identify sung tokens according to the same ratios as they do in speech. This suggests that in the absence of clues from lexical information, native listeners use the same perceptual tools from speech to identify vowel length in song, even though words in composed music are not necessarily always presented in the same way.
Recommended citation: Vesik, K. (2021). Perception of vowel quantity in sung Estonian. Unpublished manuscript, University of British Columbia.
Published in University of British Columbia, 2022
This paper investigates how the vowel patterns of two closely-related dialects of Estonian can be described using as much shared representation as possible, as well as what parameters and biases are necessary for the Gradual Learning Algorithm (Boersma & Hayes, 2001) to be able to learn restrictive grammars for both dialects. Standard Estonian and the minority Kihnu dialect share the same vowel inventory but differ in their distribution of those vowels, with Standard Estonian being subject to positional restrictions and Kihnu Estonian demonstrating front-back vowel harmony. I extend the constraint set that Kiparsky and Pajusalu (2003) propose to account for the vowel harmony typology in Balto-Finnic languages, and show via Low-Faithfulness Constraint Demotion (Hayes, 2004) that there exist ranking of these constraints that account for the patterns in both dialects. I also process the Estonian Dialect Corpus (Lindström, 2013) and use the contents as learning data for runs of the Gradual Learning Algorithm implemented by both OTSoft (Hayes et al., 2013) and the author. Convergence on grammars for both dialects is contingent on three biases being employed during the learning process: low initial faithfulness, specific over general faithfulness, and the Magri (2012) update rule. This suggests that even for relatively simple phonological patterns, the learning environment must be delicately balanced in order to account for the behaviour of two different grammars grounded in the same shared framework.
Recommended citation: Vesik, K. (2022). Laying a foundation for bidialectalism: necessary biases for algorithmic learning of two dialects of Estonian. Unpublished manuscript, University of British Columbia.
Published in Proceedings of the Fifty-seventh Annual Meeting of the Chicago Linguistic Society, 2022
This paper investigates representation and perception of contrastive vowel length in Estonian songs. In speech, vowels in Quantity 1 (short; Q1) and Quantity 2 (long; Q2) are correlated with σ1:σ2 duration ratios of 2:3 and 3:2, respectively (Lehiste, 1960). However, in atypical timing contexts such as song, adhering to speech ratios may not always be feasible. I aim to determine whether native Estonian speakers perceive vowel quantity in song adhering to the same criteria that they are shown to use in speech. A small corpus exploration suggests that there is a general trend for composers to assign Q1 words to shorter:longer note pairs and Q2 words to longer:shorter pairs, without specific adherence to the ratios typical of speech. Given that text setting in musical composition does not necessarily follow expected speech ratios, a perception study was implemented with the purpose of determining whether listeners nevertheless perceive vowel quantity in song according to the speech ratios. Native Estonian listeners were asked to identify sung bisyllabic nonce words of varying ratios as either Q1 or Q2, and results showed that participants identify sung tokens according to the same ratios as they do in speech. This suggests that in the absence of clues from lexical information, native listeners use the same perceptual tools from speech to identify vowel length in song, even though words in composed music are not necessarily always presented in the same way.
Recommended citation: Vesik, K. (2022). Perception of vowel quantity in sung Estonian. In Proceedings of the Fifty-seventh Annual Meeting of the Chicago Linguistic Society, Chicago, 433-444. https://drive.google.com/drive/folders/1o09xbZ2fsvXmxTDe0P84hPlcoopvrzKd
Published in Proceedings of the LREC 2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources, 2022
This paper provides an introduction to the Sign Language Phonetic Annotator-Analyzer (SLP-AA) software, a free and open-source tool currently under development, for facilitating detailed form-based transcription of signs. The software is designed to have a user-friendly interface that allows coders to transcribe a great deal of phonetic detail without being constrained to a particular phonetic annotation system or phonological framework. Here, we focus on the ‘annotator’ component of the software, outlining the functionality for transcribing movement, location, hand configuration, orientation, and contact, as well as the timing relations between them.
Recommended citation: Hall, K. C., Aonuki, Y., Vesik, K., Poy, A., & Tolmie, N. (2022). Sign Language Phonetic Annotator-Analyzer: Open-Source Software for Form-Based Analysis of Sign Languages. Proceedings of the LREC 2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources, Marseille, France, 59-66. https://aclanthology.org/2022.signlang-1.10/
Published in Supplemental Proceedings of the 2022 Annual Meeting on Phonology, 2023
This paper investigates the learning of Kihnu Estonian, a minority dialect of Estonian (Balto-Finnic). I propose a set of constraints to account for Kihnu Estonian vowel harmony patterns, and show that they can be used to produce a restrictive grammar for Kihnu Estonian vowel harmony. With this constraint set, I model the acquisition of Kihnu Estonian vowel harmony via the application of the Gradual Learning Algorithm (Boersma and Hayes, 2001). Antagonistic constraints in the set I adopt pose obstacles to successful learning of the vowel patterns attested in the learning data. These obstacles can be circumvented via use of the update rule from the Calibrated Error-Driven Ranking Algorithm (Magri, 2012).This update rule has been argued to be detrimental to learning variation in stochastic OT. However, though it was originally proposed to address the Credit Problem (Dresher, 1999), I show that it is in fact an elegant solution to the learning problems caused by oscillating constraints when modeling acquisition of Kihnu Estonian vowel harmony.
Recommended citation: Vesik, K. (2023). The Calibrated Error-Driven Ranking Algorithm as a Solution to Oscillation in Antagonistic Constraints: A Necessary Bias for Algorithmic Learning of Kihnu Estonian. In N. Elkins, B. Hayes, J. Jo, & J. L. Siah (Eds.), Supplemental Proceedings of the 2022 Annual Meeting on Phonology. Washington, DC: Linguistic Society of America. https://doi.org/10.3765/amp.v10i0.5429 https://journals.linguisticsociety.org/proceedings/index.php/amphonology/article/view/5429
Published in Proceedings of the 2022 Annual Meeting on Phonology, 2023
There are competing views in contemporary phonological theory about how to best represent processes that are pervasive, frequent, and phonologically motivated, yet still lexically sensitive. To what extent can – or should – a process that applies idiosyncratically to different morphemes, words, and even phrases, be represented in a way that allows it to generalize to novel forms? We examine this question by looking at prenominal liaison as it is used in contemporary Laurentian French, spoken in Canada. We present the results of an online production study that compares application of liaison in real vs. nonce nouns, and that considers the effect of nonce nouns’ phonological properties and morphosyntactic context on the process. We interpret our results as evidence that liaison behaviour is driven jointly by lexical representations and an abstract grammar, with properties of the real-word lexicon affecting liaison rates in nonce words. We further show that there is considerable variation in the population in the extent to which speakers produce liaison with real h-aspiré words, but that all speakers nonetheless share an understanding of what types of words are more vs. less likely to undergo liaison.
Recommended citation: Tessier, A. M., Jesney, K., Vesik, K., Lo, R., & Bouchard, M. E. (2023). The Productive Status of Laurentian French Liaison: Variation across Words and Grammar. In N. Elkins, B. Hayes, J. Jo, & J. L. Siah (Eds.), Proceedings of the 2022 Annual Meeting on Phonology. Washington, DC: Linguistic Society of America. https://doi.org/10.3765/amp.v10i0.5447 https://journals.linguisticsociety.org/proceedings/index.php/amphonology/article/view/5447
Published in Linguistica Uralica, 2023
This paper investigates back/front vowel harmony in the Kihnu variety of Estonian. Data from the Estonian Dialect Corpus are analyzed to inform the description of harmony in this dialect, a phenomenon that has been understudied in the literature. Previously reported patterns of categorical harmony (/u/-/y/ and /ɑ/-/æ/ pairs) and transparency (/i/) are confirmed. However, the corpus provides insufficient direct evidence to either support or refute previous descriptions of the /o/-/ø/ pair as non-participatory. Subtleties of a relationship previously described as variable (/e/-/ɤ/ pair) are explored in more depth, with /e/ proposed as a second transparent vowel. Vowel harmony is also explored in Kihnu Estonian’s rich inventory of diphthongs, with intra-syllabic harmony in diphthongs shown to occur at a similar rate to that of inter-syllabic harmony between monophthongs.
Recommended citation: Vesik, K. (2023). Vowel harmony in the Kihnu variety of Estonian: A corpus study. Linguistica Uralica, 59(3), 181-199. https://dx.doi.org/10.3176/lu.2023.3.02 https://kirj.ee/wp-content/plugins/kirj/pub/ling-2023-3-181-199_20230910135940.pdf?v=3e8d115eb4b3
Published in Proceedings of the LREC-COLING 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources, 2024
This paper introduces the ongoing project of digitizing and phonologically transcribing the The Canadian Dictionary of ASL to be used as a language resource. We describe the contents of the dictionary and the procedure used for creating the transcribed version, using the Sign Language Phonetic Annotator-Analyzer software. We also outline the benefits of creating a resource with such a detailed representation of the formational structure of signs.
Recommended citation: Hall, K. C., Asthana, A., Reid, M., Gao, Y., Hobby, G., Tkachman, O., & Vesik, K. (2024). Phonological Transcription of the Canadian Dictionary of ASL as a Language Resource. Proceedings of the LREC-COLING 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources, Torino, Italy, 81-89. https://www.sign-lang.uni-hamburg.de/lrec/pub/24010.html
Not yet published
Please contact me directly for more information about this manuscript.
Recommended citation: Vesik, K., & Hall, K.C. (in press). Improved student learning through active retrieval practice and random-sampled exams. Canadian Journal of Linguistics/Revue Canadienne de Linguistique.
Published:
Vesik, K., Abdul-Mageed, M., & Silfverberg, M. 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology. July 10, 2020.
Published:
Vesik, K. Second UBC Language Sciences Graduate Student and Postdoctoral Fellow Research Day. October 16, 2020.
Published:
Vesik, K. 57th Annual Meeting of the Chicago Linguistic Society. May 6-8, 2021.
Published:
Vesik, K. 37th Annual Northwest Linguistics Conference. May 15, 2021.
Published:
Vesik, K., & Hall, K. C. Special Session on Pedagogy at the annual meeting of the Canadian Linguistic Association. June 4-7, 2021.
Published:
Hall, K.C., Aonuki, Y., Vesik, K., Poy, A., & Tolmie, N. Sign Language Phonetic Annotator-Analyzer: Open-Source Software for Form-Based Analysis of Sign Languages. LREC 2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources. June 25, 2022.
Published:
Vesik, K. Necessary biases for algorithmic learning of Kihnu Estonian vowel harmony. 2022 Annual Meeting on Phonology. October 23, 2022.
Post-baccalaureate course, Queens University at Kingston Continuing Teacher Education, 2019
CONT 931 is a 9-week course for students in the BC Post-Graduate Certificate program in Mathematics Education. I was the instructor for this course, assessing student work and facilitating a lively and positive online environment with productive discussion.
Undergraduate course, UBC Linguistics, 2020
LING 100 is a one-term overview of linguistics for students in any discipline. I was a graduate TA for this course, marking student assignments, monitoring discussion boards, running tutorials, and holding office hours.
Undergraduate course, UBC Linguistics, 2020
LING 200 is a one-term introduction to phonology and phonetics. I was a graduate TA for this course, supporting the instructor during lecture activities, marking student assignments, and holding office hours. I also developed, based on specifications developed collaboratively with the instructor, a database and Python script used to produce individualized random-sampled exams.
Post-baccalaureate course, Queens University at Kingston Continuing Teacher Education, 2020
CONT 933 is a 9-week course for students in the BC Post-Graduate Certificate program in Mathematics Education. I was the instructor for this course, assessing student work and facilitating a lively and positive online environment with productive discussion.
Undergraduate course, UBC Linguistics, 2020
LING 200 is a one-term introduction to phonology and phonetics. I was a graduate TA for this course, supporting the instructor during lecture activities, marking student assignments, running tutorials, and holding office hours. I also continued to expand and refine the Python script (to produce individualized random-sampled exams) developed in the previous term.
Undergraduate course, UBC Linguistics, 2021
LING 100 is a one-term overview of linguistics for students in any discipline. I was a senior graduate TA for this course, developing course material, marking and facilitating discussion boards, and holding office hours.
Post-baccalaureate course, Queens University at Kingston Continuing Teacher Education, 2021
CONT 933 is a 9-week course for students in the BC Post-Graduate Certificate program in Mathematics Education. I was the instructor for this course, assessing student work and facilitating a lively and positive online environment with productive discussion.
Undergraduate course, UBC Linguistics, 2021
LING 313 is a one-term course in phonetics for upper-year undergraduate students. I was a graduate TA for this course, marking student assignments, monitoring discussion boards, running labs, and holding office hours.
Undergraduate course, UBC Linguistics, 2021
LING 200 is a one-term introduction to phonology and phonetics. I was a graduate TA for this course, marking student assignments, running tutorials, and holding office hours.
Post-baccalaureate course, Queens University at Kingston Continuing Teacher Education, 2021
CONT 933 is a 9-week course for students in the BC Post-Graduate Certificate program in Mathematics Education. I was the instructor for this course, assessing student work and facilitating a lively and positive online environment with productive discussion.
Undergraduate course, UBC Linguistics, 2022
LING 222 is a one-term introduction to language acquisition that covers audition and speech perception, phonological organization, word learning, syntax, and pragmatics. I was a graduate TA for this course, marking student assignments and facilitating class discussions.
Undergraduate course, UBC Linguistics, 2023
LING 222 is a one-term introduction to language acquisition that covers audition and speech perception, phonological organization, word learning, syntax, and pragmatics. I was a graduate TA for this course, marking student assignments and facilitating student presentations.