- How to use the default system recognizer’s results in your own Android projects,
- How to use the NDK in your projects,
- How to use PocketSphinx (a lightweight recognizer library written in C) on Android
This week Joel and Gina presented some of the work lab members Josh, Theresa, Tobin and Gina and interns ME, Louisa, Elise, Yuliya and Hisako have done on the LingSync project as part of their 20 minute presentation “LingSync & the Online Linguistic Database: New models for the collection and management of data for language communities, linguists and language learners” at the Computational Approaches to Endangered Languages workshop at the 52nd Annual Meeting of the Association for Computational Linguistics (ACL).
LingSync and the Online Linguistic
Database (OLD) are new models for the
collection and management of data in
endangered language settings. The LingSync
and OLD projects seek to close a
feedback loop between field linguists, language
communities, software developers,
and computational linguists by creating
web services and user interfaces (UIs)
which facilitate collaborative and inclusive
language documentation. This paper
presents the architectures of these tools
and the resources generated thus far. We
also briefly discuss some of the features
of the systems which are particularly helpful
to endangered languages fieldwork and
which should also be of interest to computational
linguists, these being a service that
automates the identification of utterances
within audio/video, another that automates
the alignment of audio recordings and
transcriptions, and a number of services
that automate the morphological parsing
task. The paper discusses the requirements
of software used for endangered language
documentation, and presents novel data
which demonstrates that users are actively
seeking alternatives despite existing software.
Lab member Gina led a hand on workshop at Google Montreal as part of the All Girls Hack Night. The workshop shows how to get “up and running” with Android Intents in a three part tutorial, resulting in a gesture and/or voice remote control Android app.
Try it out! Here are the installers for each step of the workshop:
Step 1: Make it Talk
Step 2: Make it Listen
Twelve years after NAPhC1, the lab is proud to become the North American Phonology Conference’s first industry sponsor!
Mutsumi Oi, grad student at University of Ottawa won the sponsored door prize of a Praat script customized to her needs. The prize includes commented source code and a screencast explaining how the script works, and how to modify the script to tweek it for future use. Oi has until October 31 2012 to claim her prize, we are excited to work with her and will keep you posted if she decides to OpenSource her script.
Praat is an OpenSource phonetics software by Paul Boersma and David Weenink. It has been used by generations of linguists to automate phonetic analysis, export aligned transcriptions (textgrid), spectrograms, intonation contours and many other visualizations of sound. Praat can be run in a GUI, or on the command line. iLanguage Lab has integrated Praat into many of its Node.js experimentation server projects.
For some sample scripts: http://www.linguistics.ucla.edu/faciliti/facilities/acoustic/praat.html
Emmy and Hisako are proud to present the release of Spy or Not, a gamified psycholinguistics experiment made in collaboration with the Accents Research Lab at Concordia University headed by Dr. Spinu.
It is commonly observed that some people are “Good with Accents.” Some people can easily imitate various accents of their native language, while others appear struggle with imitation. This research is dedicated to building free OpenSource phonetics scripts to extract the acoustic components of native speakers and “Good with Accents” speakers to transfer the technical details in a visualizable format to applied linguists on the ground who are working with accented (clinical and non-native) speakers.
In order to collect non-biased judgements from native speakers, a pilot study was designed and run by Dr. Spinu and her students. Images and supporting sound effects were created and the perceptual side of the pilot was disguised as the game “Spy or Not?” The game has since gathered over 8,000 data points by crowdsourcing the judgements to determine the degree (on an 11 point scale) of which participants were “Good with Accents.” This a novel approach to the coding problems that experimenters frequently encounter.
Participation in this project furthers research in phonetics and phonology in addition to experimental methodology in the age of the social web. Our hope is that our readers will Tweet their “Good with Accents” scores and help us get more participants, especially native speakers of Russian English accents, Sussex English accents and South African accents, accents we could never access at the scale we need in a lab setting. Visit the free online game, or play offline by downloading the game at the Chrome Store or on Google Play as a Android App.
Thanks everyone for coming to the Wine and Cheese 0.5! We had more than expected (counting by wine glasses, over 30). I had a ton of fun, it was great to see you again, some of who I haven’t seen in nearly 10 years! I know you were a diverse group but I’m really proud of you for all mingling and trying the games after the talks, even walking around collaborating with members of the other Set. My goal was to bring you back to the initial love you felt for your respective fields, remember the spark that got you started.
The games seemed hard, but not so hard when you realized the tools you needed were either in the experts in the room, or in the data itself…
The linguistic party games bottle-taking-home-winners:
The winner: Olivier
-Bonus points for identifying three extraneous words in his English Wordle
-Full credit for correctly identifying his spectrogram!!! We suspect he consulted with Nadya? If so she gets a bottle too next time I see her 🙂
-Full credit for asking a linguist to teach him (rather than giving him the answer) how to correctly identify the disambiguation point in “The pit bull attacked by the cat was annoyed” (after ‘by’ it is clear that attacked modifies pit bull, rather than pit bull is the subject of attack).
-Full credit for identifying Groovy as an agglutinative verb final language in the code example, although the example does not demonstrate Groovy as an agglutinative language this could be claimed to be true (ironically the only Groovy expert in the room, got a Groovy example…) Full credit would have also been given for saying Groovy is a Creole with the lexical items from Java and syntax from its substrates of Ruby/Python/Perl/Java speakers.
Gina’s pick for Creativity: Julien
-Full credit for guessing LaTeX is an isolating language, and that he had the code to draw a vowel chart.
-Full credit for correctly guessing the ‘unlockable’ tree for the reading “The door is locked and you have the key to open it, thus the door is unlockable.”
-Full credit for 5 min of researching and yet incorrectly concluding that Indonesian was Malay. For those that don’t know, Indonesian and Malay are arguably the same E-Language, proof that E-Language might be a useful socio-political concept but maybe not a linguistic/NLP concept. They have the same stop/functional words and the same morphemes, if he really needed to distinguish them he needed to look for Named Entities aka Proper Nouns.
-Partial credit for trying the spectrograms and using number of syllables as a heuristique to guess (it was a pretty good try for justifying his answer.. his spectrogram had around 14 syllables, his answer had 18)
Hisako’s pick for Creativity: Dr. Witte
-Full credit for guessing that Croatian was Slovakian (they are both slavic, and have the letter ž, although a quick CTL-f in Wikipedia for the suffix -ija might have helped rule out Slovakian)
-Full credit for reading the comments in his source code and guessing that Java was a creole language, although I kind of doubt that creole is the best way to describe it..
-Partial credit for excessive creativity for saying the exact wrong answer: Linguist examine various linguistic systems and develop a) prescriptive grammars. I think Hisako’s judgment has been warped by TA-ing too many LING courses…
Honorable mention: Dr. Hale
Honorable mention: Emmy (MdotEdot)
-Partial credit for insisting that her Finnish wordle wasn’t Finnish due to the rampant mentions of Newcastle, but realizing it was indeed Finnish due to the productive morphology on Newcastle, Newcastlessa, Newcastlen..
Honorable mention: Dr. Bergler
-I remember Hisako was very impressed but I think some of the cards got thrown away in the clean up so I can’t remember… 🙂
Honorable mention: Peter
-Hisako was also impressed with your answers but I can’t remember which one it was 🙂
Hisako and I didn’t have a chance to go around to everyone to give them their answers, but email me a picture of your game card and we will email you the answers for your card and explain anything you didn’t get or anything that isn’t from the domain of your Set (ie, linguistics for non-linguistics, code for non-programmers 🙂