Category Archives: Research

Presentation at ComputEL workshop @ ACL 2014

This week Joel and Gina presented some of the work lab members Josh, Theresa, Tobin and Gina and interns ME, Louisa, Elise, Yuliya and Hisako have done on the LingSync project as part of their 20 minute presentation “LingSync & the Online Linguistic Database: New models for the collection and management of data for language communities, linguists and language learners” at the  Computational Approaches to Endangered Languages workshop at the 52nd Annual Meeting of the Association for Computational Linguistics (ACL).




LingSync and the Online Linguistic
Database (OLD) are new models for the
collection and management of data in
endangered language settings. The LingSync
and OLD projects seek to close a
feedback loop between field linguists, language
communities, software developers,
and computational linguists by creating
web services and user interfaces (UIs)
which facilitate collaborative and inclusive
language documentation. This paper
presents the architectures of these tools
and the resources generated thus far. We
also briefly discuss some of the features
of the systems which are particularly helpful
to endangered languages fieldwork and
which should also be of interest to computational
linguists, these being a service that
automates the identification of utterances
within audio/video, another that automates
the alignment of audio recordings and
transcriptions, and a number of services
that automate the morphological parsing
task. The paper discusses the requirements
of software used for endangered language
documentation, and presents novel data
which demonstrates that users are actively
seeking alternatives despite existing software.

Download full paper as .pdf or .tex



Using Technology to Bridge Gaps @Carlton University

Hisako and Elise went to Carlton University this week to present at FEL 2013, the 17th Conference of the Foundation for Endangered Language. The theme of this year’s workshop was Endangered Languages Beyond Boundaries: Community Connections, Collaborative Approaches, and Cross-Disciplinary Research.

Elise McClay (BA ’12), Erin Olson (BA ’12), Carol Little (BA ’12), Hisako Noguchi (Concordia), Alan Bale (Concordia), Jessica Coon (McGill) and Gina  (iLanguage Lab) presented an electronic poster titled “LingSync: Using Technology to Bridge Gaps between Speakers, Learners, and Linguists.”

Hisako and Elise demo Tobin's app at FEL
Hisako and Elise demo Tobin’s app at FEL 2013

Source code available on GitHub.

Computational Field Workshop @McGill

On May 27th the Mi’gmaq Partnership (Listuguj, McGill, iLanguage) will be hosting its first Computational Field Workshop at McGill. Lab members Hisako, Gina, Josh and Tobin along with Louisa and Carol presented some of their recent scripts and tools developed as part of the partnership.

The workshop will focus on computational tools for transcribing, storing and searching linguistic data. There is a special focus on fieldwork, but it should be of broader interest as well––no background required.

In addition to work by Montreal-based iLanguage Lab, a key partner in the Mi’gmaq Partnership, the workshop will feature a talk and workshop by keynote speaker Alexis Palmer.

More details can be found in the workshop program.

The workshop will be held at the Thompson House, McGill.
The workshop will be held at the Thompson House, McGill.

Making your apps smarter @Notman House

Next Wednesday our software engineering intern Bahar Sateli will be presenting her OpenSource Named Entity Recognition library for Android which is powered by the Semantic Software Lab‘s Semantic Assistants web services platform, which in turn, is powered by GATE, an Open Source General Architecture for Text Engineering developed at the University of Sheffield.

As part of her MITACS/NRC-IRAP funded project in collaboration with iLanguage Lab she created an Android Library to make it possible to recognize people, locations, dates and other useful pieces of text, on Android Phones. The sky is the limit as it can run any GATE powered pipeline.

The current open source pipelines range from very specialized (recognizing Bacteria and Fungi entities in bio-medical texts) to very general (recognizing people, places and dates).

She will be presenting her app iForgot Who which takes in some text, and automatically creates new contacts for you, a handy app for all those party planners out there. It is a demo application to show new developers how they can use her OpenSource system to make their own apps smarter and automate tasks for users.

The presentations start at 6:30, and we will be going out for drinks afterwards at around 8:30/9:30 at Pub Quartier Latin (next to La Distilierie, corner of Onario and Sanguinet, 1 block walk from the talk).

Come one, come all, for the presentation and/or for drinks!

Code is open sourced on SourceForge.


The Google+ event

Directions to presentation:

View Larger Map

Directions to drinks:

View Larger Map



Bahar presents at Android Montreal
Bahar presents to a record breaking crowd at Android Montreal

iLanguage Lab Sponsors NAPhC 7

Twelve years after NAPhC1, the lab is proud to become the North American Phonology Conference’s first industry sponsor!


Mutsumi Oi, grad student at University of Ottawa won the sponsored door prize of a Praat script customized to her needs. The prize includes commented source code and a screencast explaining how the script works, and how to modify the script to tweek it for future use. Oi has until October 31 2012 to claim her prize, we are excited to work with her and will keep you posted if she decides to OpenSource her script.

Praat is an OpenSource phonetics software by Paul Boersma and David Weenink. It has been used by generations of linguists to automate phonetic analysis, export aligned transcriptions (textgrid), spectrograms, intonation contours and many other visualizations of sound. Praat can be run in a GUI, or on the command line. iLanguage Lab has integrated Praat into many of its Node.js experimentation server projects.

For some sample scripts:

Spy or Not?

So you think you would make a good spy?

Emmy and Hisako are proud to present the release of Spy or Not, a gamified psycholinguistics experiment made in collaboration with the Accents Research Lab at Concordia University headed by Dr. Spinu.

It is commonly observed that some people are “Good with Accents.”  Some people can easily imitate various accents of their native language, while others appear struggle with imitation.  This research is dedicated to building free OpenSource phonetics scripts to extract the acoustic components of native speakers and “Good with Accents” speakers to transfer the technical details in a visualizable format to applied linguists on the ground who are working with accented (clinical and non-native) speakers.

In order to collect non-biased judgements from native speakers, a pilot study was designed and run by Dr. Spinu and her students. Images and supporting sound effects were created and the perceptual side of the pilot was disguised as the game “Spy or Not?” The game has since gathered over 8,000 data points by crowdsourcing the judgements to determine the degree (on an 11 point scale) of which participants were “Good with Accents.” This a novel approach to the coding problems that experimenters frequently encounter.

Participation in this project furthers research in phonetics and phonology in addition to experimental methodology in the age of the social web. Our hope is that our readers will Tweet their “Good with Accents” scores and help us get more participants, especially native speakers of Russian English accents, Sussex English accents and South African accents, accents we could never access at the scale we need in a lab setting. Visit the free online game, or play offline by downloading the game at the Chrome Store or on Google Play as a Android App.


Bénévoles recherché(e)s pour travailler sur le développement de plusieurs projets Android en SourceLibre pour des outils orthophoniques (1,2) ex. AndroidBAT(3), WebBAT(3), AuBlog (4)

Qui est admissible?

  • Étudiant(e)s en orthophonie, linguistique, psychologie, génie-logiciel, audiologie ou disciplines connexes.

Tâches à accomplir :

  • entrée de données (de mouvements oculaires, phonémiques, de choix de réponse)
  • transcription (d’échantillons de langage, de textes)
  • évaluation de participants cliniques et non cliniques
  • programmation pour Android ou HTML5

Possibilités :

  • d’emploi en programmation
  • de subventions
  • de stages rémunérés (maîtrise et doctorat)

Opportunités :

  • expérience en gestion de projets
  • expérience en Données Ouvertes et Source Libre
  • expérience avec des outils techniques en orthophonie et en phonétique

Avantages :

  • horaires flexibles
  • expérience dans le domaine des nouveaux médias
  • expérience appliquée dans le domaine de la technologie langagière
  • expérience technique menant à de nouvelles opportunités de financement visant spécifiquement les Données Ouvertes
  • lettres de recommandation (avec implication minimale)

Implications :

  • minimum entre 2 et 5 heures par semaine
  • minimum de 20 heures d’engagement au total (pas de maximum)
  1. Des projets SourceLibre en collaboration avec des chercheurs des universités UdeM, UQAM, Concordia, McGill
  2. Personnes à contacter : Emmy Cathcart Ph.D FieldLinguistics @ iLanguage Lab, Alexandra Marquis Ph.D. Psycholinguistique du développement @ ÉOA UdeM
  3. BAT : Bilingual Aphasia Test – Le test d’aphasie chez les bilingues
  4. AuBlog : outils de recherche sur la prosodie