EXPANDING LANGUAGE POSSIBILITIES WITH AI

OUR TECHNOLOGY

01 / LANGUAGE IDENTIFICATION

Most commercially available systems for text language identification support 50-100 languages. SIL has built a language identification system that currently supports more than 1300 languages! In addition, we are deploying custom spoken language and accent identification systems around the world.

02 / MULTILINGUAL SPEECH SYSTEMS

SIL is collaborating with Intel to develop and deploy patent pending, AI-driven audio technology that expands possibilities for contactless and multilingual applications in linguistically diverse contexts. With this technology, speech in input audio streams can be dynamically recognized, even in cases when multiple languages might be spoken.

03 / LOCAL LANGUAGE CHAT

Organizations have increasingly turned to dialogue systems to help them engage with users. As these organizations move into emerging markets, supporting chat interactions becomes challenging due to linguistic diversity. The SIL chat platform natively supports natural language understanding (NLU) and conversation design in 1600+ languages.

04 / TRANSLATION QUALITY CHECKING

In the case of highly important information (e.g., COVID-19 health information), translation quality must be ensured. At the same time, those wishing to create translations have limited options to check quality. Harnessing the latest artificial intelligence (AI) techniques, SIL seeks to automatically and objectively assess multiple facets of translation quality and to validate the usability of those assessments with translation experts.

05 / MULTIMODAL LANGUAGE MODELS

With widespread focus on text-only techniques, audio represents a largely untapped source of information. Based on this, SIL is exploring multimodal (i.e., text and audio) language models. Such pre-trained models could be combined with a universal phone recognizer to lower the data requirement for a variety of NLP tasks in low resources languages like machine translation, speech recognition, dialogue, etc.

06 / HUMAN IN-THE-LOOP MT

Often people wonder why machine translation is not used for local language translation. Well, as it turns out, commercial machine translation technology only supports around 100 of the 7000+ living languages. SIL is leveraging and extending new research from the NLP community to demonstrate how machine translation technology can be applied (with a human-in-the-loop) to low resources language translations. 

flow.png
 

AI @ SIL

world.png
SIL INTERNATIONAL

SIL is a global, faith-based nonprofit that works with local communities around the world to develop language solutions that expand possibilities for a better life.  We are involved in approximately 1,350 active language projects in 104 countries. These projects impact more than 1.1 billion people within 1,600 local communities.

languages-1891043-1597953.webp
OUR VISION

We long to see people flourishing in community using the languages they value most. This includes flourishing in the digital sphere where AI and Natural Language Processing (NLP) is increasingly driving the features of products. We work to make sure that the benefits of AI and NLP extend to local language communities.

kiosk.png
OUR TECHNOLOGY

Because we are focused on AI and NLP for local languages, the tech that we build is uniquely suited for emerging markets. We create models and systems that can be run at the edge (disconnected from the Internet) on low powered devices. We also leverage and extend the latest research on NLP tasks in data scarce scenarios.

 

PUBLICATIONS

  • Meyer, J. et al, "BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus." Submitted to InterSpeech 2022. (Related Article, Hugging Face Space)

  • Leong, C. and Whitenack, D., "Phone-ing it in: Towards Flexible Multi-Modal Language Model Training by Phonetic Representations of Data." ACL (2022). (Paper)

  • Serianni, A. and Whitenack, D., "Exploring Transfer Learning Pathways for Neural Machine Back Translation of Eskimo-Aleut, Chicham, and Classical Languages. (Paper)

  • Whitenack, D., Nemecek, J., & Manepalli, S., "Dyn-ASR: Compact, Multilingual Speech Recognition via Spoken Language and Accent Identification." Accepted to IEEE WF-IOT (2021).

  • Nekoto, Wilhelmina et al. “Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages.” EMNLP Findings (2020). (Paper)

  • Hirekodi, S., Sunny, S., Topno, L., Daniel, A., Whitenack, D., Skewes, R., & Cranney, S. (2019). Katecheo: A Portable and Modular System for Multi-Topic Question Answering. ArXiv, abs/1907.00854. ,