Skip to main content
National Runner Up

interpretAR

interpretAR is a mobile application set to improve the quality of life for individuals who rely on sign language as their main medium of communication.

What it does

interpretAR's main purpose is twofold: (1) to solve every day inconveniences for the hearing impaired, and (2) to act as a learning aid for those who want to learn sign language. interpretAR aims to accomplish this through an Augmented Reality (AR) approach.


Your inspiration

When an individual suffers from hearing impairment, their ability to interact with the world severely declines. Conversations become inconvenient - getting the point across is challenging and meaning can be lost. Big players like Netflix, Google, and Microsoft try to do their part by offering Closed Captioning (CC) services on popular media. While CC proves to be a viable solution for those with hearing loss, at times it can feel robotic and expressionless. Having the option to incorporate sign language alongside these modules would allow individuals to use their preferred method of communication and thus interpretAR was created.


How it works

interpretAR uses Microsoft Cognitive Services Speech SDK, Blender Animations, Unity Gaming Engine, OpenCV, and Android Studio to create an application that translates speech input into American Sign Language (ASL) real time through an Augmented Reality (AR) approach. Currently the user has two options in which they can use our application: (1) ASL Feature (powered by Unity Gaming Engine) in which speech is translated into ASL real time. The sign language animations were created by our team in Blender. The user has over 100 words in which they can translate into ASL. Our team incorporated a face recognition option to promote lip reading for an immersive experience. ASL animations are integrated with the applications camera view. (2) Multi-language Feature (powered by Microsoft Azure) in which the user has over 9 different languages that they can choose to translate. Speech is translated in real time and displayed as captions in the applications camera view.


Design process

To begin our design process, we decided to start with hand modelling. We determined that Blender was the best software to use. Blender is a 3D graphics software used to animate models for different applications. A wire frame template of the hand was first obtained on the Blender website. The hand was designed with the following specifications: (1) 398 vertices and (2) 393 faces. We inserted bones into the hand mesh using Blender’s armature tool in order to give the hand control purposes through a process called rigging. Using Blender’s Action Editor feature, the hand was animated using keyframes. With the hand animations made in blender we now had to bring them to life within our application. This is accomplished through a software called Unity. Unity allows developers to create a virtual scene in which their animations interact with the environment based on received input from the user. We used Unity for two purposes: (1) to create an AR scene in which the hand animations and the users perspective of the real world were joined together, and (2) enable particular hand animations based on the received speech input. Microsoft Azures speech to text models were utilized to minimize error in translation. Lastly, interpretAR is brought to market through the use of Android Studio.


How it is different

Hand Talk Translator is an application on the Google Play Store that can translate text and audio into Brazilian Sign Language, displayed via an avatar on the screen of a mobile device. This application targets two main audiences - those that are interested in learning sign language, and those that rely on it for accessibility. interpretAR takes this one step further by converting audio into sign language while also providing a more immersive experience using augmented reality, and a more practical solution for direct conversation. As mentioned in the inspiration big players such as Google with their Google’s Cloud Speech-to-Text converts audio to text using neural network models in an API. The application can recognize 120 languages and return text results in real-time. interpretAR differs through incorporation of sign language. This allows individuals to understand using their preferred mode of communication with hopes to also be used as an educational tool.


Future plans

Naturally, the first improvement that could be made to interpretAR is expansion of vocabulary. Currently, interpretAR hosts a library of 100 basic ASL words, but ideally the entire ASL dictionary could be pushed onto the app. Similarly, the addition of other sign languages (i.e. British Sign Language, etc.) could be explored. Also, a 'google cardboard' option can be installed in order to promote its educational use for children. We would also like to make interpretAR available on IOS devices. In its final form, interpretAR translates complex speech input from multiple languages into multiple sign languages accurately and in real time.


Awards

interpretAR was awarded 'Most Innovative Design' at the 2019 McMaster University Electrical and Computer Engineering Expo. With approximately 50 groups at the expo event it was humbling to see interpretAR win top prize.


End of main content. Return to top of main content.