AKA Story
Deep learning facial recognition


Humans always had the innate ability to recognize and distinguish between faces. Now, it is not erroneous to say that robots can manage this ability too.

As Cindy starts talking, Musio locates her face, extracts facial cues, and identifies that she is in fact, Cindy. Admittingly this technology doesn’t seem so cutting-edge considering most modern smartphones already have an facial recognition feature embedded in them. However, Musio’s vision team believes that visual cues are an important component of communication and are spending a large amount of resources in creating a more refined identification system.


Lead computer vision engineer Justin says “We are very aware that there already exists a vast amount of both open-source and commercial software which enables what we are doing. However, creating a commercially viable engine is an entirely different matter.” When asked of specific hardships, Justin pinpoints runtime and accuracy as a crucial problem. “Sure we can create an engine which has very high performance. The problem is making sure the algorithm can run with limited resources in real time such that the consumer doesn’t experience lag.” The vision team has implemented various techniques starting from the traditional LBP algorithm to the more advanced deep learning algorithms to address this problem. However, according to vision engineer JK, there seems to be a lot more problems to address. “We also have to take into account difference in brightness the user might be exposed to, not to mention extracting additional information that might facilitate communication such as facial expressions and object detection.” Despite the vast number of obstacles that the vision team must tackle, Justin says he enjoys such challenges. “I enjoy challenges. We’ll just take them on one at a time and see how it goes.”


The ultimate goal of the vision team is to incorporate such visual cues into conversation. This can start from Musio identifying the speaker and saying “Hello Cindy,” to saying something complex like “It looks like Cindy is crying. Can you hand the tissue on the desk over to her Ryan?” Furthermore, the team is also working on a vision security project which authorizes usage of Musio based on facial information. Such incorporation of vision technology into Musio is surely an exciting prospect and something to keep an eye on as Musio continues to develop.

Leave a Reply