Amazon announced its vision for home assistant device Alexa to play a more active role as a personal assistant in November. In interviews with The McGill Tribune, Will Hamilton and Jackie Cheung, professors in McGill’s Department of Computer Science, outlined the basics of the technology behind voice-controlled home assistants like the Amazon Alexa and Google Home.
Alexa’s algorithm first transcribes sound waves. Known as raw speech processing, this procedure employs supervised machine learning, whereby computers are trained to recognize and interpret speech through labelled input data. An artificial neural network with neurons takes in the speech waves and outputs the matching textual transcript.
The transcript is then translated into a language that the computer can understand, so that the speech input becomes an instruction for the home assistant algorithm to follow. Compared to raw speech processing, the workload instead falls onto AI systems designers, who must provide the home assistant algorithm with a response for all possible voice commands given by the human user.
“There is a lot of manually curated, manually specified knowledge [involved in this step],” Cheung said.
As with other voice-controlled devices, the issue of reliability significantly influences design.
“[Tech companies] really don’t want the model to do weird things,” Hamilton said.
To avoid giving responses that are unexpected or would make the user uncomfortable, Alexa is designed to provide clear answers for every command it is given. As a result, the programming behind Alexa’s responses is considerably less advanced than raw language processing. Alexa’s recent upgrade, which allows it to provide better predictions, operates on the same cautious principle.
Training the computer to match verbal commands to specific responses is not fundamentally different from programming traditional dialogue. For example, if a user were to make a dinner reservation for two on a Friday night, Alexa’s algorithm would be prompted to ask if the user wants to make a movie reservation after the meal.
Regarding the privacy implications of home assistants, both Hamilton and Cheung noted that while privacy is a valid concern, a great part of users’ fears concerning their data security may stem from common misconceptions about the home assistant industry. For companies like Amazon and Google, collecting data is not an outrageous deed. After all, they need these data to train algorithms and improve their model.
Both Hamilton and Cheung agreed that concerns over user privacy also depend on the user’s own risk tolerance.
“Generally, people are aware that these companies are collecting data.” Cheung said. “If they accept this, they will buy and use these devices of their own volition.”
Once customers give access to their information by purchasing these devices, however, Canadian law stipulates very few restrictions on what type of data companies can collect. In an email to the Tribune, Ignacio Cofone, a professor in the McGill Faculty of Law, discussed the issue of giving consumers the responsibility of protecting their own privacy.
“The idea that, if we rely on consumer consent, consumers will […] manage their privacy risks by consenting only to those things that are beneficial to them has proven, at minimum, ineffective, and at maximum, harmful [to consumer interests],” Cofone wrote.
Cofone’s comments echo statements made by other legal scholars on the role of personal consent in the use of data mining technology. Such scholars maintain that the concept of ‘privacy self-management’ is not enough to protect users from exploitation.
Neither the technology behind Alexa’s upgrade nor its accompanying privacy issues are particularly revolutionary. Perhaps more groundbreaking are their implications. As these devices become more intimately involved in the private lives of users, the data they collect will construct an increasingly detailed picture of users’ identities and behaviours. While Amazon continues to profit off of user data, the long term consequences of this degree of data collection are not yet understood.