About Google Duplex, Robots, And Beyond

Last week’s I/O demo, of the virtual assistant Google Duplex scheduling a haircut appointment with a salon over a phone call, was jaw-dropping and nerve-racking at the same time. You can check this YouTube video of the demo if you haven’t. It was an inventive synthesis of natural language understanding, deep learning, and text-to-speech. Though it reminded me of the unseen implications of AI, the demo, in all honesty, was stupendous! The voice simulation, which incorporated the conversational responsiveness of the algorithm, and the emotional connect with the caller on the other side were so convincing it could feel like a natural conversation happening between real people without any way to know the difference, only that that wasn’t the case here. Whilst it’s a great time to see all the varied experiments around automation and machine learning coming to life etcetera, but the implications of them are obscure and might go beyond the role of ‘assisting’ humans. And here’s why.

I was reading Alec Ross’ insightful non-fiction about the permeating effects of digital transformation, automation, and technology, on our culture and jobs, and it’s called ‘The Industries of The Future’. In which, he presents a vivid image of tech innovations, ala Google Duplex, that is replete with industries that would define the employment prospects in a tech-oriented world, where we would deal with subservient robots, big data for ‘predictive analytics’ and commonly use genome sequencing for a deeper comprehension of our biological composition in areas related to preventive healthcare. In a similar vein, in Don Norman’s evocative writing of ‘Emotional Design’, he outlines in an entire chapter on robotics about a future in which humanoid robots would have access to our homes and personal spaces — just in case, this has already happened with a device such as the Roomba vacuum. The concept seems far-fetched today but not if you consider the events of last week’s Google I/O. So then in his book ‘The Industries…’, the author Alec Ross offers a meek reference to artificial intelligence’s debilitating effect on voice-based interactions including a scenario of committing fraud. He says in the book…

[perfectpullquote align=”right” bordertop=”false” cite=”” link=”” color=”” class=”” size=””][…] A downside is the increased risk of fraud. If my voice can be reconstructed in a way that makes the reconstruction difficult to distinguish from my “real” voice, then it opens up new opportunities for fraud — fraud in dozens of languages, no less. In a world with near-universal translation and communication, an ironic side effect may be that we’ll need to be able to look somebody in the eye to believe what he or she is saying.[/perfectpullquote]

Although we are not quite there yet as of today it’s the beginning of an exciting journey, for instance, Duplex isn’t yet a displaying a natural one-to-one conversation between a human and a machine. While this was a short demo of Duplex showcasing its capabilities of placing a call in the background the information that is triggering the task could be sourced from a pre-scheduled event — probably a calendar event or a voice-based command. Assuming, in other words, the event of booking a haircut appointment triggered Duplex to (a) understand the situation/task, then (b) build a judgement based on the requisition, and finally, (c) place a call at the salon asking for an appointment. Now, could we see a future iteration of Duplex looking beyond a pre-booking scenario and exploring a natural conversation? For instance, let’s say, I missed a call from my doctor’s clinic while I was busy in a meeting. During that time, Duplex calls the clinic to understand the situation with my medical reports. And, just as the case would be during a natural phone conversation, Duplex would make text notes on the recommendations by the doctor. Later, the moment I’m out of my meeting I could listen to the recorded conversation between the doctor and the virtual assistant and get the full details. It’d indeed be noteworthy to see Duplex negotiating the condition of the patient, at the same time translating and summarizing the technical terms with the doctor during the conversation based on natural language processing alone. Eventually, I would expect it to emulate this serious negotiation in a local language such as French or German, which is still a long way into the future. So until this scenario meets its logical conclusion, the chances of someone committing “fraud” as mentioned in the quote from the book seems like a moonshot.

But let’s assume someone does have the gumption to use Duplex to impersonate a caller with the intention to commit a fraud, how might we build the necessary safeguards to prevent its apparent misuse? Or, should we simply add a unique marker during a phone call such as audible “beeps” of varying intervals (call it a ‘morse code’ to suggest the individual’s identity) to differentiate between a genuine caller and an impersonator. Or how about we aim to make Duplex sentient? Wherein it could develop an autonomous profile and an emotional judgement of a person’s character at a visceral and behavioural level to interpret whether a phone call being made by someone is destructive or constructive in nature, then prevent its consistent misuse by locking itself down and making available a detailed report about the abnormalities to the relevant authorities for further action. Our machines today work on this simple logic of not carrying out our command when things don’t seem right to do so, for instance, an elevator which won’t budge if a person or an object obstructs the closing doors once we have pressed our floor button. But additionally in the future, let’s also imagine, if this elevator we are traveling in was programmed with face recognition and neural networking capabilities, if this programming integrated a sentient character it could also judge whether the passengers in the elevator are dangerous to the residents of the building and to what extent, then based on its judgement, it’d decide whether to lock them inside until the police arrived at the scene or let them move on to their destinations. Of course, the elevator would be equipped with liability regulations so that the machine could be controlled without any glitches in the eventuality of an oversight.

This feels like a narrative from a sci-fi movie today. Yet, the true potential of virtual assistants like Google’s Duplex, or the technologies going beyond it including robots, could only be unravelled if its makers intend for it to become autonomous while integrating the necessary safeguards and liability clauses that are well-defined. This would somewhat help alleviate the fear of Terminator-styled machines in the minds of the consumers. As a multitude of tech innovations begin surfacing on the horizon and as the line between virtual and real worlds start to blur gradually, it’s now becoming more evident, that the time for humans and machines to coexist has finally arrived!