Speech comes of age
Voice recognition technology has matured enough to make it a powerful tool for government services, says Douglas Perris of Uniphone
One day soon, we will talk to our gadgets the way we talk to everyone else. And our gadgets will talk back. They will be able to hear what we say and figure out what we mean.
This is not some premonition, but the reality. If you’re steeped in today’s technology, you’ll understand how the rise of voice recognition is revolutionizing our lives. These new tools are extending the reach of our digital life into places and situations. The nature of our relationship with devices is increasingly becoming conversational. This will make our relationship to technology even more loyal and intimate.
The first speech recognition system could only understand digits. It evolved through the decades to a point where we now have a Siri or a Cortana. The explosion of voice recognition apps indicates that speech recognition’s time has come, and that you can expect plenty more of it.
This disruptive technology is already spilling into the public sector and rightly so. A government doesn’t have a say in choosing its customers. It has to provide a range of services to a wide spectrum of people. And in the process, face bigger challenges because of the diversity. A government is constantly answerable to the people. It is a different dynamic altogether.
The notion of developing technology to enhance the performance, liveability, bureaucracy and infrastructure of a city and its citizens is one full of potential and innovation for entrepreneurs and the digital world alike. Daily transactional actions can shift to computer-driven interactions, costing only a fraction of the price one would spend on an on-site employee. The UAE, for instance, can take the lead in developing such solutions for the government to improve the quality of life of citizens, stimulate business and drive economic growth.
In the course of the past few months, the term ‘smart’ has become prevalent and exciting for technology developers across the globe. Dubai is a prime example of that trend. It has seen tremendous growth in the past 10 years. It is perhaps most famous for its notable construction projects — the largest hotel in the world, the 162-story Burj Dubai skyscraper and a chain of 300 islands shaped like countries of the world. The government is already in the midst of making many more services ‘smart’ in sectors including transportation, society, the economy, government and the environment.
But for all its fairy tale growth, some of its sectors have lagged behind. Call centre service is one of them. This was the case at the Dubai Airport. However, with the increasing traffic and corresponding strain on its call centre, the impetus for a change was inevitable. Given that the airport handles more than 100 airlines flying to destinations across the world, call volume proved to be a big headache. It’s no surprise, then, that the airport installed a voice-enabled system that could relay flight information to travellers.
As human specialists continue to feed high-quality translations into speech recognition technologies, translations will get more refined. The automatic speech recognizer will have better information to work from when it issues its translations — vocalized or put into text. This is important for localisation efforts; understanding cultural nuances to communicate clearly and avoid mistakes.
This, in turn, becomes increasingly important in the government’s intelligence efforts, specifically for real-time intelligence. Machine translation is king when it comes to real-time intelligence, because humans take too long to turn things around. The government can use a number of sensors for intelligence to look for suspicious language. If a sensor finds something suspicious in another language, the machine translation can quickly digest the content and provide a summary of all that is being said.
In the healthcare sector, too, this technology can be a boon for the Dubai government. The benefits of adopting voice recognition across the hospital enterprise are endless. A voice recognition software can improve medical reporting quality and efficiency. The technology can remove the need to write longhand, and thus eliminate the errors from secretaries transcribing notes. It will show how voice recognition saves time and improves accuracy. It will also be able to provide a localised service for customers across the Middle East, providing a voice recognition solution for the multi-lingual, multi-cultural community of doctors in the region.
Where this emerging technology can really prove its mettle is in localizing to the regional market. This means dealing with a plethora of regional Arabic dialects. Being able to handle multiple dialects will be beneficial for the Dubai government to validate identities, build commercial and transportation hubs, and identify and solve basic citizen queries. Speech is really starting to take off in this market. Moving onward, the government can deploy applications handling transactions and even voice biometrics.
As exciting as these developments sound, there’s still a lot to be learned by automatic speech recognition and translation. It is difficult to perfect translation technology because everyone speaks differently, even if they’re speaking the same language. We use different tones, tempos and accents. These things confuse the automatic speech recognition engine, and it takes a lot of quality data to help the engine understand that it’s the same language that is being spoken.
Moreover, language is evolving over time. The government will need to learn continuously if it wants to improve the life of its citizens. Dubai can then slowly move to delivering truly ‘smart’ services to the city. The government also constantly needs to adapt to the ever-changing technological advances and needs of its citizens to deliver on its promises.
The power to change the way people live rests in the hands of the government, and not business enterprises. The adoption of speech recognition will not only save resource-cost, but also improve communication between the government and its people. It’s time to take this grown-up child seriously.
Douglas Peris is General Manager, APAC & Middle East, Uniphore Software Systems.