Close Menu
TechCentralTechCentral

    Subscribe to the newsletter

    Get the best South African technology news and analysis delivered to your e-mail inbox every morning.

    Facebook X (Twitter) YouTube LinkedIn
    WhatsApp Facebook X (Twitter) LinkedIn YouTube
    TechCentralTechCentral
    • News
      Big Microsoft 365 price increases coming next year

      Big Microsoft price increases coming next year

      5 December 2025
      Vodacom to take control of Safaricom in R36-billion deal - Shameel Joosub

      Vodacom to take control of Safaricom in R36-billion deal

      4 December 2025
      Black Friday goes digital in South Africa as online spending surges to record high

      Black Friday goes digital in South Africa as online spending surges to record high

      4 December 2025
      BYD takes direct aim at Toyota with launch of sub-R500 000 Sealion 5 PHEV

      BYD takes direct aim at Toyota with launch of sub-R500 000 Sealion 5 PHEV

      4 December 2025
      'Get it now': Takealot in new instant deliveries pilot

      ‘Get it now’: Takealot in new instant deliveries pilot

      4 December 2025
    • World
      Amazon and Google launch multi-cloud service for faster connectivity

      Amazon and Google launch multi-cloud service for faster connectivity

      1 December 2025
      Google makes final court plea to stop US breakup

      Google makes final court plea to stop US breakup

      21 November 2025
      Bezos unveils monster rocket: New Glenn 9x4 set to dwarf Saturn V

      Bezos unveils monster rocket: New Glenn 9×4 set to dwarf Saturn V

      21 November 2025
      Tech shares turbocharged by Nvidia's stellar earnings

      Tech shares turbocharged by stellar Nvidia earnings

      20 November 2025
      Config file blamed for Cloudflare meltdown that disrupted the web

      Config file blamed for Cloudflare meltdown that disrupted the web

      19 November 2025
    • In-depth
      Jensen Huang Nvidia

      So, will China really win the AI race?

      14 November 2025
      Valve's Linux console takes aim at Microsoft's gaming empire

      Valve’s Linux console takes aim at Microsoft’s gaming empire

      13 November 2025
      iOCO's extraordinary comeback plan - Rhys Summerton

      iOCO’s extraordinary comeback plan

      28 October 2025
      Why smart glasses keep failing - no, it's not the tech - Mark Zuckerberg

      Why smart glasses keep failing – it’s not the tech

      19 October 2025
      BYD to blanket South Africa with megawatt-scale EV charging network - Stella Li

      BYD to blanket South Africa with megawatt-scale EV charging network

      16 October 2025
    • TCS
      TCS+ | How Cloud on Demand helps partners thrive in the AWS ecosystem - Odwa Ndyaluvane and Xenia Rhode

      TCS+ | How Cloud On Demand helps partners thrive in the AWS ecosystem

      4 December 2025
      TCS | MTN Group CEO Ralph Mupita on competition, AI and the future of mobile

      TCS | Ralph Mupita on competition, AI and the future of mobile

      28 November 2025
      TCS | Dominic Cull on fixing South Africa's ICT policy bottlenecks

      TCS | Dominic Cull on fixing South Africa’s ICT policy bottlenecks

      21 November 2025
      TCS | BMW CEO Peter van Binsbergen on the future of South Africa's automotive industry

      TCS | BMW CEO Peter van Binsbergen on the future of South Africa’s automotive industry

      6 November 2025
      TCS | Why Altron is building an AI factory - Bongani Andy Mabaso

      TCS | Why Altron is building an AI factory in Johannesburg

      28 October 2025
    • Opinion
      Your data, your hardware: the DIY AI revolution is coming - Duncan McLeod

      Your data, your hardware: the DIY AI revolution is coming

      20 November 2025
      Zero Carbon Charge founder Joubert Roux

      The energy revolution South Africa can’t afford to miss

      20 November 2025
      It's time for a new approach to government IT spend in South Africa - Richard Firth

      It’s time for a new approach to government IT spend in South Africa

      19 November 2025
      How South Africa's broken Rica system fuels murder and mayhem - Farhad Khan

      How South Africa’s broken Rica system fuels murder and mayhem

      10 November 2025
      South Africa's AI data centre boom risks overloading a fragile grid - Paul Colmer

      South Africa’s AI data centre boom risks overloading a fragile grid

      30 October 2025
    • Company Hubs
      • Africa Data Centres
      • AfriGIS
      • Altron Digital Business
      • Altron Document Solutions
      • Altron Group
      • Arctic Wolf
      • AvertITD
      • Braintree
      • CallMiner
      • CambriLearn
      • CYBER1 Solutions
      • Digicloud Africa
      • Digimune
      • Domains.co.za
      • ESET
      • Euphoria Telecom
      • Incredible Business
      • iONLINE
      • IQbusiness
      • Iris Network Systems
      • LSD Open
      • NEC XON
      • Netstar
      • Network Platforms
      • Next DLP
      • Ovations
      • Paracon
      • Paratus
      • Q-KON
      • SevenC
      • SkyWire
      • Solid8 Technologies
      • Telit Cinterion
      • Tenable
      • Vertiv
      • Videri Digital
      • Vodacom Business
      • Wipro
      • Workday
      • XLink
    • Sections
      • AI and machine learning
      • Banking
      • Broadcasting and Media
      • Cloud services
      • Contact centres and CX
      • Cryptocurrencies
      • Education and skills
      • Electronics and hardware
      • Energy and sustainability
      • Enterprise software
      • Financial services
      • Information security
      • Internet and connectivity
      • Internet of Things
      • Investment
      • IT services
      • Lifestyle
      • Motoring
      • Public sector
      • Retail and e-commerce
      • Satellite communications
      • Science
      • SMEs and start-ups
      • Social media
      • Talent and leadership
      • Telecoms
    • Events
    • Advertise
    TechCentralTechCentral
    Home » In-depth » Why tech giants love the sound of your voice

    Why tech giants love the sound of your voice

    By Agency Staff13 December 2016
    Twitter LinkedIn Facebook WhatsApp Email Telegram Copy Link
    News Alerts
    WhatsApp
    Amazon’s Echo

    Amazon’s Echo has made tangible the promise of an artificially intelligent personal assistant in every home.

    Those who own the voice-activated gadget (known colloquially as Alexa, after its female interlocutor) are prone to proselytising “her” charms, applauding Alexa’s ability to call an Uber, order pizza or check the kid’s maths homework. The company says more than 5 000 people a day profess their love for Alexa.

    On the other hand, Alexa devotees also know that unless you speak to her very clearly — and slowly — she’s likely to say: “Sorry, I don’t have the answer to that question.”

    “I love her. I hate her, I love her,” one customer wrote on Amazon’s website, while still awarding Alexa five stars. “You will very quickly learn how to talk to her in a way that she will understand and it’s not unlike speaking to a small frustrating toddler.”

    Voice recognition has come a long way in the past few years. But it’s still not good enough to popularise the technology for everyday use and usher in a new era of human-machine interaction, allowing us to talk to all our gadgets — cars, washing machines, televisions. Despite advances in speech recognition, most people continue to swipe, tap and click. And probably will for the foreseeable future.

    What’s holding back progress? Partly the artificial intelligence that powers the technology has room to improve. There’s also a serious deficit of data — specifically audio of human voices, speaking in multiple languages, accents and dialects in often noisy circumstances that can defeat the code.

    So Amazon, Apple, Microsoft and China’s Baidu have embarked on a worldwide hunt for terabytes of human speech.

    Microsoft has set up mock apartments in cities around the globe to record volunteers speaking in a home setting. Every hour, Amazon uploads Alexa queries to a vast digital warehouse. Baidu is busily collecting every dialect in China. Then they take all that data and use it to teach their computers how to parse, understand and respond to commands and queries.

    The challenge is finding a way to capture natural, real-world conversations. Even 95% accuracy isn’t enough, says Adam Coates, who runs Baidu’s artificial intelligence lab in Sunnyvale, California. “Our goal is to push the error rate down to 1%,” he says. “That’s where you can really trust the device to understand what you’re saying, and that will be transformative.”

    Not so long ago, voice recognition was comically rudimentary. An early version of Microsoft’s technology running in Windows transcribed “mom” as “aunt” during a 2006 demo before an auditorium of analysts and investors. When Apple debuted Siri five years ago, the personal assistant’s gaffes were widely mocked because it, too, routinely spat out incorrect results or didn’t hear the question correctly. When asked if Gillian Anderson is British, Siri provided a list of English restaurants. Now Microsoft says its speech engine makes the same number or fewer errors than professional transcribers, Siri is winning grudging respect, and Alexa has given us a tantalising glimpse of the future.

    Much of that progress owes a debt to the magic of neural networks, a form of artificial intelligence based loosely on the architecture of the human brain. Neural networks learn without being explicitly programmed but generally require an enormous breadth and diversity of data. The more a speech recognition engine consumes, the better it gets at understanding different voices and the closer it gets to the eventual goal of having a natural conversation in many languages and situations.

    Hence the global scramble to capture a multitude of voices. “The more data we shove in our systems the better it performs,” says Andrew Ng, Baidu’s chief scientist. “This is why speech is such a capital-intensive exercise; not a lot of organisations have this much data.”

    When the industry began working seriously on voice recognition in the 1990s, companies like Microsoft relied on publicly available data from research institutes such as the Linguistics Data Consortium, a storehouse of voice and text data founded in 1992 with backing from the US government and located at the University of Pennsylvania.

    Then tech companies started collecting their own voice data, some of it garnered from volunteers who came in to read and be recorded. Now, with the popularity of speech-controlled software gaining ground, they harvest much of the data from their own products and services.

    When you tell your phone to search for something, play a song or guide you to a destination, chances are a company is recording it. (Apple, Google, Microsoft and Amazon emphasise that they anonymise user data to protect customer privacy.)

    When you ask Alexa what the weather is or the latest football score, the gadget uses the queries to improve its understanding of natural language (although “she” isn’t listening to your conversations unless you say her name). “By design, Alexa gets smarter as you use her,” says Nikko Strom, senior principal scientist for the program.

    Microsoft’s Cortana software

    One of the key challenges is getting the technology conversant with multiple languages, accents and dialects. Nowhere, perhaps, is this more crucial than in China. Seeking to harvest dialects from all over the country, Baidu launched a marketing campaign during Chinese New Year earlier this year. Calling the push a “dialect conservation initiative,” the search giant promised people that if they contributed they would help usher in a future when they would talk to Baidu using their dialect.

    In two weeks, the company recorded more than a thousand hours of speech to plug into its computers. Many people did it for free simply because they were proud of their hometown dialects. A high school teacher in Sichuan was so excited about the program, he asked a class of students to record more than a thousand ancient poems in Sichuanese.

    Another challenge: teaching voice recognition technology to pick up commands over background noise — the clamour of happy hour, say, or the cacophony of a sports stadium.

    Microsoft has deployed an Xbox app called Voice Studio to harvest conversation over the din of users shooting villains or watching movies. The company offered rewards including points and digital apparel for avatars and lured hundreds of subjects willing to contribute their game chatter to Microsoft’s speech efforts. The program worked gangbusters in Brazil, where the local subsidiary promoted the app heavily on the main Xbox page. The data was used to create the Brazilian Portuguese version of Cortana, released earlier this year.

    Companies are also designing voice recognition systems for specific situations. Microsoft has been testing technology that can answer travellers’ queries without being distracted by the constant barrage of flight announcements at airports. The company’s technology is also being used in an automated ordering system for McDonald’s drive-thrus. Trained to ignore scratchy audio, screaming kids and “ums”, it can spit out a complicated order, getting even the condiments right. Amazon is conducting tests in automobiles, challenging Alexa to work well with road noise and open windows.

    Apple’s Siri

    Even as companies scour the world for data, they’re figuring out ways to improve voice recognition with less of it. The technology being tested at McDonald’s is more accurate than other systems that use much more data, says Xuedong Huang, Microsoft’s chief speech scientist, who has been working on voice recognition at the company for more than two decades. “You can always have breakthroughs even without using the most data.”

    Google generally subscribes to a less-is-more philosophy, deploying a piecemeal approach that uses unintelligible units of sound to build words and phrases. With its speech recognition system, the company aims to solve multiple problems with just one change.

    For its data sets, Google strings together tens of thousands of audio snippets that are typically two to five seconds long. The process requires less computing power and can be more easily tested and tweaked, says Google researcher Françoise Beaufays.

    For its part, Baidu is working on more efficient algorithms where learning one language makes it easier to learn the next 12. That’s particularly important for those spoken by tens of thousands of people rather than millions, where there just won’t be huge swaths of data no matter what, says Ng, the company’s chief scientist.

    Ask researchers like Ng when it will be possible to speak naturally to your digital assistant and they get wistful. No one really knows. Neural networks remain mysterious even to those who understand them best. And much of the work is trial and error; make a tweak here and you’re never quite sure what will happen there.

    Based on the current technology and methods, the process will probably take years. But Ng, Huang, Beaufays and other scientists say you never know when a breakthrough will arrive, catapulting research forward and turning Alexa and Siri into true conversationalists.  — (c) 2016 Bloomberg LP



    Alexa Amazon Amazon Echo Apple Baidu Cortana Echo Microsoft Siri
    Subscribe to TechCentral Subscribe to TechCentral
    Share. Facebook Twitter LinkedIn WhatsApp Telegram Email Copy Link
    Previous ArticleSassa may extend Net1 grants contract
    Next Article MTN said to extract billions from Iran

    Related Posts

    Big Microsoft 365 price increases coming next year

    Big Microsoft price increases coming next year

    5 December 2025
    Black Friday goes digital in South Africa as online spending surges to record high

    Black Friday goes digital in South Africa as online spending surges to record high

    4 December 2025
    Unlock smarter computing with your surface Copilot+ PC

    Unlock smarter computing with your Surface Copilot+ PC

    4 December 2025
    Company News
    AI is not a technology problem - iqbusiness

    AI is not a technology problem – iqbusiness

    5 December 2025
    Telcos are sitting on a data gold mine - but few know what do with it - Phillip du Plessis

    Telcos are sitting on a data gold mine – but few know what do with it

    4 December 2025
    Unlock smarter computing with your surface Copilot+ PC

    Unlock smarter computing with your Surface Copilot+ PC

    4 December 2025
    Opinion
    Your data, your hardware: the DIY AI revolution is coming - Duncan McLeod

    Your data, your hardware: the DIY AI revolution is coming

    20 November 2025
    Zero Carbon Charge founder Joubert Roux

    The energy revolution South Africa can’t afford to miss

    20 November 2025
    It's time for a new approach to government IT spend in South Africa - Richard Firth

    It’s time for a new approach to government IT spend in South Africa

    19 November 2025

    Subscribe to Updates

    Get the best South African technology news and analysis delivered to your e-mail inbox every morning.

    Latest Posts
    Big Microsoft 365 price increases coming next year

    Big Microsoft price increases coming next year

    5 December 2025
    AI is not a technology problem - iqbusiness

    AI is not a technology problem – iqbusiness

    5 December 2025
    Vodacom to take control of Safaricom in R36-billion deal - Shameel Joosub

    Vodacom to take control of Safaricom in R36-billion deal

    4 December 2025
    Black Friday goes digital in South Africa as online spending surges to record high

    Black Friday goes digital in South Africa as online spending surges to record high

    4 December 2025
    © 2009 - 2025 NewsCentral Media
    • Cookie policy (ZA)
    • TechCentral – privacy and Popia

    Type above and press Enter to search. Press Esc to cancel.

    Manage consent

    TechCentral uses cookies to enhance its offerings. Consenting to these technologies allows us to serve you better. Not consenting or withdrawing consent may adversely affect certain features and functions of the website.

    Functional Always active
    The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
    Preferences
    The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
    Statistics
    The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
    Marketing
    The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
    • Manage options
    • Manage services
    • Manage {vendor_count} vendors
    • Read more about these purposes
    View preferences
    • {title}
    • {title}
    • {title}