Close Menu
TechCentralTechCentral

    Subscribe to the newsletter

    Get the best South African technology news and analysis delivered to your e-mail inbox every morning.

    Facebook X (Twitter) YouTube LinkedIn
    WhatsApp Facebook X (Twitter) LinkedIn YouTube
    TechCentralTechCentral
    • News
      Solly Malatsi moves to rescue South Africa's botched AI policy

      Malatsi moves to rescue South Africa’s botched AI policy

      12 May 2026
      MTN's African engines fire - but South Africa still stalled

      MTN’s African engines fire – but South Africa still stalled

      12 May 2026
      Naspers shares tumble on iFood investment warning - Fabricio Bloisi

      Naspers shares tumble on iFood investment warning

      12 May 2026
      Netflix's astonishing R2.2-trillion content bill

      Netflix’s astonishing R2.2-trillion content bill

      12 May 2026
      Joosub warns of 24 months of pain for phone buyers

      Joosub warns of 24 months of pain for phone buyers

      12 May 2026
    • World
      Pop star sues Samsung for $15-million - Dua Lipa

      Pop star sues Samsung for $15-million

      11 May 2026
      OpenAI's new audio APIs aim for conversational voice agents

      OpenAI’s new audio APIs aim for conversational voice agents

      8 May 2026
      'It was my idea': Musk claims paternity of OpenAI - Elon Musk

      ‘It was my idea’: Musk claims paternity of OpenAI

      29 April 2026
      Pivotal week for US tech stocks

      Pivotal week for US tech stocks

      28 April 2026
      Worries over OpenAI's growth as Anthropic gains ground - Sam Altman. Shelby Tauber/Reuters

      Worries over OpenAI’s growth as Anthropic gains ground

      28 April 2026
    • In-depth
      Alfa's electric rebel - Alfa Romeo Junior Elettrica Veloce

      Alfa’s electric rebel

      29 April 2026
      Africa switches on as Europe dims the lights

      Africa switches on as Europe dims the lights

      9 April 2026
      The biggest untapped EV market on Earth is hiding in plain sight

      The biggest untapped EV market on Earth is hiding in plain sight

      1 April 2026
      Datatec is firing on all cylinders - Jens Montanana

      The R16-billion tech giant hiding in plain sight

      26 March 2026
      The last generation of coders

      The last generation of coders

      18 February 2026
    • TCS
      Michael Rossouw

      TCS+ | The retirement decision most South Africans get wrong

      6 May 2026
      TCS | The Cape Town start-up listening for TB with AI - Braden van Breda

      TCS | The Cape Town start-up listening for TB with AI

      4 May 2026

      TCS+ | ‘The ISP for ISPs’: Vox’s shift to wholesale aggregator

      20 April 2026
      TCS | Werner Lindemann on how AI is rewriting the infosec rulebook

      TCS | Werner Lindemann on how AI is rewriting the infosec rulebook

      15 April 2026
      TCS | Donovan Marsh on AI and the future of filmmaking

      TCS | Donovan Marsh on AI and the future of filmmaking

      7 April 2026
    • Opinion
      Free calls, dead voice and Shameel Joosub's Spanish ghost - Duncan McLeod

      Free calls, dead voice and Shameel Joosub’s Spanish ghost

      22 April 2026
      The conflict of interest at the heart of PayShap's slow adoption - Cheslyn Jacobs

      The conflict of interest at the heart of PayShap’s slow adoption

      26 March 2026
      South Africa's energy future hinges on getting wheeling right - Aishah Gire

      South Africa’s energy future hinges on getting wheeling right

      10 March 2026
      Free calls, dead voice and Shameel Joosub's Spanish ghost - Duncan McLeod

      Apple just dropped a bomb on the Windows world

      5 March 2026
      R230-million in the bag for Endeavor's third Harvest Fund - Alison Collier

      VC’s centre of gravity is shifting – and South Africa is in the frame

      3 March 2026
    • Company Hubs
      • 1Stream
      • Africa Data Centres
      • AfriGIS
      • Altron Digital Business
      • Altron Document Solutions
      • Altron Group
      • Arctic Wolf
      • Ascent Technology
      • AvertITD
      • BBD
      • Braintree
      • CallMiner
      • CambriLearn
      • Contactable
      • CYBER1 Solutions
      • Digicloud Africa
      • Digimune
      • Domains.co.za
      • ESET
      • Euphoria Telecom
      • HOSTAFRICA
      • Incredible Business
      • iONLINE
      • IQbusiness
      • Iris Network Systems
      • Kaspersky
      • LSD Open
      • Mitel
      • NEC XON
      • Netstar
      • Network Platforms
      • Next DLP
      • Ovations
      • Paracon
      • Paratus
      • Q-KON
      • SevenC
      • SkyWire
      • Solid8 Technologies
      • Telit Cinterion
      • Telviva
      • Tenable
      • Vertiv
      • Videri Digital
      • Vodacom Business
      • Wipro
      • Workday
      • XLink
    • Sections
      • AI and machine learning
      • Banking
      • Broadcasting and Media
      • Cloud services
      • Contact centres and CX
      • Cryptocurrencies
      • Education and skills
      • Electronics and hardware
      • Energy and sustainability
      • Enterprise software
      • Financial services
      • HealthTech
      • Information security
      • Internet and connectivity
      • Internet of Things
      • Investment
      • IT services
      • Lifestyle
      • Motoring
      • Policy and regulation
      • Public sector
      • Retail and e-commerce
      • Satellite communications
      • Science
      • SMEs and start-ups
      • Social media
      • Talent and leadership
      • Telecoms
    • Events
    • Advertise
    TechCentralTechCentral
    Home » In-depth » Google’s ‘Babel fish’ heralds future of translation

    Google’s ‘Babel fish’ heralds future of translation

    By Editor9 January 2012
    Twitter LinkedIn Facebook WhatsApp Email Telegram Copy Link
    News Alerts
    WhatsApp
    Ashish Venugopal

    In Douglas Adams’s famous Hitchhiker’s Guide to the Galaxy series of science-fiction books, interstellar species use Babel fish — “small, yellow, leech-like” creatures that feed on “brain-wave energy” — to translate speech in real time.

    A team of developers at Google is working on the real thing, using statistical models to translate different languages, including Afrikaans, on the Web and on mobile phones, using voice input and output as well as text.

    TechCentral sat down with Google Translate research scientist Ashish Venugopal at Google’s headquarters in Silicon Valley last week and asked him about the stumbling blocks to effective real-time translation and the future of the technology. This is an edited transcript of that interview.

    TechCentral: How many languages does Google Translate now support?

    Ashish Venugopal: There are 63 languages supported. That’s a lot of languages. How do we get all that data in there? If we tried manually to give the system those languages, it would be a hopeless task. The only possible way we could do this is to harness the power of machine computation. We build statistical models that are automatically training themselves and learning all the time. As people translate new content on the Web, our systems pick this up and it adds the words. The system is constantly reading and analysing the Web. It’s a statistical approach. The idea is that once we learn the essential model of how to speak a word, and we can apply that to every word. We haven’t memorised every word.

    Are there any difficult languages that make it hard to get translation right?

    Yes, there are some incredibly tough languages. If your language is very different from English, for example, then it will be very difficult to translate it to English. We use English as an intermediate language and so if you were translating from Russian to Japanese, we’d translate the Russian to English and then to Japanese.

    When we talk about a “tough” language, it’s one that is really different compared to English. There are languages that are very different in multiple dimensions.

    The first question to ask is, what is the order of words from in one language compared to English. In English, we’d put the subject first, then the verb and then the object, whereas the Japanese have the subject first, then the object and finally the verb. We have to teach computers how to recognise this reordering pattern.

    We don’t tell the computer how to translate every sentence. We give it general patterns to look for. When it sees new data, it uses those patterns, matches that to data and then comes up with a model that it uses to translate sentences.

    When we say languages are harder, they’re harder because of the ordering of words, they’re harder because there may be different notions of what a word even is. In English, you say you put the phone on the table — “phone” and “table” are objects and “on” is an additional word that explains what’s happening. In other languages, the “on” could be glued onto the word “phone” or “table” and we have to teach the computer that “on” could be connected to the object or be separate from it.

    All these issues get easier when there’s more source data. We launch languages when we feel they are adding value to somebody. We have “alpha” or experimental languages where we were just able to launch the system, as opposed to it being fluent and correct. The alpha languages tend to have less source data available online.

    What are the main stumbling blocks to this technology and what will be possible in the future?

    We are really reliant on the source data. The first stumbling block for a new language is, is there data on the Web? Once there’s enough content on the Web and as we build our system … on average it works really well. On average, you’ll be very impressed with it. But every once in a while you’ll be irritated with it.

    Because of the statistical approach, you may enter something and get some crazy translation. What we are trying to do is limit those crazy translations and ensure in all cases we are providing a reasonable translation.

    This really comes from the fact that this is a statistical system. We’ve built it so you can literally put anything into it. We will translate anything you give us. It might be good or it might be bad, but on average it will be quite impressive.

    What we are really working on now is clipping the bottom end of the cases where we make mistakes. We see these issues in languages that are very different compared to English. Russian, for example, adds a lot of information to words and they get longer and longer and when we translate we sometimes make mistakes there.

    In the future, in a reasonably short time, we will take machine translation for granted, as part of our everyday lives. I mean that from an 80-20 standpoint, where 80% of the use cases we’ll be able to address effectively. The last 20% will be incredibly hard. That speaks to the fact that machine translation won’t be a substitute for a human translator.

    No one is going to take an important political speech and put it into machine translation to publish it in 20 different languages. Our goal is not to create artificial intelligence; our goal is to provide an 80% solution where you’ll be able to understand the political speech’s point, but not it’s rhetoric, not it’s beauty necessarily.

    Is the future of this technology instant voice translation using devices like mobile phones to facilitate real-time translation of conversations?

    We can do that already, but not simultaneously. It’s not an immediate goal. It’s a matter of where we are focusing now. There’s still more work to be done on the quality side before we can start to develop this continuous form of operation.

    Will you continue to do translation in the cloud (online on servers) or will it move down to devices like phones as they get more powerful?

    We make all our decisions purely based on quality. We want to ensure the highest quality translations are delivered to our users in the shortest possible time and that’s leaning towards the cloud for now, but that might change.

    What sort of computing power does Google Translate require?

    We use the full power of Google’s search engine. The reason Google Translate exists is because of the investments made in search. We sit on top that search infrastructure.

    Do you have a team of linguists working all over the world?

    We have a team of statisticians, all working right over there [points and laughs]. It’s less linguistically orientated. There are linguistic ideas that influence our decisions. To give you an example, when I was working on the last set of Indian languages that were launched, I didn’t use any linguistic knowledge; I used Wikipedia and my grandmother. So, it’s Wikipedia, my grandmother and statistics. That’s what we use to put a language together.  — Duncan McLeod, TechCentral

    • Subscribe to our free daily newsletter
    • Follow us on Twitter or on Google+ or on Facebook
    • Visit our sister website, SportsCentral (still in beta)
    Follow TechCentral on Google News Add TechCentral as your preferred source on Google


    Ashish Venugopal Google Google Translate
    WhatsApp YouTube
    Share. Facebook Twitter LinkedIn WhatsApp Telegram Email Copy Link
    Previous ArticleToshiba Portégé R830 review: not quite the Air
    Next Article New sub-sea cables to drive down broadband prices

    Related Posts

    Hyperscalers ate my next computer

    Hyperscalers ate my next computer

    8 May 2026
    Alphabet closes in on Nvidia as world's most valuable company

    Alphabet closes in on Nvidia as world’s most valuable company

    6 May 2026
    More details about Apple's AI plans emerge

    More details about Apple’s AI plans emerge

    6 May 2026
    Company News
    Where AI actually belongs in enterprise systems - BBD Software Development

    Where AI actually belongs in enterprise systems

    11 May 2026
    Your databases are being watched - just not by you - Ascent Technology Johan Lambert

    Your databases are being watched – just not by you

    8 May 2026
    Hexion deploys 30 petabyte sovereign data archive in South Africa

    Hexion deploys 30 petabyte sovereign data archive in South Africa

    7 May 2026
    Opinion
    Free calls, dead voice and Shameel Joosub's Spanish ghost - Duncan McLeod

    Free calls, dead voice and Shameel Joosub’s Spanish ghost

    22 April 2026
    The conflict of interest at the heart of PayShap's slow adoption - Cheslyn Jacobs

    The conflict of interest at the heart of PayShap’s slow adoption

    26 March 2026
    South Africa's energy future hinges on getting wheeling right - Aishah Gire

    South Africa’s energy future hinges on getting wheeling right

    10 March 2026

    Subscribe to Updates

    Get the best South African technology news and analysis delivered to your e-mail inbox every morning.

    Latest Posts
    Solly Malatsi moves to rescue South Africa's botched AI policy

    Malatsi moves to rescue South Africa’s botched AI policy

    12 May 2026
    MTN's African engines fire - but South Africa still stalled

    MTN’s African engines fire – but South Africa still stalled

    12 May 2026
    Naspers shares tumble on iFood investment warning - Fabricio Bloisi

    Naspers shares tumble on iFood investment warning

    12 May 2026
    Netflix's astonishing R2.2-trillion content bill

    Netflix’s astonishing R2.2-trillion content bill

    12 May 2026
    © 2009 - 2026 NewsCentral Media
    • Cookie policy (ZA)
    • TechCentral – privacy and Popia

    Type above and press Enter to search. Press Esc to cancel.

    Manage consent

    TechCentral uses cookies to enhance its offerings. Consenting to these technologies allows us to serve you better. Not consenting or withdrawing consent may adversely affect certain features and functions of the website.

    Functional Always active
    The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
    Preferences
    The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
    Statistics
    The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
    Marketing
    The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
    • Manage options
    • Manage services
    • Manage {vendor_count} vendors
    • Read more about these purposes
    View preferences
    • {title}
    • {title}
    • {title}