Close Menu
TechCentralTechCentral

    Subscribe to the newsletter

    Get the best South African technology news and analysis delivered to your e-mail inbox every morning.

    Facebook X (Twitter) YouTube LinkedIn
    WhatsApp Facebook X (Twitter) LinkedIn YouTube
    TechCentralTechCentral
    • News
      Starlink wait set to drag on as Icasa flags legal hurdle

      Starlink wait set to drag on as Icasa flags legal hurdle

      13 May 2026
      Malatsi opens door to 'some' partial privatisations of SOEs - communications minister Solly Malatsi

      Malatsi opens door to ‘some’ partial privatisations of SOEs

      13 May 2026
      Sam Altman denies betraying Elon Musk. Shelby Tauber/Reuters

      Sam Altman denies betraying Elon Musk

      13 May 2026
      Naked Insurance launches native app in ChatGPT - Alex Thomson

      Naked Insurance launches native app in ChatGPT

      13 May 2026
      Canal+ firms up 3 June JSE listing

      Canal+ firms up 3 June JSE listing

      13 May 2026
    • World
      Pop star sues Samsung for $15-million - Dua Lipa

      Pop star sues Samsung for $15-million

      11 May 2026
      OpenAI's new audio APIs aim for conversational voice agents

      OpenAI’s new audio APIs aim for conversational voice agents

      8 May 2026
      'It was my idea': Musk claims paternity of OpenAI - Elon Musk

      ‘It was my idea’: Musk claims paternity of OpenAI

      29 April 2026
      Pivotal week for US tech stocks

      Pivotal week for US tech stocks

      28 April 2026
      Sam Altman denies betraying Elon Musk. Shelby Tauber/Reuters

      Worries over OpenAI’s growth as Anthropic gains ground

      28 April 2026
    • In-depth
      Alfa's electric rebel - Alfa Romeo Junior Elettrica Veloce

      Alfa’s electric rebel

      29 April 2026
      Africa switches on as Europe dims the lights

      Africa switches on as Europe dims the lights

      9 April 2026
      The biggest untapped EV market on Earth is hiding in plain sight

      The biggest untapped EV market on Earth is hiding in plain sight

      1 April 2026
      Datatec is firing on all cylinders - Jens Montanana

      The R16-billion tech giant hiding in plain sight

      26 March 2026
      The last generation of coders

      The last generation of coders

      18 February 2026
    • TCS
      TCS+ | The Up&Up Group on the hidden cost of AI - Jason Harrison

      TCS+ | The Up&Up Group on the hidden cost of AI

      13 May 2026
      Michael Rossouw

      TCS+ | The retirement decision most South Africans get wrong

      6 May 2026
      TCS | The Cape Town start-up listening for TB with AI - Braden van Breda

      TCS | The Cape Town start-up listening for TB with AI

      4 May 2026

      TCS+ | ‘The ISP for ISPs’: Vox’s shift to wholesale aggregator

      20 April 2026
      TCS | Werner Lindemann on how AI is rewriting the infosec rulebook

      TCS | Werner Lindemann on how AI is rewriting the infosec rulebook

      15 April 2026
    • Opinion
      Free calls, dead voice and Shameel Joosub's Spanish ghost - Duncan McLeod

      Free calls, dead voice and Shameel Joosub’s Spanish ghost

      22 April 2026
      The conflict of interest at the heart of PayShap's slow adoption - Cheslyn Jacobs

      The conflict of interest at the heart of PayShap’s slow adoption

      26 March 2026
      South Africa's energy future hinges on getting wheeling right - Aishah Gire

      South Africa’s energy future hinges on getting wheeling right

      10 March 2026
      Free calls, dead voice and Shameel Joosub's Spanish ghost - Duncan McLeod

      Apple just dropped a bomb on the Windows world

      5 March 2026
      R230-million in the bag for Endeavor's third Harvest Fund - Alison Collier

      VC’s centre of gravity is shifting – and South Africa is in the frame

      3 March 2026
    • Company Hubs
      • 1Stream
      • Africa Data Centres
      • AfriGIS
      • Altron Digital Business
      • Altron Document Solutions
      • Altron Group
      • Arctic Wolf
      • Ascent Technology
      • AvertITD
      • BBD
      • Braintree
      • CallMiner
      • CambriLearn
      • Contactable
      • CYBER1 Solutions
      • Digicloud Africa
      • Digimune
      • Domains.co.za
      • ESET
      • Euphoria Telecom
      • HOSTAFRICA
      • Incredible Business
      • iONLINE
      • IQbusiness
      • Iris Network Systems
      • Kaspersky
      • LSD Open
      • Mitel
      • NEC XON
      • Netstar
      • Network Platforms
      • Next DLP
      • Ovations
      • Paracon
      • Paratus
      • Q-KON
      • SevenC
      • SkyWire
      • Solid8 Technologies
      • Telit Cinterion
      • Telviva
      • Tenable
      • Vertiv
      • Videri Digital
      • Vodacom Business
      • Wipro
      • Workday
      • XLink
    • Sections
      • AI and machine learning
      • Banking
      • Broadcasting and Media
      • Cloud services
      • Contact centres and CX
      • Cryptocurrencies
      • Education and skills
      • Electronics and hardware
      • Energy and sustainability
      • Enterprise software
      • Financial services
      • HealthTech
      • Information security
      • Internet and connectivity
      • Internet of Things
      • Investment
      • IT services
      • Lifestyle
      • Motoring
      • Policy and regulation
      • Public sector
      • Retail and e-commerce
      • Satellite communications
      • Science
      • SMEs and start-ups
      • Social media
      • Talent and leadership
      • Telecoms
    • Events
    • Advertise
    TechCentralTechCentral
    Home » Sections » AI and machine learning » The insanely powerful supercomputer Microsoft built for AI workloads

    The insanely powerful supercomputer Microsoft built for AI workloads

    When Microsoft invested $1-billion in OpenAI in 2019, it agreed to build a cutting-edge supercomputer for the AI research start-up.
    By Dina Bass13 March 2023
    Twitter LinkedIn Facebook WhatsApp Email Telegram Copy Link
    News Alerts
    WhatsApp

    When Microsoft invested US$1-billion in OpenAI in 2019, it agreed to build a massive, cutting-edge supercomputer for the artificial intelligence research start-up. The only problem: Microsoft didn’t have anything like what OpenAI needed and wasn’t totally sure it could build something that big in its Azure cloud service without it breaking.

    OpenAI was trying to train an increasingly large set of AI programs called models, which were ingesting greater volumes of data and learning more and more parameters, the variables the AI system has sussed out through training and retraining. That meant OpenAI needed access to powerful cloud computing services for long periods of time.

    To meet that challenge, Microsoft had to find ways to string together tens of thousands of Nvidia’s A100 graphics chips — the workhorse for training AI models — and change how it positions servers on racks to prevent power outages. Scott Guthrie, the Microsoft executive vice president who oversees cloud and AI, wouldn’t give a specific cost for the project, but said “it’s probably larger” than several hundred million dollars.

    We built a system architecture that could operate and be reliable at a very large scale.

    “We built a system architecture that could operate and be reliable at a very large scale. That’s what resulted in ChatGPT being possible,” said Nidhi Chappell, Microsoft GM of Azure AI infrastructure. “That’s one model that came out of of it. There’s going to be many, many others.”

    The technology allowed OpenAI to release ChatGPT, the viral chatbot that attracted more than a million users within days of going public in November and is now getting pulled into other companies’ business models. As generative AI tools such as ChatGPT gain interest from businesses and consumers, more pressure will be put on cloud services providers such as Microsoft, Amazon.com and Google to ensure their data centres can provided the enormous computing power needed.

    Now Microsoft uses that same set of resources it built for OpenAI to train and run its own large AI models, including the new Bing search bot introduced last month. It also sells the system to other customers. The software giant is already at work on the next generation of the AI supercomputer, part of an expanded deal with OpenAI in which Microsoft added $10-billion to its investment.

    Better for AI

    “We didn’t build them a custom thing — it started off as a custom thing, but we always built it in a way to generalise it so that anyone that wants to train a large language model can leverage the same improvements,” said Guthrie in an interview. “That’s really helped us become a better cloud for AI broadly.”

    Training a massive AI model requires a large pool of connected graphics processing units in one place like the AI supercomputer Microsoft assembled. Once a model is in use, answering all the queries users pose — called inference — requires a slightly different setup. Microsoft also deploys graphics chips for inference but those processors — hundreds of thousands of them — are geographically dispersed throughout the company’s more than 60 regions of data centres. Now the company is adding the latest Nvidia graphics chip for AI workloads — the H100 — and the newest version of Nvidia’s Infiniband networking technology to share data even faster, Microsoft said on Monday in a blog post.

    The new Bing is still in preview with Microsoft gradually adding more users from a waitlist. Guthrie’s team holds a daily meeting with about two dozen employees they’ve dubbed the “pit crew”, after the group of mechanics that tune race cars in the middle of the race. The group’s job is to figure out how to bring greater amounts of computing capacity online quickly, as well as fix problems that crop up.

    “It’s very much a kind of a huddle, where it’s like, ‘Hey, anyone has a good idea, let’s put it on the table today, and let’s discuss it and let’s figure out, okay, can we shave a few minutes here? Can we shave a few hours? A few days?’” Guthrie said.

    A cloud service depends on thousands of different parts and items — the individual pieces of servers, pipes, concrete for the buildings, different metals and minerals — and a delay or short supply of any one component, no matter how tiny, can throw everything off. Recently, the pit crew had to deal with a shortage of cable trays — the basket-like contraptions that hold the cables coming off the machines. So they designed a new cable tray that Microsoft could manufacture itself or find somewhere to buy. They’ve also worked on ways to squish as many servers as possible in existing data centres around the world so they don’t have to wait for new buildings, Guthrie said.

    When OpenAI or Microsoft is training a large AI model, the work happens at one time. It’s divided across all the GPUs and at certain points, the units need to talk to each other to share the work they’ve done. For the AI supercomputer, Microsoft had make sure the networking gear that handles the communication among all the chips could handle that load, and it had to develop software that gets the best use out of the GPUs and the networking equipment. The company has now come up with software that lets it train models with tens of trillions of parameters.

    Because all the machines fire up at once, Microsoft had to think about where they were placed and where the power supplies were located. Otherwise you end up with the data centre version of what happens when you turn on a microwave, toaster and vacuum cleaner at the same time in the kitchen, Guthrie said.

    Read: Microsoft is infusing AI into business apps, including Teams

    The company also had to make sure it could cool off all of those machines and chips, and uses evaporation, outside air in cooler climates and high-tech swamp coolers in hot ones, said Alistair Speirs, director of Azure global infrastructure.

    Microsoft is going to keep working on customised server and chip designs and ways to optimise its supply chain in order to wring any speed gains, efficiency and cost-savings it can, Guthrie said.

    Read: Microsoft to bake Bing AI into Windows 11

    “The model that is wowing the world right now is built on the supercomputer we started building a couple of years ago. The new models will be built on the new supercomputer we’re training now, which is much bigger and will enable even more sophistication,” he said.  — Reported with Max Chafkin and Ian King, (c) 2023 Bloomberg LP

    Get TechCentral’s daily newsletter

    Follow TechCentral on Google News Add TechCentral as your preferred source on Google


    ChatGPT Microsoft OpenAI
    WhatsApp YouTube
    Share. Facebook Twitter LinkedIn WhatsApp Telegram Email Copy Link
    Previous ArticleAbsa adds to growing chorus of alarm over load shedding
    Next Article MTN takes R695-million hit from load shedding

    Related Posts

    Sam Altman denies betraying Elon Musk. Shelby Tauber/Reuters

    Sam Altman denies betraying Elon Musk

    13 May 2026
    Naked Insurance launches native app in ChatGPT - Alex Thomson

    Naked Insurance launches native app in ChatGPT

    13 May 2026
    Setback for Microsoft's Africa cloud ambitions

    Setback for Microsoft’s Africa cloud ambitions

    10 May 2026
    Company News
    In crypto, trust is the new currency - Binance South Africa's Sam Mkhize

    In crypto, trust is the new currency

    13 May 2026
    Don't miss the Telviva Tech Insights webinar

    Don’t miss the Telviva Tech Insights webinar

    13 May 2026

    Don’t miss the Pan African DataCentres Exhibition & Conference

    13 May 2026
    Opinion
    Free calls, dead voice and Shameel Joosub's Spanish ghost - Duncan McLeod

    Free calls, dead voice and Shameel Joosub’s Spanish ghost

    22 April 2026
    The conflict of interest at the heart of PayShap's slow adoption - Cheslyn Jacobs

    The conflict of interest at the heart of PayShap’s slow adoption

    26 March 2026
    South Africa's energy future hinges on getting wheeling right - Aishah Gire

    South Africa’s energy future hinges on getting wheeling right

    10 March 2026

    Subscribe to Updates

    Get the best South African technology news and analysis delivered to your e-mail inbox every morning.

    Latest Posts
    Starlink wait set to drag on as Icasa flags legal hurdle

    Starlink wait set to drag on as Icasa flags legal hurdle

    13 May 2026
    Malatsi opens door to 'some' partial privatisations of SOEs - communications minister Solly Malatsi

    Malatsi opens door to ‘some’ partial privatisations of SOEs

    13 May 2026
    Sam Altman denies betraying Elon Musk. Shelby Tauber/Reuters

    Sam Altman denies betraying Elon Musk

    13 May 2026
    Naked Insurance launches native app in ChatGPT - Alex Thomson

    Naked Insurance launches native app in ChatGPT

    13 May 2026
    © 2009 - 2026 NewsCentral Media
    • Cookie policy (ZA)
    • TechCentral – privacy and Popia

    Type above and press Enter to search. Press Esc to cancel.

    Manage consent

    TechCentral uses cookies to enhance its offerings. Consenting to these technologies allows us to serve you better. Not consenting or withdrawing consent may adversely affect certain features and functions of the website.

    Functional Always active
    The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
    Preferences
    The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
    Statistics
    The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
    Marketing
    The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
    • Manage options
    • Manage services
    • Manage {vendor_count} vendors
    • Read more about these purposes
    View preferences
    • {title}
    • {title}
    • {title}