Close Menu
TechCentralTechCentral

    Subscribe to the newsletter

    Get the best South African technology news and analysis delivered to your e-mail inbox every morning.

    Facebook X (Twitter) YouTube LinkedIn
    WhatsApp Facebook X (Twitter) LinkedIn YouTube
    TechCentralTechCentral
    • News
      ICT BEE fight deepens as MK, EFF target Malatsi - Colleen Makhubele

      ICT BEE fight deepens as MK, EFF target Malatsi

      15 December 2025
      ANC's attack on Solly Malatsi shows how BEE dogma trumps economic reality

      ANC’s attack on Solly Malatsi shows how BEE dogma trumps economic reality

      14 December 2025
      Political war erupts over BEE in the ICT sector - Solly Malatsi

      Political war erupts over BEE in the ICT sector

      13 December 2025
      Icasa told to align on BEE in move that will favour Starlink - Solly Malatsi

      Icasa told to align on BEE in move that will favour Starlink

      12 December 2025
      South African solar industry faces a reality check

      South African solar industry faces a reality check

      12 December 2025
    • World
      Oracle’s AI ambitions face scrutiny on earnings miss

      Oracle’s AI ambitions face scrutiny on earnings miss

      11 December 2025
      China will get Nvidia H200 chips - but not without paying Washington first

      China will get Nvidia H200 chips – but not without paying Washington first

      9 December 2025
      IBM reportedly close to $11-billion deal to buy Confluent - Arvind Krishna

      IBM reportedly close to $11-billion deal to buy Confluent

      8 December 2025
      Amazon and Google launch multi-cloud service for faster connectivity

      Amazon and Google launch multi-cloud service for faster connectivity

      1 December 2025
      Google makes final court plea to stop US breakup

      Google makes final court plea to stop US breakup

      21 November 2025
    • In-depth
      Black Friday goes digital in South Africa as online spending surges to record high

      Black Friday goes digital in South Africa as online spending surges to record high

      4 December 2025
      Canal+ plays hardball - and DStv viewers feel the pain

      Canal+ plays hardball – and DStv viewers feel the pain

      3 December 2025
      Jensen Huang Nvidia

      So, will China really win the AI race?

      14 November 2025
      Valve's Linux console takes aim at Microsoft's gaming empire

      Valve’s Linux console takes aim at Microsoft’s gaming empire

      13 November 2025
      iOCO's extraordinary comeback plan - Rhys Summerton

      iOCO’s extraordinary comeback plan

      28 October 2025
    • TCS
      TCS+ | Africa's digital transformation - unlocking AI through cloud and culture - Cliff de Wit Accelera Digital Group

      TCS+ | Cloud without culture won’t deliver AI: Accelera’s Cliff de Wit

      12 December 2025
      TCS+ | How Cloud on Demand helps partners thrive in the AWS ecosystem - Odwa Ndyaluvane and Xenia Rhode

      TCS+ | How Cloud On Demand helps partners thrive in the AWS ecosystem

      4 December 2025
      TCS | MTN Group CEO Ralph Mupita on competition, AI and the future of mobile

      TCS | Ralph Mupita on competition, AI and the future of mobile

      28 November 2025
      TCS | Dominic Cull on fixing South Africa's ICT policy bottlenecks

      TCS | Dominic Cull on fixing South Africa’s ICT policy bottlenecks

      21 November 2025
      TCS | BMW CEO Peter van Binsbergen on the future of South Africa's automotive industry

      TCS | BMW CEO Peter van Binsbergen on the future of South Africa’s automotive industry

      6 November 2025
    • Opinion
      Netflix, Warner Bros deal raises fresh headaches for MultiChoice - Duncan McLeod

      Netflix, Warner Bros deal raises fresh headaches for MultiChoice

      5 December 2025
      BIN scans, DDoS and the next cybercrime wave hitting South Africa's banks - Entersekt Gerhard Oosthuizen

      BIN scans, DDoS and the next cybercrime wave hitting South Africa’s banks

      3 December 2025
      Your data, your hardware: the DIY AI revolution is coming - Duncan McLeod

      Your data, your hardware: the DIY AI revolution is coming

      20 November 2025
      Zero Carbon Charge founder Joubert Roux

      The energy revolution South Africa can’t afford to miss

      20 November 2025
      It's time for a new approach to government IT spend in South Africa - Richard Firth

      It’s time for a new approach to government IT spend in South Africa

      19 November 2025
    • Company Hubs
      • Africa Data Centres
      • AfriGIS
      • Altron Digital Business
      • Altron Document Solutions
      • Altron Group
      • Arctic Wolf
      • AvertITD
      • Braintree
      • CallMiner
      • CambriLearn
      • CYBER1 Solutions
      • Digicloud Africa
      • Digimune
      • Domains.co.za
      • ESET
      • Euphoria Telecom
      • Incredible Business
      • iONLINE
      • IQbusiness
      • Iris Network Systems
      • LSD Open
      • NEC XON
      • Netstar
      • Network Platforms
      • Next DLP
      • Ovations
      • Paracon
      • Paratus
      • Q-KON
      • SevenC
      • SkyWire
      • Solid8 Technologies
      • Telit Cinterion
      • Tenable
      • Vertiv
      • Videri Digital
      • Vodacom Business
      • Wipro
      • Workday
      • XLink
    • Sections
      • AI and machine learning
      • Banking
      • Broadcasting and Media
      • Cloud services
      • Contact centres and CX
      • Cryptocurrencies
      • Education and skills
      • Electronics and hardware
      • Energy and sustainability
      • Enterprise software
      • Financial services
      • Information security
      • Internet and connectivity
      • Internet of Things
      • Investment
      • IT services
      • Lifestyle
      • Motoring
      • Public sector
      • Retail and e-commerce
      • Satellite communications
      • Science
      • SMEs and start-ups
      • Social media
      • Talent and leadership
      • Telecoms
    • Events
    • Advertise
    TechCentralTechCentral
    Home » Sections » AI and machine learning » How the AI behind ChatGPT actually works

    How the AI behind ChatGPT actually works

    The language models powering modern AI tools have a much longer history than most people realise.
    By The Conversation16 December 2024
    Twitter LinkedIn Facebook WhatsApp Email Telegram Copy Link
    News Alerts
    WhatsApp

    How the AI behind the likes of ChatGPT actually worksThe arrival of AI systems called large language models (LLMs), like OpenAI’s ChatGPT chatbot, has been heralded as the start of a new technological era. And they may indeed have significant impacts on how we live and work in future.

    But they haven’t appeared from nowhere and have a much longer history than most people realise. In fact, most of us have already been using the approaches they are based on for years in our existing technology.

    LLMs are a particular type of language model, which is a mathematical representation of language based on probabilities. If you’ve ever used predictive text on a mobile phone or asked a smart speaker a question, then you have almost certainly already used a language model. But what do they actually do and what does it take to make one?

    Language models are designed to estimate how likely it would be to see a particular sequence of words

    Language models are designed to estimate how likely it would be to see a particular sequence of words. This is where probabilities come in. For example, a good language model for English would assign a high probability to a well=formed sentence like “the old black cat slept soundly” and a low probability to a random sequence of words such as “library a or the quantum some”.

    Most language models can also reverse this process to generate plausible-looking text. The predictive text in your smartphone uses language models to anticipate how you might want to complete text as you are typing.

    The earliest method for creating language models was described in 1951 by Claude Shannon, a researcher working for IBM. His approach was based on sequences of words known as n-grams – say, “old black” or “cat slept soundly”. The probability of n-grams occurring within text was estimated by looking for examples in existing documents. These mathematical probabilities were then combined to calculate the overall probability of longer sequences of words, such as complete sentences.

    Neural networks

    Estimating probabilities for n-grams becomes much more difficult as the n-gram gets longer, so it is much harder to estimate accurate probabilities for 4-grams (sequences of four words) than for bi-grams (sequences of two words). Consequently, early language models of this type were often based on short n-grams.

    However, this meant that they often struggled to represent the connection between words that occurred far apart. This could result in the start and end of a sentence not matching up when the language model was used to generate a sentence.

    Read: iOS 18.2 update is rolling out, adding ChatGPT to iPhones

    To avoid this problem, researchers created language models based on neural networks – AI systems that are modelled on the way the human brain works. These language models are able to represent connections between words that may not be close together. Neural networks rely on large numbers of numerical values (known as parameters) to help understand these connections between words. These parameters must be set correctly for the model to work well.

    The neural network learns the appropriate values for these parameters by looking at large numbers of example documents, in a similar way that n-gram probabilities are learned by n-gram language models. During this “training” process, the neural network looks through the training documents and learns to predict the next word based on the ones that have come before.

    These models work well but have some disadvantages. Although in theory, the neural network is able to represent connections between words that occur far apart, in practice more importance is placed on those that are closer.

    More importantly, words in the training documents have to be processed in sequence to learn appropriate values for the network’s parameters. This limits how quickly the network can be trained.

    The dawn of transformers

    A new type of neural network, called a transformer, was introduced in 2017 and avoided these problems by processing all the words in the input at the same time. This allowed them to be trained in parallel, meaning that the calculations required can be spread across multiple computers to be carried out at the same time.

    A side effect of this change is that it allowed transformers to be trained on vastly more documents than was possible for previous approaches, producing larger language models.

    Transformers also learn from examples of text but can be trained to solve a wider range of problems than only predicting the next word. One is a kind of “fill in the blanks” problem where some words in the training text have been removed. The goal here is to guess which words are missing.

    The use of transformers has allowed the development of modern large language models

    Another problem is where the transformer is given a pair of sentences and asked to decide whether the second should follow the first. Training on problems like these has made transformers more flexible and powerful than previous language models.

    The use of transformers has allowed the development of modern large language models. They are in part referred to as large because they are trained using vastly more text examples than previous models.

    Some of these AI models are trained on over a trillion words. It would take an adult reading at average speed more than 7 600 years to read that much. These models are also based on very large neural networks, some with more than 100 billion parameters.

    In the last few years, an extra component has been added to large language models that allows users to interact with them using prompts. These prompts can be questions or instructions.

    Reinforcement learning

    This has enabled the development of generative AI systems such as ChatGPT, Google’s Gemini and Meta’s Llama. Models learn to respond to the prompts using a process called reinforcement learning, which is similar to the way computers are taught to play games like chess.

    Humans provide the language model with prompts, and the humans’ feedback on the replies produced by the AI model is used by the model’s learning algorithm to guide further output. Generating all these questions and rating the replies requires a lot of human input, which can be expensive to obtain.

    One way of reducing this cost is to create examples using a language model in order to simulate human-AI interaction. This AI-generated feedback is then used to train the system.

    Creating a large language model is still an expensive undertaking, though. The cost of training some recent models has been estimated to run into hundreds of millions of dollars. There is also an environmental cost, with the carbon dioxide emissions associated with creating LLMs estimated to be equivalent to multiple transatlantic flights.

    These are things that we will need to find solutions to amid an AI revolution that, for now, shows no sign of slowing down.The Conversation

    • The author, Mark Stevenson, is senior lecturer, University of Sheffield
    • This article is republished from The Conversation under a Creative Commons licence. Read the original article

    Get breaking news from TechCentral on WhatsApp. Sign up here

    Don’t miss:

    Google rolls out faster Gemini AI model to power agents



    ChatGPT OpenAI
    Subscribe to TechCentral Subscribe to TechCentral
    Share. Facebook Twitter LinkedIn WhatsApp Telegram Email Copy Link
    Previous ArticleThe US lost its lead in semiconductors. It may never regain it
    Next Article Broadcom joins the trillion-dollar club

    Related Posts

    OpenAI launches GPT-5.2 after 'code red' push to counter Google. Shelby Tauber/Reuters

    OpenAI launches GPT-5.2 after ‘code red’ push to counter Google

    12 December 2025
    OpenAI warns new models pose high cybersecurity risk

    OpenAI warns new models pose high cybersecurity risk

    11 December 2025
    Smartphone prices set to jump as memory crunch hits consumer tech

    Smartphone prices set to jump as memory crunch hits consumer tech

    3 December 2025
    Company News
    New Vox partner programme helps ISPs expand without the heavy lifting

    New Vox partner programme helps ISPs expand without the heavy lifting

    15 December 2025
    How alternative credit models can unlock South Africa's hidden economy - Cameron Kyle-Perumal M-KOPA South Africa

    How alternative credit models can unlock South Africa’s hidden economy

    15 December 2025
    When the physical world goes online: the new front line of cyber risk - Snode Technologies

    When the physical world goes online: the new front line of cyber risk

    12 December 2025
    Opinion
    Netflix, Warner Bros deal raises fresh headaches for MultiChoice - Duncan McLeod

    Netflix, Warner Bros deal raises fresh headaches for MultiChoice

    5 December 2025
    BIN scans, DDoS and the next cybercrime wave hitting South Africa's banks - Entersekt Gerhard Oosthuizen

    BIN scans, DDoS and the next cybercrime wave hitting South Africa’s banks

    3 December 2025
    Your data, your hardware: the DIY AI revolution is coming - Duncan McLeod

    Your data, your hardware: the DIY AI revolution is coming

    20 November 2025

    Subscribe to Updates

    Get the best South African technology news and analysis delivered to your e-mail inbox every morning.

    Latest Posts
    New Vox partner programme helps ISPs expand without the heavy lifting

    New Vox partner programme helps ISPs expand without the heavy lifting

    15 December 2025
    How alternative credit models can unlock South Africa's hidden economy - Cameron Kyle-Perumal M-KOPA South Africa

    How alternative credit models can unlock South Africa’s hidden economy

    15 December 2025
    ICT BEE fight deepens as MK, EFF target Malatsi - Colleen Makhubele

    ICT BEE fight deepens as MK, EFF target Malatsi

    15 December 2025
    ANC's attack on Solly Malatsi shows how BEE dogma trumps economic reality

    ANC’s attack on Solly Malatsi shows how BEE dogma trumps economic reality

    14 December 2025
    © 2009 - 2025 NewsCentral Media
    • Cookie policy (ZA)
    • TechCentral – privacy and Popia

    Type above and press Enter to search. Press Esc to cancel.

    Manage consent

    TechCentral uses cookies to enhance its offerings. Consenting to these technologies allows us to serve you better. Not consenting or withdrawing consent may adversely affect certain features and functions of the website.

    Functional Always active
    The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
    Preferences
    The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
    Statistics
    The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
    Marketing
    The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
    • Manage options
    • Manage services
    • Manage {vendor_count} vendors
    • Read more about these purposes
    View preferences
    • {title}
    • {title}
    • {title}