A few years ago, I wrote about how product teams shouldn’t care about Kubernetes. The goal was to build “golden paths” through internal developer platforms that abstract away infrastructure complexity, allowing engineers to focus on shipping value rather than the underlying logistics of how that value is delivered.

Today, we are facing a rift in this challenge. While much of the initial enterprise AI focus has been centred on conversational interfaces, we are experiencing the wave of agentic AI integration. For organisations to unlock the value of generative AI and realise the true impact of large language models, the requirement has evolved from “AI that chats” to “AI that acts”.

The transition to agentic AI – systems where large language models serve as a reasoning engine that agents use to navigate and execute both technical and business workflows – requires more than just a better prompt. It requires a pragmatic, hybrid infrastructure that balances the frontier capabilities of the hyperscalers with the sovereignty and predictable unit economics of the private data centre.

The case for hybrid AI

As agents begin to act on sensitive data and execute high-frequency workflows, the “where” of your AI stack becomes as important as the “what”. This isn’t a new debate. It mirrors the cloud adoption journey. Just as we realised that data gravity, compliance and sovereignty concerns, and the sheer inertia of legacy estates meant not every workload belonged in a hyperscaler, we are discovering that agentic AI requires a similarly nuanced and hybrid approach.

Read: The cloud paradox: are you using the cloud, or just paying for it?

There is an emerging and powerful argument for enterprises to establish and evolve a sovereign AI core. By leveraging stacks like Red Hat OpenShift AI to serve open-source models on your own infrastructure and GPUs, you address three critical enterprise hurdles:

Data sovereignty: The agent, the model and the IT ecosystem – and organisational data which they integrate – all reside within the boundary of your data centre and private network. This allows you to treat the sovereign AI core as a native citizen of your existing IT estate, directly inheriting your established security controls and data governance frameworks, rather than attempting to bolt them onto a public API. Additionally, it immediately mitigates any potential concerns that your organisational data may inadvertently end up in a frontier model provider’s training dataset.
Predictable economics: A sovereign capability provides a hedge against the “success tax” of spiralling, consumption-based token costs. While the initial investment in GPUs and infrastructure is significant (although we’ve personally seen good mileage combining narrow agentic use cases with smaller models on cost-effective GPUs), these assets offer a predictable cost-per-inference that can be amortised over time. This becomes a strategic advantage as you scale – you are no longer penalised for high-volume agentic workflows. Furthermore, proximity to large, immovable data – those massive on-premises data warehouses that are too costly to move to a hyperscaler – minimises latency and egress costs.
Overcoming AI innovation FUD (fear, uncertainty, doubt): Lastly, and yet perhaps most impactful to the business case, is that establishing a sovereign AI capability lowers the barrier to entry for internal experimentation. By providing engineers (and more broadly, prospective AI integrators across the organisation) with scale-out model inference and well-documented APIs – ideally architecturally consistent with frontier providers using open-source inference engines such as vLLM – you create a “soft landing” for innovation.

When the unit economics are well understood, and the infrastructure is already “sunk cost”, the perceived risk of a failed experiment vanishes. Instead of every new agentic use case requiring a business case to justify unknown software-as-a-service token spend, teams can pull from an extensive catalogue of best-of-breed open-source models and start building immediately – accelerating innovation and time-to-value for agentic use cases.

This allows for a seamless transition: validate the logic within your sovereign AI core and should the use case require the unique reasoning of a hyperscale frontier model, the API-consistent nature of the hybrid stack makes that shift a configuration change rather than a refactoring exercise.

However, a hybrid AI strategy isn’t about ignoring the cloud; it’s about choosing the right tool for the job while weighing the desired end state and expected value against locality, privacy and cost constraints. Whether we are leveraging the Llama Stack within Red Hat OpenShift AI for a sovereign implementation or AWS Bedrock AgentCore and the Strands SDK for hyperscale velocity and access to frontier reasoning, we are looking for the same agentic primitives.

These are core infrastructure for industrialising agentic AI – agent identity and authorisation, persistent memory and secure tool integration leveraging established standards such as the Model Context Protocol, allowing teams to accelerate from experiments and prototypes to “productionised” agentic architectures.

Read: Why cloud projects fail – and how three days can fix this

By utilising established platforms to handle AI infrastructure scaffolding and integration, engineers are liberated from the toil and cognitive overhead of managing AI plumbing. They are empowered to focus directly on the high-order work of creating value and achieving agentic AI’s promised step-change in organisational bandwidth and value creation – this is the agentic golden path.