Amazon Web Services (AWS), beginning in the first quarter of 2026, will make available a Neural Kernel Interface (NKI) it developed to optimize compute kernels running on its Trainium and Inferentia processors under an open source license.

The NKI software development kit (SDK) for AWS Inferentia and Trainium processors integrates machine learning frameworks such as PyTorch, with a compiler for a domain-specific language (DSL), runtime, tools, and libraries.

The primary goal is to provide developers of AI training and inference models with an SDK that makes it possible to maximize performance of AWS processors, says David Nalley, director of developer experience at AWS. “It’s for developers that need to squeeze every ounce of performance,” he says.

AWS is betting that Inferentia and Trainium processors will provide an attractive alternative to graphic processing units (GPUs) for training and then deploying artificial intelligence (AI) inference models.

Additionally, AWS is expecting that AI workloads will become more distributed following an agreement with Google to create an open source network pathway between cloud computing environments, adds Nalley.

Earlier this month AWS also revealed it plans to build a Trainium 4 processor that will provide six times the performance of Trainium 3 processors that it is now making available to provide up to 362 FP8 PFLOPs in an AWS UltraServer. Organizations that are using Trainium processors to reduce AI inference costs by up to 50% include Anthropic, Decart, Karakuri, Metagenomics, Neto.ai, Ricoh, and Splashmusic.

It’s not clear to what degree an open source edition of NKI might foster additional adoption of Trainium and Inferentia processors. In the meantime, AWS is relying heavily on these processors to infuse AI capabilities across its entire managed service portfolio.

Each organization will need to determine to what degree to rely on alternatives to GPUs to train models or deploy inference engines. There is an understandable tendency to equate AI with GPUs, but as organizations start to build and deploy a wide range of models an effort will be made to reduce the cost of training and deploying AI models. In theory, the greater the need for performance, the more likely it is that organizations will tend to rely on GPUs, but AWS is also making it clear that even in those instances it is possible to use Inferentia or Trainium processors.

Regardless of approach, organizations should carefully think through what type of AI model makes the most sense for their application. In many instances, a customized small model that doesn’t require near the level of processing horsepower required by a foundational model will be more than sufficient. In fact, in many instances the inference model will be deployed and managed by internal IT teams that tend to have a lot of experience with cost optimization.

In the meantime, the Futurum Group projects the global AI platforms market, valued at $24.9 billion last year, will reach $292 billion by 2030, for approximately a 50.8% compound annual growth rate. At that pace of adoption, the need for the IT infrastructure experience needed to manage those platforms, even as advances in applying AI to IT operations are made, is only going to intensify as the number of workloads continue to rapidly expand.

TECHSTRONG TV

Click full-screen to enable volume control
Watch latest episodes and shows

Tech Field Day Events

TECHSTRONG AI PODCAST

SHARE THIS STORY