Intel has officially introduced its next-generation Xeon 6 server processors alongside the Gaudi 3 AI accelerators, making bold claims about their performance improvements.
The Xeon 6 6900P stands out as the first in the Xeon lineup to feature performance cores (P-cores) aimed at high-demand computing tasks, alongside efficient cores (E-cores) for handling lighter workloads with reduced power consumption. Intel’s strategy involves customizing the mix of P-cores and E-cores across different models of the Xeon 6 series, depending on their intended application. The Xeon 6 6700E, which focuses on energy efficiency with E-cores, was previously launched in June.
Intel asserts that the Xeon 6 6900P delivers double the performance of its predecessor, thanks to a higher core count, enhanced memory bandwidth, and built-in AI acceleration on each core.
In addition to the P-cores, Intel has incorporated AI inference capabilities directly into the Xeon 6900P series, effectively integrating AI coprocessors within the CPU itself. This mirrors the approach of rival AMD, as both companies aim to enable more energy-efficient AI inferencing that can be executed on local machines, reducing the need for dedicated GPU server processors.
The technical specifications for the Xeon 6900P chips are notably impressive. Compared to the previous generation, Intel has doubled the maximum core count to 128 by adopting a chiplet architecture, which divides the processor into smaller, easier-to-manufacture segments rather than using a single large silicon piece.
Xeon 6 also supports Micron’s new MRDIMM memory modules, which offer improved performance in terms of both bandwidth and latency. For the Xeon 6900P, memory speeds can reach up to 8,800 MT/s, marking a 57% improvement over previous models.
Further advancements include six Ultra Path Interconnect 2.0 (UPI) links for faster CPU-to-CPU communication at speeds of up to 24 GT/s, support for 96 lanes of PCIe 5.0 and CXL 2.0, and the addition of new vector and matrix extensions aimed at accelerating high-performance computing and AI tasks. The matrix extension, in particular, supports 16-bit floating point calculations, which are critical for AI inference workloads.
The increased core count and performance do come with a trade-off in terms of power consumption. Four of the five processors in the 6900P family have a thermal design power (TDP) of 500 watts, while one is rated at 400 watts. For comparison, the previous generation Xeon processors topped out at 350 watts.
Nevertheless, the performance gains are significant. In one benchmark test involving the Llama 2 chatbot, which uses a 7-billion-parameter model, Intel’s 96-core Xeon 6972P outperformed AMD’s 96-core EPYC 9654 by more than three times and was 128% faster than the prior-generation Xeon. In another test focused on the BERT language processing model, the Xeon 6972P was 4.3 times faster than AMD’s Epyc and 2.2 times quicker than its Xeon predecessor.
However, it’s worth mentioning that the AMD EPYC processor used in the comparison has been available for nearly two years, and AMD is expected to release its next-generation processors soon, which could alter the competitive landscape.