Arm’s Neoverse server chips generate at least 40% better performance

Venture BeatThis post was originally published by Dean Takahashi at Venture Beat

Arm is going after server infrastructure.

Arm unveiled the performance numbers for its Arm Neoverse V1 and N2 server chip platforms, with processing boosts ranging from 40% to 50% over the previous generation.

The demands of data center workloads and internet traffic are growing exponentially, and new solutions are needed to keep up with these demands while reducing the current and anticipated growth of power consumption. But  Arm said the variety of workloads and applications being run today means the traditional one-size-fits all approach to computing is not the answer. That’s a jab at the dominant server vendors Intel and Advanced Micro Devices, which use the x86 architecture.

The Arm Neoverse V1 is a server chip microarchitecture that Arm’s customers — the big chip makers of the world — can design chips around for servers in big datacenters that power the internet. The V1 supports Scalable Vector Extension (SVE) and delivers more than 50% performance increases for high-performance computing machine learning workloads.

“The time for Neoverse across all infrastructure is now,” said Chris Bergey, senior vice president for the infrastructure line of business at Arm, in a press briefing.

And another chip microarchitecture, the Arm Neoverse N2 platform, uses the new Armv9 architecture that Cambridge, United Kingdom-based Arm recently announced.  It can deliver 40% more performance for a variety of workloads.

“I think the N2 will pleasantly surprise people how performant designs will be in single-threaded designs,” said Patrick Moorhead, an analyst at Moor Insights & Strategy. “V1 looks to be a strong start in a nichey market, HPC. Overall, Arm is raising its game in the compute market.”

Bergey said the journey to producing competitive server chips has been a decade in the making. Chips based on the designs should be hitting the market either late this year or early next year.

Above: Arm chips are bringing power efficiency to datacenters.

Image Credit: Arm

And Arm said the Arm Neoverse CMN-700 is the industry’s most advanced mesh interconnect to unleash the performance and performance/watt benefits of Neoverse V1 and N2 platforms. This device is a key element for constructing high-performance Neoverse V1 and Neoverse N2-based system-on-chips or SoCs. It enables higher core counts and cache memory sizes.

As Moore’s Law comes to an end, solution providers are seeking specialized processing. Enabling specialized processing has been a focal point since the inception of our Neoverse line of platforms, and Arm expects these latest additions to accelerate this trend.

Back in September, Arm unveiled the new Neoverse N2 and Neoverse V1 platforms without talking about performance. Now the company is talking about the performance per watt, the total cost of ownership benefits and partners adopting the designs.

“We believe Arm processors are coming to servers in a big way. We believe Arm is actually going to be everywhere, from the edge to the cloud,” said Bev Crair, senior vice president at Oracle, in a press briefing.

Customers

Above: Alibaba says its chips get a 50% boost with the latest Arm designs.

Image Credit: Arm

Among the customers:

  • Marvell revealed its Octeon family of networking solutions based on Neoverse N2 will begin sampling by end of 2021, providing a three times performance uplift over previous generation Octeon chips.
  • India’s Ministry of Electronics and Information Technology (MeitY) has announced it will join SiPearl and ETRI in licensing Neoverse V1 for its national exascale HPC project.
  • Oracle plans to adopt Ampere Altra central processing units (CPUs) for Oracle Cloud Infrastructure, as the price/performance leader across a wide range of workloads.
  • Arm-powered Amazon Web Services Graviton2 continues to rapidly expand its EC2 footprint with steady growth and regional expansion.
  • Alibaba Cloud just tested the upcoming Alibaba Cloud ECS Arm instances, showing off improved performance of the DragonWell JDK on Neoverse N1 by 50%.
  • Tencent is making investments in both hardware testing and on software enablement that will allow them to adopt Neoverse technology for cloud applications. Bergey said the tests are showing great performance per watt for the Arm-based designs.
  • Nvidia’s Grace is using an unannounced Arm processor. But Arm didn’t say if the new Neoverse designs are being used in Grace.

These partners are taking full advantage of what is under the hood of Neoverse platforms. This is just the tip of the iceberg for both infrastructure workload benefits and on how our partners plan to implement and take Neoverse IP to market, Bergey said. Arm argues that innovators shouldn’t have to choose between performance or power efficiency.

The chips can target a range of cloud-to-edge uses.

“The Neoverse V1 and N2 are huge improvements for Arm,” said Kevin Krewell, an analyst at Tirias Research, in an email to VentureBeat. “The V1 with the Scalable Vector Extensions (SVE) are powerful enough to be the CPU core for supercomputers. Even though Arm didn’t provide performance numbers against AMD and Intel, it seems to be very competitive based on Arm’s data. The N2 is not an insignificant improvement over the N1. It’s the core to use for designs with very high core count, trading off some performance and a narrower SVE implementation for a smaller core size and lower power. These improvements are in line with Nvidia’s goals for the Arm architecture in the data center and one of these cores could well be the core used in Nvidia’s Project Grace CPU.”

The Neoverse V1

Arm's ecosystem

Above: Arm’s ecosystem

Image Credit: Arm

This chip design delivers a 50% uplift, as well as a 1.8 times improvement for a range of vector workloads and four times improvement
for machine learning workloads over N1.

Neoverse V1 is the first in a new performance-first computing tier for Arm. Neoverse V1 gives silicon partners the flexibility to build compute for applications more reliant on CPU performance and bandwidth while providing system-on-chip (SoC) design flexibility.

With the performance-first mindset, the design philosophy behind Neoverse V1 was to build the widest-microarchitecture Arm has ever produced to accommodate more instructions in flight in support of markets like high performance and exascale computing. The wide and deep architecture with the addition of scalable vector extensions (SVE), gives Neoverse V1 the lead in per-core performance, code longevity with SVE, and provides SoC designers implementation flexibility, Arm said.

You can see the benefits of some of these design elements in SiPearl and ETRI’s HPC SoCs and Arm thinks this is the direction HPC
compute is heading, Bergey said.

The Neoverse N2

Above: Amazon Web Services is using Arm’s latest designs in its Graviton2 processor.

Image Credit: Arm

The Neoverse N2 is aimed at cloud-to-edge performance. A few weeks ago, Arm introduced the Armv9 architecture to address global demand for ubiquitous specialized processing. The Neoverse N2 platform is the first based on the Armv9 architecture with improvements to security, power efficiency, and performance.

Delivering 40% higher single-threaded performance compared to N1, Neoverse N2 still retains the same level of power and area efficiency as Neoverse N1. The scalability of Neoverse N2 extends from high-throughput computing, such as in hyperscale cloud where Arm sees 1.3 times improvement on NGINX over N1.

The Neoverse N2 platform delivers superior performance per-thread, and industry-leading performance-per-watt driving a reduced total cost of ownership for users. Neoverse N2 is the first platform to feature SVE2, an Armv9 feature that drives a significant uplift in cloud-to-edge performance efficiency.

For a broader set of use cases like machine learning, digital signal processing, multimedia and 5G systems, SVE2 not only brings performance and ease of programming and portability benefits of SVE.

Spread the word

This post was originally published by Dean Takahashi at Venture Beat

Related posts