SiFive unveils XM Series to accelerate high performance AI workloads

1 min read

SiFive has announced the SiFive Intelligence XM Series designed for accelerating high performance AI workloads.

Credit: StellarK - adobe.stock.com

This is the first IP from SiFive to include a highly scalable AI matrix engine, which accelerates time to market for companies building system on chip solutions for edge IoT, consumer devices, next generation electric and/or autonomous vehicles, and data centres.

As part of SiFive’s commitment to supporting its customers and the broader RISC-V ecosystem, SiFive has also announced its intention to open source a reference implementation of its SiFive Kernel Library (SKL).

SiFive’s new XM Series offers a scalable and efficient AI compute engine and by integrating scalar, vector, and matrix engines, customers will be able to take advantage of very efficient memory bandwidth. The XM Series also continues SiFive’s legacy of offering extremely high performance per watt for compute-intensive applications.

“Many companies are seeing the benefits of an open processor standard while they race to keep up with the rapid pace of change with AI. AI plays to SiFive’s strengths with performance per watt and our unique ability to help customers customise their solutions,” said Patrick Little, CEO of SiFive. “We’re already supplying our RISC-V solutions to five of the ‘Magnificent 7’ companies, and as companies pivot to a ‘software first’ design strategy we are working on new AI solutions with a wide variety of companies from automotive to datacentre and the intelligent edge and IoT.”

“RISC-V was originally developed to efficiently support specialised computing engines including mixed-precision operations,” explained Krste Asanovic, SiFive Founder and Chief Architect. “This, coupled with the inclusion of efficient vector instructions and the support of specialised AI extensions, are the reasons why many of the largest datacentre companies have already adopted RISC-V AI accelerators.”

The XM Series features four X-Cores per cluster, a cluster can deliver 16 TOPS (INT8) or 8 TFLOPS (BF16) per GHz. There is 1TB/s of sustained memory bandwidth per XM Series cluster, with the clusters being able to access memory via a high bandwidth port or via a CHI port for coherent memory access.

SiFive said that it sees the creation of systems incorporating no host CPU or ones based on RISC-V, x86 or Arm.