The Cloud AI 100 is Qualcomm Technologies’ AI inference solution that’s been designed for Generative AI and large language models (LLMs). This working relationship between the two companies is intended to make AI accessible for a much wider range of AI-powered applications for developers.
“Together with Qualcomm Technologies we are pushing the boundaries of what's possible in AI efficiency and performance” said Yonatan Geifman, CEO and co-founder of Deci. “Our joint efforts streamline the deployment of advanced AI models on Qualcomm Technologies’ hardware, making AI more accessible and cost-effective, and economically viable for a wider range of applications. Our work together is a testament to our vision of making the transformational power of generative AI available to all.”
Through the relationship, Deci will work with Qualcomm Technologies to launch two new models.
The first is the DeciCoder-6B, a 6 billion parameter model for code generation engineered with a focus on performance at scale. Supporting eight programming languages (C, C#, C++, GO, RAST, Python, Java, JavaScript), it is said to outperform established models such as CodeGen2.5-7B, StarCoder-7B, and CodeLlama-7B. According to Deci, in Python, DeciCoder achieves a 3-point lead over models more than twice its size, such as StarCoderBase 15.5B.
The model also offers significantly better memory and computational efficiency, boasting 19x higher throughput compared to similar models when running on Qualcomm’s Cloud AI 100.
The second model is the DeciDiffusion 2.0, a 732 million parameter text-to-image diffusion model capable of outperforming Stable Diffusion v1.5, operating at 2.6 times the speed with on-par image quality.
Both models have been optimised to leverage the full potential of the Qualcomm Cloud AI 100 solution and are designed to enable users across various industries to experience exceptional performance from the outset at a more competitive price point.
Both DeciCoder-6B and DeciDiffusion 2.0 were developed using Deci’s Neural Architecture Search Technology, AutoNAC, its proprietary, hardware-aware technology that uses Neural Architecture Search for enterprises of all sizes.
The distinctive architecture of both models ensures efficient scaling of batching while maintaining minimal memory usage and avoiding any increase in latency.
Additionally, the models were designed to handle large batches, enabling maximal utilisation of the computational power of Qualcomm’s Cloud AI 100 cores.
DeciCoder-6B and DeciDiffusion have been released under Apache-2.0 and CreativeML Open RAIL++-M Licenses, respectively.