According to Deci, the DeciLM 7B is currently setting new benchmarks in the large language model (LLM) space and is outperforming prominent open-source models such as Llama2 7B and Mistral 7B in both accuracy and efficiency.
DeciLM-7B offers both improved accuracy and speed with less computational demand and achieves a 1.83x and 2.39x increase in throughput over Mistral 7B and Llama 2 7B, respectively, which means significantly faster processing speeds compared to competing models.
Its compact design makes it suitable for cost-effective GPUs as it combines greater affordability with a high-end performance.
The performance of DeciLM-7B can be further accelerated when used in tandem with Infery-LLM, the inference engine, and can achieve speeds 4.4 times greater than Mistral 7B with vLLM without sacrificing quality.
Leveraging DeciLM-7B in conjunction with Infery-LLM enables teams to drastically reduce their LLM compute expenses, while simultaneously benefiting from quicker inference times. This integration facilitates the efficient scaling of Generative AI workloads and supports the transition to more cost-effective hardware solutions.
This synergy enables the efficient serving of multiple clients simultaneously without excessive compute costs or latency issues and this is especially crucial in sectors such as telecommunications, online retail, and cloud services, where the ability to respond to a massive influx of concurrent customer inquiries in real time can significantly enhance user experience and operational efficiency.
Licensed under Apache 2.0, DeciLM-7B is available for use and deployment anywhere, including local setups, enabling teams to fine tune for specific industry applications without compromising on data security or privacy.
Designed to be versatile it allows teams to tailor it for unique use cases across a wide range of business applications, including content creation, translation, conversation modelling, data categorization, summarization, sentiment analysis and chatbot development, among others.
When fine-tuned for specific data sets, DeciLM-7B can deliver similar quality to that of much larger models such as GPT 3.5 at approximately 97% lower cost and better speed.
"With the increasing use of Generative AI in various business sectors, there's a growing demand for models that are not only highly performant but also operationally cost efficient,” said Yonatan Geifman, CEO and co-founder of Deci. "Our latest innovation, DeciLM-7B, combined with Infery-LLM, is a game-changer in this regard. It's adaptable to diverse settings, including on-premise solutions, and its exceptional inference efficiency makes high-quality large language models more accessible to a wider range of users.”