Check out the on-demand sessions from the Low-Code/No-Code Summit to learn how to successfully innovate and gain efficiencies by improving and scaling citizen developers. look now.
Large language models (LLMs) are all the rage in the AI world right now, but training them can be difficult and expensive; models with several billion parameters require months of work by experienced engineers to be operational (reliably and accurately).
A new joint offering from Cerebras Systems and Cirrascale Cloud Services aims to democratize AI by giving users the ability to train GPT-class models much cheaper than existing vendors — and with just a few lines of code.
“We think LLMs are underhyped,” said Andrew Feldman, CEO and co-founder of Cerebras Systems, during a pre-briefing. “Over the next year, we will see a huge increase in the impact of LLMs across various sectors of the economy.”
Likewise, generative AI is perhaps one of the most significant technological advancements in recent history, as it allows documents to be written, images to be created, and software to be coded from input from ordinary text.
Smart Security Summit
Learn about the critical role of AI and ML in cybersecurity and industry-specific case studies on December 8. Sign up for your free pass today.
To help accelerate adoption and improve the accuracy of generative AI, Cerebras today also announced a new partnership with AI content platform Jasper AI.
“We really feel like the next chapter in generative AI is custom models that are continually improving,” said Dave Rogenmoser, CEO of Jasper.
The first stage of the technology was “really exciting,” he said, but “it’s about to get much, much more exciting.”
Unleash research opportunities
Compared to LLMs, traditional cloud providers may struggle because they are unable to guarantee latency between a large number of GPUs. Feldman explained that the varying latency creates complex and time-consuming challenges in distributing a large AI model between GPUs, and that there are “great variations in time to train.”
The new Cerebras AI Model Studio, which is hosted on the Cirrascale AI Innovation Cloud, allows users to train Generative Transformer Class (GPT) models – including GPT-J, GPT-3, and GPT-NeoX – on Cerebras Wafer- Scale Clusters. This includes the recently announced Andromeda AI supercomputer.
Users can choose from state-of-the-art GPT-class models ranging from 1.3 billion parameters to 175 billion parameters, and train with accuracy time eight times faster than on an A100, and at half the price of traditional cloud providers, Feldman said.
For example, training time on GPT-J with a traditional cloud takes about 64 days from scratch; the Cerebras AI Model Studio reduces that to eight days from scratch. Similarly, on traditional clouds, production costs on GPUs alone can be as high as $61,000; while on Cerebras it is $45,000 for the full production run.
The new tool eliminates the need for devops and distributed programming; push-button model digitization can range from one to 20 billion parameters. Models can be trained with longer sequence lengths, opening new research opportunities.
“We’re unlocking a fundamentally new capability to search at this scale,” said Andy Hock, Cerebra product manager.
As Feldman noted, Cerebra’s mission is “to expand access to deep learning and rapidly accelerate the performance of AI workloads.”
Its new AI Model Studio is “easy and extremely simple”, he said. “We’ve organized this so you can jump on it, you can point, you can click.”
Accelerating the potential of AI
Meanwhile, young Jasper (founded in 2021) will use Cerebras’ Andromeda AI supercomputer to train its supercomputing models in “a fraction of the time,” Rogenmoser said.
As he noted, companies want custom templates, “and they really do.”
“They want these models to improve, to self-optimize based on past usage data, based on performance,” he said.
In his initial work on small workloads with Andromeda – which was announced this month at SC22, the international conference on high-performance computing, networking, storage and analytics – Jasper discovered that the supercomputer was doing work that thousands of GPUs couldn’t do.
The company expects to “significantly advance AI work,” including training GPT networks to scale AI outputs to all levels of end-user complexity and granularity. This will allow Jasper to quickly and easily customize content for multiple categories of customers, Rogenmoser said.
The partnership “allows us to invent the future of generative AI by doing things that aren’t practical or just not possible with traditional infrastructure,” he said.
Jasper’s products are used by 100,000 customers to write marketing copy, advertisements, books and other materials. Rogenmoser described the company as eliminating “the tyranny of the blank page” by serving as an “AI co-pilot”.
As he said, it allows creators to focus on the key elements of their story, “not the mundane.”
VentureBeat’s Mission is to be a digital public square for technical decision makers to learn about transformative enterprise technology and conduct transactions. Discover our Briefings.
#Cerebras #unveils #partnerships #LLM #generative #tools