G42, the Abu Dhabi-based artificial intelligence and cloud computing company, has partnered with a consortium of engineers, researchers, and a Silicon Valley-based AI chip startup Cerebras to launch advanced Arabic language software that can power generative AI applications.
The new software, called Jais, is a large language model with 13 billion parameters and was trained on a massive dataset of Arabic and English text, including computer code. Because there was not enough Arabic data to train a model of Jais’ size, Timothy Baldwin, a professor at Mohamed bin Zayed University of Artificial Intelligence, explained that the computer code within the English language data played a pivotal role and helped train the model’s ability to reason.
In a statement, Baldwin told Reuters: “(Code) gives the model a big leg up in terms of reasoning abilities, because it spells out the (logical) steps.”
Named after the highest peak in the United Arab Emirates, Jais is the first bilingual large language model to be released from the Middle East. It is designed to be used in a variety of applications, including machine translation, text generation, and question-answering.
The training of the Jais model was conducted on the world’s largest supercomputer for AI training called a Condor Galaxy, which was developed by Cerebras in partnership with G42. Condor Galaxy will be used to address other major world challenges including health care, energy, and climate action. Cerebras recently revealed its plan to construct three such units in collaboration with G42, with the initial unit slated for arrival this year, followed by two more units in 2024.
“This model was trained, from start to finish, of 13 billion (parameters), in three and a half days,” highlighted Cerebras CEO Andrew Feldman. “But there was months of work before that.”
The initiative was a collaborative effort between Cerebras, the Mohamed bin Zayed University of Artificial Intelligence, and Inception, a subsidiary of the Abu Dhabi-based tech conglomerate G42 with a dedicated focus on AI. The partnership was formed to address the lack of large language models available in Arabic.
The release of Jais is a significant milestone in the development of Arabic language AI. It is a valuable tool that has the potential to improve communication, education, and other aspects of life in the Middle East and North Africa region.
“Jais is a significant milestone for the development of Arabic language AI,” said Karim El-Solh, CEO of G42. “It will help to accelerate the adoption of AI in the Middle East and North Africa region.”
Meanwhile, Jais is an open-source language model, which means that it is freely available for anyone to use and modify. This will allow researchers and developers to improve the model and make it even more powerful.
The news comes less than a year after G42 launched a $10 billion fund to invest in late-stage technology startup companies in the areas of cloud computing, clean technology and renewables, digital infrastructure, life sciences, healthcare, new materials, and fintech, among others.
Founded in 2018, the Abu Dhabi, United Arab Emirates (UAE)-based G42 is focused on the development of AI industries in the government sector, healthcare, finance, oil and gas, aviation, and hospitality.
Founded in 2016 and based in Los Altos, California, Cerebras specializes in developing and manufacturing high-performance artificial intelligence (AI) computer systems used by organizations such as Argonne National Laboratory, Lawrence Livermore National Laboratory, and the Pittsburgh Supercomputing Center to accelerate their AI research and development.