Since last year, big language models such as ChatGPT, Gemini, and HyperClovaX have been hot issues. Amid this interest, Generative AI has developed into a competition for LLM development, and at the same time, as the market demand for lightweight has recently increased, sLLM is also attracting attention. This year, many industry insiders are pointing to cost-effective sLLM as a major trend. sLLM stands for small large language models, which are relatively small in size. As sLLM is low in cost and high in security, more and more companies prefer it.
Google launched the sLLM ‘Gemini Nano,’ open-source form of the sLLM, late last year, following the launch of the sLLM “Gemma,” a lightweight version of the giant language model Gemini this year. Google Cloud customers will now be able to customize and build Gemma models on Vertex AI and run them on Google Kubernetes Engine (GKE).
Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Developed by Google DeepMind and other teams across Google, Gemma is inspired by Gemini, and the name reflects the Latin gemma, meaning “precious stone.” Accompanying our model weights, we’re also releasing tools to support developer innovation, foster collaboration, and guide responsible use of Gemma models.
Here are the key details to know:
Gemma joins over 130 models in Vertex AI Model Garden, including our recently announced expanded access to Gemini: Gemini 1.0 Pro, 1.0 Ultra, and 1.5 Pro models.
By using Gemma models on Vertex AI, developers can take advantage of an end-to-end ML platform that makes tuning, managing, and monitoring models simple and intuitive. With Vertex AI, builders can reduce operational overhead and focus on creating bespoke versions of Gemma that are optimized for their use case. For example, using Gemma models on Vertex AI, developers can:
Vertex AI makes it easy for developers to turn their own tuned models into scalable endpoints that can power AI applications of all sizes.
GKE provides tools to build custom apps, from prototyping simple projects to rolling them out at enterprise scale. Today, developers can also deploy Gemma directly on GKE to create their own gen AI apps for building prototypes or testing model capabilities:
GKE delivers efficient resource management, consistent ops environments, and autoscaling. In addition, it helps enhance these environments with easy orchestration of Google Cloud AI accelerators, including GPUs and TPUs, for faster training and inference when building generative AI models.
Cloocus is the Google Cloud Premier Partner, the highest level of Google Cloud Partners, and provides comprehensive cloud services based on Google Cloud. Specifically, Cloocus is rapidly acquiring Generative AI technologies that have recently been in the spotlight through a group of skilled Data & AI experts. If you need expert consultation regarding Generative AI deployment, please apply for expert consulting through the button below!
Cloocus Corp.
[United States] 500 7th Ave. Fl 8 New York, NY 10018 | Tel.+1 408.7722024 | E-mail.info_us@cloocus.com
[Malaysia] A-3A, Block A, Level 3A, Sunway PJ51A, Jalan SS9A/19, Seri Setia, 47300 Petaling Jaya. | Tel.+6016 331 5396 | E-mail.infoMY@cloocus.com
[Korea Headquarter] 6, Nonhyeon-ro 75-gil, Gangnam-gu, Seoul, Republic of Korea 06247 | Tel.02-597-3400 | E-mail.marketing@cloocus.com
[Korea Busan Campus] 55, Centum jungang-ro, Haeundae-gu, Busan, Republic of Korea | Tel.051-900-3400
Copyrights 2024 Cloocus co.,ltd. all rights reserved.