Technologies Driving Enhanced On-device Generative AI Experiences: LoRA

This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm.

Utilize low-rank adaptation (LoRA) to provide customized experiences across use cases

Enhancing contextualization and customization has always been a driving force in the realm of user experience. While generative artificial intelligence (AI) has already demonstrated its transformative potential, there remains ample opportunity for further advancements.

To cater to the growing demand for customized and contextually relevant experiences, this blog post (check out part 1 about multimodal generative AI) explores how LoRA is poised to make a significant impact.

Similar to a tailor providing a custom fit for a suit, LoRA adapters enable customized generative AI experiences.

Customization with LoRA adapters

Foundation models and pretrained generative AI models have broad-based knowledge and can respond to many prompts well. However, they can sometimes miss the mark because they have not been customized or fine-tuned with additional data for detailed knowledge.

“Jack of all trades, master of none” describes this problem pretty well: A generative AI model can demonstrate many adequate skills but can lack expertise.

For example, a large language foundation model that is asked to act as a fitness and health coach may struggle with providing accurate feedback on exercises or meal suggestions. Fine-tuning the model by training it with additional examples of proper exercise form and accurate calorie counts for dishes can significantly increase accuracy.

While training the original foundation model requires significant data, compute, budget and expertise, fine-tuning on a much smaller amount of domain-specific data can still be too challenging for many AI companies, developers and practitioners.

To address this challenge, researchers developed a technique called LoRA that reduces the number of trainable parameters of AI models significantly (e.g., 98% reduction in parameters) to reduce training cost while still improving the accuracy of the model for the fine-tuned task. The model changes are encapsulated in the LoRA adapter, which is added to the original values of the model to create the fine-tuned model.

Beyond making it easier to train the model, LoRA also enables greater efficiency, scalability and customization of on-device generative AI use cases. LoRA is broadly applicable to generative and traditional AI models.

As an example, generative AI models like LLMs can be fine-tuned to create tailored personal assistants, improved language translation and more. LoRA adapters are being generated by developers and the broader AI community to create custom experiences, and consumers can choose the one that matches their preferences.

LoRA is how generative AI can scale to provide more customized, personalized and accurate experiences based on consumer and business preferences.

Customized Stable Diffusion is now possible on device

We have also recently demonstrated Stable Diffusion with LoRA adapters running on an Android smartphone. The LoRA adapters enabled the creation of high-quality custom images for Stable Diffusion based on personal or artistic preferences. Users could select a LoRA adapter and set the adapter strength to produce the desired image.

For example, we demonstrated a “noodles” adapter that would create a similar image as Stable Diffusion except that the generated image would integrate pasta, such as spaghetti, as the drawing style.

Beyond enabling fine-tuned language vision models (LVMs) for different artistic styles, the LoRA technique is broadly applicable to any AI model. It is especially crucial for on-device generative AI due to the size of the models and constraints in DRAM and flash storage — the adapters are small, often less than 2% of base model size, and quick to switch.

Running LoRA adapters on device can provide enhanced privacy, security, reliability, personalization and cost benefits.

We are at the beginning of the generative AI era with much more innovation to come.

Driving continued innovation in on-device generative AI

LoRA, along with multimodal AI, are great technology examples of what is coming next to on-device generative AI. They address existing challenges to provide contextual, custom and personalized experiences at scale for consumers and businesses.

We are in exciting times, and I look forward to seeing how this technology is used by developers and the rest of the AI ecosystem to provide enhanced user experiences.

Pat Lawlor
Director, Technical Marketing, Qualcomm Technologies, Inc.

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone
Phone: +1 (925) 954-1411
Scroll to Top