Build Your business specific LLMs Using RAG


AI Tech Circle

Hey Reader!

When we talk about Large Language model implementations in the business context, you will hear the widespread term Retrieval-Augmented Generation (RAG), and it is being presented as the Magic wand to several scenarios where you need to rely on your data while using the generative AI. RAG is the solution for assembling your business data and the LLM; you will get the desired outputs.

So, I thought of going through the fundamentals of RAG; it is just for understanding and clarity. In a paper in 2020, “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” Meta introduced a retrieval-augmented generation framework to give LLMs access to information beyond their training data. RAG allows LLMs to build on a specialized body of knowledge to answer questions more accurately.

Retrieval-augmented generation (RAG) in Large Language Models (LLMs) enhances the model’s ability to generate responses by dynamically retrieving relevant information from a large dataset or database at the time of the query. This approach combines the generative power of LLMs with the specificity and accuracy provided by external data sources, enabling the model to produce more accurate, detailed, and contextually relevant outputs.

How RAG Works:

  1. Query Processing: When a query or prompt is received, the RAG system interprets the request.
  2. Data Retrieval: It then searches a connected database or knowledge base (could be PDFs, etc) to find relevant information related to the query.
  3. Content Generation: The retrieved information is fed into the LLM, which uses this context to generate a more informed and accurate response.

Example:

Suppose you are using a RAG-enhanced LLM for a medical information system. A user asks, “What are the latest treatment options for type 2 diabetes?”

  • Interpretation: The RAG system interprets the query to understand that it needs information on recent diabetes treatments.
  • Retrieval: It queries the connected medical database or sources of medical information stored in its knowledge base, retrieving articles, studies, and guidelines related to the latest treatment options for type 2 diabetes.
  • Generation: The LLM, now equipped with the latest retrieved information, generates a response summarizing the current treatment options, perhaps mentioning new drugs, lifestyle modification strategies, and the latest findings from recent studies.

Without RAG, an LLM would have to rely solely on the information it was trained on, which might be outdated or lack the specific details in newly published research. RAG ensures the model’s output is current and deeply informed by the most relevant available data, significantly enhancing the quality and utility of the response.

What are the use cases for RAG (Retrieval-Augmented Generation)?:

  • Question-Answering Chatbots: By integrating large language models (LLMs) with chatbots, they can autonomously generate more precise answers by accessing company documents and knowledge bases. This approach is primarily utilized to enhance customer support, automate website responses, and add business context and data for providing quick solutions to inquiries and resolving issues efficiently.
  • Enhanced Search Capabilities: When combined with search engines, LLMs can enrich search outcomes with generated responses, improving the accuracy of informational queries. This advancement makes it simpler for users to locate the necessary information for their tasks.
  • Data Query Engines: Utilizing company data as a context for LLMs enables employees to obtain answers to their queries effortlessly. This application is handy for accessing information from HR, Finance, Procurement, and Legal to several divisions documents, such as questions about company policies, benefits, and compliance standards.

These use cases demonstrate the versatility and potential of RAG to transform information retrieval and interaction within organizations. In the next week, I will go through the technical aspects of the RAG and how it works.

Weekly News & Updates…

This week’s unveiling of new AI tools and products drives the technology revolution forward.

  1. Aya open-source LLM from Cohere multilingual model is available on Kaggle, so go to Kaggle and start exploring.
  2. Gemma from Google Open Language Models is now available in the KerasNLP collection.
  3. Gemini Business from Google will be available in the Google Workspace apps
  4. The EU’s AI Act and How Companies Can Achieve Compliance

The Cloud: the backbone of the AI revolution

Favorite Tip Of The Week:

Here’s my favorite resource of the week.

Potential of AI

  • Experiment: Figma to Replit Plugin: This experimental plugin turns static designs into responsive React components. Export the generated code to Replit to share an instantly-deployable React app.

Things to Know

  • Stable Diffusion 3 has released an early preview of the model with the capabilities of the text-to-image model with significantly improved performance in multi-subject prompts, image quality, and spelling abilities.

The Opportunity…

Podcast:

  • This week’s Open Tech Talks episode 126 is “Web3 Unveiled: Revolutionizing Digital Engagement with Viktoriia Miracle”

Apple | Spotify | Google Podcast

Courses to attend:

Events:

Tech and Tools…

  • Gemma in PyTorch: PyTorch implementation of Gemma models
  • SoraWebui is an open-source project that simplifies video creation by allowing users to generate videos online with OpenAI’s Sora model using text
  • ChatGPT + Enterprise data with Azure OpenAI and AI Search

Data Sets…

  • fastMRI Dataset from NYU School of Medicine and NYU Langone Health
  • ROSE: A Retinal OCT-Angiography Vessel SEgmentation Dataset

Other Technology News

Want to stay on the cutting edge?

Here’s what else is happening in Information Technology you should know about:

  • Google apologizes for ‘missing the mark’ after Gemini generated racially diverse Nazis, as reported by The Verge
  • Cyberattacks are the No. 1 worry for business leaders—and AI may be able to help, as reported by Fortune

Earlier Edition of a newsletter

That’s it!

As always, thanks for reading.

Hit reply and let me know what you found most helpful this week – I’d love to hear from you!

Until next week,

Kashif Manzoor

The opinions expressed here are solely my conjecture based on experience, practice, and observation. They do not represent the thoughts, intentions, plans, or strategies of my current or previous employers or their clients/customers. The objective of this newsletter is to share and learn with the community.