AI Simplified - Decoding the Jargon

In this blogpost, I explain some of the core AI/ML concepts in simple terms.

Understanding these concepts is useful for anyone developing or working with Generative AI tools like ChatGPT, Gemini, etc.

These are the topics I cover in this post:

Prompt Engineering: Enhance AI responses with refined inputs.
Fine-tuning: Tailor AI models for specific tasks.
Retrieval-Augmented Generation (RAG): Improve AI using custom data sources.
Vector Database: Advanced storage system for AI.
Retriever: Link user queries with stored information in vector databases.

Prompt Engineering

The questions you ask a Generative AI model (e.g., ChatGPT, Gemini) are called "prompts." To get better answers, you must "engineer" or "refine" your prompt.

Example - Instead of saying "Write about dogs," say "Write a 200-word paragraph about the history of domesticated dogs, focusing on their roles in human society."

graph LR
        A[User Input] --> B[Simple Prompting]
        A --> C[Prompt Engineering]

        B --> D[Direct Output]

        C --> E{Refine Prompt}
        E -->|Improve| F[Better Results]
        E -->|Iterate| C
        F --> G[Final Output]

        subgraph Simple Prompting
        B
        D
        end

        subgraph Prompt Engineering
        C
        E
        F
        end

Fine-tuning

Fine-tuning makes an AI model better at specific tasks. It is like teaching a smart student to become an expert in a new subject.

Let us say a law firm needs to create many legal documents every day. ChatGPT can write these documents, but it might not use the right words or the right format, that the law firm needs.

To fix this, the law firm can "fine-tune" the AI model. This means teaching the AI to write documents exactly how the law firm wants them.

To do this, the firm shows the AI model some examples of perfect legal documents written by their best lawyers. The AI learns from these examples and gets better at writing documents just like the law firm wants.

flowchart LR
        A[("Base LLM")] -->|Fine-tuning| B
        subgraph FT [Fine-tuned LLM]
                B[("Fine-tuned Model")]
                D[("Fine-tuned Knowledge")]
        end
        C[("Law-firm specific examples")] --> FT
        FT <-->|Prompt/Response| E[("User")]

Retrieval-Augmented Generation (RAG)

RAG allows you to feed the AI model your own data sources, enabling it to give more relevant and tailored responses.

Imagine a pizza restaurant's chatbot using RAG. It is like giving the chatbot a constantly updated menu card. When customers ask about today's specials, changed delivery zones, or new toppings, the chatbot can instantly access this fresh information. It does not just rely on old data but can pull up the latest details.

graph TB
        subgraph "Traditional AI Model"
        A1[User Query] --> B1[AI Model]
        B1 --> C1[General Response]
        end
        subgraph "RAG-Enabled AI Model"
        A2[User Query] --> R[Retriever]
        R <--> V[Vector DB]
        V <--> D[Custom Data Sources]
        R --> B2[AI Model]
        A2 --> B2
        B2 --> C2[Tailored Response]
        end

Vector Database

A Vector Database is a smart storage system for AI. It helps AI access new or specific information not included in its original training.

A vector database is like a smart library where instead of searching for books by their title or author, you are searching by the ideas inside the books. This helps the AI find and compare information more effectively.

Let me reuse the example from the previous section about the chatbot for a pizza restaurant. The restaurant keeps its latest menu in a Vector Database. When customers ask about new pizzas, the chatbot can quickly check this database for current information. This way, the restaurant does not need to constantly update the chatbot. They just add new pizza details to the database, and the chatbot can access this information when needed.

Vector databases store data differently from relational databases like MySQL. Instead of using rows and columns, vector databases convert each piece of data into a numerical format called an embedding. These embeddings are placed in a multi-dimensional space. Similar items are positioned closer together.

For example, "cats" and "dogs" would be near each other, while "table" and "chair" would be further apart. This method helps AI models provide more relevant answers.

Retriever

The Retriever in a RAG system works like a smart search tool. It helps connect what users ask with the information stored in "vector databases".

When someone asks a question, the Retriever does three main things:

Find Similar Info: It looks for information that is close to what the user asked.
Sort by Importance: It puts the found information in order, with the most useful stuff at the top.
Pick the Best: It chooses the top pieces of information to send back to the AI.

The AI then uses this information to give an answer the user can easily understand.

graph TB
        subgraph "RAG-Enabled AI Model"
        A[User Query] --> R[Retriever]
        R <--> V[Vector DB]
        V <--> D[Custom Data Sources]
        R --> B[AI Model]
        A --> B
        B --> C[Tailored Response]
        end