Decoding AI Jargons with Chai

Aj ki class ke baad, aap ChatGPT ko kabhi uss nazariye se dekh hi nahi paoge ~ Piyush Garg Sir

This is what our mentor told us in the beginning of our first class of Gen AI cohort , and this is absolutely true. This lecture was my first face off with generative AI and it was really over my expectation. Behind writing a simple “Hi” to ChatGPT and getting response “Hey, Whats Up!” in return , there are so many underlying mechanisms that works. Here are some of the key takeaways that I have personally got to know that I am diving into sips.

Nowadays, AI is everywhere, from personal assistant to all our software application. So being an engineer it becomes very essential to know about the jargons which common in the domain of AI. So lets start with some sips of Chai :

Sip 1 :

NLP (Natural Language Processing) → Its a way by which AI can understand human language and responds back in the same human language.
Tokenisation → Its the process by which the AI breaks our input into chunks and further process it as tokens. Each AI model has its own way of tokenising inputs. For example, OpenAI (ChatGPT) uses a tokenising library called TikToken. Similarly other models like Gemini, Claude has its own algorithms of tokenisation.

import tiktoken

encoder = tiktoken.encoding_for_model('gpt-4o')

print("Vocab Size", encoder.n_vocab) 

text = "The cat sat on the mat"
tokens = encoder.encode(text)

print("Tokens", tokens) 

my_tokens = [976, 9059, 10139, 402, 290, 2450]
decoded = encoder.decode([976, 9059, 10139, 402, 290, 2450])
print("Decoded", decoded)

Vector Embeddings → The tokens are then converted into Vector Embeddings. Vector embeddings are basically some sort of numbers that helps the model to understand the input. In real life , it can be considered as a 3D graph, where each point is plotted according to their semantic meaning and these points are related to each other based on their semantic meaning. For example, if we check the vector point of the word “tea” in the Vector Embedding Projector, we can see the near by points are “coffee”, “beverage”, “cola”, “drink” etc.

from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()

client = OpenAI()

text = "Eiffel Tower is in Paris and is a famous landmark, it is 324 meters tall"

response = client.embeddings.create(
    input=text,
    model="text-embedding-3-small"
)

print("Vector Embeddings", response.data[0].embedding)

Sip 2 :

Attention & Multi head Attention → Its a mechanism thorough which the tokens communicate with each other. In other words, though attention mechanism, the model can focus on the important (or effective) part of the input. Now, multi head attention means, parallel running of these attention mechanisms that helps the model to focus main concept of the input which helps the model to understand the context of the input.

Sip 3 :

FeedForward → It refers to the Neural network which is the brain of the transformer. It helps the model to understand complex patterns, behaviour and context of the input based on which it gives the output.
Softmax → It is a mathematical function which creates probability on prediction of next word. Based on this, we can decide the creativity of the model.
Temperature → It is the factor in an AI model which decides the creativity (i.e. randomness) in of the output. For instance, in case of high temperature, the model would respond with “the sky is blue” and in case of low temperatures the model would respond with the “The sky, a canvas of swirling nebulae, whispered ancient secrets”.

These were some my personal take aways from the lecture. It was an eye opening session, and yes I won’t see ChatGPT as I used to see it before.