What is the difference between AI, machine learning, and deep learning?

AI is the broad field of making machines intelligent. Machine learning is a subset of AI where systems learn from data. Deep learning is a subset of ML using multi-layer neural networks. All deep learning is ML, and all ML is AI, but not vice versa.

Why is Python the dominant language for AI?

Python dominates AI due to its readability, extensive ecosystem (NumPy, PyTorch, TensorFlow, scikit-learn), strong community, and seamless integration with C/C++ for performance-critical operations. Its simplicity lets researchers focus on algorithms rather than language complexities.

What is Pyodide and how does it run Python in the browser?

Pyodide is a Python distribution for WebAssembly that runs entirely in the browser. It compiles CPython to WebAssembly, enabling Python code execution without a server. It includes scientific libraries like NumPy, making it perfect for interactive AI tutorials.

Artificial Intelligence Programming — Python AI Tutorials & Examples

02 — PYTHON & NUMPY FOUNDATION

The Numerical Foundation of AI

NumPy provides fast, vectorised N-dimensional arrays backed by optimised C code. Every major ML library — PyTorch, TensorFlow, Scikit-learn — operates on NumPy-compatible arrays internally. Vectorised operations avoid slow Python loops, running at near-C speed. Arrays support broadcasting, slicing, reshaping, and full linear algebra operations natively, making them the universal currency of numerical AI computation.

Python · NumPy Array Operations runnable

import numpy as np

a = np.array([1, 2, 3, 4, 5], dtype=float)
b = np.array([10, 20, 30, 40, 50], dtype=float)

print("Array a:        ", a)
print("Array b:        ", b)
print("a * 2:          ", a * 2)
print("a + b:          ", a + b)
print("Dot product:    ", np.dot(a, b))
print("Mean of a:      ", np.mean(a))
print("Std of a:       ", np.std(a))
print("Normalised a:   ", np.round((a - a.mean()) / a.std(), 3))

matrix = np.array([[1, 2], [3, 4]])
print("\n2x2 Matrix:\n", matrix)
print("Determinant:    ", np.linalg.det(matrix))
print("Inverse:\n",       np.linalg.inv(matrix))
print("Eigenvalues:    ", np.round(np.linalg.eigvals(matrix), 3))

03 — DATA PREPROCESSING

Preparing Data for AI Models

Raw data is almost never ready for a machine learning model. Preprocessing transforms messy, inconsistent input data into a clean, structured format that a model can learn from effectively. The quality of preprocessing directly determines the quality of the model — a well-preprocessed simple model consistently outperforms a complex model fed raw, unclean data. Core steps include handling missing values, encoding categorical variables, scaling numerical features, and splitting data into training and test sets.

Python · Preprocessing Pipeline runnable

import numpy as np

np.random.seed(42)
n = 20

ages    = np.random.randint(18, 65, n).astype(float)
salaries = np.random.randint(20000, 120000, n).astype(float)
scores  = np.random.rand(n) * 100

ages[3] = np.nan
ages[11] = np.nan
salaries[7] = np.nan

def fill_missing(arr):
    mean = np.nanmean(arr)
    arr[np.isnan(arr)] = mean
    return arr

ages     = fill_missing(ages)
salaries = fill_missing(salaries)

def minmax_scale(arr):
    return (arr - arr.min()) / (arr.max() - arr.min())

def zscore(arr):
    return (arr - arr.mean()) / arr.std()

ages_scaled     = minmax_scale(ages)
salaries_scaled = minmax_scale(salaries)
scores_z        = zscore(scores)

split = int(0.8 * n)
X = np.column_stack([ages_scaled, salaries_scaled, scores_z])
X_train, X_test = X[:split], X[split:]

print(f"Dataset size:       {n} samples")
print(f"Features:           age, salary, score")
print(f"Missing values:     filled with column mean")
print(f"Scaling applied:    min-max (age, salary), z-score (score)")
print(f"Train / Test split: {len(X_train)} / {len(X_test)}")
print(f"\nFirst 3 preprocessed training rows:")
for row in X_train[:3]:
    print(" ", np.round(row, 4))

04 — LINEAR REGRESSION

The Fundamental ML Algorithm

Linear regression is the simplest and most fundamental ML algorithm. It models the relationship between input features and a continuous output as a straight line (or hyperplane in higher dimensions). The model finds the weights that minimise the Mean Squared Error between predictions and true values, either by solving the closed-form normal equation or iteratively via gradient descent.

Python · Linear Regression from Scratch runnable

import numpy as np

np.random.seed(0)
X = np.linspace(0, 10, 50)
y = 2.5 * X + 1.2 + np.random.randn(50) * 1.5

X_b = np.column_stack([np.ones(len(X)), X])

weights = np.linalg.lstsq(X_b, y, rcond=None)[0]
b0, b1 = weights

y_pred = b0 + b1 * X
ss_res = np.sum((y - y_pred) ** 2)
ss_tot = np.sum((y - y.mean()) ** 2)
r2     = 1 - ss_res / ss_tot
mse    = ss_res / len(y)
mae    = np.mean(np.abs(y - y_pred))

print("LINEAR REGRESSION RESULTS")
print("-" * 35)
print(f"  True relationship: y = 2.5x + 1.2")
print(f"  Learned intercept: {b0:.4f}")
print(f"  Learned slope:     {b1:.4f}")
print(f"  MSE:               {mse:.4f}")
print(f"  MAE:               {mae:.4f}")
print(f"  R² score:          {r2:.4f}")
print("-" * 35)
print(f"\nPredictions for x = 0, 5, 10:")
for x_val in [0, 5, 10]:
    print(f"  x={x_val:2d}  →  ŷ = {b0 + b1*x_val:.3f}")

05 — K-NEAREST NEIGHBOURS

Instance-Based Learning

K-Nearest Neighbours (KNN) is a simple, non-parametric classifier that stores all training instances and predicts labels based on the majority class of the k closest points in feature space. It's particularly useful for multi-class problems and when decision boundaries are irregular.

Python · KNN Classifier runnable

import numpy as np

np.random.seed(7)

class_a = np.random.randn(30, 2) + [0, 0]
class_b = np.random.randn(30, 2) + [4, 4]
class_c = np.random.randn(30, 2) + [0, 4]

X = np.vstack([class_a, class_b, class_c])
y = np.array([0]*30 + [1]*30 + [2]*30)

idx   = np.random.permutation(len(X))
split = int(0.8 * len(X))
X_tr, X_te = X[idx[:split]], X[idx[split:]]
y_tr, y_te = y[idx[:split]], y[idx[split:]]

def knn_predict(X_train, y_train, X_test, k=5):
    preds = []
    for x in X_test:
        dists = np.sqrt(np.sum((X_train - x)**2, axis=1))
        nn    = y_train[np.argsort(dists)[:k]]
        preds.append(np.bincount(nn).argmax())
    return np.array(preds)

for k in [1, 3, 5, 7]:
    preds = knn_predict(X_tr, y_tr, X_te, k=k)
    acc   = np.mean(preds == y_te)
    print(f"  k={k}  →  Accuracy: {acc:.2%}  ({np.sum(preds==y_te)}/{len(y_te)} correct)")

print(f"\nBest k=5 class breakdown:")
preds5 = knn_predict(X_tr, y_tr, X_te, k=5)
for cls in range(3):
    mask = y_te == cls
    acc  = np.mean(preds5[mask] == cls)
    print(f"  Class {cls}: {acc:.2%} accuracy")

06 — NEURAL NETWORKS FROM SCRATCH

Building the XOR Network

A neural network consists of layers of neurons, each computing a weighted sum of its inputs followed by a non-linear activation function. The network learns by adjusting its weights through backpropagation — computing the gradient of a loss function with respect to each weight, then taking a small step in the direction that reduces the loss. This XOR example demonstrates how a simple network solves a non-linearly separable problem.

Python · Neural Network from Scratch runnable

import numpy as np

np.random.seed(1)

X = np.array([[0,0],[0,1],[1,0],[1,1]])
y = np.array([[0],[1],[1],[0]])

def sigmoid(z):    return 1 / (1 + np.exp(-z))
def sig_deriv(z):  return z * (1 - z)

W1 = np.random.randn(2, 4) * 0.5
b1 = np.zeros((1, 4))
W2 = np.random.randn(4, 1) * 0.5
b2 = np.zeros((1, 1))

lr = 0.5
losses = []

for epoch in range(10000):
    z1 = X @ W1 + b1
    a1 = sigmoid(z1)
    z2 = a1 @ W2 + b2
    a2 = sigmoid(z2)

    loss = np.mean((y - a2)**2)
    losses.append(loss)

    d2 = -(y - a2) * sig_deriv(a2)
    d1 = (d2 @ W2.T) * sig_deriv(a1)

    W2 -= lr * a1.T @ d2
    b2 -= lr * d2.mean(axis=0)
    W1 -= lr * X.T  @ d1
    b1 -= lr * d1.mean(axis=0)

print("XOR NEURAL NETWORK — TRAINING COMPLETE")
print("-" * 42)
for epoch in [0, 999, 4999, 9999]:
    print(f"  Epoch {epoch+1:5d}  loss = {losses[epoch]:.6f}")

print("\nFinal predictions (threshold 0.5):")
print(f"{'Input':<12} {'Target':<10} {'Output':<12} {'Correct'}")
print("-" * 42)
preds = sigmoid(sigmoid(X @ W1 + b1) @ W2 + b2)
for i in range(4):
    raw  = preds[i][0]
    pred = int(raw >= 0.5)
    ok   = "✓" if pred == y[i][0] else "✗"
    print(f"  {str(X[i]):<10}  {int(y[i][0]):<10}  {raw:.4f}        {ok}")

07 — NATURAL LANGUAGE PROCESSING

TF-IDF Text Representation

Before any model can process text, it must be converted into numerical form. This pipeline involves tokenisation, building a vocabulary, and representing documents as vectors — either using bag-of-words, TF-IDF, or modern contextual embeddings. TF-IDF (Term Frequency-Inverse Document Frequency) weights words by their importance in a document relative to the whole corpus.

Python · TF-IDF Vectoriser runnable

import numpy as np
import re
from collections import Counter

docs = [
    "Python is great for machine learning and artificial intelligence",
    "Machine learning uses algorithms to find patterns in data",
    "Neural networks are inspired by the human brain",
    "Deep learning is a subset of machine learning",
    "Natural language processing helps computers understand text",
]

def tokenise(text):
    return re.findall(r'\b[a-z]+\b', text.lower())

tokenised = [tokenise(d) for d in docs]

vocab = sorted(set(w for doc in tokenised for w in doc))
word_idx = {w: i for i, w in enumerate(vocab)}
N = len(docs)

def tf(tokens, word):
    return tokens.count(word) / len(tokens)

def idf(word):
    df = sum(1 for doc in tokenised if word in doc)
    return np.log(N / (1 + df))

tfidf = np.zeros((N, len(vocab)))
for d, tokens in enumerate(tokenised):
    for w in tokens:
        tfidf[d, word_idx[w]] = tf(tokens, w) * idf(w)

print("TF-IDF MATRIX (top keywords per document)")
print("-" * 50)
for d, doc in enumerate(docs):
    scores = tfidf[d]
    top3   = sorted(zip(vocab, scores), key=lambda x: -x[1])[:3]
    kws    = ", ".join(f"{w}({s:.3f})" for w, s in top3 if s > 0)
    short  = doc[:38] + "…"
    print(f"  Doc{d+1}: {short}")
    print(f"        Top keywords: {kws}\n")

08 — CLUSTERING (K-MEANS)

Unsupervised Learning

Unsupervised learning finds hidden structure in data without any labelled examples. K-Means clustering partitions data into K groups by iteratively assigning points to the nearest centroid and recomputing centroids until convergence. Applications include customer segmentation, document grouping, and anomaly detection.

Python · K-Means from Scratch runnable

import numpy as np

np.random.seed(42)

cluster1 = np.random.randn(20, 2) * 0.8 + [0,  0]
cluster2 = np.random.randn(20, 2) * 0.8 + [6,  0]
cluster3 = np.random.randn(20, 2) * 0.8 + [3,  5]
X = np.vstack([cluster1, cluster2, cluster3])
true_labels = np.array([0]*20 + [1]*20 + [2]*20)

def kmeans(X, k=3, max_iter=100):
    centroids = X[np.random.choice(len(X), k, replace=False)]
    for iteration in range(max_iter):
        dists  = np.sqrt(((X[:, None] - centroids[None])**2).sum(axis=2))
        labels = dists.argmin(axis=1)
        new_c  = np.array([X[labels==i].mean(axis=0) for i in range(k)])
        if np.allclose(centroids, new_c):
            print(f"  Converged at iteration {iteration+1}")
            break
        centroids = new_c
    inertia = sum(np.sum((X[labels==i] - centroids[i])**2) for i in range(k))
    return labels, centroids, inertia

labels, centroids, inertia = kmeans(X, k=3)

def purity(true, pred):
    total = 0
    for c in np.unique(pred):
        mask = pred == c
        total += np.bincount(true[mask]).max()
    return total / len(true)

print("K-MEANS CLUSTERING RESULTS")
print("-" * 38)
for i, c in enumerate(centroids):
    size = np.sum(labels == i)
    print(f"  Cluster {i+1}: centroid=({c[0]:.2f}, {c[1]:.2f})  size={size}")
print(f"\n  Total inertia (lower=better): {inertia:.2f}")
print(f"  Clustering purity:            {purity(true_labels, labels):.2%}")

09 — MODEL EVALUATION

Measuring Performance

Accuracy is intuitive but misleading on imbalanced datasets. Precision measures the fraction of positive predictions that are actually positive. Recall measures the fraction of actual positives that were detected. The F1 score balances both. ROC-AUC measures the model's ability to discriminate between classes across all thresholds.

Python · Evaluation Metrics runnable

import numpy as np

np.random.seed(0)
n = 100
y_true = np.random.randint(0, 2, n)
y_prob = np.clip(y_true * 0.6 + np.random.rand(n) * 0.5, 0, 1)
y_pred = (y_prob >= 0.5).astype(int)

TP = np.sum((y_pred == 1) & (y_true == 1))
TN = np.sum((y_pred == 0) & (y_true == 0))
FP = np.sum((y_pred == 1) & (y_true == 0))
FN = np.sum((y_pred == 0) & (y_true == 1))

accuracy  = (TP + TN) / n
precision = TP / (TP + FP) if (TP + FP) > 0 else 0
recall    = TP / (TP + FN) if (TP + FN) > 0 else 0
f1        = 2 * precision * recall / (precision + recall) if (precision + recall) > 0 else 0
spec      = TN / (TN + FP) if (TN + FP) > 0 else 0

def auc_roc(y_true, y_prob):
    sorted_idx = np.argsort(-y_prob)
    tprs, fprs, tp, fp = [0], [0], 0, 0
    P, N = y_true.sum(), (1-y_true).sum()
    for idx in sorted_idx:
        if y_true[idx] == 1: tp += 1
        else: fp += 1
        tprs.append(tp / P)
        fprs.append(fp / N)
    return np.trapz(tprs, fprs)

auc = abs(auc_roc(y_true, y_prob))

print("CLASSIFICATION EVALUATION REPORT")
print("=" * 38)
print(f"  Confusion Matrix:")
print(f"    TP={TP:3d}  FP={FP:3d}")
print(f"    FN={FN:3d}  TN={TN:3d}")
print("  " + "-" * 34)
print(f"  Accuracy:   {accuracy:.4f}")
print(f"  Precision:  {precision:.4f}")
print(f"  Recall:     {recall:.4f}")
print(f"  Specificity:{spec:.4f}")
print(f"  F1 Score:   {f1:.4f}")
print(f"  ROC-AUC:    {auc:.4f}")
print("=" * 38)
quality = "Excellent" if auc>0.85 else "Good" if auc>0.7 else "Fair"
print(f"  Model quality: {quality}")

10 — GENERATIVE AI & ETHICS

Language Models and Responsible AI

Language models learn to predict the next token in a sequence given all preceding tokens. Through this simple objective applied to enormous text corpora, they develop the ability to reason, summarise, translate, and answer questions. Building AI responsibly requires understanding Fairness (audit for bias), Transparency (document limitations), Privacy (protect sensitive data), and Robustness (test edge cases).

Python · Bias Detection runnable

import numpy as np

np.random.seed(0)
n = 200

group     = np.random.choice(['A', 'B'], n)
true_qual = np.random.rand(n)
noise     = np.where(group == 'A',
                     np.random.rand(n) * 0.15,
                     np.random.rand(n) * 0.35)
score     = np.clip(true_qual + noise, 0, 1)
y_true    = (true_qual >= 0.5).astype(int)
y_pred    = (score >= 0.5).astype(int)

def group_metrics(group_mask, y_true, y_pred):
    yt, yp = y_true[group_mask], y_pred[group_mask]
    TP  = np.sum((yp==1) & (yt==1))
    FP  = np.sum((yp==1) & (yt==0))
    FN  = np.sum((yp==0) & (yt==1))
    TN  = np.sum((yp==0) & (yt==0))
    acc = (TP+TN)/len(yt)
    fpr = FP/(FP+TN) if (FP+TN)>0 else 0
    fnr = FN/(FN+TP) if (FN+TP)>0 else 0
    return acc, fpr, fnr, len(yt)

print("AI FAIRNESS AUDIT REPORT")
print("=" * 48)
print(f"  {'Metric':<26} {'Group A':>9}  {'Group B':>9}")
print("  " + "-" * 44)

for metric_name, idx in [("Accuracy", 0), ("False Positive Rate", 1), ("False Negative Rate", 2)]:
    ma = group_metrics(group == 'A', y_true, y_pred)
    mb = group_metrics(group == 'B', y_true, y_pred)
    diff = abs(ma[idx] - mb[idx])
    flag = "⚠ BIAS" if diff > 0.05 else "✓ OK"
    print(f"  {metric_name:<26} {ma[idx]:>8.3f}   {mb[idx]:>8.3f}   {flag}")

print("  " + "-" * 44)
ma = group_metrics(group == 'A', y_true, y_pred)
mb = group_metrics(group == 'B', y_true, y_pred)
print(f"  {'Sample size':<26} {ma[3]:>9}  {mb[3]:>9}")
print("\n  Bias threshold: >5% gap between groups flags concern.")

Programming in Artificial Intelligence

What Is Artificial Intelligence?

The Numerical Foundation of AI

Preparing Data for AI Models

The Fundamental ML Algorithm

Instance-Based Learning

Building the XOR Network

TF-IDF Text Representation

Unsupervised Learning

Measuring Performance

Language Models and Responsible AI

Comments & Feedback

Frequently Asked Questions

What Is Artificial Intelligence?

The Numerical Foundation of AI

Preparing Data for AI Models

The Fundamental ML Algorithm

Instance-Based Learning

Building the XOR Network

TF-IDF Text Representation

Unsupervised Learning

Measuring Performance

Language Models and Responsible AI

Comments & Feedback

Frequently Asked Questions

Stay in the Loop