02 — PYTHON & NUMPY FOUNDATION
The Numerical Foundation of AI
NumPy provides fast, vectorised N-dimensional arrays backed by optimised C code. Every major ML library — PyTorch, TensorFlow, Scikit-learn — operates on NumPy-compatible arrays internally. Vectorised operations avoid slow Python loops, running at near-C speed. Arrays support broadcasting, slicing, reshaping, and full linear algebra operations natively, making them the universal currency of numerical AI computation.
03 — DATA PREPROCESSING
Preparing Data for AI Models
Raw data is almost never ready for a machine learning model. Preprocessing transforms messy, inconsistent input data into a clean, structured format that a model can learn from effectively. The quality of preprocessing directly determines the quality of the model — a well-preprocessed simple model consistently outperforms a complex model fed raw, unclean data. Core steps include handling missing values, encoding categorical variables, scaling numerical features, and splitting data into training and test sets.
04 — LINEAR REGRESSION
The Fundamental ML Algorithm
Linear regression is the simplest and most fundamental ML algorithm. It models the relationship between input features and a continuous output as a straight line (or hyperplane in higher dimensions). The model finds the weights that minimise the Mean Squared Error between predictions and true values, either by solving the closed-form normal equation or iteratively via gradient descent.
05 — K-NEAREST NEIGHBOURS
Instance-Based Learning
K-Nearest Neighbours (KNN) is a simple, non-parametric classifier that stores all training instances and predicts labels based on the majority class of the k closest points in feature space. It's particularly useful for multi-class problems and when decision boundaries are irregular.
06 — NEURAL NETWORKS FROM SCRATCH
Building the XOR Network
A neural network consists of layers of neurons, each computing a weighted sum of its inputs followed by a non-linear activation function. The network learns by adjusting its weights through backpropagation — computing the gradient of a loss function with respect to each weight, then taking a small step in the direction that reduces the loss. This XOR example demonstrates how a simple network solves a non-linearly separable problem.
07 — NATURAL LANGUAGE PROCESSING
TF-IDF Text Representation
Before any model can process text, it must be converted into numerical form. This pipeline involves tokenisation, building a vocabulary, and representing documents as vectors — either using bag-of-words, TF-IDF, or modern contextual embeddings. TF-IDF (Term Frequency-Inverse Document Frequency) weights words by their importance in a document relative to the whole corpus.
08 — CLUSTERING (K-MEANS)
Unsupervised Learning
Unsupervised learning finds hidden structure in data without any labelled examples. K-Means clustering partitions data into K groups by iteratively assigning points to the nearest centroid and recomputing centroids until convergence. Applications include customer segmentation, document grouping, and anomaly detection.
09 — MODEL EVALUATION
Measuring Performance
Accuracy is intuitive but misleading on imbalanced datasets. Precision measures the fraction of positive predictions that are actually positive. Recall measures the fraction of actual positives that were detected. The F1 score balances both. ROC-AUC measures the model's ability to discriminate between classes across all thresholds.
10 — GENERATIVE AI & ETHICS
Language Models and Responsible AI
Language models learn to predict the next token in a sequence given all preceding tokens. Through this simple objective applied to enormous text corpora, they develop the ability to reason, summarise, translate, and answer questions. Building AI responsibly requires understanding Fairness (audit for bias), Transparency (document limitations), Privacy (protect sensitive data), and Robustness (test edge cases).
DISCUSSION
Comments & Feedback
This is a demo comment section using browser storage. To connect a real backend, replace with your own service.