Advanced Data Structures

Built-in Collections

Python's foundational data structures provide efficient storage and retrieval mechanisms for different use cases.

Type	Syntax	Features
List	`[]`	Ordered, Mutable, Allows duplicates
Tuple	`()`	Ordered, Immutable, Hashable
Set	`{}`	Unique, Unordered, Fast lookup
Dict	`{k:v}`	Key-value pairs, O(1) access

Performance Tip: Use sets for membership testing, dicts for key-based lookups, and lists for ordered sequences.

Collections Module

High-performance container datatypes that extend Python's built-in collections:

namedtuple()

Create tuple subclasses with named fields for better code readability:

from collections import namedtuple

Point = namedtuple('Point', ['x', 'y'])
p = Point(11, y=22)
print(p.x, p.y)  # 11 22

deque

Double-ended queue with O(1) appends/pops from both ends:

from collections import deque

d = deque([1, 2, 3])
d.appendleft(0)
d.append(4)
# deque([0, 1, 2, 3, 4])

Counter

Count hashable objects efficiently:

from collections import Counter

words = ['red', 'blue', 'red', 'green', 'blue', 'blue']
counter = Counter(words)
print(counter.most_common(2))
# [('blue', 3), ('red', 2)]

defaultdict

Dictionary with default values for missing keys:

from collections import defaultdict

dd = defaultdict(list)
dd['colors'].append('red')
# No KeyError, creates empty list

Control Flow Mastery

Conditional Statements

Python's control flow structures enable decision-making in your code:

# Standard if-elif-else
score = 85

if score >= 90:
    grade = 'A'
elif score >= 80:
    grade = 'B'
elif score >= 70:
    grade = 'C'
else:
    grade = 'F'

# Ternary operator
status = 'Pass' if score >= 60 else 'Fail'

Match-Case (Python 3.10+): Structural pattern matching for complex conditions.

match status_code:
    case 200:
        print("Success")
    case 404:
        print("Not Found")
    case _:
        print("Other status")

Loops & Iteration

For Loops

# Iterate over sequence
for item in [1, 2, 3]:
    print(item)

# With index using enumerate
for idx, value in enumerate(['a', 'b', 'c']):
    print(f"{idx}: {value}")

# Iterate over dictionary
for key, value in my_dict.items():
    print(f"{key} = {value}")

While Loops

count = 0
while count < 5:
    print(count)
    count += 1

Loop Control

break - Exit loop immediately
continue - Skip to next iteration
else - Execute if loop completes normally

for num in range(10):
    if num == 3:
        continue  # Skip 3
    if num == 7:
        break  # Stop at 7
    print(num)
else:
    print("Loop finished")

Exception Handling

Gracefully handle errors and edge cases:

try:
    result = 10 / 0
except ZeroDivisionError as e:
    print(f"Error: {e}")
except Exception as e:
    print(f"Unexpected: {e}")
else:
    print("No errors")
finally:
    print("Always runs")

Best Practice: Catch specific exceptions rather than bare except: clauses.

Custom Exceptions

class ValidationError(Exception):
    pass

def validate_age(age):
    if age < 0:
        raise ValidationError("Age cannot be negative")

Pythonic Comprehensions

Comprehensions provide concise syntax for creating sequences, offering both clarity and performance benefits over traditional loops.

List Comprehensions

Create lists in a single, readable line:

# Basic list comprehension
squares = [x**2 for x in range(10)]

# With conditional filter
even_squares = [x**2 for x in range(10) if x % 2 == 0]

# Nested comprehension
matrix = [[i*j for j in range(3)] for i in range(3)]

# If-else in comprehension
labels = ['even' if x % 2 == 0 else 'odd' for x in range(5)]

Dictionary Comprehensions

Build dictionaries efficiently:

# Create dict from two lists
keys = ['a', 'b', 'c']
values = [1, 2, 3]
mapping = {k: v for k, v in zip(keys, values)}

# Transform dictionary values
original = {'name': 'john', 'city': 'nyc'}
uppercase = {k: v.upper() for k, v in original.items()}

# Filter dictionary
scores = {'Alice': 85, 'Bob': 92, 'Charlie': 78}
high_scores = {k: v for k, v in scores.items() if v >= 80}

Set Comprehensions

Generate unique collections:

# Unique squares
unique_squares = {x**2 for x in [-2, -1, 0, 1, 2]}
# Result: {0, 1, 4}

Generator Expressions

Memory-efficient iteration for large datasets:

# Generator (uses parentheses)
gen = (x**2 for x in range(1000000))

# Lazy evaluation - values computed on demand
for value in gen:
    if value > 100:
        break

Performance Note: Generator expressions use minimal memory as they compute values on-the-fly, making them ideal for large datasets.

Object Oriented Programming

The Anatomy of a Class

Classes are blueprints for creating objects with shared attributes and behaviors:

class ProjectManager:
    # Class variable (shared across instances)
    company = "TechCorp"
    
    def __init__(self, name, budget):
        # Instance variables
        self.name = name
        self.__budget = budget  # Private attribute
        self._projects = []  # Protected by convention
    
    # Property decorator for controlled access
    @property
    def budget(self):
        return self.__budget
    
    @budget.setter
    def budget(self, value):
        if value < 0:
            raise ValueError("Budget cannot be negative")
        self.__budget = value
    
    # Instance method
    def add_project(self, project):
        self._projects.append(project)
        return f"Added {project}"
    
    # Class method
    @classmethod
    def from_config(cls, config_dict):
        return cls(config_dict['name'], config_dict['budget'])
    
    # Static method
    @staticmethod
    def calculate_tax(amount):
        return amount * 0.15

Usage Example:

pm = ProjectManager("Alice", 50000)
print(pm.budget)  # 50000
pm.budget = 60000  # Using setter
pm.add_project("Website Redesign")

Access Modifiers & Encapsulation

Naming	Convention	Access Level
`public_attr`	No prefix	Fully accessible
`_protected`	Single underscore	Internal use (by convention)
`__private`	Double underscore	Name mangled (pseudo-private)

Important: Python doesn't enforce true private attributes. Double underscore triggers name mangling to _ClassName__attribute.

Data Classes (Python 3.7+)

Simplified class creation for data storage:

from dataclasses import dataclass

@dataclass
class Employee:
    name: str
    department: str
    salary: float
    active: bool = True
    
# Auto-generates __init__, __repr__, __eq__, etc.
emp = Employee("Bob", "Engineering", 75000)

Inheritance & Mixins

Single Inheritance

Child classes inherit attributes and methods from parent classes:

class Vehicle:
    def __init__(self, brand, model):
        self.brand = brand
        self.model = model
    
    def start(self):
        return f"{self.brand} {self.model} is starting"

class ElectricCar(Vehicle):
    def __init__(self, brand, model, battery_capacity):
        super().__init__(brand, model)
        self.battery_capacity = battery_capacity
    
    def charge(self):
        return f"Charging {self.battery_capacity}kWh battery"

tesla = ElectricCar("Tesla", "Model 3", 75)
print(tesla.start())  # Inherited method
print(tesla.charge())  # New method

Multiple Inheritance & MRO

Python supports multiple inheritance using the Method Resolution Order (MRO):

class Flyable:
    def fly(self):
        return "Flying in the sky"

class Swimmable:
    def swim(self):
        return "Swimming in water"

class Duck(Flyable, Swimmable):
    def quack(self):
        return "Quack!"

duck = Duck()
print(duck.fly())
print(duck.swim())
print(Duck.__mro__)  # View resolution order

MRO Algorithm: Python uses C3 linearization to determine the order in which base classes are searched when looking for a method.

Super() Deep Dive

The super() function is used to call methods from parent classes without naming them explicitly:

class Parent:
    def __init__(self, name):
        self.name = name
        print(f"Parent init: {name}")

class Child(Parent):
    def __init__(self, name, age):
        super().__init__(name)  # Call parent __init__
        self.age = age
        print(f"Child init: {age}")

Cooperative Multiple Inheritance

class A:
    def method(self):
        print("A")
        super().method()  # Cooperative

class B:
    def method(self):
        print("B")

class C(A, B):
    def method(self):
        print("C")
        super().method()

c = C()
c.method()  # Output: C, A, B (follows MRO)

Abstract Base Classes

Define interfaces that subclasses must implement:

from abc import ABC, abstractmethod

class Shape(ABC):
    @abstractmethod
    def area(self):
        pass
    
    @abstractmethod
    def perimeter(self):
        pass

class Rectangle(Shape):
    def __init__(self, width, height):
        self.width = width
        self.height = height
    
    def area(self):
        return self.width * self.height
    
    def perimeter(self):
        return 2 * (self.width + self.height)

# Cannot instantiate Shape directly
# shape = Shape()  # TypeError
rect = Rectangle(5, 3)

Magic Methods (Dunder Methods)

Magic methods (double underscore methods) allow you to define how objects behave with Python's built-in operations:

Object Lifecycle

__init__(self) - Constructor
__new__(cls) - Instance creation
__del__(self) - Destructor

String Representation

__str__(self) - Human-readable string
__repr__(self) - Developer-friendly representation

Comparison Operators

__eq__(self, other) - Equal to (==)
__lt__(self, other) - Less than (<)
__le__(self, other) - Less than or equal (<=)
__gt__(self, other) - Greater than (>)
__ge__(self, other) - Greater than or equal (>=)

Arithmetic Operators

__add__(self, other) - Addition (+)
__sub__(self, other) - Subtraction (-)
__mul__(self, other) - Multiplication (*)
__truediv__(self, other) - Division (/)
__mod__(self, other) - Modulo (%)

Container Methods

__len__(self) - Length
__getitem__(self, key) - Indexing []
__setitem__(self, key, value) - Assignment []
__contains__(self, item) - Membership (in)
__iter__(self) - Make iterable

Practical Example

class Vector:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    
    def __repr__(self):
        return f"Vector({self.x}, {self.y})"
    
    def __add__(self, other):
        return Vector(self.x + other.x, self.y + other.y)
    
    def __mul__(self, scalar):
        return Vector(self.x * scalar, self.y * scalar)
    
    def __eq__(self, other):
        return self.x == other.x and self.y == other.y
    
    def __len__(self):
        return int((self.x**2 + self.y**2)**0.5)

# Usage
v1 = Vector(2, 3)
v2 = Vector(1, 1)
v3 = v1 + v2  # Vector(3, 4)
v4 = v1 * 2   # Vector(4, 6)
print(len(v3))  # 5

Context Managers: Use __enter__ and __exit__ to create objects that work with the with statement.

Standard Library: OS & Sys

OS Module

Interface with the operating system for file and directory operations:

import os

# Directory operations
current_dir = os.getcwd()  # Get current directory
os.chdir('/path/to/dir')  # Change directory
os.mkdir('new_folder')  # Create directory
os.makedirs('path/to/nested/dir', exist_ok=True)

# File operations
os.remove('file.txt')  # Delete file
os.rename('old.txt', 'new.txt')  # Rename

# Path operations
print(os.path.exists('file.txt'))  # Check existence
print(os.path.isfile('data.csv'))  # Is it a file?
print(os.path.isdir('folder'))  # Is it a directory?
print(os.path.join('dir', 'file.txt'))  # Join paths

# Environment variables
home = os.environ.get('HOME')
os.environ['CUSTOM_VAR'] = 'value'

Listing Directory Contents

# List all files and folders
items = os.listdir('.')

# Walk directory tree
for root, dirs, files in os.walk('.'):
    for file in files:
        full_path = os.path.join(root, file)
        print(full_path)

Sys Module

System-specific parameters and functions:

import sys

# Python version info
print(sys.version)  # Full version string
print(sys.version_info)  # Version tuple

# Command line arguments
print(sys.argv)  # List of arguments
# Example: python script.py arg1 arg2
# sys.argv = ['script.py', 'arg1', 'arg2']

# Module search path
print(sys.path)  # List of import paths
sys.path.append('/custom/path')

# Standard streams
sys.stdout.write('Output\n')
sys.stderr.write('Error message\n')

# Exit program
sys.exit(0)  # Exit code 0 = success

Platform Information

print(sys.platform)  # 'linux', 'darwin', 'win32'
print(sys.maxsize)  # Largest integer
print(sys.executable)  # Python interpreter path

Memory Tip: Use sys.getsizeof(object) to check memory usage of objects.

Pathlib Reference

Object-oriented filesystem paths - the modern, Pythonic way to handle file paths:

from pathlib import Path

# Create path objects
current = Path.cwd()  # Current directory
home = Path.home()  # Home directory
path = Path('data/files/report.txt')

# Path properties
print(path.name)  # 'report.txt'
print(path.stem)  # 'report'
print(path.suffix)  # '.txt'
print(path.parent)  # 'data/files'
print(path.parts)  # ('data', 'files', 'report.txt')

# Join paths (modern way)
config_path = Path('config') / 'settings.json'
log_path = Path.home() / '.logs' / 'app.log'

File Operations

Reading & Writing

path = Path('data.txt')

# Read entire file
content = path.read_text()
data = path.read_bytes()

# Write to file
path.write_text('Hello World')
path.write_bytes(b'\x89PNG')

# Append to file
with path.open('a') as f:
    f.write('New line\n')

Checking & Creating

# Check existence
if path.exists():
    print("File exists")

if path.is_file():
    print("It's a file")

if path.is_dir():
    print("It's a directory")

# Create directories
Path('logs').mkdir(exist_ok=True)
Path('a/b/c').mkdir(parents=True)

Globbing & Iteration

# Find all Python files
py_files = Path('.').glob('*.py')
for file in py_files:
    print(file)

# Recursive search
all_txt = Path('.').rglob('*.txt')  # Search subdirectories

# Iterate directory contents
for item in Path('.').iterdir():
    if item.is_file():
        print(f"File: {item.name}")

Advanced Operations

# Resolve absolute path
abs_path = path.resolve()

# Get file stats
stats = path.stat()
print(f"Size: {stats.st_size} bytes")
print(f"Modified: {stats.st_mtime}")

# Rename/move file
path.rename('new_name.txt')

# Delete file
path.unlink()  # Delete file
path.rmdir()  # Remove empty directory

Best Practice: Prefer pathlib.Path over os.path for modern Python code. It's more readable and cross-platform compatible.

Itertools - Efficient Iteration Tools

The itertools module provides memory-efficient tools for working with iterators:

Infinite Iterators

from itertools import count, cycle, repeat

# count(start, step) - infinite counter
for i in count(10, 2):
    if i > 20: break
    print(i)  # 10, 12, 14, 16, 18, 20

# cycle(iterable) - infinite cycle
colors = cycle(['red', 'green', 'blue'])
# Cycles forever: red, green, blue, red...

# repeat(value, times) - repeat value
list(repeat(10, 3))  # [10, 10, 10]

Combinatoric Iterators

from itertools import (
    permutations, 
    combinations, 
    product
)

# All permutations
list(permutations([1, 2, 3], 2))
# [(1,2), (1,3), (2,1), (2,3), (3,1), (3,2)]

# Combinations (no repeats)
list(combinations([1, 2, 3], 2))
# [(1,2), (1,3), (2,3)]

# Cartesian product
list(product('AB', [1, 2]))
# [('A',1), ('A',2), ('B',1), ('B',2)]

Filtering & Slicing

from itertools import filterfalse, islice, takewhile, dropwhile

# filterfalse - opposite of filter
data = [1, 2, 3, 4, 5, 6]
list(filterfalse(lambda x: x % 2 == 0, data))  # [1, 3, 5]

# islice - slice iterator without loading into memory
list(islice(count(), 5, 10))  # [5, 6, 7, 8, 9]

# takewhile - take elements while condition is true
list(takewhile(lambda x: x < 5, [1, 4, 6, 2, 1]))  # [1, 4]

# dropwhile - drop elements while condition is true
list(dropwhile(lambda x: x < 5, [1, 4, 6, 2, 1]))  # [6, 2, 1]

Grouping & Chaining

from itertools import groupby, chain

# groupby - group consecutive elements
data = [1, 1, 2, 2, 2, 3, 1]
for key, group in groupby(data):
    print(f"{key}: {list(group)}")
# 1: [1, 1]
# 2: [2, 2, 2]
# 3: [3]
# 1: [1]

# chain - combine multiple iterables
list(chain([1, 2], [3, 4], [5]))  # [1, 2, 3, 4, 5]

# chain.from_iterable - flatten nested iterables
nested = [[1, 2], [3, 4], [5]]
list(chain.from_iterable(nested))  # [1, 2, 3, 4, 5]

Practical Examples

from itertools import zip_longest, accumulate
import operator

# zip_longest - zip with fillvalue for unequal lengths
list(zip_longest([1, 2], ['a', 'b', 'c'], fillvalue=0))
# [(1, 'a'), (2, 'b'), (0, 'c')]

# accumulate - running totals
list(accumulate([1, 2, 3, 4]))  # [1, 3, 6, 10]

# accumulate with custom function
list(accumulate([1, 2, 3, 4], operator.mul))  # [1, 2, 6, 24]

Memory Efficiency: Itertools functions return iterators, not lists. They generate values on-demand, saving memory for large datasets.