Advanced Data Structures
Built-in Collections
Python's foundational data structures provide efficient storage and retrieval mechanisms for different use cases.
| Type | Syntax | Features |
|---|---|---|
| List | [] |
Ordered, Mutable, Allows duplicates |
| Tuple | () |
Ordered, Immutable, Hashable |
| Set | {} |
Unique, Unordered, Fast lookup |
| Dict | {k:v} |
Key-value pairs, O(1) access |
Collections Module
High-performance container datatypes that extend Python's built-in collections:
namedtuple()
Create tuple subclasses with named fields for better code readability:
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
p = Point(11, y=22)
print(p.x, p.y) # 11 22
deque
Double-ended queue with O(1) appends/pops from both ends:
from collections import deque
d = deque([1, 2, 3])
d.appendleft(0)
d.append(4)
# deque([0, 1, 2, 3, 4])
Counter
Count hashable objects efficiently:
from collections import Counter
words = ['red', 'blue', 'red', 'green', 'blue', 'blue']
counter = Counter(words)
print(counter.most_common(2))
# [('blue', 3), ('red', 2)]
defaultdict
Dictionary with default values for missing keys:
from collections import defaultdict
dd = defaultdict(list)
dd['colors'].append('red')
# No KeyError, creates empty list
Control Flow Mastery
Conditional Statements
Python's control flow structures enable decision-making in your code:
# Standard if-elif-else
score = 85
if score >= 90:
grade = 'A'
elif score >= 80:
grade = 'B'
elif score >= 70:
grade = 'C'
else:
grade = 'F'
# Ternary operator
status = 'Pass' if score >= 60 else 'Fail'
match status_code:
case 200:
print("Success")
case 404:
print("Not Found")
case _:
print("Other status")
Loops & Iteration
For Loops
# Iterate over sequence
for item in [1, 2, 3]:
print(item)
# With index using enumerate
for idx, value in enumerate(['a', 'b', 'c']):
print(f"{idx}: {value}")
# Iterate over dictionary
for key, value in my_dict.items():
print(f"{key} = {value}")
While Loops
count = 0
while count < 5:
print(count)
count += 1
Loop Control
break- Exit loop immediatelycontinue- Skip to next iterationelse- Execute if loop completes normally
for num in range(10):
if num == 3:
continue # Skip 3
if num == 7:
break # Stop at 7
print(num)
else:
print("Loop finished")
Exception Handling
Gracefully handle errors and edge cases:
try:
result = 10 / 0
except ZeroDivisionError as e:
print(f"Error: {e}")
except Exception as e:
print(f"Unexpected: {e}")
else:
print("No errors")
finally:
print("Always runs")
except: clauses.
Custom Exceptions
class ValidationError(Exception):
pass
def validate_age(age):
if age < 0:
raise ValidationError("Age cannot be negative")
Pythonic Comprehensions
Comprehensions provide concise syntax for creating sequences, offering both clarity and performance benefits over traditional loops.
List Comprehensions
Create lists in a single, readable line:
# Basic list comprehension
squares = [x**2 for x in range(10)]
# With conditional filter
even_squares = [x**2 for x in range(10) if x % 2 == 0]
# Nested comprehension
matrix = [[i*j for j in range(3)] for i in range(3)]
# If-else in comprehension
labels = ['even' if x % 2 == 0 else 'odd' for x in range(5)]
Dictionary Comprehensions
Build dictionaries efficiently:
# Create dict from two lists
keys = ['a', 'b', 'c']
values = [1, 2, 3]
mapping = {k: v for k, v in zip(keys, values)}
# Transform dictionary values
original = {'name': 'john', 'city': 'nyc'}
uppercase = {k: v.upper() for k, v in original.items()}
# Filter dictionary
scores = {'Alice': 85, 'Bob': 92, 'Charlie': 78}
high_scores = {k: v for k, v in scores.items() if v >= 80}
Set Comprehensions
Generate unique collections:
# Unique squares
unique_squares = {x**2 for x in [-2, -1, 0, 1, 2]}
# Result: {0, 1, 4}
Generator Expressions
Memory-efficient iteration for large datasets:
# Generator (uses parentheses)
gen = (x**2 for x in range(1000000))
# Lazy evaluation - values computed on demand
for value in gen:
if value > 100:
break
Object Oriented Programming
The Anatomy of a Class
Classes are blueprints for creating objects with shared attributes and behaviors:
class ProjectManager:
# Class variable (shared across instances)
company = "TechCorp"
def __init__(self, name, budget):
# Instance variables
self.name = name
self.__budget = budget # Private attribute
self._projects = [] # Protected by convention
# Property decorator for controlled access
@property
def budget(self):
return self.__budget
@budget.setter
def budget(self, value):
if value < 0:
raise ValueError("Budget cannot be negative")
self.__budget = value
# Instance method
def add_project(self, project):
self._projects.append(project)
return f"Added {project}"
# Class method
@classmethod
def from_config(cls, config_dict):
return cls(config_dict['name'], config_dict['budget'])
# Static method
@staticmethod
def calculate_tax(amount):
return amount * 0.15
Usage Example:
pm = ProjectManager("Alice", 50000)
print(pm.budget) # 50000
pm.budget = 60000 # Using setter
pm.add_project("Website Redesign")
Access Modifiers & Encapsulation
| Naming | Convention | Access Level |
|---|---|---|
public_attr |
No prefix | Fully accessible |
_protected |
Single underscore | Internal use (by convention) |
__private |
Double underscore | Name mangled (pseudo-private) |
_ClassName__attribute.
Data Classes (Python 3.7+)
Simplified class creation for data storage:
from dataclasses import dataclass
@dataclass
class Employee:
name: str
department: str
salary: float
active: bool = True
# Auto-generates __init__, __repr__, __eq__, etc.
emp = Employee("Bob", "Engineering", 75000)
Inheritance & Mixins
Single Inheritance
Child classes inherit attributes and methods from parent classes:
class Vehicle:
def __init__(self, brand, model):
self.brand = brand
self.model = model
def start(self):
return f"{self.brand} {self.model} is starting"
class ElectricCar(Vehicle):
def __init__(self, brand, model, battery_capacity):
super().__init__(brand, model)
self.battery_capacity = battery_capacity
def charge(self):
return f"Charging {self.battery_capacity}kWh battery"
tesla = ElectricCar("Tesla", "Model 3", 75)
print(tesla.start()) # Inherited method
print(tesla.charge()) # New method
Multiple Inheritance & MRO
Python supports multiple inheritance using the Method Resolution Order (MRO):
class Flyable:
def fly(self):
return "Flying in the sky"
class Swimmable:
def swim(self):
return "Swimming in water"
class Duck(Flyable, Swimmable):
def quack(self):
return "Quack!"
duck = Duck()
print(duck.fly())
print(duck.swim())
print(Duck.__mro__) # View resolution order
Super() Deep Dive
The super() function is used to call methods from parent classes without naming them explicitly:
class Parent:
def __init__(self, name):
self.name = name
print(f"Parent init: {name}")
class Child(Parent):
def __init__(self, name, age):
super().__init__(name) # Call parent __init__
self.age = age
print(f"Child init: {age}")
Cooperative Multiple Inheritance
class A:
def method(self):
print("A")
super().method() # Cooperative
class B:
def method(self):
print("B")
class C(A, B):
def method(self):
print("C")
super().method()
c = C()
c.method() # Output: C, A, B (follows MRO)
Abstract Base Classes
Define interfaces that subclasses must implement:
from abc import ABC, abstractmethod
class Shape(ABC):
@abstractmethod
def area(self):
pass
@abstractmethod
def perimeter(self):
pass
class Rectangle(Shape):
def __init__(self, width, height):
self.width = width
self.height = height
def area(self):
return self.width * self.height
def perimeter(self):
return 2 * (self.width + self.height)
# Cannot instantiate Shape directly
# shape = Shape() # TypeError
rect = Rectangle(5, 3)
Magic Methods (Dunder Methods)
Magic methods (double underscore methods) allow you to define how objects behave with Python's built-in operations:
Object Lifecycle
__init__(self)- Constructor__new__(cls)- Instance creation__del__(self)- Destructor
String Representation
__str__(self)- Human-readable string__repr__(self)- Developer-friendly representation
Comparison Operators
__eq__(self, other)- Equal to (==)__lt__(self, other)- Less than (<)__le__(self, other)- Less than or equal (<=)__gt__(self, other)- Greater than (>)__ge__(self, other)- Greater than or equal (>=)
Arithmetic Operators
__add__(self, other)- Addition (+)__sub__(self, other)- Subtraction (-)__mul__(self, other)- Multiplication (*)__truediv__(self, other)- Division (/)__mod__(self, other)- Modulo (%)
Container Methods
__len__(self)- Length__getitem__(self, key)- Indexing []__setitem__(self, key, value)- Assignment []__contains__(self, item)- Membership (in)__iter__(self)- Make iterable
Practical Example
class Vector:
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
return f"Vector({self.x}, {self.y})"
def __add__(self, other):
return Vector(self.x + other.x, self.y + other.y)
def __mul__(self, scalar):
return Vector(self.x * scalar, self.y * scalar)
def __eq__(self, other):
return self.x == other.x and self.y == other.y
def __len__(self):
return int((self.x**2 + self.y**2)**0.5)
# Usage
v1 = Vector(2, 3)
v2 = Vector(1, 1)
v3 = v1 + v2 # Vector(3, 4)
v4 = v1 * 2 # Vector(4, 6)
print(len(v3)) # 5
__enter__ and __exit__ to create objects that work with the with statement.
Standard Library: OS & Sys
OS Module
Interface with the operating system for file and directory operations:
import os
# Directory operations
current_dir = os.getcwd() # Get current directory
os.chdir('/path/to/dir') # Change directory
os.mkdir('new_folder') # Create directory
os.makedirs('path/to/nested/dir', exist_ok=True)
# File operations
os.remove('file.txt') # Delete file
os.rename('old.txt', 'new.txt') # Rename
# Path operations
print(os.path.exists('file.txt')) # Check existence
print(os.path.isfile('data.csv')) # Is it a file?
print(os.path.isdir('folder')) # Is it a directory?
print(os.path.join('dir', 'file.txt')) # Join paths
# Environment variables
home = os.environ.get('HOME')
os.environ['CUSTOM_VAR'] = 'value'
Listing Directory Contents
# List all files and folders
items = os.listdir('.')
# Walk directory tree
for root, dirs, files in os.walk('.'):
for file in files:
full_path = os.path.join(root, file)
print(full_path)
Sys Module
System-specific parameters and functions:
import sys
# Python version info
print(sys.version) # Full version string
print(sys.version_info) # Version tuple
# Command line arguments
print(sys.argv) # List of arguments
# Example: python script.py arg1 arg2
# sys.argv = ['script.py', 'arg1', 'arg2']
# Module search path
print(sys.path) # List of import paths
sys.path.append('/custom/path')
# Standard streams
sys.stdout.write('Output\n')
sys.stderr.write('Error message\n')
# Exit program
sys.exit(0) # Exit code 0 = success
Platform Information
print(sys.platform) # 'linux', 'darwin', 'win32'
print(sys.maxsize) # Largest integer
print(sys.executable) # Python interpreter path
sys.getsizeof(object) to check memory usage of objects.
Pathlib Reference
Object-oriented filesystem paths - the modern, Pythonic way to handle file paths:
from pathlib import Path
# Create path objects
current = Path.cwd() # Current directory
home = Path.home() # Home directory
path = Path('data/files/report.txt')
# Path properties
print(path.name) # 'report.txt'
print(path.stem) # 'report'
print(path.suffix) # '.txt'
print(path.parent) # 'data/files'
print(path.parts) # ('data', 'files', 'report.txt')
# Join paths (modern way)
config_path = Path('config') / 'settings.json'
log_path = Path.home() / '.logs' / 'app.log'
File Operations
Reading & Writing
path = Path('data.txt')
# Read entire file
content = path.read_text()
data = path.read_bytes()
# Write to file
path.write_text('Hello World')
path.write_bytes(b'\x89PNG')
# Append to file
with path.open('a') as f:
f.write('New line\n')
Checking & Creating
# Check existence
if path.exists():
print("File exists")
if path.is_file():
print("It's a file")
if path.is_dir():
print("It's a directory")
# Create directories
Path('logs').mkdir(exist_ok=True)
Path('a/b/c').mkdir(parents=True)
Globbing & Iteration
# Find all Python files
py_files = Path('.').glob('*.py')
for file in py_files:
print(file)
# Recursive search
all_txt = Path('.').rglob('*.txt') # Search subdirectories
# Iterate directory contents
for item in Path('.').iterdir():
if item.is_file():
print(f"File: {item.name}")
Advanced Operations
# Resolve absolute path
abs_path = path.resolve()
# Get file stats
stats = path.stat()
print(f"Size: {stats.st_size} bytes")
print(f"Modified: {stats.st_mtime}")
# Rename/move file
path.rename('new_name.txt')
# Delete file
path.unlink() # Delete file
path.rmdir() # Remove empty directory
pathlib.Path over os.path for modern Python code. It's more readable and cross-platform compatible.
Itertools - Efficient Iteration Tools
The itertools module provides memory-efficient tools for working with iterators:
Infinite Iterators
from itertools import count, cycle, repeat
# count(start, step) - infinite counter
for i in count(10, 2):
if i > 20: break
print(i) # 10, 12, 14, 16, 18, 20
# cycle(iterable) - infinite cycle
colors = cycle(['red', 'green', 'blue'])
# Cycles forever: red, green, blue, red...
# repeat(value, times) - repeat value
list(repeat(10, 3)) # [10, 10, 10]
Combinatoric Iterators
from itertools import (
permutations,
combinations,
product
)
# All permutations
list(permutations([1, 2, 3], 2))
# [(1,2), (1,3), (2,1), (2,3), (3,1), (3,2)]
# Combinations (no repeats)
list(combinations([1, 2, 3], 2))
# [(1,2), (1,3), (2,3)]
# Cartesian product
list(product('AB', [1, 2]))
# [('A',1), ('A',2), ('B',1), ('B',2)]
Filtering & Slicing
from itertools import filterfalse, islice, takewhile, dropwhile
# filterfalse - opposite of filter
data = [1, 2, 3, 4, 5, 6]
list(filterfalse(lambda x: x % 2 == 0, data)) # [1, 3, 5]
# islice - slice iterator without loading into memory
list(islice(count(), 5, 10)) # [5, 6, 7, 8, 9]
# takewhile - take elements while condition is true
list(takewhile(lambda x: x < 5, [1, 4, 6, 2, 1])) # [1, 4]
# dropwhile - drop elements while condition is true
list(dropwhile(lambda x: x < 5, [1, 4, 6, 2, 1])) # [6, 2, 1]
Grouping & Chaining
from itertools import groupby, chain
# groupby - group consecutive elements
data = [1, 1, 2, 2, 2, 3, 1]
for key, group in groupby(data):
print(f"{key}: {list(group)}")
# 1: [1, 1]
# 2: [2, 2, 2]
# 3: [3]
# 1: [1]
# chain - combine multiple iterables
list(chain([1, 2], [3, 4], [5])) # [1, 2, 3, 4, 5]
# chain.from_iterable - flatten nested iterables
nested = [[1, 2], [3, 4], [5]]
list(chain.from_iterable(nested)) # [1, 2, 3, 4, 5]
Practical Examples
from itertools import zip_longest, accumulate
import operator
# zip_longest - zip with fillvalue for unequal lengths
list(zip_longest([1, 2], ['a', 'b', 'c'], fillvalue=0))
# [(1, 'a'), (2, 'b'), (0, 'c')]
# accumulate - running totals
list(accumulate([1, 2, 3, 4])) # [1, 3, 6, 10]
# accumulate with custom function
list(accumulate([1, 2, 3, 4], operator.mul)) # [1, 2, 6, 24]