This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
This is a Python workspace for Jupyter notebooks that accompany blog posts on the parent Jekyll site (git-steven.github.io). Notebooks cover data visualization, PySpark, scikit-learn feature engineering, coupling metrics analysis, and architectural proposals (IoC/DI patterns, DIKW frameworks).
This directory lives inside the parent Jekyll repo but has its own Python toolchain (Poetry, .venv, pyproject.toml).
# Install dependencies (uses Poetry with pyproject.toml)
poetry install
# Start JupyterLab
bin/start # or: poetry run jupyter lab
# Run tests
poetry run pytest
# Run a single test file
poetry run pytest tests/test_foo.py
# Run a single test
poetry run pytest tests/test_foo.py::test_bar
# Lint
poetry run ruff check .
# Build Spark Docker image (for PySpark notebooks)
./build-spark.sh
# Start Spark cluster
docker compose up
# Shell into Spark container
./docker-sh.sh
This is not a traditional Python application. Content is organized as standalone Jupyter notebooks at the root level, each supporting a blog post topic:
pyspark01.ipynb / pyspark02.ipynb - PySpark tutorials using Harry Potter datasetssklearn-feature-engineering*.ipynb - scikit-learn feature engineeringcoupling-metrics-*.ipynb - Software coupling metrics analysis and visualizationfast-api-sqs-labmda.ipynb - FastAPI + SQS + Lambda patternsdikw.ipynb - DIKW pyramid visualizationioc_proposal_notebook.ipynb - IoC/dependency injection proposal--cov=modgud (coverage target from a related project; may need updating for local use)PySpark notebooks use a Dockerized Spark cluster:
Dockerfile builds terracoil/spark image (bitnami/spark + graphframes)docker-compose.yml defines master/worker topologyconf/ and spark-defaults.conf for Spark configurationdata/ contains CSV datasets (Harry Potter scripts/characters, Medium search data, Game of Thrones scripts)terracoil/ - Python package stub (empty __init__.py)ioc_proposal_one_class_per_file.md - Detailed IoC design documentdikw.md - DIKW framework writeupcoupling-metrics-complete.drawio - draw.io diagram for coupling metrics articleThe pyproject.toml pytest config targets --cov=modgud and testpaths = ["tests"]. There is currently no tests/ directory or modgud package in this workspace; these settings carry over from a sibling project. When adding tests here, update addopts to match the correct package name or remove the coverage flag.