End-to-End ML Pipeline with SageMaker Pipelines

Overview SageMaker Pipelines lets you define a directed acyclic graph (DAG) of ML steps that SageMaker executes, tracks, and makes reproducible. This project builds a complete pipeline over a retail sales dataset: raw data in S3 goes in, predictions come out, with every intermediate artefact versioned and auditable. The four steps in the pipeline: PreprocessData → TrainModel → CreateInferenceModel → BatchInference The Dataset Walmart retail sales data with three source tables (features, sales, stores). The target is weekly sales per store — a regression problem. ...

September 1, 2025 · 3 min · 495 words · Kiprono Elijah