A segmentation & early warning approach — analyzing 143 weeks of sales data across 45 stores and 81 departments to flag which store-department pairs will underperform before it happens.
As part of a 5-person analytics team, I led the underperformance definition framework and the feature engineering pipeline — designing the logic that separates true performance failure from structural or seasonal noise. My focus was building a model that gives Walmart store managers a meaningful early warning signal, not just a ranking of who sold the least.
I also drove the segmentation analysis to distinguish localized department-level issues from store-wide declines, and contributed to the final stakeholder recommendations.
Walmart's existing approach identified bottom-performing stores by raw sales rank — but "who sold the least" is not the same as underperformance. That method had three critical blind spots.
We designed a three-phase framework to move Walmart from reactive reporting to proactive, context-aware flagging of store-department combinations at risk.
Sourced from Kaggle's Walmart Recruiting: Store Sales Forecasting competition — covering nearly 3 years of anonymized weekly sales, store characteristics, and macro-economic features.
| File | Contents | Key Stats |
|---|---|---|
| sales.csv | Weekly sales by store and department | 421,570 observations · Feb 2010 – Oct 2012 |
| stores.csv | Store type (A/B/C) and physical size | 45 stores · Types: Supercenter, Discount, Neighborhood |
| features.csv | Economic & promotional markdowns, CPI, fuel prices, temperature | 5 markdown fields · Holiday flags · Macro indicators |
We engineered 11 features across four conceptual groups — each designed to capture a different dimension of underperformance risk that raw sales data misses.
Before modeling, the exploratory analysis surfaced three critical patterns that shaped our entire prediction strategy.
We trained a Decision Tree classifier (CART) on a temporal 80/20 split — preserving the time sequence rather than randomizing — to predict whether each store × department pair would underperform the following week.
Base Model Performance
Residual_Z — highest importance (0.30). Detects when sales fall below peer expectations, flagging localized failure vs. market trends.
Rolling_Std_13w / CV_13w — second and third. Volatile departments face significantly higher underperformance risk.
Drop_4w — sudden sharp declines proved stronger predictors than gradual drift.
The final model outputs a store × department risk probability matrix. Store 43 – Dept 52 showed a 0.96 predicted probability — the highest in the test period, enabling proactive management intervention.
A weekly early warning list of their highest-risk store-department combinations, enabling proactive inventory adjustments and targeted promotions before sales decline becomes visible in reporting.
Portfolio-wide risk visibility that distinguishes systemic regional issues from isolated store problems — enabling smarter resource allocation and escalation decisions.
A reproducible, extensible feature engineering framework that can incorporate new data signals (geographic enrichment, department name mapping) to further improve prediction quality.
Moving from reactive ranking to proactive classification requires context-aware baselines. Without accounting for store type, seasonality, and momentum, you're not measuring performance — you're measuring size.