Airline Baggage Delivery Analysis

Executive Summary

This analysis examined baggage delivery times across 99,174 flight records, focusing on data quality, performance trends, and probability modeling. After identifying and removing 15,057 problematic records (15.18%) due to duplicates, unrealistic timestamps, and invalid routing, the cleaned dataset offered more reliable insights.

16.1 min

Average Delivery Time

81.27%

Under 21 Minutes (Empirical)

78.55%

Under 21 Minutes (Theoretical)

15.18%

Data Issues Identified

Data Quality Assessment

Issues Identified in Dataset

Total problematic records: 15,057 out of 99,174 (15.18%)

Duplicate Records

8,380 duplicate entries removed to prevent double-counting in performance metrics and average time calculations.

Unrealistic BagDropDurations

Records showing bag drop durations under 10 seconds with bag counts greater than 10, indicating scanning or recording errors.

Invalid Processing Times

AverageTimePerBag values of 0 or 1 second indicate implausibly fast processing rates from incorrect timestamps or missing intervals.

Same Origin & Destination

Some entries had identical airport codes for origin and destination, likely test records not representing real flights.

FirstBagDropTime Anomalies

First bag logged less than 1 minute after arrival—operationally unlikely given typical deplaning and unloading times.

Data Cleaning Approach

Tools Used: Excel for data cleaning and preparation, StatsTool for statistical analysis

Key Steps:

Removed duplicate records using Excel's "Remove Duplicates" tool
Validated data types across all columns (time formats, numeric bag counts)
Created calculated field: BaggageDeliveryTime = LastBagDropTime - ActualArrival
Applied filtering rules based on operational assumptions

Statistical Analysis

Descriptive Statistics

Sample Size: 84,117 flights (after cleaning)

Metric	Value	Interpretation
Mean	16:06 (16.1 minutes)	Average delivery time
Median	15:22 (15.27 minutes)	50th percentile delivery time
Standard Deviation	6:30 (6.5 minutes)	Typical variation from mean
Minimum	0:16 (0.27 minutes)	Fastest delivery observed
Maximum	39:59 (39.98 minutes)	Longest delivery observed

Percentile Analysis

Percentile	Time	Interpretation
1%	03:41	1% of deliveries are extremely fast
10%	08:39	10% take less than ~9 minutes
50%	15:22	Median delivery time
90%	24:39	90% are under ~25 minutes
99%	35:46	Top 1% take longer than ~35 minutes

Key Observation: The mean (16.1 min) is slightly higher than the median (15.27 min), suggesting right-skewness in the distribution. This indicates that while most deliveries are relatively fast, there's a long tail of slower deliveries pulling the average upward.

Probability Analysis: Deliveries Under 21 Minutes

Comparative Results

Two methods were used to estimate the probability that baggage delivery time is less than 21 minutes:

📊 Empirical Approach

Result: 81.27%

Direct calculation counting actual observations where delivery time < 21 minutes.

= COUNTIF(CleanData, "<0:21:00") / 84119

Advantages:

No distributional assumptions
Simple to compute
Reflects actual observed frequencies

Disadvantages:

Sensitive to sample size
Cannot extrapolate beyond observed data

📈 Theoretical Approach (Log-Normal)

Result: 78.55%

Used log-normal distribution based on observed right-skewness in the data.

Parameters:

LN Mean = 2.690
LN SD = 0.448

= LOGNORM.DIST(21, 2.690, 0.448, TRUE)

Advantages:

Generalizes to unseen data
Accounts for skewness better than normal distribution
Enables predictive calculations

Disadvantages:

Assumes data fits chosen distribution
Requires parameter estimation

Why the 2.72% Difference?

The empirical method gives a slightly higher probability because it directly counts observations without assuming a distribution. The lognormal model, while accounting for skewness, smooths the data and may slightly underestimate the tail probability.

The empirical method captures all real-world variability (including anomalies)
The lognormal model smooths extreme values, slightly reducing tail probabilities

Bottom line: Both methods confirm that approximately ~80% of flights meet the 21-minute target, which is operationally useful.

Recommendations

Set a Reliability KPI Benchmark: Adopt 21 minutes as the key operational benchmark for baggage delivery, since ~80% of flights already meet this threshold. Establish regular monitoring to track performance against this KPI across airports and time windows.
Implement Real-Time Monitoring Dashboards: Build dashboards that flag outlier cases (>30 minutes) and surface trends by airport, airline, or time of day. Provide visibility to both operations teams and executives for proactive intervention.
Address Data Quality at the Source: Standardize data entry processes to reduce duplicate or unrealistic records (~15% of dataset was invalid). Introduce automated validation rules to catch errors earlier.
Policy Adjustments for Consistency: Focus resources on the long-tail delays (top 10–20% of flights) where performance diverges significantly from the median. Pilot policy or staffing adjustments at airports with higher variance.
Continuous Improvement via A/B Testing: Use geo-based holdouts to test operational changes (e.g., staffing models, unloading processes) at select airports. Compare treatment vs. control performance to validate the impact of interventions before full rollout.

Challenges & Learnings

Key Challenges

Messy data with timing errors and logical inconsistencies that required careful filtering
Some records lacked context (like flights with the same origin and destination) forcing informed assumptions
The data's skewed distribution meant a normal model was inappropriate, requiring log-normal fitting
Balancing data cleaning without over-filtering and losing valid insights

Key Insights

The analysis highlighted the importance of data quality—approximately 15% of records contained errors that would have distorted results if not addressed. The average baggage delivery time of 16.1 minutes, with about 80% of flights meeting the 21-minute benchmark, demonstrates solid operational performance. However, the right-skewed distribution reveals opportunities to address the longer tail of delays. The comparison between empirical and theoretical approaches validated our findings while demonstrating the value of using appropriate statistical models for operational decision-making.