NYC Taxi ETA Forecasting: Modeling Meets Real-World Chaos

🚕 Why This Matters

When traffic breaks down, everything around it follows—rider experience, driver earnings, dispatch reliability, and pricing logic.

We set out to predict trip durations—but the model revealed something deeper.

Every December, it failed. Not because of bad code, but because the city behaves differently during the holiday season.

This project shows how model performance can surface hidden operational risks—and how adaptive systems can respond.

📌 Overview

This project began with a clear goal: predict NYC taxi trip durations using historical zone-pair data and weather signals.

The model worked reliably across most months—except December.

This wasn’t random error—it exposed structural volatility in NYC’s transportation system that traditional data pipelines miss.

What started as a modeling task became a diagnostic tool for real-world disruption.

🔍 Problem

Can a single model reliably predict trip durations throughout the year?

We observed that while most months behaved predictably, December broke the pattern—predictions became noticeably worse despite no changes in data or pipeline.

This suggests a real-world shift in rider behavior, traffic patterns, or system-wide demand that standard features couldn’t capture.

⚠️ Key Finding

We uncovered two forms of volatility:

September saw a brief but significant surge in prediction error during the third week, coinciding with the UN General Assembly in NYC. Major street closures, security checkpoints, and gridlock in Midtown led to unusually high residuals (error spikes) and a dip in model reliability. The rest of the month returned to normal once the event concluded.
December showed recurring failure patterns across years—representing structural seasonality that the model couldn’t handle with rolling features.

This makes December more than just a bad month. It’s a recurring failure point that standard modeling doesn't capture.