đźš• Why This Matters

When traffic breaks down, everything around it follows—rider experience, driver earnings, dispatch reliability, and pricing logic.

We set out to predict trip durations—but the model revealed something deeper.

Every December, it failed. Not because of bad code, but because the city behaves differently during the holiday season.

This project shows how model performance can surface hidden operational risks—and how adaptive systems can respond.

📌 Overview

This project began with a clear goal: predict NYC taxi trip durations using historical zone-pair data and weather signals.

The model worked reliably across most months—except December.

This wasn’t random error—it exposed structural volatility in NYC’s transportation system that traditional data pipelines miss.

What started as a modeling task became a diagnostic tool for real-world disruption.

🔍 Problem

Can a single model reliably predict trip durations throughout the year?

We observed that while most months behaved predictably, December broke the pattern—predictions became noticeably worse despite no changes in data or pipeline.

This suggests a real-world shift in rider behavior, traffic patterns, or system-wide demand that standard features couldn’t capture.

⚠️ Key Finding

We uncovered two forms of volatility:

This makes December more than just a bad month. It’s a recurring failure point that standard modeling doesn't capture.