Analyzing Bike Sharing Demand by Month and Hour (EDA with Python & Pandas)

Categories: Python Pandas Tutorial

Tags:

Analyzing Bike Sharing Demand by Month and Hour (EDA with Python & Pandas)

When working with time-series datasets like the Seoul Bike Sharing Demand data from UCI, one of the most insightful analyses is to check how demand changes across months and hours. This helps us uncover seasonality patterns and understand when people use bikes most often.

In this post, I’ll show you step by step how to do this in Python using pandas, matplotlib, and seaborn.

🔹 Step 1: Load the data


import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import calendar

df = pd.read_csv("SeoulBikeData.csv", encoding="utf-8")

Get the dataset here: https://archive.ics.uci.edu/dataset/560/seoul+bike+sharing+demand

🔹 Step 2: Parse the datetime correctly

The dataset contains a Date column (dd/mm/yyyy) and an Hour column.
We combine them into a proper datetime field and set it as the index.

# Parse date correctly (day comes first in this dataset)
df['date'] = pd.to_datetime(df['Date'], dayfirst=True)

# Create datetime column by adding hours
df['datetime'] = df['date'] + pd.to_timedelta(df['Hour'], unit='h')

# Set as index
df = df.set_index('datetime').sort_index()

# Clean column names for convenience
df.columns = df.columns.str.strip().str.lower().str.replace(" ", "_")

Now we can easily extract month, day, hour from the datetime index.

🔹 Step 3: Add useful time columns

df['month'] = df.index.month
df['hour'] = df.index.hour

🔹 Step 4: Average bike demand by hour

hourly_mean = df.groupby('hour')['rented_bike_count'].mean()

plt.figure(figsize=(8,4))
hourly_mean.plot(kind='bar', color="teal")
plt.title("Average Bike Demand by Hour of Day")
plt.xlabel("Hour of Day (0–23)")
plt.ylabel("Average Rented Bike Count")
plt.show()

📈 Insight: You’ll usually see morning and evening peaks, reflecting office commute patterns.

🔹 Step 5: Average bike demand by month

monthly_mean = df.groupby('month')['rented_bike_count'].mean()

# Replace numbers with month abbreviations
monthly_mean.index = monthly_mean.index.map(lambda x: calendar.month_abbr[x])

plt.figure(figsize=(8,4))
monthly_mean.plot(kind='bar', color="coral")
plt.title("Average Bike Demand by Month")
plt.xlabel("Month")
plt.ylabel("Average Rented Bike Count")
plt.show()

📈 Insight: Bike demand usually peaks in spring and autumn (comfortable weather), while winter months show a dip due to cold and snow.

🔹 Step 6: Combine month + hour (heatmap)

Sometimes, looking at just months or just hours is not enough. A heatmap lets us see both dimensions together.

pivot = df.pivot_table(
    index='hour',
    columns='month',
    values='rented_bike_count',
    aggfunc='mean'
)

plt.figure(figsize=(10,6))
sns.heatmap(pivot, cmap="YlOrRd", annot=False)
plt.title("Heatmap of Bike Demand (Hour vs Month)")
plt.xlabel("Month")
plt.ylabel("Hour of Day")
plt.show()

📊 Insight: You’ll notice that commuting peaks (around 8 AM and 6 PM) are strong in spring and autumn, but almost disappear in snowy winter months.

🚴 Key Takeaways

Hour-of-day patterns clearly show commuter peaks.
Month-wise analysis highlights seasonality (high demand in spring/autumn, low in winter).
Heatmaps give a combined view, showing how hourly demand varies across months.

Such insights are crucial for:

Bike redistribution planning
Maintenance scheduling
Promotions/discounts during off-season
Forecasting demand for city planning

👉 This type of EDA is the foundation before you build predictive models (like Random Forests or time-series forecasting models).