When working with time-series datasets like the Seoul Bike Sharing Demand data from UCI, one of the most insightful analyses is to check how demand changes across months and hours. This helps us uncover seasonality patterns and understand when people use bikes most often.
In this post, I’ll show you step by step how to do this in Python using pandas, matplotlib, and seaborn.
🔹 Step 1: Load the data
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import calendar
df = pd.read_csv("SeoulBikeData.csv", encoding="utf-8")
Get the dataset here: https://archive.ics.uci.edu/dataset/560/seoul+bike+sharing+demand
🔹 Step 2: Parse the datetime correctly
The dataset contains a Date column (dd/mm/yyyy) and an Hour column.
We combine them into a proper datetime field and set it as the index.
# Parse date correctly (day comes first in this dataset)
df['date'] = pd.to_datetime(df['Date'], dayfirst=True)
# Create datetime column by adding hours
df['datetime'] = df['date'] + pd.to_timedelta(df['Hour'], unit='h')
# Set as index
df = df.set_index('datetime').sort_index()
# Clean column names for convenience
df.columns = df.columns.str.strip().str.lower().str.replace(" ", "_")
Now we can easily extract month, day, hour from the datetime index.
🔹 Step 3: Add useful time columns
df['month'] = df.index.month
df['hour'] = df.index.hour
🔹 Step 4: Average bike demand by hour
hourly_mean = df.groupby('hour')['rented_bike_count'].mean()
plt.figure(figsize=(8,4))
hourly_mean.plot(kind='bar', color="teal")
plt.title("Average Bike Demand by Hour of Day")
plt.xlabel("Hour of Day (0–23)")
plt.ylabel("Average Rented Bike Count")
plt.show()
📈 Insight: You’ll usually see morning and evening peaks, reflecting office commute patterns.
🔹 Step 5: Average bike demand by month
monthly_mean = df.groupby('month')['rented_bike_count'].mean()
# Replace numbers with month abbreviations
monthly_mean.index = monthly_mean.index.map(lambda x: calendar.month_abbr[x])
plt.figure(figsize=(8,4))
monthly_mean.plot(kind='bar', color="coral")
plt.title("Average Bike Demand by Month")
plt.xlabel("Month")
plt.ylabel("Average Rented Bike Count")
plt.show()
📈 Insight: Bike demand usually peaks in spring and autumn (comfortable weather), while winter months show a dip due to cold and snow.
🔹 Step 6: Combine month + hour (heatmap)
Sometimes, looking at just months or just hours is not enough. A heatmap lets us see both dimensions together.
pivot = df.pivot_table(
index='hour',
columns='month',
values='rented_bike_count',
aggfunc='mean'
)
plt.figure(figsize=(10,6))
sns.heatmap(pivot, cmap="YlOrRd", annot=False)
plt.title("Heatmap of Bike Demand (Hour vs Month)")
plt.xlabel("Month")
plt.ylabel("Hour of Day")
plt.show()
📊 Insight: You’ll notice that commuting peaks (around 8 AM and 6 PM) are strong in spring and autumn, but almost disappear in snowy winter months.
🚴 Key Takeaways
- Hour-of-day patterns clearly show commuter peaks.
- Month-wise analysis highlights seasonality (high demand in spring/autumn, low in winter).
- Heatmaps give a combined view, showing how hourly demand varies across months.
Such insights are crucial for:
- Bike redistribution planning
- Maintenance scheduling
- Promotions/discounts during off-season
- Forecasting demand for city planning
👉 This type of EDA is the foundation before you build predictive models (like Random Forests or time-series forecasting models).
