Categories: Practice Datasets
Tags:

Columns

Column NameTypeDescription
Guest_IDIdentifierUnique guest booking ID
Nights_StayedNumericalNumber of nights guest stayed in hotel
Room_FareNumericalTotal room fare in USD for the stay
Additional_SpendNumericalExtra spend on food, spa, services, etc.
Guest_AgeNumericalAge of the guest in years
Room_TypeCategoricalSingle, Double, Suite, Deluxe
Booking_ChannelCategoricalOnline Travel Agency, Direct, Corporate, Walk-in
CountryCategoricalCountry of the guest
Feedback_RatingCategoricalExcellent, Good, Average, Poor

Dataset Explanation

The Hotel Guest Stays dataset by Slidescope is designed to capture essential details of guest visits to a hotel, providing valuable information for analyzing guest behavior, spending patterns, and booking preferences. With 350 unique entries, this dataset balances numerical variables (quantitative guest behaviors) and categorical variables (qualitative attributes), making it an excellent resource for business intelligence dashboards, guest segmentation, predictive modeling, and customer satisfaction studies.

Get the Dataset Here: https://github.com/slidescope/Hotel-Guest-Stays-Dataset-for-Data-Analysis-and-ML-Projects

Purpose of the Dataset

Hotels generate significant amounts of guest data, but much of it is sensitive or siloed in operational systems. This dataset provides a synthetic yet realistic view of guest stays that can be freely used for practice and analysis. It is particularly useful for hospitality students, data analysts, and BI professionals who want to practice analyzing guest demographics, spending behavior, and customer satisfaction trends.

A unique strength of this dataset is that it integrates financial data (room fare, additional spend) with guest satisfaction (feedback rating), allowing learners to connect guest experiences with revenue generation. It also highlights the importance of distribution channels (e.g., online travel agencies, direct bookings, walk-ins), which is a critical metric in hotel revenue management.


Columns in Detail

  1. Guest_ID – A unique identifier for each guest stay.
  2. Nights_Stayed – Shows the length of stay, useful for calculating average stay duration.
  3. Room_Fare – Total cost of room bookings in USD, directly tied to revenue analysis.
  4. Additional_Spend – Captures non-room revenue streams like dining, spa, events.
  5. Guest_Age – Helps segment customer demographics (youth, business travelers, families).
  6. Room_Type – Indicates guest’s room preference (Single, Double, Suite, Deluxe).
  7. Booking_Channel – Distribution channel (OTA, Direct, Corporate, Walk-in) impacting commission and profitability.
  8. Country – Guest’s origin, useful for understanding international vs domestic customer base.
  9. Feedback_Rating – Guest’s qualitative feedback (Excellent, Good, Average, Poor) reflecting satisfaction.

Example KPIs to Create

To derive business insights, hotels can create several KPIs using this dataset:

📊 Revenue KPIs

  1. Average Revenue Per Stay = (Room Fare + Additional Spend) ÷ Total Guests
  2. Average Room Fare = Total Room Fare ÷ Number of Guests
  3. Additional Spend Ratio = (Additional Spend ÷ Total Revenue) × 100
  4. Revenue by Room Type – Compare earnings from Suites vs Deluxe vs Standard rooms.

👥 Guest Behavior KPIs

  1. Average Nights Stayed – Identify whether guests are short-stay or long-stay.
  2. Repeat Stay Indicator (if Guest_ID repeats) – % of returning guests.
  3. Age Group Distribution – Which demographic spends the most nights/money.

🌍 Market KPIs

  1. Bookings by Channel – % of bookings via OTA, Direct, Corporate, Walk-in.
  2. Country Mix – % share of domestic vs international guests.
  3. Channel Profitability Index – Average revenue per channel, adjusted for commission rates.

🏨 Guest Satisfaction KPIs

  1. Feedback Rating Distribution – % of Excellent, Good, Average, Poor reviews.
  2. Revenue vs. Feedback – Do higher-paying guests give better ratings?
  3. Nights Stayed vs. Feedback – Is satisfaction higher with longer stays?

Analytical Use Cases

  • Revenue Management: Identify which booking channels generate the most profitable customers.
  • Guest Segmentation: Group customers by age, room type, or spend behavior.
  • Customer Experience: Analyze how room fare and additional spend correlate with guest feedback.
  • Market Strategy: Track countries contributing the highest revenue or longest stays.
  • Operational Planning: Anticipate room type demand to optimize inventory.

✅ In summary, the Hotel Guest Stays dataset is a versatile dataset that bridges financial, demographic, and satisfaction insights. With its structured design, it can be used to practice KPI creation, Power BI dashboard design, and predictive modeling for guest retention and revenue optimization.

Target Field Selection

🔹 For Business Intelligence (KPI dashboards)

  • Room_Fare → Ideal for revenue-related KPIs like RevPAR, ADR (Average Daily Rate).
  • Additional_Spend → Useful for analyzing upselling/cross-selling impact.
  • Feedback_Rating → Helps track guest satisfaction and service quality.

🔹 For Predictive Modeling / Machine Learning

  • Feedback_Rating (categorical, ordinal) → Best as a classification target (predict whether a guest will rate stay Excellent/Good/Average/Poor).
  • Additional_Spend (numerical, continuous) → Great for a regression target (predict extra spend based on demographics, stay details, booking channel, etc.).
  • Nights_Stayed → Could also be modeled (predict length of stay).