Tags:

The Online Shoppers Purchasing Intention Dataset from the UCI Repository is a dataset designed to understand the factors influencing online shopping behavior, particularly whether a user will end up purchasing during a browsing session.

Here’s a detailed explanation:


1. Dataset Overview

  • Purpose: To predict whether a session ends with a purchase or not based on the user’s browsing behavior and other factors.
  • Domain: E-commerce and online retail.
  • Dataset Size:
    • Instances: 12,330
    • Attributes: 18 features (categorical and numerical) + 1 target variable

2. Features (Attributes)

The dataset contains both numerical and categorical variables. Key features include:

Numerical Variables:

  1. Administrative: Number of pages visited related to administrative functions (e.g., login, account settings).
  2. Administrative_Duration: Total time spent on administrative pages.
  3. Informational: Number of pages visited related to information-seeking.
  4. Informational_Duration: Total time spent on informational pages.
  5. ProductRelated: Number of pages visited related to product browsing.
  6. ProductRelated_Duration: Total time spent on product pages.
  7. BounceRates: Average bounce rate per session.
  8. ExitRates: Average exit rate per session.
  9. PageValues: Value of a page based on e-commerce metrics.
  10. SpecialDay: Closeness of the browsing date to a special day like Valentine’s Day or Black Friday.

Categorical Variables:

  1. Month: Month of the year (e.g., January, February).
  2. OperatingSystems: User’s operating system.
  3. Browser: User’s browser type.
  4. Region: Geographic region of the user.
  5. TrafficType: Traffic source type.
  6. VisitorType: Whether the visitor is a returning or new customer.
  7. Weekend: Boolean value indicating if the visit occurred on a weekend.

3. Target Variable

  • Revenue: A binary variable indicating whether the session ended with a purchase (1) or not (0).

4. Key Use Cases

This dataset can be used for:

  1. E-commerce Analytics: Understanding user behavior and browsing patterns.
  2. Predictive Modeling: Building machine learning models to predict the likelihood of purchase.
  3. Behavioral Insights: Analyzing factors (like time spent on pages, bounce rates, and visitor type) that influence purchasing decisions.
  4. Personalization: Identifying trends for personalized recommendations or targeted marketing.

5. Example Visualizations for Power BI Dashboards

Power BI dashboards can be used to display insights such as:

  1. Conversion Rates by Month: Visualize how purchase likelihood changes across months.
  2. Page Interaction Analysis: Analyze the correlation between time spent on specific types of pages and the probability of purchase.
  3. Traffic Sources: Show the effectiveness of various traffic types in driving revenue.
  4. Visitor Type Comparison: Compare purchasing behavior between new and returning visitors.
  5. Region-Based Insights: Identify geographic regions with the highest conversion rates.

6. Why It’s Suitable for Power BI

  • The mix of numerical and categorical data is ideal for dynamic filtering and segmentation in Power BI.
  • Clear business insights can be derived, such as analyzing conversion rates or customer engagement patterns.
  • Power BI can handle this dataset size comfortably, providing robust visualizations.

Dataset Link : https://archive.ics.uci.edu/dataset/468/online+shoppers+purchasing+intention+dataset