The Online Shoppers Purchasing Intention Dataset from the UCI Repository is a dataset designed to understand the factors influencing online shopping behavior, particularly whether a user will end up purchasing during a browsing session.
Here’s a detailed explanation:
1. Dataset Overview
- Purpose: To predict whether a session ends with a purchase or not based on the user’s browsing behavior and other factors.
- Domain: E-commerce and online retail.
- Dataset Size:
- Instances: 12,330
- Attributes: 18 features (categorical and numerical) + 1 target variable
2. Features (Attributes)
The dataset contains both numerical and categorical variables. Key features include:
Numerical Variables:
- Administrative: Number of pages visited related to administrative functions (e.g., login, account settings).
- Administrative_Duration: Total time spent on administrative pages.
- Informational: Number of pages visited related to information-seeking.
- Informational_Duration: Total time spent on informational pages.
- ProductRelated: Number of pages visited related to product browsing.
- ProductRelated_Duration: Total time spent on product pages.
- BounceRates: Average bounce rate per session.
- ExitRates: Average exit rate per session.
- PageValues: Value of a page based on e-commerce metrics.
- SpecialDay: Closeness of the browsing date to a special day like Valentine’s Day or Black Friday.
Categorical Variables:
- Month: Month of the year (e.g., January, February).
- OperatingSystems: User’s operating system.
- Browser: User’s browser type.
- Region: Geographic region of the user.
- TrafficType: Traffic source type.
- VisitorType: Whether the visitor is a returning or new customer.
- Weekend: Boolean value indicating if the visit occurred on a weekend.
3. Target Variable
- Revenue: A binary variable indicating whether the session ended with a purchase (
1
) or not (0
).
4. Key Use Cases
This dataset can be used for:
- E-commerce Analytics: Understanding user behavior and browsing patterns.
- Predictive Modeling: Building machine learning models to predict the likelihood of purchase.
- Behavioral Insights: Analyzing factors (like time spent on pages, bounce rates, and visitor type) that influence purchasing decisions.
- Personalization: Identifying trends for personalized recommendations or targeted marketing.
5. Example Visualizations for Power BI Dashboards
Power BI dashboards can be used to display insights such as:
- Conversion Rates by Month: Visualize how purchase likelihood changes across months.
- Page Interaction Analysis: Analyze the correlation between time spent on specific types of pages and the probability of purchase.
- Traffic Sources: Show the effectiveness of various traffic types in driving revenue.
- Visitor Type Comparison: Compare purchasing behavior between new and returning visitors.
- Region-Based Insights: Identify geographic regions with the highest conversion rates.
6. Why It’s Suitable for Power BI
- The mix of numerical and categorical data is ideal for dynamic filtering and segmentation in Power BI.
- Clear business insights can be derived, such as analyzing conversion rates or customer engagement patterns.
- Power BI can handle this dataset size comfortably, providing robust visualizations.
Dataset Link : https://archive.ics.uci.edu/dataset/468/online+shoppers+purchasing+intention+dataset