Categories: Practice Datasets
Tags:

Here’s the dataset schema:

Column NameTypeDescription
Review_IDIDUnique identifier for each review
RatingNumericalStar rating given by the customer (1–5)
Review_LengthNumericalNumber of words in the review
Helpful_VotesNumericalCount of “helpful” votes received
Days_Since_PurchaseNumericalDays passed since purchase when review was written
Product_CategoryCategoricalCategory of product reviewed (Electronics, Fashion, Home, Books, Beauty, Sports, Grocery)
Review_SentimentCategoricalSentiment of review (Positive, Neutral, Negative)
Verified_PurchaseCategoricalWhether the review is from a verified purchase (Yes/No)
Customer_LocationCategoricalCustomer’s country/region (USA, UK, India, Canada, Australia, Germany, UAE)

Purpose of Dataset
This dataset can be used to analyze customer behavior, review credibility, and product performance. It helps in sentiment analysis, fraud detection (fake reviews), and product quality tracking.

Get Dataset here: https://github.com/slidescope/E-commerce-Customer-Reviews-dataset-for-Sentiment-Analysis-and-Machine-Learning/


🔍 Sample Questions to Solve

  1. What is the average rating across product categories?
  2. Do verified purchases tend to have more positive reviews than unverified ones?
  3. Which product category receives the longest reviews on average?
  4. Does review length correlate with helpful votes?
  5. Which country’s customers give the highest ratings on average?
  6. Are negative reviews more common after a longer gap (days since purchase)?

The E-commerce Customer Reviews dataset provides valuable insights into customer feedback behavior across multiple product categories. It contains 250 entries, each uniquely identified with a Review ID. The dataset includes four numerical features—Rating (1–5), Review Length (word count), Helpful Votes, and Days Since Purchase—capturing quantifiable aspects of reviews. Alongside, it has four categorical features—Product Category, Review Sentiment, Verified Purchase status, and Customer Location—offering qualitative context. This mix of numerical and categorical data makes the dataset ideal for sentiment analysis, customer satisfaction studies, fraud detection in reviews, and trend analysis by demographics or product type, supporting practical business intelligence tasks.