Tags:

This dataset focuses on factors influencing the likelihood of heart attacks in young adults (ages 18–35) in India. It encompasses demographic, lifestyle, medical, and clinical data, offering a comprehensive resource for analyzing connections between these factors and heart attack risk in this age group.

Dataset is available here on Kaggle: https://www.kaggle.com/datasets/ankushpanday1/heart-attack-in-youth-of-india
Under License : MIT

We are exploring this dataset for Educational Research Purpose


Key Features

Demographics

  • Age
  • Gender
  • Region (state or locality)
  • Urban/Rural residence
  • Socioeconomic Status (SES)

Lifestyle Factors

  • Smoking and alcohol consumption
  • Dietary preferences (vegetarian/non-vegetarian)
  • Physical activity levels
  • Screen time
  • Sleep duration

Medical History

  • Family history of heart disease
  • Diabetes and hypertension history
  • Cholesterol levels
  • Body Mass Index (BMI)
  • Stress levels

Clinical and Test Results

  • Blood pressure (systolic and diastolic)
  • Resting heart rate
  • Electrocardiogram (ECG) results
  • Chest pain type
  • Maximum heart rate during exercise
  • Exercise-induced angina
  • Blood oxygen levels (SpO₂)
  • Triglyceride levels

Target Variable

  • Heart Attack Likelihood: Yes/No

Potential Use Cases

1. Machine Learning for Risk Prediction

  • Train classification models (e.g., logistic regression, random forests, neural networks) to predict heart attack likelihood based on features.

2. Public Health Insights

  • Identify high-risk groups, such as sedentary individuals with a family history of heart disease, to guide targeted interventions.

3. Lifestyle Recommendations

  • Analyze the effects of modifiable factors like diet, exercise, and sleep on heart attack risk.

4. Regional and Socioeconomic Analysis

  • Study disparities in heart attack risk based on geographic or socioeconomic differences within India.

Insights to Extract

1. Feature Interactions

  • Examine non-linear relationships, such as how physical activity mitigates the effect of high BMI on heart attack risk.
    • Example: High BMI individuals with active lifestyles may have a lower risk compared to their sedentary counterparts.

2. Clustering

  • Use clustering techniques (e.g., K-Means, DBSCAN) to group individuals with similar risk profiles for tailored interventions.

3. Latent Factors

  • Apply dimensionality reduction techniques like PCA to uncover hidden health patterns influencing heart attack risk.

4. Causal Relationships

  • Use causal inference methods to determine if specific factors (e.g., stress) directly influence others (e.g., cholesterol levels).

5. Regional Variation

  • Conduct geospatial analysis to study how environmental and cultural factors across regions impact lifestyle, health indicators, and heart attack risks.

Example Insights to Explore

  • Stress and Sleep Interaction: Do high-stress individuals with inadequate sleep show significantly higher heart attack risk?
  • Exercise-Induced Angina: How does exercise-induced angina correlate with abnormal ECG results and high cholesterol?
  • Gender-Specific Risk Factors: Are there differences in risk factor importance, such as smoking or BMI, between men and women?
  • Dietary Impact: What is the differential impact of vegetarian vs. non-vegetarian diets on triglycerides and heart attack risk?
  • Rare Patterns: Identify rare combinations, such as individuals with low SpO₂, normal BMI, but high heart attack likelihood.

Starting Points for Analysis

  1. Correlation Analysis:
    • Assess relationships between individual features and heart attack likelihood.
  2. Predictive Modeling:
    • Build models like Random Forest or Gradient Boosting to predict outcomes and analyze feature importance.
  3. Trend Analysis:
    • If the dataset includes time-series data, investigate trends in key health metrics (e.g., cholesterol, stress levels) over time.

This dataset provides a robust foundation for deriving actionable insights and advancing research into young adult heart health.