This Power BI dashboard project analyzes health insurance data of patients across different regions of the United States. Using a dataset that includes individual medical profiles such as age, sex, BMI, number of children, smoking status, and insurance charges, the project aims to uncover patterns and insights that influence insurance costs. The data is categorized into four main U.S. regions—Northeast, Northwest, Southeast, and Southwest—enabling geographic comparisons. Through a series of three dashboards, this analysis explores demographic distribution, insurance cost patterns, and key risk factors affecting charges. The goal is to provide a clear, visual understanding of how personal and lifestyle attributes impact insurance expenses across the country.
The file used in this project has 2 Sheets.
- Insurance – likely holds patient insurance and medical profile data.
- regions – likely maps each entry to one of the four U.S. regions (Northeast, Northwest, Southeast, Southwest).
📘 Dataset Summary
✅ Sheet: Insurance
Contains patient-level insurance data with:
age,sex,bmi,children,smoker,region(already mapped), andcharges
✅ Sheet: regions
Maps U.S. states to broader regions like midwest, northwest, southeast, etc. (You may not need this since the region is already in the main data.)
You can get the dataset here: USA Insurance Data.xlsx
Project Overview:
Build 3 Dashboard to analyze the data using Power BI.
🟦 Dashboard 1: Demographic Overview

Focus: Understand patient demographics across U.S. regions.
Questions:
- What is the distribution of patients by region, gender, and age group?
- What is the average BMI and number of children per region?
- What is the gender distribution within each region?
- How does the age distribution vary across regions?
- What percentage of patients are smokers vs non-smokers in each region?
🟩 Dashboard 2: Insurance Cost Analysis
Focus: Explore how insurance charges vary by profile.
Questions:
- What is the average insurance charge by region?
- How do smokers vs. non-smokers compare in terms of average charges?
- Does the number of children affect the total insurance cost?
- What is the correlation between BMI and insurance charges?

This scatter plot visualizes the relationship between BMI (x-axis) and insurance charges (y-axis), with a trendline included. Here’s the conclusion from the chart:
📊 Conclusion:
- Positive Correlation:
The dotted trendline slopes upward, suggesting a mild positive correlation between BMI and insurance charges. As BMI increases, average charges tend to increase, though the trend is not very steep. - High Variability:
There’s a wide spread of charges across most BMI levels. For example, individuals with a BMI around 30 show charges ranging from under $10K to over $60K—indicating other factors also strongly influence costs (e.g., smoking, age, region). - Clustered Lower Charges:
Most of the data points are clustered below $20K, showing that the majority of patients incur relatively low charges regardless of BMI. - Outliers Drive the Trend:
A few high-cost outliers with high BMI are likely influencing the upward slope of the trendline. These might represent high-risk patients (e.g., obese smokers or elderly patients with conditions).
🧠 Insight:
While higher BMI is associated with higher insurance charges, it’s not a strong standalone predictor. Combining BMI with smoking status, age, and region would give more accurate insights into what drives insurance costs.
- >> Which age group or gender incurs the highest average charges?
🟥 Dashboard 3: Risk Profile & Cost Drivers
Focus: Identify high-risk profiles and factors driving up costs.
Questions:
- Which combinations (e.g., smoker + high BMI) lead to the highest charges?
- What are the top 10% costliest patients, and what traits do they share?
- How does BMI category (underweight, normal, overweight, obese) relate to insurance charges?
- Which regions have the highest density of high-cost patients?
- Is there a clear cost risk pattern based on age, BMI, and smoking status?
Answer for Question 3.2 What are the top 10% costliest patients, and what traits do they share?
Step 1: Calculate the 90th Percentile of Charges
Create a DAX measure to find the 90th percentile:
Charge_90thPercentile =
PERCENTILEX.INC(ALL(Insurance), Insurance[charges], 0.9)
This gives you the threshold value above which a patient is in the top 10% by cost.
🔧 Step 2: Identify Top 10% Patients
Create a calculated column to flag top 10% patients:
Top10Flag =
VAR TopThreshold =
PERCENTILEX.INC(ALL(Insurance), Insurance[charges], 0.9)
RETURN IF(Insurance[charges] >= TopThreshold, "Top 10%", "Others")
📊 Step 3: Create Comparison Visuals
Now analyze what makes them different:
🧩 1. Stacked Column Chart
- Axis:
Top10Flag - Legend:
smoker,sex, orregion - Values: Count of Patients
📈 2. Bar Chart for Average Age/BMI
- Axis:
Top10Flag - Value: Average of
age,bmi,children
🧾 3. Matrix Table
- Rows:
Top10Flag - Columns: Traits like
smoker,bmi category,region - Values: Average of
charges, Count of patients
📌 Interpretation You Can Give
Once the visuals are up, look for trends like:
- Most top 10% patients are smokers
- They often have BMI > 30 (Obese)
- Tend to be older
- May have more children or from a certain region
✅ Sample Summary You Might Report
The top 10% costliest patients typically exhibit traits such as high BMI (Obese), smoking status = yes, and age above 45. These individuals significantly skew overall insurance costs and represent a key group for risk-based pricing or wellness program targeting.
Q 3.5 : Which regions have the highest density of high-cost patients?
🔧 1. Calculate the 90th Percentile of Charges
Charge_90thPercentile =
PERCENTILEX.INC(ALL(Insurance), Insurance[charges], 0.9)
🔧 2. Create a Column to Flag High-Cost Patients
Is_High_Cost =
VAR threshold =
PERCENTILEX.INC(ALL(Insurance), Insurance[charges], 0.9)
RETURN IF(Insurance[charges] >= threshold, 1, 0)
🔧 3. Calculate Density per Region
Now create a DAX measure to compute % of high-cost patients per region:
HighCostDensity =
DIVIDE(
CALCULATE(COUNTROWS(Insurance), Insurance[Is_High_Cost] = 1),
COUNTROWS(Insurance)
)
➡️ Make sure this measure is evaluated in the context of each region (i.e., use it in a visual with region on axis).
📊 4. Suggested Visualizations
Visual Description
Bar Chart Axis: region
Value: HighCostDensity → Shows % of high-cost patients per region
Map (Filled or Bubble) Show relative density of high-cost patients geographically
Stacked Column Chart Axis: region
Legend: Is_High_Cost
Value: Count of Patients – Shows counts per region
📌 Interpretation
A region with a higher percentage of high-cost patients (e.g., 25% in Southwest vs 10% in Northeast) might be dealing with riskier profiles — higher BMI, smoking, older demographics.
