Exploring the Personal Finance & Demographics JSON Dataset

Categories: Practice Datasets

Tags:

Exploring the Personal Finance & Demographics JSON Dataset

Data is the fuel of the modern digital world. To practice data analytics, machine learning, and visualization techniques, we often need well-structured datasets that combine numerical and categorical variables. Today, we introduce the Personal Finance & Demographics Dataset, a synthetic dataset with 200 rows and 9 columns designed to mimic real-world patterns of income, expenses, and demographics.

This dataset can be used by students, data enthusiasts, and professionals to learn how to analyze consumer behavior, build predictive models, and create dashboards in tools like Python, R, Tableau, or Power BI.

Personal Finance and Demographic Details JSON Dataset Snippet:

[
  {
    "id": 1,
    "age": 34,
    "income": 52000,
    "expenses": 21000,
    "savings": 8000,
    "gender": "Female",
    "region": "North",
    "marital_status": "Married",
    "occupation": "Engineer"
  },
  {
    "id": 2,
    "age": 28,
    "income": 45000,
    "expenses": 18000,
    "savings": 6000,
    "gender": "Male",
    "region": "South",
    "marital_status": "Single",
    "occupation": "Teacher"
  },
  {
    "id": 3,
    "age": 42,
    "income": 75000,
    "expenses": 32000,
    "savings": 15000,
    "gender": "Female",
    "region": "West",
    "marital_status": "Married",
    "occupation": "Doctor"
  }
  ...
]

Full Dataset Here: https://github.com/slidescope/Exploring-the-Personal-Finance-Demographics-JSON-Dataset/blob/main/personal_finance_dataset.json

📂 Structure of the Dataset

The dataset contains 9 columns:

1. id

Type: Unique Identifier
Description: Each record has a unique id from 1 to 200.
Purpose: To differentiate individuals.

2. age

Type: Numerical (Integer)
Example: 28, 34, 42
Description: Represents the age of the individual.
Purpose: Useful for studying age-wise trends in income, expenses, and savings.

3. income

Type: Numerical (Integer, in USD)
Example: 45000, 52000, 75000
Description: The annual income of the individual.
Purpose: Allows analysis of financial well-being and comparison across demographics.

4. expenses

Type: Numerical (Integer, in USD)
Example: 18000, 21000, 32000
Description: The yearly expenses of the individual.
Purpose: Helps measure spending habits and calculate savings.

5. savings

Type: Numerical (Integer, in USD)
Example: 6000, 8000, 15000
Description: The amount of money left after subtracting expenses from income.
Purpose: Useful for studying financial stability and wealth accumulation.

6. gender

Type: Categorical (Male / Female / Other)
Example: "Male", "Female"
Description: Gender of the individual.
Purpose: Enables demographic segmentation.

7. region

Type: Categorical (North, South, East, West)
Example: "North", "South"
Description: The geographic region where the individual lives.
Purpose: Useful for regional analysis of income and lifestyle.

8. marital_status

Type: Categorical (Single / Married / Divorced)
Example: "Single", "Married"
Description: Marital status of the individual.
Purpose: Helps analyze spending and savings patterns across different family types.

9. occupation

Type: Categorical (Engineer, Teacher, Doctor, Student, Business, etc.)
Example: "Engineer", "Doctor"
Description: The professional background of the individual.
Purpose: Enables occupation-based income and lifestyle comparisons.

🎯 Use Cases

With this dataset, you can:

Find income and savings patterns across age groups
Analyze regional differences in financial behavior
Study how marital status impacts expenses
Compare occupations based on average income
Build predictive models for savings estimation

🔍 Why This Dataset is Useful?

This dataset is simple yet rich enough to cover:

Descriptive Analytics (averages, distributions, group comparisons)
Data Visualization (charts by region, occupation, gender)
Predictive Analytics (regression to predict savings from income/expenses)
Segmentation (clustering customers into financial personas)

It serves as a mini real-world dataset that is perfect for practice in projects, assignments, and workshops.