Logistic Regression on Study Hours vs Pass or Fail Data

Categories: Machine Learning

Tags:

Logistic Regression on Study Hours vs Pass or Fail Data

Here is an example of solving a Logistic Regression problem using Python.

Sample Data

We’ll create a dataset to predict whether a student passes or fails based on their hours of study and whether they took a preparatory course.

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix

# Step 1: Prepare the data
data = {
    "Hours_of_Study": [2, 3, 5, 1, 4, 6, 1.5, 3.5],
    "Preparatory_Course": [0, 1, 1, 0, 0, 1, 0, 1],
    "Pass": [0, 1, 1, 0, 1, 1, 0, 1]
}

df = pd.DataFrame(data)

# Features (X) and target (y)
X = df[["Hours_of_Study", "Preparatory_Course"]]
y = df["Pass"]

# Step 2: Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 3: Fit a Logistic Regression model
model = LogisticRegression()
model.fit(X_train, y_train)

# Step 4: Predict and evaluate
y_pred = model.predict(X_test)

# Print results
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)

print("Predictions:", y_pred)
print("Accuracy:", accuracy)
print("Confusion Matrix:\n", conf_matrix)

# Step 5: Coefficients of the model
print("Intercept:", model.intercept_)
print("Coefficients:", model.coef_)

Download the file here
https://colorstech.net/wp-content/uploads/2024/11/studyhours.csv