Skip to main content

Dataset

Overview

PropertyValue
SourceKaggle – Heart Disease UCI
Samples1,025 patients
Features13 clinical attributes
TargetBinary — 1 = heart disease present, 0 = absent
The dataset has a balanced class distribution (~51% / 49%), which means accuracy is a meaningful metric here and avoids the pitfalls of imbalanced classification.

Features

FeatureDescription
ageAge in years
sexSex (1 = male, 0 = female)
cpChest pain type (0–3)
trestbpsResting blood pressure (mm Hg)
cholSerum cholesterol (mg/dl)
fbsFasting blood sugar > 120 mg/dl (1 = true)
restecgResting ECG results (0–2)
thalachMaximum heart rate achieved
exangExercise-induced angina (1 = yes)
oldpeakST depression induced by exercise relative to rest
slopeSlope of the peak exercise ST segment
caNumber of major vessels colored by fluoroscopy (0–3)
thalThalassemia type (0 = normal, 1 = fixed defect, 2 = reversible defect)

Reference

Detrano, R., Janosi, A., Steinbrunn, W., Pfisterer, M., Schmid, J., Sandhu, S., Guppy, K., Lee, S., & Froelicher, V. (1989). International application of a new probability algorithm for the diagnosis of coronary artery disease. American Journal of Cardiology, 64(5), 304–310.