Only on Eduzaurus

Prediction Of Heart Disease

Download essay Need help with essay?
Need help with writing assignment?
writers online
to help you with essay
Download PDF

Abstract—Cardiovascular disease is leading cause of the death globally. Diagnostics of this disease is typically difficult, although they do not satisfy the required accuracy. In this study, we have proposed the methodology for Automated diagnosis of normal and CAD conditions by using the Heart Rate Variability (HRV) extracted from the electrocardiogram (ECG). The Principal Components Analysis has been used to reduce the dimension of the extracted features to reduce the computational complexity and to reveal the hidden information in the data. Finally, Support Vector Machine classifier is used to classify the two classes of data using extracted features. SVM classifier has been optimized to improve the accuracy.


Cardiovascular diseases (CVDs) are the major reasons for death around the world. Approximately 17.7 million people died in 2015, representing to 31% of all global death according to World Health Organization (WHO). Coronary Artery disease (CAD) is refers to a group of diseases which includes stable angina, unstable angina and sudden cardiac death. However, European Heart Network and European Society of Cardiology estimate that over 4 million people died because of CVDs in Europe and 1.9 million people died in European Union (EU) which are 47% and 40% deaths, respectively. The human heart is the most crucial and hardest working organ of the body that combines with blood vessels to form the whole cardiovascular system. Heart disease is caused by disorders of the heart and blood vessels which result in coronary artery disease (CAD). In many cases the first sign is heart attack and some other conditions like heart failure, abnormal heart beat. CAD is affected all the ages and the rates are higher among men than women. Therefore, early detection of CAD is important to reduce the CAD affected patients.

Essay due? We'll write it for you!

Any subject

Min. 3-hour delivery

Pay if satisfied

Get your price

There are some diagnostic methods for CAD which generally begins with one of the common symptoms of the disease like heart attack or a sudden cardiac arrest. The general diagnostic tests include stress test, Electrocardiogram (ECG), echocardiography, coronary angiography or cardiac catherization. There is another test that is used to diagnose the CAD symptoms, treadmill stress test it is a painful to the patients and causes discomfort to them. Automatic CAD-diagnosing techniques using machine learning algorithms and data mining method have been developed for reducing the medical specialist’s efforts and time and save patients’ lives and cost. There are two types of studies found, some studies used signal recording to identify CAD symptoms, for instance, electrocardiograph (ECG), photo plethysmography (PPG), and phonocardiography (PCG), and other studies used clinical parameters like age, blood pressure, and smoking habit to classify CAD patients. to identify CAD symptoms.

In this research, the major focus is early detection, it is very important that not to lose even a small change in the heart signal. The major objective of our study is automatic detection of CAD by conducting a non-invasive tool on the ECG. PCA Feature reduction has been used in this study, 10 Nonlinear features which are more effective for early detection.

The proposed CAD diagnosis method is automatic. The two major steps i) training and ii) testing. The nonlinear features are used in training the classifier in the training stage. In the testing stage the same features are extracted and served as an input for the presence trained classifier for automatic detection of CAD.


Faziludeen and Sankaran used KNN and SVM for feature selection method with improved F-score measurement, while during this process, the result declared that KNN calculates best feature selection as compared to SVM. because it is easy to implement and simple. NN is based on structure and functions of biological neural networks which bargain with neurons. This algorithm computes a solution in the similar way that the human brain works. In literature, we observed that NN is successfully in the prediction of cardiac abnormalities.

Giri performed an experiment on least square-SVM (LS-SVM) or it also may refer to an improved version of SVM or fastest technique than SVM. In the study used LS-SVM to train and test the parameters of the CAD risk index. In the study successfully classifies the normal and CAD patient from the dataset with the obtained accuracy of 99.72%.

Banerjee used SVM with Gaussian RBF kernel for classifying the CAD patients from the fingertip PPG signal. The main objective was to propose a low cost, noninvasive screening system to detect CAD in an ICU patient. It is noticed that the study discloses remarkable achievement in sensitivity and specificity scores with the SVM classifier.

Davari Dolatabadi employed an SVM classifier to optimize the two parameters, namely, cost and sigma (r), for controlling the overfitting of the model and the degree of nonlinearity of the model, respectively. However, the study reveals that the proposed method uses a smaller number of parameters to obtain the accuracy of 99.2%.

Giri proposed three-dimensionality reduction techniques, namely, PCA, linear discriminant analysis (LDA), and independent component analyses (ICA). The study used PCA for the calculation of eigenvectors for projecting the actual data into the directions of sorted eigenvalues, whereas ICA is another feature reduction method which transforms the multivariate random signal into a signal having components that are mutually independent. However, LDA provides the highest separation between the classes present in the feature set. Feature selection technique is used to detect relevant attributes which lead to classifier accuracy.


Dta set introduction

The data set for the CAD group was obtained from the 86 lengthy ECG recordings of 80 human subjects. The subjects were 46 men, aged 42-85, and women, aged 23-87 years. These recordings are obtained from the Long-Term ST Database. The Long-Term ST database contains 86 lengthy ECG recordings, chosen to exhibit a variety of events of ST segment changes, including ischemic ST episodes, axis related non-ischemic ST episodes, episodes of slow ST level drift and episodes containing mixtures of these events. In this paper the study is about coronary artery disease, 35 subjects from this database who only suffer from CAD and 15 subjects are Non-CAD have been chosen.


The ECG signals were passed via a filtered parameter, the cut-off frequency is 0.006 Hz. Baseline correction data is obtained from the raw ECG signal to remove the low frequency signals. 50 Hz notch filter is applied to remove the power source interferenc noise. Finally, the pan-Tompkins algorithm is used to detect the R-peaks of ECG signal to accurately extract the QRS. The RR interval was calculated by the interval between two successive QRS complexes. Heart rate (in mins) is calculated from the RR interval (in secs) using:

HRbpm= 60/RR

C. Feature Extraction

Time and frequency domain features are used for diagnosing CAD patients from normal subjects have been discussed in this section.

Time domain features

Time domain features are derived from the RR intervals. The clear features are mean RR and mean HR. Some of the described variability within the RR intervals such as SDNN, SDSD, RMSSD, and PNN50 which are statistical, TINN and HRV triangular index are geometric features. The definition of these features is shown in the table 1.

Features Description

SDNN Standard deviation of normal to normal R-R intervals

SDSD The standard deviation of successive RR interval differences

RMSSD Square root of the mean of the sum of the squares of differences between adjacent NN intervals

PNN50 Square root of the mean of the sum of the squares of differences between adjacent NN intervals

TINN The baseline width of the RR histogram evaluated through triangular interpolation

HRV triangular index Number of all NN intervals/maximum number


This essay has been submitted by a student. This is not an example of the work written by our professional essay writers. You can order our professional work here.

We use cookies to offer you the best experience. By continuing to use this website, you consent to our Cookies policy.


Want to get a custom essay from scratch?

Do not miss your deadline waiting for inspiration!

Our writers will handle essay of any difficulty in no time.