ICT5356 Principles of Artificial Intelligence
You will research a goal, problem or task that could be solved using:
â—Ź Supervised machine learning; with
â—Ź A dataset that can be captured within a .csv file or similar that you are able to source; and with
â—Ź A simple machine learning algorithm (one of the algorithms that you will explore during Week 2)
In your application to SAINT, you will describe:
â—Ź The goal, problem or task
â—Ź The kind of data that will be used to train the machine
â—Ź An example of the kind of dataset that could be used (by creating a small sample yourself, or sourcing an appropriate sample online)
â—Ź The features that will be explored in the training
â—Ź The learning algorithm(s) that could be used to learn from the training data
â—Ź One or more working models trained on the dataset
Specifically, the machine learning model will be developed within the Orange software package. You are required to submit three items:
1. A document (500 words) that includes:
â—Ź A description of the goal, problem or task that you want to solve
â—Ź An explanation of how this goal, problem or task could be framed as a classification or regression task
â—Ź A training dataset that you will use to address this goal/problem/task - either link to an actual dataset, or provide a sample of data that you create
â—Ź An explanation of this dataset - the features included, the target variable, and how the data relates to the classification or regression task
â—Ź Any modifications or pre-processing you did for this training data, and why
â—Ź The machine learning algorithms you used for the goal/problem/task, and why
â—Ź Any hyperparameters you specified for your machine learning models
â—Ź What you achieved with the model - describe the predictions the model made on a small sample of data
2. The Orange project.
3. The model(s) developed.
This project’s purpose is to predict the occurrence of a heart attack using a supervised machine learning approach. CVDs are the number one killer all across the globe, and amongst them, heart attacks and strokes are the most common occurrences. The goal of this project for MBA assignment expert is to build a model for predicting the likelihood of a heart attack based on the dataset containing heart disease indicators. Through the use of machine learning methodologies, the researcher is optimistic of coming up with early diagnosis so that the life of individuals affected could be saved apart from cutting down the impact of heart diseases.
The dataset used in this study is obtained from kaggle and contains the summary data of more than 1300 people.
The dataset can be accessed from: <https://www.kaggle.com/datasets/bharath011/heart-disease-classification-dataset>
Figure 1: The Data Table
(Source: Self-Created)
The study can also present the problem of heart attacks prediction as a classification problem. In this context, the machine learning model is trained to classify individuals into two categories: positive and negative, where “positive” means that the patient is likely to have a heart attack and “negative” means that the patient is unlikely to have one. The use of the binary classification method is appropriate since the characteristic dataset record and future attributes can be learned by the model depending on the patterns shown in the provided sample of a dataset (Ratra et al. 2020).
Figure 2: Orange Workflow
(Source: Self-Created)
The dataset used in this project is sourced from Kaggle and contains 1,319 samples with nine fields: This is comprised of eight fields which are the input ones and one for the output. The input fields are:
• Age
• Gender (0 for Female, 1 for Male)
• Heart rate (impulse)
• Systolic BP (pressurehight)
• Diastolic BP (pressurelow)
• Blood sugar (glucose)
• CK-MB (kcm)
• Test-Troponin (troponin)
The dependent variable or output field is “class” indicating the occurrence of heart attack. The class variable is binary where the class ‘negative’ means no experience of a heart attack ever and class ‘positive’ means experience of a heart attack.
While preparing the data, observations with missing values were omitted to avoid dubious results. This step is very important because the missing values can greatly harm the performance of the model (El-Hasnony et al. 2022). The target column was set to “class,” and all others were set as feature columns. This preprocessing is very effective in cleaning up the dataset before feeding it to a model.
The adopted machine learning model for this task is a decision tree classifier. The model parameters are:
• Pruning: At least five in leaves and at least 10 in internal nodes; the maximum depth is 50.
• Splitting: Halt splitting when reaching 90% majority in the classification only.
• Binary trees: Yes
The decision tree model was selected because of the interpretability of the model and because it can be used to model numerical as well as categorical variables (Gárate-Escamila et al. 2020). The specified hyperparameters avoid overcomplicated tree structures with a large depth while also avoiding overly simple trees.
The model was evaluated using the "Test and Score" widget in Orange, with the following results:
• AUC: 0.982
• CA: 0.986
• F1: 0.986
• Precision: 0.986
• Recall: 0.986
• MCC: 0.970
Figure 3: Evaluation Results
(Self-Created)
From the "Prediction" widget, on a small sample data, the scores obtained:
• AUC: 0.989
• CA: 0.987
• F1: 0.987
• Precision: 0.987
• Recall: 0.987
• MCC: 0.973
Figure 4: Model Performance Score
(Source: Self-Created)
The confusion matrix for the model is as follows:
• rue Negative: 506
• False Negative: 3
• False Positive: 14
• True Positive: 796
Figure 5: Confusion Matrix
(Source: Self-Created)
The decision tree model in the present study performed well in predicting the cases of heart attacks with an AUC of 0.989. The model yielded high accuracy of 0.987 and specificity, which are evident marks of the potential of the model to accurately identify potential heart attack patients.
El-Hasnony, I.M., Elzeki, O.M., Alshehri, A. and Salem, H., 2022. Multi-label active learning-based machine learning model for heart disease prediction. Sensors, 22(3), p.1184.
Gárate-Escamila, A.K., El Hassani, A.H. and Andrès, E., 2020. Classification models for heart disease prediction using feature selection and PCA. Informatics in Medicine Unlocked, 19, p.100330.
Ratra, R. and Gulia, P., 2020. Experimental evaluation of open source data mining tools (WEKA and Orange). International Journal of Engineering Trends and Technology, 68(8), pp.30-35.
Essay: 10 Pages, Deadline: 2 days
They delivered my assignment early. They also respond promptly. This is excellent. Tutors answer my questions professionally and courteously. Good job. Thanks!
User ID: 9***95 United
States
Report: 10 Pages, Deadline: 4 days
After sleeping for only a few hours a day for the entire week, I was very weary and lacked the motivation to write anything or think about any suggestions for the writer to include in the paper. I am glad I chose your service and was pleasantly pleased by the quality. The paper is complete and ready for submission to the professor. Thanks!
User ID: 9***85 United
States
Assignment: 8 Pages, Deadline: 3 days
I resorted to the MBA assignment Expert in the hopes that they would provide different outcomes after receiving unsatisfactory results from other assignment writing organizations, and they genuinely are fantastic! I received exactly what I was looking for from this writing service. I'm grateful.
User ID: 9***55
Assignment: 13 Pages, Deadline: 3 days
Incredible response! I could not believe I had received the completed assignment so far ahead of the deadline. Their expert team of writers effortlessly provided me with high-quality content. I only received an A because of their assistance. Thank you very much!
User ID: 6***15 United
States
Essay: 8 Pages, Deadline: 3 days
This expert work was very nice and clean.expert did the included more words which was very kind of them.Thank you for the service.
User
ID: 9***95 United
States
Report: 15 Pages, Deadline: 5 days
Cheers on the excellent work, which involved asking questions to clarify anything they were unclear about and ensuring that any necessary adjustments were made promptly.
User ID: 9***95 United
States
Essay: 9 Pages, Deadline: 5 days
To be really honest, I can't bear writing essays or coursework. I'm fortunate to work with a writer who has always produced flawless work. What a wonderful and accessible service. Satisfied!
User ID: 9***95
Essay: 12 Pages, Deadline: 4 days
My essay submission to the university has never been so simple. As soon as I discovered this assignment helpline, however, everything improved. They offer assistance with all forms of academic assignments. The finest aspect is that there is also an option for escalation. We will get a solution on time.
User ID: 9***95 United
States
Essay: 15 Pages, Deadline: 3 days
This is my first experience with expert MBA assignment expert. They provide me with excellent service and complete my project within 48 hours before the deadline; I will attempt them again in the future.
User ID: 9***95 United
States