Optimizing Predictions For Thyroid Disease Sufferers Using Correlation Matrix And Random Forest With Hyperparameter Tuning
DOI:
https://doi.org/10.62205/mjgcs.v2i1.26Keywords:
Random Forest, Machine Learning, Optimizing, Prediction, Thyroid DiseaseAbstract
Thyroid disease is one of the most common endocrine disorders, affecting the body's hormone function and balance. Symptoms can include changes in weight, fatigue, and temperature regulation issues. Although the causes are varied, thyroid disease can generally be treated with medications or medical interventions. The objective of this study is to present and optimize a predictive model for thyroid disease patients by measuring the comparison between correlation analysis of traits and the variables used, as well as evaluating the performance of the Random Forest method in optimizing predictions. One machine learning method that can be used to optimize the prediction of thyroid disease patients is Random Forest. The features used include age, gender, smoking history, radiotherapy history, and pathology characteristics, which are utilized to optimize predictions using this Random Forest algorithm. This study employs hyperparameter tuning, with the best parameters being (n_estimators) 100 and (max_depth) 30, which are then used to predict the occurrence of thyroid disease with an accuracy of 95%.
References
R. S. Tantika and A. Kudus, “Penggunaan Metode Support Vector Machine Klasifikasi Multiclass pada Data Pasien Penyakit Tiroid,” Bandung Conf. Ser. Stat., vol. 2, no. 2, pp. 159–166, 2022, doi: 10.29313/bcss.v2i2.3590.
Yuyun Saputri and Meta Maulida Damayanti, “Karakteristik Pasien dengan Nodul Tiroid di Rumah Sakit X Bandung,” J. Ris. Kedokt., vol. 1, no. 2, pp. 71–79, 2021, doi: 10.29313/jrk.v1i2.438.
D. Sartika and Y. Yupianti, “Klasifikasi Penyakit Tiroid Menggunakan Algoritma C4.5 (Studi Kasus : Rumah Sakit Umum Daerah (RSUD) Hasanuddin Damrah Manna),” Rekayasa, vol. 13, no. 1, pp. 71–76, 2020, doi: 10.21107/rekayasa.v13i1.5912.
A. Ramadhan, B. Susetyo, and Indahwati, “Penerapan Metode Klasifikasi Random Forest Dalam Mengidentifikasi Faktor Penting Penilaian Mutu Pendidikan,” J. Pendidik. dan Kebud., vol. 4, no. 2, pp. 169–182, 2019, doi: 10.24832/jpnk.v4i2.1327.
E. Nasti, T. H. Setiawan, H. Warianto, A. Andi, and G. Gerry, “Faktor-Faktor Yang Mempengaruhi Tingkat Kecerdasan Emosional Anak Terhadap Pelajaran Matematika Dengan Menggunakan Analisis Faktor,” J. Lebesgue J. Ilm. Pendidik. Mat. Mat. dan Stat., vol. 3, no. 1, pp. 44–59, 2022, doi: 10.46306/lb.v3i1.72.
A. Handayani, A. Jamal, and A. A. Septiandri, “Evaluasi Tiga Jenis Algoritme Berbasis Pembelajaran Mesin untuk Klasifikasi Jenis Tumor Payudara,” JNTETI, vol. 6, no. 4, pp. 394–403, 2017.
M. Alnaggar, M. Handosa, T. Medhat, and M. Z. Rashad, “Thyroid Disease Multi-class Classification based on Optimized Gradient Boosting Model,” Egypt. J. Artif. Intell., vol. 2, no. 1, pp. 1–14, 2023, doi: 10.21608/ejai.2023.205554.1008.
“Thyroid Disease Dataset,” kaggle.com, no. https://www.kaggle.com/datasets/jainaru/thyroid-disease-data, [Online]. Available: https://www.kaggle.com/datasets/jainaru/thyroid-disease-data
A. E. Budiman and A. Widjaja, “Analisis Pengaruh Teks Preprocessing Terhadap Deteksi Plagiarisme Pada Dokumen Tugas Akhir,” J. Tek. Inform. dan Sist. Inf., vol. 6, no. 3, pp. 475–488, 2020, doi: 10.28932/jutisi.v6i3.2892.
D. P. Sinambela, H. Naparin, M. Zulfadhilah, and N. Hidayah, “Implementasi Algoritma Decision Tree dan Random Forest dalam Prediksi Perdarahan Pascasalin,” J. Inf. dan Teknol., vol. 5, no. 3, pp. 58–64, 2023, doi: 10.60083/jidt.v5i3.393.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Angelina, Nadhea Filosofia, Riyan Arga Wijaya
This work is licensed under a Creative Commons Attribution 4.0 International License.