Analisis Komparasi Algoritma Support Vector Machine dan K-Nearest Neighbor pada Klasifikasi Kualitas Udara Kota Jakarta

Main Article Content

Yudha Aryadi Sani
Budi Wasito

Abstract

Air pollution has become one of the biggest environmental challenges worldwide, and Jakarta, the capital city of Indonesia, is no exception. Jakarta, as one of the most populous cities in the world, faces serious problems in ensuring healthy air quality for its residents. The impact of this air pollution is not only limited to human health, but also damages the environment as a whole, including plants and aquatic ecosystems. In an effort to address this issue, many studies have been conducted to develop prediction models that can provide estimates of air quality levels in Jakarta City. Two modeling models that can be used in this context are Support Vector Machine (SVM) and K-Nearest Neighbors (KNN). This study aims to compare the performance of the SVM algorithm model with the KNN algorithm in the classification of air quality in Jakarta City. Data mining is the art and science of discovering knowledge, insights, and patterns in data based on the CRISP -DM (Cross Industry Standard Process For Data Mining) methodology. The data source in this study is the Jakarta air pollution standard index (ISPU) data obtained on the satudata.jakarta.go.id website. This study compares the accuracy of the Jakarta City air quality classification on the Support vector Machine (SVM) and K-Nearest Neighbor (KNN) algorithms using python. Comparison results are determined by the level of accuracy and score of the confusion matrix. Classification results are presented in the form of a Graphic User Interface with media interface. The comparison of classification algorithms on two models in data mining shows that the Support Vector Machine (SVM) algorithm is superior to K-Nearest Neighbor (KNN). This is evident from the higher accuracy rate in SVM, especially with the use of the Rbf kernel which reaches 97.05%, compared to KNN which has an accuracy of 94.74% with parameters p = 1 and k = 5. In addition, SVM also shows a higher correct prediction value compared to KNN. In a 1-year time span the overall quality may be quite good, with moderate quality at more than equal to 50 to 100. However, control is still needed for the unhealthy category areas.

Downloads

Download data is not yet available.

Article Details

How to Cite
Sani, Y. A., & Wasito, B. (2024). Analisis Komparasi Algoritma Support Vector Machine dan K-Nearest Neighbor pada Klasifikasi Kualitas Udara Kota Jakarta. Global Research on Economy, Business, Communication, and Information, 2(1), 55–72. https://doi.org/10.46806/grebuci.v2i1.1757
Section
Artikel Riset

References

Arhami, M, dan Muhammad Nasir (2020), Data Mining Algoritma dan Implementasinya, Edisi ke -1, Yogyakarta: ANDI (Anggota IKAPI).

Bhatia, P. (2019). Data Mining and Data Warehousing: Principles and Practical Techniques. Cambridge University Press.

Goodfellow, I,. Bengio, Y, dan Courville Aaron (2016), Deep Learning , MIT Press book.

Laudon, K. C., & Laudon, J. P. (2017). Management Information Systems: Managing the Digital Firm. Pearson.

N. R., & Lichtendahl, K. C. (2017). Data Mining for Business Analytics: Concepts, Techniques, and Applications in R. Wiley.

Ning-Tan, Pang. et al (2019), Introduction to Data Mining Second Edition, United Kingdom: Pearson Education Limited.

Raja, R. et al (2022). Data Mining and Machine Learning Applications. Wiley.

Shmueli, G., Bruce, P. C., Yahav, I., Patel,

Situmorang, Syafrizal H. et al (2014). Analisis Data Untuk Riset Manajemen dan Bisnis, Edisi 3, Medan: USU Press.

Tiwary, A., & Williams, I. (2019). Air Pollution: Measurement, Modelling and Mitigation. CRC Press, Taylor & Francis Group.

Zaki, J. M. dan Wagner Meira JR (2020), Data Mining and Machine Learning Fundamental Concepts and Algorithms, University Printing House, Cambridge CB2 8BS, United Kingdom.