IMPLEMENTASI METODE HYBRID FUZZY JARO WINKLER DAN COSINE SIMILARITY PADA SISTEM PENCARIAN AYAT AL-QURAN BERBASIS TRANSLITERASI LATIN

##plugins.themes.academic_pro.article.main##

Gempar Perkasa Tahir
Emil Agusalim Habi Talib
Fahrim Irhamna Rachman

Abstract

This research addresses the challenge of retrieving Qur’anic verses in Latin transliteration, which is hindered by the absence of a standardized orthography, leading to diverse spelling variations. The study aims to design and implement a hybrid information retrieval system that integrates Fuzzy Jaro-Winkler for lexical similarity and Cosine Similarity on fine-tuned DistilBERT embeddings for semantic relevance. The system workflow begins with preprocessing and normalization of the dataset, followed by initial candidate selection using Jaro-Winkler, and final reranking through semantic similarity scoring. Evaluation was conducted using black-box testing across scenarios including ideal queries, spelling variations, incomplete queries, and varying query lengths. Results show high accuracy for ideal (96%) and varied spelling queries (92%), with performance improving as query length increases, reaching 96% for four-word queries. The hybrid approach effectively bridges lexical and semantic gaps, outperforming single-method baselines, and demonstrates robustness in handling non-standard transliteration in Qur’anic text retrieval.

##plugins.themes.academic_pro.article.details##

How to Cite
Tahir, G. P., Habi Talib, E. A., & Rachman, F. I. (2025). IMPLEMENTASI METODE HYBRID FUZZY JARO WINKLER DAN COSINE SIMILARITY PADA SISTEM PENCARIAN AYAT AL-QURAN BERBASIS TRANSLITERASI LATIN. Jurnal Informatika Progres, 17(2), 105-115. https://doi.org/10.56708/progres.v17i2.482

References

[1] Ather, M. M. (2023). The fusion of multilingual semantic search and large language models: A new paradigm for enhanced topic exploration and contextual search. [Preprint].
[2] Poerbaningtyas, E. (2023). Penerapan metode cosine similarity pada sistem informasi retrieval pencarian terjemahan ayat-ayat suci Al-Qur’an. [Artikel ilmiah].
[3] Rosyid, A. (2021). Problematika transliterasi Al-Qur’an. Al-Fanar: Jurnal Ilmu Al-Qur’an dan Tafsir, 4(2), 191–215. https://doi.org/10.33511/alfanar.v4n2.191-215.
[4] Sujaini, H., Muhardi, H., & Simanjuntak, J. H. (2022). Aplikasi pengoreksi ejaan (spelling correction) pada naskah jurnal bidang informatika dengan N-gram dan Jaro-Winkler distance. Jurnal Edukasi dan Penelitian Informatika (JEPIN), 8(2). https://doi.org/10.26418/jp.v8i2.48092.
[5] Jayadianti, H., Santosa, B., Cahyaning, J., Saifullah, S., & Drezewski, R. (2023). Essay auto-scoring using N-gram and Jaro-Winkler based Indonesian typos. MATRIK: Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, 22(2). https://doi.org/10.30812/matrik.v22i2.2473.
[6] Sanjaya, A., Setiawan, A. B., Mahdiyah, U., Farida, I. N., Prasetyo, A. R., & Universitas Nusantara PGRI Kediri. (2023). Pengukuran kemiripan makna menggunakan cosine similarity dan basis data sinonim kata (Measurement of meaning similarity using cosine similarity and word synonyms database). Jurnal Teknologi Informasi dan Ilmu Komputer (JTIIK), 10(4).
[7] Prismadana, T. A. (2023). Aplikasi ruang tugas dengan deteksi kemiripan teks pada dokumen tugas menggunakan cosine similarity. Jurnal Informatika dan Multimedia, 15(1). https://doi.org/10.33795/jtim.v15i1.4405.
[8] Kurniadi, D., Gernowo, R., Surarso, B., Wibowo, A., & Warsito, B. (2023). Sistem penilaian jawaban singkat otomatis pada ujian online berbasis komputer menggunakan algoritma cosine similarity. Jurnal Edukasi dan Penelitian Informatika (JEPIN), 9(2). https://doi.org/10.26418/jp.v9i2.66934.
[9] Pitchandi, P., & Balakrishnan, M. (2023). Document clustering analysis with aid of adaptive Jaro-Winkler with Jellyfish search clustering algorithm. Advances in Engineering Software, 175, 103322. https://doi.org/10.1016/j.advengsoft.2022.103322.
[10] Ilyasa, M. D. H., & Yamasari, Y. (2023). Perbandingan cosine similarity dan Euclidean distance pada model rekomendasi buku dengan metode item-based collaborative filtering. Journal of Informatics and Computer Science (JINACS), 4(3), 264–274. https://doi.org/10.26740/jinacs.v4n03.p264-274.
[11] Amorese, T., Greco, C., Cuciniello, M., Milo, R., Sheveleva, O., & Glackin, N. (2023). Automatic speech recognition (ASR) with Whisper: Testing performances in different languages. In CEUR Workshop Proceedings.
[12] Rasool, S., Mushtaq, T., & Rahat, N. (2022). Multidisciplinary status of translation sciences. Al-Aijaz Research Journal of Islamic Studies & Humanities, 6(2), 165–170. https://doi.org/10.53575/u15.v6.02(22).165-170.