IDENTIFIKASI BIAS SOSIAL EKONOMI DALAM MODEL BAHASA AI INDONESIA MELALUI ETHICAL PROBING

##plugins.themes.academic_pro.article.main##

Fadilah Zahra Dwi Kinanti
Ari Maulida Aprilia
Aldila Rachma Aulia
Anis Nadhirotul Mustafida
Dicky Anggriawan Nugroho

Abstract

This study evaluates socioeconomic bias in three large language models (LLMs) that support Indonesian Nusantara, IndoGPT, and SEA-LION using an ethical probing approach. A total of 100 short narrative prompts (4–11 words) were compiled to represent issues of poverty, informal employment, access to education, and regional contexts. Each model output was analyzed using five key indicators: emotional valence, stereotypes, narrative themes, framing, and deontic indicators. The results show that all three models tend to produce neutral responses, especially SEA-LION, which has the highest proportion of neutral responses. However, stereotypes still appeared at almost the same level across all models, indicating that a neutral tone does not guarantee bias-free output. IndoGPT showed the highest use of normative language, while Nusantara more often displayed structural framing and empathetic nuances. In contrast, SEA-LION was the most stable in maintaining neutrality without eliminating implicit stereotypical tendencies. These findings confirm that socioeconomic bias in Indonesian-language LLMs still occurs subtly through deterministic narratives, generalizations, and framing that normalizes the vulnerability of low-income groups. This study provides an initial overview of the direction of generative bias in Indonesian LLMs and highlights the need for broader dataset development, stricter annotation methods, and continuous evaluation for the development of fairer models.

##plugins.themes.academic_pro.article.details##

How to Cite
Dwi Kinanti, F. Z., Aprilia, A. M., Aulia, A. R., Mustafida, A. N., & Nugroho, D. A. (2026). IDENTIFIKASI BIAS SOSIAL EKONOMI DALAM MODEL BAHASA AI INDONESIA MELALUI ETHICAL PROBING. Jurnal Informatika Progres, 18(1), 30-40. https://doi.org/10.56708/progres.v18i1.508

References

[1] Gallegos, I. O., Rossi, R. A., Barrow, C., Tanjim, M. M., Kim, S., Guo, F., Koh, A., & Ahmed, N. K. (2024). Bias and fairness in large language models: A survey. Computational Linguistics, 50(3), 1097–1179. https://doi.org/10.1162/coli_a_00524
[2] Navigli, R., & Conia, S. (2023). Biases in large language models: Origins, inventory, and insights. ACM Computing Surveys, 56(7), 1–38. https://doi.org/10.1145/3597307
[3] Gehman, S., Gururangan, S., Sap, M., Choi, Y., & Smith, N. A. (2020). RealToxicityPrompts: Evaluating neural toxic degeneration in language models. Findings of the Association for Computational Linguistics: EMNLP 2020, 3356–3369. https://doi.org/10.18653/v1/2020.findings-emnlp.301
[4] Nangia, N., Vania, C., Bhalerao, R., & Bowman, S. R. (2020). CrowS-Pairs: A challenge dataset for measuring social biases in masked language models. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1953–1967. https://doi.org/10.18653/v1/2020.emnlp-main.154
[5] Nadeem, M., Bethke, A., & Reddy, S. (2021). StereoSet: Measuring stereotypical bias in pretrained language models. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL-IJCNLP), 5356–5371. https://doi.org/10.18653/v1/2021.acl-long.416
[6] Koto, F., Rahimi, A., Lau, J. H., & Baldwin, T. (2020). IndoLEM and IndoBERT: A benchmark dataset and pre-trained language model for Indonesian NLP. Proceedings of the 28th International Conference on Computational Linguistics (COLING), 757–770. https://doi.org/10.18653/v1/2020.coling-main.66
[7] Cahyawijaya, S., Winata, G. I., Wilie, B., Vincentio, K., Li, X., Koto, F., Moeljadi, D., Purwarianti, A., & Pascual, F. (2021). IndoNLG: Benchmark and resources for evaluating Indonesian natural language generation. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 8875–8898. https://doi.org/10.18653/v1/2021.emnlp-main.699
[8] Koto, F., Rahimi, A., Lau, J. H., & Baldwin, T. (2021). IndoBERTweet: A pretrained language model for Indonesian Twitter with effective domain-specific vocabulary initialization. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 10660–10668. https://doi.org/10.18653/v1/2021.emnlp-main.833
[9] Talat, Z., Hagen, A., & Diehl, T. (2021). You reap what you sow: On the challenges of bias evaluation under multilingual settings. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 84–91. https://doi.org/10.18653/v1/2021.findings-emnlp.11
[10] Arzaghi, M., Carichon, F., & Farnadi, G. (2024). Understanding intrinsic socioeconomic biases in large language models. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1), 49–60. https://doi.org/10.1609/aies.v7i1.31616
[11] Singh, S., Keshari, S., Jain, V., & Chadha, A. (2024). Born with a silver spoon? Investigating socioeconomic bias in large language models. arXiv preprint arXiv:2403.14633. https://doi.org/10.48550/arXiv.2403.14633
[12] Shrawgi, H., Rath, P., Singhal, T., & Dandapat, S. (2024). Uncovering stereotypes in large language models: A task complexity-based approach. Findings of the Association for Computational Linguistics: ACL 2024, 1841–1857.
[13] Dou, L., Cheng, X., Wang, X., Zhang, J., & Wang, Z. (2024). Sailor: Open language models for South-East Asia. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 424–435. https://doi.org/10.18653/v1/2024.emnlp-demo.45
[14] Nie, S., et al. (2024). Do multilingual large language models mitigate stereotype bias? Proceedings of the 2nd Workshop on Cross-Cultural Considerations in NLP (C3NLP), 65–83. https://doi.org/10.18653/v1/2024.c3nlp-1.6
[15] Vashishtha, A., Ahuja, K., & Sitaram, S. (2023). On evaluating and mitigating gender biases in multilingual settings. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), 307–318. https://doi.org/10.18653/v1/2023.acl-long.19
[16] Manerba, M. M., Sta, K., Guidotti, R., & Augenstein, I. (2024). Social bias probing: Fairness benchmarking for language models. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 14653–14671.
[17] Winata, G. I., et al. (2023). NusaX: Multilingual parallel sentiment dataset for 10 Indonesian local languages. Findings of the Association for Computational Linguistics: EMNLP 2023, 815–834. https://doi.org/10.18653/v1/2023.findings-emnlp.59
[18] Parrish, A., Chen, A., Nangia, N., Padmakumar, V., Phuphamphiroj, A., Thompson, G., Htut, P. M., & Bowman, S. R. (2022). BBQ: A hand-built bias benchmark for question answering. Findings of the Association for Computational Linguistics: ACL 2022, 2086–2105. https://doi.org/10.18653/v1/2022.findings-acl.165
[19] Smith, E. M., Hall, M., Pappas, N., & Williams, A. (2022). “I’m sorry to hear that”: Finding new biases in language models with a holistic descriptor dataset. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 9180–9211. https://doi.org/10.18653/v1/2022.emnlp-main.625