“Ethical Ai Through Bias Mitigation In Large Language Models: A Review”

Authors

  • Anindita Chakraborty, Author
  • Sampurna Mandal, Author
  • Suvojit Mukhopadhyay, Author
  • Tiyasa Saha, Author
  • Durjay Barman, Author
  • Partha Sarothi Roy Author
  • Gulshan Kumar Sinha Author

DOI:

https://doi.org/10.64252/9nfa0e86

Keywords:

Large Language Models (LLMs), Bias Detection, Bias Mitigation, Fairness in AI, Responsible AI, Natural Language Processing (NLP), Ethical AI, Alignment Strategies.

Abstract

Large Language Models (LLMs) are transforming natural language processing with applications in healthcare, education, recruitment, and civic information. Despite their benefits, LLMs risk amplifying social biases present in training data, producing unfair or harmful outcomes across dimensions such as gender, race, nationality, religion, and disability. This paper surveys the origins, manifestations, and mitigation of bias in LLMs. We review historical evidence from word embeddings to foundation models, highlighting how stereotypes persist through representation, training, and deployment. Mitigation strategies are analyzed at multiple levels: data-centric methods such as balancing and counterfactual augmentation; objective-level fairness constraints; post-training alignment through reinforcement learning and constitutional principles; and inference-time safeguards, including safety classifiers and constrained decoding. We also discuss evaluation approaches like red-teaming and multilingual audits. Finally, challenges such as data gaps, cultural variation, and fairness–accuracy trade-offs are outlined, with future directions for causal fairness, adaptive safety, and cross-cultural generalization.

Downloads

Download data is not yet available.

Downloads

Published

2025-05-17

Issue

Section

Articles

How to Cite

“Ethical Ai Through Bias Mitigation In Large Language Models: A Review”. (2025). International Journal of Environmental Sciences, 3521-3526. https://doi.org/10.64252/9nfa0e86