Ji-Ung Lee

Hi there! I am Ji-Ung, a postdoc at the research training group Neuroexplicit Models at the University of Saarland (Germany). My research revolves around efficient model training in natural language processing (NLP). This usually involves methods such as active learning that can handle low-resource scenarios with users who can provide the labels for queried instances. I am also interested in (human) language learning, so the evaluation of my methods often happens within the context of automated exercise generation and assessment. Finally, I am a big fan of user studies, having devised and conducted various evaluation studies involving citizen scientists. If you are interested in these research topics feel free to drop me a message!

News

09-2025 - New Preprint (paper)

Bridging Fairness and Explainability: Can Input-Based Explanations Promote Fairness in Hate Speech Detection? Yifan Wang, Mayank Jobanputra, Ji-Ung Lee, Soyoung Oh, Isabel Valera, Vera Demberg. 2025.

Abstract: Natural language processing (NLP) models often replicate or amplify social bias from training data, raising concerns about fairness. At the same time, their black-box nature makes it difficult for users to recognize biased predictions and for developers to effectively mitigate them. While some studies suggest that input-based explanations can help detect and mitigate bias, others question their reliability in ensuring fairness. Existing research on explainability in fair NLP has been predominantly qualitative, with limited large-scale quantitative analysis. In this work, we conduct the first systematic study of the relationship between explainability and fairness in hate speech detection, focusing on both encoder- and decoder-only models. We examine three key dimensions: (1) identifying biased predictions, (2) selecting fair models, and (3) mitigating bias during model training. Our findings show that input-based explanations can effectively detect biased predictions and serve as useful supervision for reducing bias during training, but they are unreliable for selecting fair models among candidates.

02-2025 - New Preprint (paper)

B-cos LM: Efficiently Transforming Pre-trained Language Models for Improved Explainability. Yifan Wang, Sukrut Rao, Ji-Ung Lee, Mayank Jobanputra, Vera Demberg. 2025.

Abstract: Post-hoc explanation methods for black-box models often struggle with faithfulness and human interpretability due to the lack of explainability in current neural models. Meanwhile, B-cos networks have been introduced to improve model explainability through architectural and computational adaptations, but their application has so far been limited to computer vision models and their associated training pipelines. In this work, we introduce B-cos LMs, i.e., B-cos networks empowered for NLP tasks. Our approach directly transforms pre-trained language models into B-cos LMs by combining B-cos conversion and task fine-tuning, improving efficiency compared to previous B-cos methods. Our automatic and human evaluation results demonstrate that B-cos LMs produce more faithful and human interpretable explanations than post hoc methods, while maintaining task performance comparable to conventional fine-tuning. Our in-depth analysis explores how B-cos LMs differ from conventionally fine-tuned models in their learning processes and explanation patterns. Finally, we provide practical guidelines for effectively building B-cos LMs based on our findings.

02-2025 - Full-day workshop on research data management (linkedin)

On February 17, I gave a workshop on research data management at our RTG. The workshop covered various aspects around the collection, processing, and storage of research data, and also provided insights on how to conduct reproducible research, especially when working with neural models. The workshop ended with a fruitful discussion where we decided upon practical guidelines for our RTG.